TY - JOUR TI - Art ticker DO - https://doi.org/doi:10.7282/T3D79DGW PY - 2016 AB - Considering the large number of artists that exist, there is valuable talent to be discovered. But the question arises, how to find promising and emerging artists from hundreds of thousands of names listed among many different aggregations websites such as artfacts.org, and thousands of art galleries? We introduce an application named ArtTicker which uses many features of Machine Learning, Information Retrieval, Data Mining and Text Mining to crawl, rank, and analyze artists and their popularity on the web. We start by identifying names of artists who are not yet listed in large aggregate directories (such as artfacts) but are already represented by some galleries. This task requires crawling and extraction of artist names from thousands of art galleries. These web sites share a lot of common structures, however there is also significant variety among them and artist name extraction requires complex heuristics. We harvest thousands of artist names this way. Then we enter the second phase of the project – ranking this artists by their “web presence”. Since the wealth of any data mining model is the actual data, the data collection period consisted of extensive crawling from a vast number of news publication websites. To this end we gather and cluster news from several leading art related news websites and also use many signals to rank and classify these art news sources. The artists’s score is based on how significantly an individual artist was featured in the art news stream of articles. The final objective of finding the emerging artists is met by identifying the names which are present on gallery web sites, have high media presence (high score) and are not listed yet on the artist aggregate sites. The working prototype analyzes over 150 sources in English language but can be easily extended based on automatically crawling and analyzing related sources. It currently holds over 250,000 artists and over 70,000 articles from all these news sources. In essence, this is a streaming application for which given any geographic area (say Lower Manhattan) identifies the “hottest” artists who are not yet known. KW - Computer Science KW - Artists KW - Data mining LA - eng ER -