Keywords: music discovery, music history, similarity search
March 2004 - 2007



With a proliferation of music artists working on blending various styles of music, the use of traditional genres to describe an artist's creative output is becoming increasingly less useful. In addition, people have varied tastes and often like music in widely different genres. Often, other people similarly like the same collection of artists. Due to our strong interests in music, Roy Kim and I decided to create a system that could recommend artists that would match with a listener's tastes. Thus, was born.

The basic idea is simple: users install a small plugin to their music player which sends metadata about their currently playing song to our servers. As a first level of service, users can access this history of played songs and look at various statistics about their listening preferences. In addition, they can generate dynamic text or images based on their current song (e.g. for displaying on their webpages or AIM profiles, etc.).

However, the more interesting use of the data comes from a backend analyzer (written entirely by me), which looks at all the data and automatically determines which artists are similar to each other based on a few different techniques. This backend has been written with several implementation issues in mind. Since we do not have the resources to store all our data in memory at one time and do the processing on it, the program has been written to process manageable chunks of data at a time (in terms of both memory and execution time) and combine the results together once each segment completes. Another major portion of the program deals with incorrect data (of which there is a lot, since we use ID3Tags from music, which are frequently incorrectly labelled). The output from this backend program is then stored in a database which the frontend uses to display personalized suggestions for each user.