How Does Spotify Know You So Well? Science behind personalized music recommendations

How does Spotify’s magic engine run? How does it seem to nail individual users’ tastes so much more accurately than any of the other services?

Spotify doesn’t actually use a single revolutionary recommendation model. Instead, they mix together some of the best strategies used by other services to create their own uniquely powerful discovery engine. Spotify’s Three Types of Recommendation Models:

Recommendation Model #1: Collaborative Filtering (which analyze both your behavior and others’ behaviors)

Spotify doesn’t have a star-based system with which users rate their music. Instead, Spotify’s data is implicit feedback — specifically,
the stream counts of the tracks and additional streaming data, such as whether a user saved the track to their own playlist, or visited
the artist’s page after listening to a song. Collaborative filtering then uses that data to say:

“Hmmm… You both like three of the same tracks — Q, R, and S — so you are probably similar users. Therefore, you’re each likely to enjoy other tracks that the other person has listened to, that you haven’t heard yet.” Therefore, it suggests that the one on the right check out track P — the only track not mentioned, but that his “similar” counterpart enjoyed — and the one on the left check out track T, for the same reasoning. Simple, right?

But how does Spotify actually use that concept in practice to calculate millions of users’ suggested tracks based on millions of other users’ preferences? With matrix math, done with Python libraries!

Recommendation Model #2: Natural Language Processing (NLP) (which analyze text)

The second type of recommendation models that Spotify employs are Natural Language Processing (NLP) models. The source  data for these models,  as the name suggests, are regular words: track metadata, news articles, blogs, and other text around the internet. Natural Language Processing, which is the ability of a computer to understand human speech as it is spoken, is a vast  field unto itself, often harnessed through sentiment analysis APIs.

Spotify crawls the web constantly looking for blog posts and other written text about music to figure out what people are saying about specific artists and songs — which adjectives and what particular language is frequently used in reference to those artists and songs, and which other artists and songs are also being discussed alongside them.

Recommendation Model #3: Raw Audio Models (which analyze the raw audio tracks themselves)

First of all, adding a third model further improves the accuracy of the music recommendation service. But this model also serves a secondary purpose: unlike the first two types, raw audio models take new songs into account.

Take, for example, a song your singer-songwriter friend has put up on Spotify. Maybe it only has 50 listens, so there are few other listeners to collaboratively filter it against. It also isn’t mentioned anywhere on the internet yet, so NLP models won’t pick it up. Luckily, raw audio models don’t discriminate between new tracks and popular tracks, so with their help, your friend’s song could end up in a Discover Weekly playlist alongside popular songs!

But how can we analyze raw audio data, which seems so abstract? With convolutional neural networks!

Convolutional neural networks are the same technology used in facial recognition software. In Spotify’s case, they’ve been modified for use on audio data instead of pixels.

 

Content Courtesy: medium.com