February 25, 2010

Using social networks to guide recommendation systems

punk bffs
Photo courtesy of Walt Jabsco

The problem of trying to incorporate social networks into collaborative filtering recommendations seems to be a hot research topic right now. The basic idea of this problem is that one may have a dataset consisting of many different ratings by a user of a thing, like a movie or product, which takes on a number from 0 to 1. What we would like to do is to predict how much a given user will like something which they have never rated before.

In collaborative filtering the approaches have two axes user/item similarity and memory based or model based approaches.

The first axis describes what kind of similarity you are leveraging in order to make your recommendations. User similarity asserts that people who have rated things similar to you in the past are likely to rate a new thing in the same way. Item similarity assets that you are likely to rate a new thing the same way that you have rated similar things in the past. The first requires a way of determining whether users are similar, the latter a way of determining whether items are similar. In either case you can just use your ratings themselves as the basis for similarity or you can use some external knowledge to judge similarity.

The second axis describes how you store the information from which you base your decisions. In a memory based approach, you just keep all of your rankings around and when it comes time to make a new rating on an unseen item, you go to your data and do your analysis. In a model-based approach, every time a new rating is observed, a new model is constructed for a user. When it is time to rate a new unseen item, then the model is consulted for a rating.

Two recent papers that explore these issues are "On Social Networks and Collaborative Recommendation" by Konstas, Stathopoulus, and Jose and "Learning to Recommend with Social Trust Ensemble" by Ma, King and Lyu, both from SIGIR 2009.

The first paper, "On Social..." undertook the task of trying to create a list of songs that a user would like based on their previous history of listening to songs in Last.fm, who their friends are (and their history), and then a collection of tags which applied to users and music. There approach was an attempt to merge both user and item similarity with a social network in a memory based approach. Because they ultimately created a play list sorted by most-likely-to-be-liked, they had a hard time comparing their results to traditional collaborative filtering systems which produce a hypothetical how-much-do-you-like-a-song rating for every song given a user.

Their approach was very interesting to me because they basically created a graph in which users, songs and tags were nodes and the relations between them were represented as weighted edges in a graph. Then they ran the PageRank algorithm over the personalized graph, pulled out the songs in the graph that were mostly highly rated and that was the recommendation. The weights on the edges of the graph required some black magic tuning.

It appears that any memory-based system that has significant computation, like the previous one, suffers from a real-time response challenge. If you are creating a system like Pandora, it's not really a problem because you have quite a bit of time in which to pick the next song about as long as the current song takes to play.

But if your system is more of a query-response system where you ask "Will I like song X?" then you really only have milliseconds to get an answer. This suggests to me that a model based approach is nearly required in fast-response systems.

The second paper, "Learning to Recommend..." was similar in spirit but very different in execution. The goal of the authors of this paper was to create a recommendation system of products based on the reviews in Epinions.com. The key feature that the authors wanted to include was a social-recommendation component and the basic assumption they were exploring: What you like is based on a combination of your own tastes and the tastes of your social network. When it comes to epinions this work shows that to be true in a 40/60 split respectively.

So they cast the problem of collaborative filtering as a graphical model and used results that showed how the matrix manipulations associated with collaborative filtering can be solved as just such a model. Then they showed how as social graph can also be cast as a graphical model. So the first graphical model says how likely you are to like something based on your previous history of ratings and the second component says how likely is your social network likely to like something. Then they combined the graphical models and derived an optimization technique for finding a local optimal solution to the matrix problem in the beginning. The result was a model that performed a query on an item quickly and did better than simply looking at a users tastes by themselves, or a social networks tastes by themselves.

Posted by djp3 at February 25, 2010 10:14 AM | TrackBack (0)
Post a comment

Post a comment