## Thought this was cool: Steffen Rendle – libFM

News from the KDD CUP workshop. I was highly impressed by Steffen

Rendle, the author of libFM

collaborative filtering library. Steffen won the 2nd place in track1

and the 3rd place in track2.

Unlike our team who had around 15 people, and the Taiwanese team who

had around 20 people

Steffen worked alone, and got a great rating in BOTH tracks.

What is nice about Steffen’s work, is that he is using only a SINGLE

algorithm, and not an ensemble of

methods as typically deployed. The trick is the he does very smart

feature engineering to create a very good

feature matrix. Once he gets the representative feature matrix he uses

the libFM algorithm.

I asked Steffen to explain the essense of the method:

A is the input (design) matrix where each row is a case and each column a

(real-valued) predictor variable. I.e. the same way of feature engineering

as in other standard ML algorithms such as linear/ polynomial regression,

SVM, etc.Internally, the FM model works similarly as a polynomial regression, i.e.

it contains all pairwise interactions between any two variables in A. The

important difference to polynomial regression is that the model parameters

for variable interactions are factorized. This is the reason why FMs

perform well in problems such as recommender systems.

Some of the recent notable work of Steffen is a caching method for

caching ALS computation, that according to Steffen, significantly

speeds up ALS computation and makes it a lighter algorithm like SGD.

The work is described in his recent SIGIR 2011 paper.

A second interesting work is an online matrix factorization

computation described in the paper: Steffen Rendle, Lars Schmidt-Thieme (2008):

Online-Updating Regularized Kernel Matrix Factorization Models forLarge-Scale Recommender Systems, in Proceedings of the 2008 ACM

Conference on Recommender Systems (RecSys 2008), ACM.

When new users/ items are added into the system, only an incremental

update is computed.

Finally, Steffen gave a very detailed tutorial in KDD, about a whole

bunch of matrix factorization methods and the relations between them. I find the tutorial a good overview of the connections between

algorithms, however it is intended for intermediate user level who

actually master some of the algorithms in this domain.

As you may know, we have a very crude and preliminary libFM

implementation in GraphLab. libFM algorithm implementation contains a

subset of the full libFM

functionality with only three predictions: user, item and time. Users

are encouraged to check the original libFM library for enhanced implementation.

libFM library has a track record performance in KDD

CUP and is highly recommended collaborative filtering package.

from Large Scale Machine Learning and Other Animals: http://bickson.blogspot.com/2012/08/steffen-rendle-libfm.html?utm_source=feedburner&utm_medium=feed&utm_campaign=Feed%3A+blogspot%2FsYXZE+%28Large+Scale+Machine+Learning+and+Other+Animals%29

## 发表评论