Thought this was cool: Post Peer-Review Discussion continues and a remarkable dataset

The discussion on a post peer review model has had more than 40 comments and one of the most important aspect is how to make the system as trustworthy as possible. If you have any thoughts on the matter, please share it with us. 

In one of the thread, there was a discussion about recommender capabilities. Since we were looking at as a model (this is a Reddit clone), I went to the reddit discussion of the development of that open source platform and found that they, Reddit, actually are looking for a recommender system and they have a nice dataset.

There are 23,091,688 votes from 43,976 users over 3,436,063 links in 11,675 reddits. (Interestingly these ~44k users represent almost 17% of our total votes). The dump is 2.2gb uncompressed, 375mb in bz2.

A reddit is a category. A link is a subject (in Arxaliv it would be a paper) so that matrix (43976 x 3436063) is pretty sparsely filled (1.5e-5). Some SVD has been tried but I am sure they haven’t looked at low rank solvers. Since Reddit is such a massive platform, if your algorithm provides good results, it will get to be known beyond your expectations. 

