Just another site

Thought this was cool: Misc Updates

leave a comment »

How big is Facebook data? I got this update from my collaborator Aapo Kyrola:

This morning, there are more than one billion people using Facebook actively each month….

Facebook has also shared a number of key metrics with users along with the announcement, including 1.13 trillion Likes since its 2009 launch (note that this is actually probably higher, since the official press document contained a note accidentally left in from an editor about rolling back the number because of info shared previously with Businessweek), 140.3 billion friend connections, 219 billion photos uploaded, 17 billion location-tagged posts and 62.6 million songs played some 22 billion times.

I got the following 10 patterns for research in Machine learning from Tianqi Chen. The list is by John Langford in his blog:

  1. Separate code from data.
  2. Separate input data, working data and output data.
  3. Save everything to disk frequently.
  4. Separate options from parameters.
  5. Do not use global variables.
  6. Record the options used to generate each run of the algorithm.
  7. Make it easy to sweep options.
  8. Make it easy to execute only portions of the code.
  9. Use checkpointing.
  10. Write demos and tests.

 Following John’s good practice, Tianqi used some of those ideas for competing in KDD CUP. And here is a summary of his experience. Specifically, Tianqi uses Makefiles for managing multiple and complex execution scripts.

from Large Scale Machine Learning and Other Animals:


Written by cwyalpha

十月 10, 2012 在 2:54 上午

发表在 Uncategorized


Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  更改 )

Google+ photo

You are commenting using your Google+ account. Log Out /  更改 )

Twitter picture

You are commenting using your Twitter account. Log Out /  更改 )

Facebook photo

You are commenting using your Facebook account. Log Out /  更改 )


Connecting to %s

%d 博主赞过: