Just another site

Archive for 六月 2013

Thought this was cool: 如何计算两个文档的相似度全文文档

leave a comment »




  1. 如何计算两个文档的相似度(一)
  2. 如何计算两个文档的相似度(二)
  3. 如何计算两个文档的相似度(三)
  4. 概率语言模型及其变形系列-LDA及Gibbs Sampling
  5. 概率语言模型及其变形系列-PLSA及EM算法
  6. LDA-math-神奇的Gamma函数(1)
  7. LDA-math-神奇的Gamma函数(2)
  8. LDA-math-神奇的Gamma函数(3)
  9. 概率语言模型及其变形系列-LDA Gibbs Sampling 的JAVA实现

from 我爱自然语言处理:


Written by cwyalpha

六月 10, 2013 at 2:38 上午

发表在 Uncategorized

Thought this was cool: 比奥数题还脑残的Python题

leave a comment »

下边的问题是在CPython interactive shell 里按顺序执行的,请在 下划线处 ____ 填上答案。

>>> x = [1, 2, 3]
>>> for number in x:
...     number += 1
>>> print x
>>> print 1 == True
>>> print 0 == False
>>> print 2 == True
>>> item = 'x'
>>> print item == "y" or "z"
>>> l = [1,6,4,2,3,9,4,6]
>>> l = l.sort()
>>> print l
>>> ll = [lambda: n for n in range(5)]
>>> print [l() for l in ll]
>>> my_gen = (i for i in range(10))
>>> print(3 in my_gen)
>>> print(7 in my_gen)
>>> print(2 in my_gen)


from est's blog:

Written by cwyalpha

六月 5, 2013 at 10:38 上午

发表在 Uncategorized

Thought this was cool: 微软到底出了什么问题

leave a comment »

最近HN上有个关于Linux 3.1内核tickless新特性的讨论,然后一位微软员工,自称来自NT内核组,发个帖子抱怨微软:

See, component owners are generally openly hostile to outside patches: if you’re a dev, accepting an outside patch makes your lead angry (due to the need to maintain this patch and to justify in in shiproom the unplanned design change), makes test angry (because test is on the hook for making sure the change doesn’t break anything, and you just made work for them), and PM is angry (due to the schedule implications of code churn). There’s just no incentive to accept changes from outside your own team. You can always find a reason to say “no”, and you have very little incentive to say “yes”.

Another reason for the quality gap is that that we’ve been having trouble keeping talented people. Google and other large Seattle-area companies keep poaching our best, most experienced developers,

More examples:
– We can’t touch named pipes. Let’s add %INTERNAL_NOTIFICATION_SYSTEM%! (Oh, and let’s make %INTERNAL_NOTIFICATION_SYSTEM% inconsistent with virtually every other named NT primitive.)
– We can’t expose %INTERNAL_NOTIFICATION_SYSTEM% to the rest of the world because we don’t want to fill out paperwork and we’re not losing sales because we only have 1990s-era Win32 APIs available publicly.
– We can’t touch DCOM. Let’s create %C#_REMOTING_FLAVOR_OF_THE_WEEK%!
– XNA. Need I say more?
– Why would anyone need an archive format that supports files larger than 2GB?
– Let’s support symbolic links (Can I have a one on my review score now?), but make sure that nobody can use them so I don’t get blamed for security vulnerabilities (Great! I got that one on my review score, and now I get to look sage and responsible!)
– We can’t touch Source Depot, so let’s hack together SDX!
– We can’t touch SDX, so let’s pretend for four releases that we’re moving to TFS while not actually changing anything!
– Oh god, the NTFS code is a purple opium-fueled Victorian horror novel that uses global recursive locks and SEH for flow control. Let’s write ReFs instead. (And hey, let’s start by copying and pasting the NTFS source code and removing half the features! Then let’s add checksums, because checksums are cool, right, and now with checksums we’re just as good as ZFS? Right? Do I get a one on my review score now? And who the hell needs quotas anyway?)
– We just can’t be fucked to implement C11 support, and variadic templates were just too hard to implement in a year. (But ohmygosh I turned “^” into a reference-counted pointer operator. Can I have my patent cube and one on my review score now? Oh, and what’s a reference cycle?)


  1. 改进都被臃肿的官僚枪毙了。
    2 牛人都去Google了
  2. 做内核的都是现招应届毕业生。造轮子NIH综合症高发群体

在大学的时候也接触过微软技术体系,感觉是有牛逼的地方,有亮点,但是偶数代会把奇数代的技术否定掉。记得有个哥们花了很大力气学习翻译了.NET Atlas AJAX,结果beta一完毕,整个AJAX体系就变得面目全非了。

微软的内部已经死掉了。Wintel 没有希望了。PC也将继续没落。

btw 想起了一幅图

from est's blog:

Written by cwyalpha

六月 5, 2013 at 10:23 上午

发表在 Uncategorized