CWYAlpha

Just another WordPress.com site

Thought this was cool: 利用函数来画任意图形

leave a comment »

dft

最近我们经常看到像鸟叔,初音之类的通过函数图像来画出来,看上去十分神奇的样子,wolframalpha这里有大量的通过函数图像来画人物的例子,大家可以去围观,而且最上面我这几个字也是我用函数图像画出来的,今天我们就说说这是怎么做到的。

首先我画的图形的函数是这个样子的

x(t)= 3.69696969697 *cos( 0.0 *t)- 1.78787878788 *sin( 0.0 *t) + -0.608631557183 *cos( 0.190399554763 *t)- 0.637886769069 *sin( 0.190399554763 *t) + -1.00348014017 *cos( 0.380799109526 *t)- -0.251077539499 *sin( 0.380799109526 *t) + 0.0858300019403 *cos( 0.571198664289 *t)- 0.0312569171374 *sin( 0.571198664289 *t) + 0.0403938878342 *cos( 0.761598219052 *t)- -0.424077642684 *sin( 0.761598219052 *t) + -0.029533908125 *cos( 0.951997773815 *t)- -0.612124881382 *sin( 0.951997773815 *t) + -0.196190215552 *cos( 1.14239732858 *t)- 0.0779263864101 *sin( 1.14239732858 *t) + -0.317379458544 *cos( 1.33279688334 *t)- -0.107897599954 *sin( 1.33279688334 *t) + 0.195824060124 *cos( 1.5231964381 *t)- -0.0603270845874 *sin( 1.5231964381 *t) + 0.0206709920939 *cos( 1.71359599287 *t)- -0.192643145398 *sin( 1.71359599287 *t) + -0.0390613766898 *cos( 1.90399554763 *t)- -0.0233815418361 *sin( 1.90399554763 *t) + 0.120124291368 *cos( 2.09439510239 *t)- 0.165578836822 *sin( 2.09439510239 *t) + 0.118680071019 *cos( 2.28479465716 *t)- 0.192116690811 *sin( 2.28479465716 *t) + 0.000254592255497 *cos( 2.47519421192 *t)- 0.0310134506924 *sin( 2.47519421192 *t) + 0.082560849545 *cos( 2.66559376668 *t)- -0.0576194138539 *sin( 2.66559376668 *t) + -0.0926779308527 *cos( 2.85599332145 *t)- -0.0839813133077 *sin( 2.85599332145 *t) + -0.192145472441 *cos( 3.04639287621 *t)- -0.00294302065064 *sin( 3.04639287621 *t) + -0.0467279999084 *cos( 3.23679243097 *t)- 0.0312620057434 *sin( 3.23679243097 *t) + -0.131380808582 *cos( 3.42719198573 *t)- 0.0240572405524 *sin( 3.42719198573 *t) + -0.146394824366 *cos( 3.6175915405 *t)- 0.0428285063991 *sin( 3.6175915405 *t) + -0.0688190483421 *cos( 3.80799109526 *t)- -0.02541184382 *sin( 3.80799109526 *t) + -0.0901484634214 *cos( 3.99839065002 *t)- 0.181258333651 *sin( 3.99839065002 *t) + -0.0898212610648 *cos( 4.18879020479 *t)- -0.0443667156102 *sin( 4.18879020479 *t) + -0.0750426044852 *cos( 4.37918975955 *t)- 0.0736544316679 *sin( 4.37918975955 *t) + -0.119525232263 *cos( 4.56958931431 *t)- -0.0355835823163 *sin( 4.56958931431 *t) + -0.193150001399 *cos( 4.75998886908 *t)- 0.0319541260203 *sin( 4.75998886908 *t) + 0.25777892582 *cos( 4.95038842384 *t)- -0.000416186591487 *sin( 4.95038842384 *t) + 0.171027143418 *cos( 5.1407879786 *t)- 0.218778298585 *sin( 5.1407879786 *t) + -0.16745165654 *cos( 5.33118753336 *t)- -0.127545487283 *sin( 5.33118753336 *t) + 0.242332056731 *cos( 5.52158708813 *t)- -0.263812652924 *sin( 5.52158708813 *t) + 0.0367447452299 *cos( 5.71198664289 *t)- -0.534397947337 *sin( 5.71198664289 *t) + -0.325288006456 *cos( 5.90238619765 *t)- 0.163558793585 *sin( 5.90238619765 *t) + -1.13634134796 *cos( 6.09278575242 *t)- -0.84340197599 *sin( 6.09278575242 *t)

y(t)= 1.78787878788 *cos( 0.0 *t)+ 3.69696969697 *sin( 0.0 *t) + 0.637886769069 *cos( 0.190399554763 *t)+ -0.608631557183 *sin( 0.190399554763 *t) + -0.251077539499 *cos( 0.380799109526 *t)+ -1.00348014017 *sin( 0.380799109526 *t) + 0.0312569171374 *cos( 0.571198664289 *t)+ 0.0858300019403 *sin( 0.571198664289 *t) + -0.424077642684 *cos( 0.761598219052 *t)+ 0.0403938878342 *sin( 0.761598219052 *t) + -0.612124881382 *cos( 0.951997773815 *t)+ -0.029533908125 *sin( 0.951997773815 *t) + 0.0779263864101 *cos( 1.14239732858 *t)+ -0.196190215552 *sin( 1.14239732858 *t) + -0.107897599954 *cos( 1.33279688334 *t)+ -0.317379458544 *sin( 1.33279688334 *t) + -0.0603270845874 *cos( 1.5231964381 *t)+ 0.195824060124 *sin( 1.5231964381 *t) + -0.192643145398 *cos( 1.71359599287 *t)+ 0.0206709920939 *sin( 1.71359599287 *t) + -0.0233815418361 *cos( 1.90399554763 *t)+ -0.0390613766898 *sin( 1.90399554763 *t) + 0.165578836822 *cos( 2.09439510239 *t)+ 0.120124291368 *sin( 2.09439510239 *t) + 0.192116690811 *cos( 2.28479465716 *t)+ 0.118680071019 *sin( 2.28479465716 *t) + 0.0310134506924 *cos( 2.47519421192 *t)+ 0.000254592255497 *sin( 2.47519421192 *t) + -0.0576194138539 *cos( 2.66559376668 *t)+ 0.082560849545 *sin( 2.66559376668 *t) + -0.0839813133077 *cos( 2.85599332145 *t)+ -0.0926779308527 *sin( 2.85599332145 *t) + -0.00294302065064 *cos( 3.04639287621 *t)+ -0.192145472441 *sin( 3.04639287621 *t) + 0.0312620057434 *cos( 3.23679243097 *t)+ -0.0467279999084 *sin( 3.23679243097 *t) + 0.0240572405524 *cos( 3.42719198573 *t)+ -0.131380808582 *sin( 3.42719198573 *t) + 0.0428285063991 *cos( 3.6175915405 *t)+ -0.146394824366 *sin( 3.6175915405 *t) + -0.02541184382 *cos( 3.80799109526 *t)+ -0.0688190483421 *sin( 3.80799109526 *t) + 0.181258333651 *cos( 3.99839065002 *t)+ -0.0901484634214 *sin( 3.99839065002 *t) + -0.0443667156102 *cos( 4.18879020479 *t)+ -0.0898212610648 *sin( 4.18879020479 *t) + 0.0736544316679 *cos( 4.37918975955 *t)+ -0.0750426044852 *sin( 4.37918975955 *t) + -0.0355835823163 *cos( 4.56958931431 *t)+ -0.119525232263 *sin( 4.56958931431 *t) + 0.0319541260203 *cos( 4.75998886908 *t)+ -0.193150001399 *sin( 4.75998886908 *t) + -0.000416186591487 *cos( 4.95038842384 *t)+ 0.25777892582 *sin( 4.95038842384 *t) + 0.218778298585 *cos( 5.1407879786 *t)+ 0.171027143418 *sin( 5.1407879786 *t) + -0.127545487283 *cos( 5.33118753336 *t)+ -0.16745165654 *sin( 5.33118753336 *t) + -0.263812652924 *cos( 5.52158708813 *t)+ 0.242332056731 *sin( 5.52158708813 *t) + -0.534397947337 *cos( 5.71198664289 *t)+ 0.0367447452299 *sin( 5.71198664289 *t) + 0.163558793585 *cos( 5.90238619765 *t)+ -0.325288006456 *sin( 5.90238619765 *t) + -0.84340197599 *cos( 6.09278575242 *t)+ -1.13634134796 *sin( 6.09278575242 *t)

如果像我们这里只用cos和sin的话,我们可以画出任意我们想画的闭合曲线,至于其他图像那样包含很多闭合曲线的是用了step function的技巧,这里我就暂时不说了,也就说我这里是说明如何画出任意的闭合曲线的。当然了也就是说只要你能一笔画的东西都可以,线当然是可以交叉或者重合的。

因为我们是要画闭合曲线,并且注意到我们的函数最后是x(t)和y(t)这种形式,所以也就是说,我们可以把x和y分别当作是周期函数来对待,说道周期函数当然就是想到傅里叶级数了,因为傅里叶级数可以逼近任意的周期函数。所以这里我们就要用到DFT离散傅立叶变换

我们这里就要用到离散傅里叶变换和逆变换的公式

\hat{x}[k]=\sum_{n=0}^{N-1}e^{-i\frac{2\pi}{N}nk}x[n]
x[n]=\frac{1}{N}\sum_{k=0}^{N-1}e^{-i\frac{2\pi}{N}nk}\hat{x}[k]

也就是说我们可以把我们想要画的图形画出来,然后依次找出我们要连接的这些点,这些点就是x[n],带入到上面的第一个公式,我们可以得到x[k],于是再调用第二个公式也就是离散傅里叶的逆变换,最后利用欧拉公式把e^ix=cosx+isinx展开,就能得到我们最后的函数形式了。

按照这个思路实现的python代码如下

  1. importmath
  2.  
  3. N =int(raw_input())
  4.  
  5. f =[]
  6. foriinrange(N):
  7.    (x, y)=map(float,raw_input().split())
  8.     f.append(complex(x, y))
  9.  
  10. F =[]
  11. foriinrange(N):
  12.     ang =-2*1j*math.pi*i / N
  13.     r =0
  14.    forjinrange(N):
  15.         r +=(math.e**(ang*j))*f[j]
  16.     F.append(r)
  17.  
  18. print”set parametric”
  19. print”set samples”, N +1
  20.  
  21. print”x(t)=”,
  22. foriinrange(N):
  23.     ang =2*math.pi*i / N
  24.    ifi>0:
  25.        print”+”,
  26.    printF[i].real/ N,”*cos(“, ang,”*t)-“,
  27.    printF[i].imag/ N,”*sin(“, ang,”*t)”,
  28.  
  29. print
  30. print”y(t)=”,
  31. foriinrange(N):
  32.     ang =2*math.pi*i / N
  33.    ifi>0:
  34.        print”+”,
  35.    printF[i].imag/ N,”*cos(“, ang,”*t)+”,
  36.    printF[i].real/ N,”*sin(“, ang,”*t)”,
  37.  
  38. print
  39. print”plot [t=0:”, N,”] x(t), y(t)”
  40. print”pause 60″

也可以在这里看代码,把代码存成dft.py然后运行python dft.py输入点的个数以及点的位置,运行程序就可以看到生成的函数了,把结果输入到gnuplotli就可以看到函数图像了。

比如我生成的那个函数图像的输入是这个样子的

33

0 0

2 0

1 0

1 3

0 3

2 3

2 2

3.5 0

5 2

5 3

4 3

3.5 2

3 3

2 3

3 3

3.5 2

4 3

5 3

5 1

5.5 0

6.5 0

7 1

7 3

7 1

6.5 0

5.5 0

5 1

5 3

4 3

3.5 2

3 3

1 3

1 0

可以把输入文件保存到input.txt,然后运行cat input.txt | python dft.py | gnuplot就可以看到绘制好的函数了。

也就是说只要把你要画的东西的点描绘出来,输入到程序中就可以生成不可思议的函数了!

参考资料:

http://mathematica.stackexchange.com/questions/17704/how-to-create-new-person-curve

http://tieba.baidu.com/p/2156093774

http://www.quora.com/Mathematics/How-is-the-Gangnam-Style-mathematical-plot-made

我猜您可能还会喜欢:


0

   

0

udpwork.com 聚合
|
评论: 0
|
要! 要! 即刻! Now!

from IT牛人博客聚合网站: http://www.udpwork.com/item/9767.html

Written by cwyalpha

五月 4, 2013 at 5:08 上午

发表在 Uncategorized

Thought this was cool: 字符串匹配的Boyer-Moore算法

leave a comment »

上一篇文章,我介绍了KMP算法

但是,它并不是效率最高的算法,实际采用并不多。各种文本编辑器的”查找”功能(Ctrl+F),大多采用Boyer-Moore算法

Boyer-Moore算法不仅效率高,而且构思巧妙,容易理解。1977年,德克萨斯大学的Robert S. Boyer教授和J Strother Moore教授发明了这种算法。

下面,我根据Moore教授自己的例子来解释这种算法。

1.

假定字符串为”HERE IS A SIMPLE EXAMPLE”,搜索词为”EXAMPLE”。

2.

首先,”字符串”与”搜索词”头部对齐,从尾部开始比较。

这是一个很聪明的想法,因为如果尾部字符不匹配,那么只要一次比较,就可以知道前7个字符(整体上)肯定不是要找的结果。

我们看到,”S”与”E”不匹配。这时,“S”就被称为”坏字符”(bad character),即不匹配的字符。
我们还发现,”S”不包含在搜索词”EXAMPLE”之中,这意味着可以把搜索词直接移到”S”的后一位。

3.

依然从尾部开始比较,发现”P”与”E”不匹配,所以”P”是”坏字符”。但是,”P”包含在搜索词”EXAMPLE”之中。所以,将搜索词后移两位,两个”P”对齐。

4.

我们由此总结出“坏字符规则”

  后移位数 = 坏字符的位置 – 搜索词中的上一次出现位置

如果”坏字符”不包含在搜索词之中,则上一次出现位置为 -1。

以”P”为例,它作为”坏字符”,出现在搜索词的第6位(从0开始编号),在搜索词中的上一次出现位置为4,所以后移 6 – 4 = 2位。再以前面第二步的”S”为例,它出现在第6位,上一次出现位置是 -1(即未出现),则整个搜索词后移 6 – (-1) = 7位。

5.

依然从尾部开始比较,”E”与”E”匹配。

6.

比较前面一位,”LE”与”LE”匹配。

7.

比较前面一位,”PLE”与”PLE”匹配。

8.

比较前面一位,”MPLE”与”MPLE”匹配。我们把这种情况称为”好后缀”(good suffix),即所有尾部匹配的字符串。
注意,”MPLE”、”PLE”、”LE”、”E”都是好后缀。

9.

比较前一位,发现”I”与”A”不匹配。所以,”I”是”坏字符”。

10.

根据”坏字符规则”,此时搜索词应该后移 2 – (-1)= 3 位。问题是,此时有没有更好的移法?

11.

我们知道,此时存在”好后缀”。所以,可以采用“好后缀规则”

  后移位数 = 好后缀的位置 – 搜索词中的上一次出现位置

这个规则有三个注意点:

  (1)”好后缀”的位置以最后一个字符为准。假定”ABCDEF”的”EF”是好后缀,则它的位置以”F”为准,即5(从0开始计算)。

  (2)如果”好后缀”在搜索词中只出现一次,则它的上一次出现位置为 -1。比如,”EF”在”ABCDEF”之中只出现一次,则它的上一次出现位置为-1(即未出现)。

  (3)如果”好后缀”有多个,则除了最长的那个”好后缀”,其他”好后缀”的上一次出现位置必须在头部。比如,假定”BABCDAB”的”好后缀”是”DAB”、”AB”、”B”,请问这时”好后缀”的上一次出现位置是什么?回答是,此时采用的好后缀是”B”,它的上一次出现位置是头部,即第0位。这个规则也可以这样表达:如果最长的那个”好后缀”只出现一次,则可以把搜索词改写成如下形式进行位置计算”(DA)BABCDAB”,即虚拟加入最前面的”DA”。

举例来说,如果字符串”ABCDAB”的后一个”AB”是”好后缀”。那么它的位置是5(从0开始计算,取最后的”B”的值),在”搜索词中的上一次出现位置”是1(第一个”B”的位置),所以后移 5 – 1 = 4位,前一个”AB”移到后一个”AB”的位置。

再举一个例子,如果字符串”ABCDEF”的”EF”是好后缀,则”EF”的位置是5 ,上一次出现的位置是 -1(即未出现),所以后移 5 – (-1) = 6位,即整个字符串移到”F”的后一位。

回到上文的这个例子。此时,所有的”好后缀”(MPLE、PLE、LE、E)之中,只有”E”在”EXAMPLE”还出现在头部,所以后移 6 – 0 = 6位。

12.

可以看到,”坏字符规则”只能移3位,”好后缀规则”可以移6位。所以,Boyer-Moore算法的基本思想是,每次后移这两个规则之中的较大值。

更巧妙的是,这两个规则的移动位数,只与搜索词有关,与原字符串无关。因此,可以预先计算生成《坏字符规则表》和《好后缀规则表》。使用时,只要查表比较一下就可以了。

13.

继续从尾部开始比较,”P”与”E”不匹配,因此”P”是”坏字符”。根据”坏字符规则”,后移 6 – 4 = 2位。

14.

从尾部开始逐位比较,发现全部匹配,于是搜索结束。如果还要继续查找(即找出全部匹配),则根据”好后缀规则”,后移 6 – 0 = 6位,即头部的”E”移到尾部的”E”的位置。

(完)

文档信息

[广告]
 优衫(Ushan)是国内顶尖的定制西服店,常年为众多政商名流、影视明星、跨国高管定制衬衫与西服。以工艺精良、用料考究、版型出色、性价比高等特点广受各界好评。


0

   

0

udpwork.com 聚合
|
评论: 0
|
要! 要! 即刻! Now!

from IT牛人博客聚合网站: http://www.udpwork.com/item/9762.html

Written by cwyalpha

五月 4, 2013 at 5:08 上午

发表在 Uncategorized

Thought this was cool: PonyORM – python的新一代黑魔法级别ORM

leave a comment »

简单的例子,来自官网

python的查询代码:

select(c for c in Customer
     if sum(c.orders.price) > 1000)

通过PonyORM翻译成SQL:

SELECT "c"."id"
FROM "Customer" "c"
  LEFT JOIN "Order" "order-1"
    ON "c"."id" = "order-1"."customer"
GROUP BY "c"."id"
HAVING coalesce(SUM("order-1"."total_price"), 0) > 1000

以前觉得peewee的查询语法很clever,比Django那种丑爆的 price__gt=1000 好出一条街。那么PonyORM就超出其他python ORM一条银河系了。

这个PonyORM的黑魔法在哪里呢?

  1. 首先select(x for x in ...) 这是一个generator comprehension,和list comprehension不同的是,返回的是一个惰性求值的生成器,于是
  2. 该表达式bytecode可以反编译
  3. 把Python的AST翻译成SQL的AST
  4. 分离和优化查询。
  5. 把SQL AST生成为特定数据库的query
  6. 执行SQL query
  7. 把返回构造成python对象,并且缓存。

出自PonyORM开发者 u/amalashkevich。更加详细的黑魔法解释可以在stackoverflow上看到。

可以说Peewee和PonyORM的设计才是优雅的。但是PonyORM更加高级,基本接近了LINQ和超过了Hibernate HQL。

顺便提一下,有一个另外类似的ORM——PQL,用于MongoDB:

比如:

pql.find("a > 1 and b == 'foo' or not c.d == False")

翻译成MongoDB的查询:

{'$or': [{'$and': [{'a': {'$gt': 1}}, {'b': 'foo'}]}, {'$not': {'c.d': False}}]}

很可惜它是基于字符串的翻译,而不是原生python表达式。

其他讨论

from est's blog: http://blog.est.im/post/49564925054

Written by cwyalpha

五月 4, 2013 at 4:53 上午

发表在 Uncategorized

Thought this was cool: 用Go語言計算PageRank

leave a comment »

PageRank是搜索引擎結果排序的重要算法,其依賴的方式是鏈接結構分析,大致解釋就是一個網頁A有一個指向另一個網頁B的鏈接,就相當於A給B投票,獲得投票越多的網頁的PageRank值越高。並不是每個網頁的投票權重都是一樣的,自己PageRank越大的網頁投票權重越大,所以PageRank的計算公式是遞歸的,需要迭代計算,直到結果收斂。

我使用Go語言對真實網頁的數據WT2g進行了PageRank的計算,計算出的結果分佈如下圖:

PageRank Distribution

觀察發現,PageRank的分佈服從齊普夫分佈(Zipf Distribution),其中32%的網頁的PageRank爲最小值9.459×10^-7,超過一半的網頁的PageRank的值小於6.600×10^-6,而PageRank的最大值爲1.885×10^-3。

值得一提的是Go語言,推薦一個對Go語言特性的介紹:Go在Google:以軟件工程爲目的的語言設計。使用Go語言最大的感受是它的函數可以有多返回值,而且在各種API中這個特性被大量使用,而且約定多返回值的最後一個參數是error類型,表示是否有錯誤發生。這種錯誤處理的方法和C++、Java、Python、JavaScript使用的異常不同,倒是與C語言的錯誤處理相似。C語言習慣於把函數的返回值作爲「是否有錯誤發生」的標記,如果有錯誤再通過其他的手段(如全局變量error)來獲取,Go語言直接把錯誤作爲了一個返回值。Go語言還支持一等函數(First Class Function)和閉包,因此方便用來實現yield功能,下面代碼中的lineReader函數就是返回了一個生成器,用來按行讀取文件,每調用一次讀取一行,讀完以後釋放內存。Go語言還是一個顯式有指針的語言,同時也提供了垃圾回收,省去了手動維護內存的麻煩。

以下是用Go語言計算PageRank的代碼:

package main

import (
    "bufio"
    "errors"
    "fmt"
    "io"
    "math"
    "os"
    "strings"
)

type vertex struct {
    inDegree  int
    outDegree int
    pagerank  float64
}

type edge struct {
    start int
    end   int
}

var vertexs []vertex
var edges []edge
var vertexID map[string]int = make(map[string]int)
var numVertex int = 0

func lineReader(filename string) (func() (string, error), error) {
    f, err := os.Open(filename)
    if err != nil {
        return nil, err
    }
    buf := bufio.NewReaderSize(f, 64)
    return func() (string, error) {
        line, isPrefix, err := buf.ReadLine()
        if err != nil {
            if err == io.EOF {
                if err := f.Close(); err != nil {
                    return "", err
                }
            }
            return "", err
        }
        if isPrefix {
            return "", errors.New("buffer size to small")
        }
        return string(line), nil
    }, nil
}

func addVertex(vertexName string) int {
    var ID int
    var ok bool
    if ID, ok = vertexID[vertexName]; !ok {
        ID = numVertex
        vertexID[vertexName] = ID
        vertexs = append(vertexs, vertex{})
        numVertex++
    }
    return ID
}

func read() {
    readline, err := lineReader("wt2g_inlinks.source")
    if err != nil {
        panic(err)
    }
    for {
        line, err := readline()
        if err != nil {
            if err == io.EOF {
                break
            }
            panic(err)
        }
        // Line format is like "ID1\tID2"
        sections := strings.Split(line, "\t")
        if len(sections) != 2 {
            panic(errors.New("Illegal line format"))
        }
        start := addVertex(sections[0])
        end := addVertex(sections[1])
        edges = append(edges, edge{start, end})
    }
}

func calcPagerank(alpha float64, numIterations int) {
    // Initialize out degree of every vertex
    for i := range edges {
        edge := &edges[i]
        vertexs[edge.start].outDegree++
        vertexs[edge.end].inDegree++
    }
    var I = make([]float64, numVertex)
    var S float64
    for i := 0; i < numVertex; i++ {
        vertexs[i].pagerank = 1 / float64(numVertex)
        I[i] = alpha / float64(numVertex)
    }
    // Calculate pagerank repeatedly until converge (numIterations times)
    for k := 0; k < numIterations; k++ {
        for i := range edges {
            edge := &edges[i]
            I[edge.end] += (1 - alpha) * vertexs[edge.start].pagerank / float64(vertexs[edge.start].outDegree)
        }
        S = 0
        for i := 0; i < numVertex; i++ {
            if vertexs[i].outDegree == 0 {
                S += (1 - alpha) * vertexs[i].pagerank / float64(numVertex)
            }
        }
        for i := 0; i < numVertex; i++ {
            vertexs[i].pagerank = I[i] + S
            I[i] = alpha / float64(numVertex)
        }
    }
}

func main() {
    read()
    calcPagerank(0.15, 30)
    fmt.Println("Done")
}

BYVNotes是一個我用Go語言實現的簡單在線記事本網站,使用了Revel框架。

from Beyond the Void: http://www.byvoid.com/blog/pagerank-go

Written by cwyalpha

四月 26, 2013 at 2:08 上午

发表在 Uncategorized

Thought this was cool: A non-magical introduction to Pip and Virtualenv for Python beginners – Blog – DabApps – Brighton, UK

leave a comment »

Comments:

A non-magical introduction to Pip and Virtualenv for Python beginners – Blog – DabApps – Brighton, UK

URL: http://dabapps.com/blog/introduction-to-pip-and-virtualenv-python/

Tagged:

technical

python

One of the hurdles that new Python developers have to get over is understanding the Python packaging ecosystem. This blog post is based on material covered in our Python for Programmers training course, which attempts to explain pip and virtualenv for new Python users.

Prerequisites

Python for Programmers is aimed at developers who are already familiar with one or more programming languages, and so we assume a certain amount of technical knowledge. It will help if you’re reasonably comfortable with a command line. The examples below use bash, which is the default shell on Macs and most Linux systems. But the commands are simple enough that the concepts should be transferrable to any terminal, such as PowerShell for Windows.

pip

Let’s dive in. pip is a tool for installing Python packages from the Python Package Index.

PyPI (which you’ll occasionally see referred to as The Cheeseshop) is a repository for open-source third-party Python packages. It’s similar to RubyGems in the Ruby world, PHP’s Packagist, CPAN for Perl, and NPM for Node.js.

Python actually has another, more primitive, package manager called easy_install, which is installed automatically when you install Python itself. pip is vastly superior to easy_install for lots of reasons, and so should generally be used instead. You can use easy_install to install pip as follows:

You can then install packages with pip as follows (in this example, we’re installing Django):

# DON'T DO THIS
$ sudo pip install django

Here, we’re installing Django globally on the system. But in most cases, you shouldn’t install packages globally. Read on to find out why.

virtualenv

virtualenv solves a very specific problem: it allows multiple Python projects that have different (and often conflicting) requirements, to coexist on the same computer.

What problem does it solve?

To illustrate this, let’s start by pretending virtualenv doesn’t exist. Imagine we’re we’re going to write a Python program that needs to make HTTP requests to a remote web server. We’re going to use the Requests library, which is brilliant for that sort of thing. As we saw above, we can use pip to install Requests.

But where on your computer does pip install the packages to? Here’s what happens if I try to run pip install requests:

$ pip install requests
Downloading/unpacking requests
 Downloading requests-1.1.0.tar.gz (337Kb): 337Kb downloaded
 Running setup.py egg_info for package requests
Installing collected packages: requests
 Running setup.py install for requests
 error: could not create '/Library/Python/2.7/site-packages/requests': Permission denied

Oops! It looks like pip is trying to install the package into /Library/Python/2.7/site-packages/requests. This is a special directory that Python knows about. Anything that’s installed in site-packages can be imported by your programs.

We’re seeing the error because /Library/ (on a Mac) is not usually writeable by “ordinary” users. To fix the error, we can run sudo pip install requests (sudo means “run this command as a superuser”). Then everything will work fine:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ sudo pip install requests
Password:
Downloading/unpacking requests
 Running setup.py egg_info for package requests
Installing collected packages: requests
 Running setup.py install for requests
Successfully installed requests
Cleaning up...

This time it worked. We can now type python and try importing our new library:

>>> import requests
>>> requests.get('http://dabapps.com')
<Response [200]>

So, we now know that we can import requests and use it in our program. We go ahead and work feverishly on our new program, using requests (and probably lots of other libraries from PyPI too). The software works brilliantly, we make loads of money, and our clients are so impressed that they ask us to write another program to do something slightly different.

But this time, we find a brand new feature that’s been added to requests since we wrote our first program that we really need to use in our second program. So we decide to upgrade the requests library to get the new feature:

sudo pip install --upgrade requests

Everything seems fine, but we’ve unknowingly created a disaster!

Next time we try to run it, we discover that our original program (the one that made us loads of money) has completely stopped working and is raising errors when we try to run it. Why? Because something in the API of the requests library has changed between the previous version and the one we just upgraded to. It might only be a small change, but it means our code no longer uses the library correctly. Everything is broken!

Sure, we could fix the code in our first program to use the new version of the requests API, but that takes time and distracts us from our new project. And, of course, a seasoned Python programmer won’t just have two projects but dozens – and each project might have dozens of dependencies! Keeping them all up-to-date and working with the same versions of every library would be a complete nightmare.

How does virtualenv help?

virtualenv solves this problem by creating a completely isolated virtual environment for each of your programs. An environment is simply a directory that contains a complete copy of everything needed to run a Python program, including a copy of the python binary itself, a copy of the entire Python standard library, a copy of the pip installer, and (crucially) a copy of the site-packages directory mentioned above. When you install a package from PyPI using the copy of pip that’s created by the virtualenv tool, it will install the package into the site-packages directory inside the virtualenv directory. You can then use it in your program just as before.

How can I install virtualenv?

If you already have pip, the easiest way is to install it globally sudo pip install virtualenv. Usually pip and virtualenv are the only two packages you ever need to install globally, because once you’ve got both of these you can do all your work inside virtual environments.

In fact, virtualenv comes with a copy of pip which gets copied into every new environment you create, so virtualenv is really all you need. You can even install it as a separate standalone package (rather than from PyPI). This might be easier for Windows users. See virtualenv.org for instructions.

How do I create a new virtual environment?

You only need the virtualenv tool itself when you want to create a new environment. This is really simple. Start by changing directory into the root of your project directory, and then use the virtualenv command-line tool to create a new environment:

$ cd ~/code/myproject/
$ virtualenv env
New python executable in env/bin/python
Installing setuptools............done.
Installing pip...............done.

Here, env is just the name of the directory you want to create your virtual environment inside. It’s a common convention to call this directory env, and to put it inside your project directory (so, say you keep your code at ~/code/projectname/, the environment will be at ~/code/projectname/env/ – each project gets its own env). But you can call it whatever you like and put it wherever you like!

Note: if you’re using a version control system like git, you shouldn’t commit the env directory. Add it to your .gitignore file (or similar).

How do I use my shiny new virtual environment?

If you look inside the env directory you just created, you’ll see a few subdirectories:

The one you care about the most is bin. This is where the local copy of the python binary and the pip installer exists. Let’s start by using the copy of pip to install requests into the virtualenv (rather than globally):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
$ env/bin/pip install requests
Downloading/unpacking requests
 Downloading requests-1.1.0.tar.gz (337kB): 337kB downloaded
 Running setup.py egg_info for package requests
Installing collected packages: requests
 Running setup.py install for requests
Successfully installed requests
Cleaning up...

It worked! Notice that we didn’t need to use sudo this time, because we’re not installing requests globally, we’re just installing it inside our home directory.

Now, instead of typing python to get a Python shell, we type env/bin/python, and then…

>>> import requests
>>> requests.get('http://dabapps.com')
<Response [200]>

But that’s a lot of typing!

virtualenv has one more trick up its sleeve. Instead of typing env/bin/python and env/bin/pip every time, we can run a script to activate the environment. This script, which can be executed with source env/bin/activate, simply adjusts a few variables in your shell (temporarily) so that when you type python, you actually get the Python binary inside the virtualenv instead of the global one:

$ which python
/usr/bin/python
$ source env/bin/activate
$ which python
/Users/jamie/code/myproject/env/bin/python

So now we can just run pip install requests (instead of env/bin/pip install requests) and pip will install the library into the environment, instead of globally. The adjustments to your shell only last for as long as the terminal is open, so you’ll need to remember to rerun source env/bin/activate each time you close and open your terminal window. If you switch to work on a different project (with its own environment) you can run deactivate to stop using one environment, and then source env/bin/activate to activate the other.

Activating and deactivating environments does save a little typing, but it’s a bit “magical” and can be confusing. Make your own decision about whether you want to use it.

Requirements files

virtualenv and pip make great companions, especially when you use the requirements feature of pip. Each project you work on has its own requirements.txt file, and you can use this to install the dependencies for that project into its virtual environment:

env/bin/pip install -r requirements.txt

See the pip documentation for more details.

Recap

  • pip is a tool for installing packages from the Python Package Index.
  • virtualenv is a tool for creating isolated Python environments containing their own copy of python, pip, and their own place to keep libraries installed from PyPI.
  • It’s designed to allow you to work on multiple projects with different dependencies at the same time on the same machine.
  • You can see instructions for installing it at virtualenv.org.
  • After installing it, run virtualenv env to create a new environment inside a directory called env.
  • You’ll need one of these environments for each of your projects. Make sure you exclude these directories from your version control system.
  • To use the versions of python and pip inside the environment, type env/bin/python and env/bin/pip respectively.
  • You can “activate” an environment with source env/bin/activate and deactivate one with deactivate. This is entirely optional but might make life a little easier.

pip and virtualenv are indispensible tools if you’re a regular Python user. Both are fairly simple to understand, and we highly recommend getting to grips with them.

If this blog post has sparked your interest in learning Python, check out our Python for Programmers workshop at DabApps HQ in Brighton.

Please enable JavaScript to view the comments powered by Disqus.
blog comments powered by

from Hacker News 200: http://dabapps.com/blog/introduction-to-pip-and-virtualenv-python/

Written by cwyalpha

四月 25, 2013 at 12:38 下午

发表在 Uncategorized

Thought this was cool: 听水车们讲大数据在国内的发展

leave a comment »

发信人: Nineteen (..), 信区: Database
标 题: Re: cassandra集群的去中心拓扑真是帅啊
发信站: 水木社区 (Sat Mar 9 10:03:09 2013), 站内

就像@immars提到的,开源项目们在一两年后开发出来的东西比论文原型在性能上差了一个层次,其实不仅仅是性能,其他方面差得会更多。

然后其他公司一看,不错,有东西能应付应付需求,接着就开始大用特用,坚持个一两年,东西尽管被改个面目全非,但仅限于补丁摞补丁,在外围小刀,想深入大改?门都没有,老板们会说了,先满足业务需求。最常听到的说法是:tmd我们都要死了,你丫还想花那么长时间大改?

团队规模在“快死了”的状态中不断成长,成长的另一个原因是层出不穷的运维事件和用户“永远都没办法满足的需求”,话语权也变得越来越重。

集群规模越来越大,最后发现确实搞不定了,一边开始上各种歪招,比如云梯居然在优化jvm;另一方面开始组织力量研发自己的系统,后者三大互联网公司貌似都尝试过,百度的yangzhengkun,腾讯的zhuhuican和阿里的wangjian。

但是遇到阻力很大,阻力的一部分就来自于前面提到的“团队”,抢饭碗吗?另一部分则是互联网公司缺乏大型平台的研发经验,各种没耐心,各种弯路,各种交学费。腾讯和百度是属于交了学费退学那种。

阿里还在向前走,远没走到头,这也是为什么阿里云梯系统还在的原因,它不仅得在,还得加强,因为淘宝业务增长太快。

可以看看论文出来到现在多长时间了,如果有渠道,可以去了解了解google技术进步的速度,它跑得越来越快,差距越来越大,这不是成功打击了对手是什么

从另一个方面也容易理解,开源出来自己的系统加强竞争对手的技术基础设施吗?还没到共产主义社会。至于傍了大腿的项目们,人开源出来的从来不是它生产环境使用的现网系统,或者过时或者阉割。

至于有人说“这么说开源项目都是坏的了?”,不是这样,开源的螺丝钉、离合器、甚至发动机都不差,但是指望开源的空间站、宇宙飞船没有问题…还是算了吧,凑合用用就好,真有心,还是自己造。

发信人: penny1983 (一只熊猫,两种表述||熊猫永不受伤), 信区: Database
标 题: Re: cassandra集群的去中心拓扑真是帅啊
发信站: 水木社区 (Wed Apr 10 10:31:16 2013), 站内

开源实现没有靠谱的啊。

Paxos 算法和满足实际需求的系统之间还存在大量的鸿沟, fault-tolerant sytem 即
使写伪代码都不容易写对,Google开发chubby时候专门写了一个state machine
语言和相应的编译器,把用state machine 表示的算法转为c++,而且在chubby一致性检
验和容错方面投入了巨大的精力。

Google的chubby一开始也是基于第三方商业数据库,但是由于商业库的replication问
题(bug,无法证明replica算法正确),google不得不自己实现kv db 用于实现multi-
paxos。这一过程也是一把辛酸啊,参加google的论文
Paxos made live-An Engineering View。

from est's blog: http://blog.est.im/post/47677324624

Written by cwyalpha

四月 11, 2013 at 11:23 上午

发表在 Uncategorized

Thought this was cool: 代码洁癖症的表现

leave a comment »

文章系本人原创,转载请保持完整性并注明出自《四火的唠叨》

代码洁癖症的表现有下列情形之一的,你患上了代码洁癖症。症状程度可轻可重,轻者帮助写出优雅整洁的代码,重者走火入魔,万劫不复。

  1. 多余的空行、分号,没有使用的变量,见一个删一个。
  2. tab或者空格没有对齐的必须纠正过来,除了缩进用,不允许看到代码内连续两个空格。
  3. 看到一个类某个方法没有注释,不由自主地加上,不管有没有意义。
  4. 错误的拼写,无论是在命名还是注释必须纠正过来;不一致的大小写,必须要纠正过来;标点符号的遗漏,必须补上。
  5. 看到if(a==0)这样的代码必须改成if(0==a)这样的形式。
  6. 所有IDE对代码的告警必须消除,无论采取的方式是否有实际意义。
  7. 看到赤裸的数字,必须定义成常量,即便数字表意很直观,还是只能接受常量数字。
  8. 见不得非静态的公有变量,必须建立get/set方法。
  9. 不断地按代码格式整理的快捷键,在Eclipse就是不断地CTRL+Shift+F、CTRL+Shift+O,甚至不住地CTRL+S。
  10. 一旦看到超过连续3个的if-else判断分支,就要优化;类似的方法调用代码,如果连续出现,就要优化;超过若干行的方法,必须重构。
  11. 最本质的表现,喜欢长时间阅读自己的代码,心中一边啧啧赞赏不已,一边自我陶醉。

文章系本人原创,转载请保持完整性并注明出自《四火的唠叨》

分享到:

0

   

0

udpwork.com 聚合
|
评论: 0
|
要! 要! 即刻! Now!

from IT牛人博客聚合网站: http://www.udpwork.com/item/9287.html

Written by cwyalpha

四月 11, 2013 at 11:23 上午

发表在 Uncategorized