寫了一個測試網站的指令碼,用多線程跑的,每個線程每秒去取一次網頁,因為這個指令碼用到了線程啊,time,ulrlib這些東西,也許對其他人會有用,貼出來吧。
另外一個原因是,通過這個指令碼,發現python的效率真的不錯,無論是開發效率還是執行效率。之前完全沒用過python的thread和time,一邊看python programing,一邊寫,也就一個小時的功夫,看上去效果還不錯。
代碼如下:
#!/usr/local/bin/python
#FileName = test_Web.py
#get from db
#http://192.168.1.74/spaces/posts/postdetail.aspx?id=
#put into db
#http://192.168.1.74/admin/space/post/post_add.aspx?fid=0&um=300372&v=__VERSION__&title=0&content=hahaasdfasdf
import thread, time, urllib
id_index = 100000
id_count = 20000
id_max = id_index+id_count
i_cnt = 0
time_begin = time.time()
bStop = False
def openurl():
sock = urllib.urlopen("http://192.168.1.74/admin/space/post/post_add.aspx?fid=0&um=300372&v=__VERSION__&title=0&content=hahaasdfasdf")
htmlSource = sock.read()
#print htmlSource
sock.close()
def opengeturl(id):
strUrl = "http://192.168.1.74/spaces/posts/postdetail.aspx?id=%d" % id
print strUrl
sock = urllib.urlopen(strUrl)
htmlSource = sock.read()
#print htmlSource
sock.close()
def child( myID ): # this function runs in threads
while( True ):
global id_index
global id_max
global id_count
global time_begin
global bStop
global i_cnt
if( i_cnt > id_count ):
now = time.time()
i_handle_time = now - time_begin
if( bStop == False ):
print "%d rows cost %f second" % ( id_count, i_handle_time )
bStop = True
break
#id_index = id_index + 1
i_cnt = i_cnt + 1
print "[%d] ==> %d" % (myID, id_index)
opengeturl(id_index)
time.sleep(1)
for i in range( 500 ): # spawn 3 threads
thread.start_new( child, (i,) )
time.sleep(1000000)
print 'Main thread exiting.' # don't exit too early