Python Basics 1

Source: Internet
Author: User
Tags apache log truncated timedelta

1. Understanding assic Unicode UTF-8 EncodingAssic the first pure English Unicode addition of Chinese characters in memory is Unicode using the U-declared string as a Unicode type utf-8 variable Save on the server (general write files are, character set garbled problem resolution) encode decode

2. Using and Importing modules

Importfrom * Import * speak Basic module

3. User interaction and formatted output

Raw_input (user interaction)%s,%s,%s (Name,age,job)

4. Process Control and recycling

Age = Raw_input (' Age: ') if Age>40:print ' old! ' Elif age>30:print ' young! ' Else:print ' younger! ' The If execution ends directly when the case is matched, not exiting else, the declaration of the variable can be omitted as equals = the definition of the assignment is a double equal = = = For loop for * in *:

5.while Cycle

The while loop defaults to a dead loop for the loop has a start end, the number of cycles  print_num = input (' which loop do you want it to is printed out? ') Count = 0while Count < 100000:       if count = print_num:          &NBSP ; print ' There got  the number: ',count            choice = raw_input (' Do you want to Continue the loop? (y/n) '             if choice = = ' n ':                 break            else:                  W Hile print_num <= count:                         print _num = InOut (' which loop do you want it to be printed out? ')                          print u "pass"     &NBSP ;  else:              print ' Loop: ',count  &nbsp    count  +=1       else:      print ' Loop: ',count 

6. File processing

Read file F=file (' MyFile.txt ', ' r ') for line in F.readlines (): line = Line.strip (' \ n '). Split (': ') print line file content written in memory buffer More than 1024 writes to the hard disk to force write to hard disk F.flush
The result of line is the list R read-only mode W write-only mode a append mode r+b binary form (Linux vs. Windows) The common method for processing dos2unixw+ba+b file files: File properties f.cl OSE () #标记文件是否已经关闭f. Encoding #文件编码f. Flush # F.readline ([size]) #读一行 if siz is defined E, it is possible to return only part of a line f.readline/f.readlines/f.xreadlines

File.readlines () is to read the entire contents of the file into memory, and parse into a list, when the size of the file is very large, it takes a lot of memory, the use of this method is an unwise approach.


On the other hand, starting with Python 2.3, the file types in Python begin to support iterative functions, such as the following two pieces of code are actually similar:

with open(‘foo.txt‘‘r‘) as f:    for line in f.readlines():        # do_something(line)
With open (' Foo.txt ', ' R ') as F:for-F: # do_something (line)

However, the latter iteration takes up less memory and is more intelligent (depending on the implementation of the Python file object), and the required file content is automatically read from buffer as needed, and it is an encouraging practice.

As for File.xreadlines (), the direct return of a ITER (file) iterator, after Python 2.3 has not recommended this method, it is recommended to use the following

for line in f:    # do_something(line)
This way. F.seek/f.tellf.truncate Cutting

The truncate () method truncates the size of the file. If an optional dimension parameter exists, the file is truncated (up to) the size.

The size defaults to the current position. The current file location does not change. Note that if a specified size exceeds the current size of the file, the results are dependent on the platform.

Note: This method does not open in read-only mode when the file is working.

The following is the syntax for the Truncate () method:

Fileobject.truncate ([size])

Size--If an optional parameter exists, the file is truncated (up to).

This method does not return any values.

The following example shows the use of the Truncate () method.

#!/usr/bin/python
# Open a filefo = open ("Foo.txt", "rw+") print "Name of the file:", fo.name# assuming file has following 5 lines# this is 1s T line# this was 2nd line# this is 3rd line# This is 4th line# this is 5th lineline = Fo.readline () print "Read line:%s"% (li  NE) # now truncate remaining file.fo.truncate () # Try to read file Nowline = Fo.readline () print ' Read line:%s '% (line) # Close Opend Filefo.close ()

When we run the above program, it produces the following results:

Name of the file:  foo.txtread Line:this is 1st line
Read Line:

7. Optimal processing of incremental logs

File just reads the handle to the F.seekf.tell
Example of incremental processing #!/usr/bin/env pythonimport sysdef filerev (FD):     fd.seek (0,2)     pos = -1    line =  '      While fd.tell ()  > 0:        fd.seek (pos,1)          d = fd.read (1)          if fd.tell ()  == 1:             yield d + line             break        else:             pos = -2             if d !=  ' \ n ':                 line = d  + line            else:                 if line:                      yield line                 line = d if __name__ ==  ' __main__ ':     with open ( SYS.ARGV[1])  as fd:        for i in filerev ( FD):            print i,  

Apache Log Processing example

 #!/usr/bin/env python import sysimport datetimeimport socketfrom file_ backwards import * month = {     ' Jan ':1,     ' Feb ':2,     ' Mar ':3,     ' Apr ':4,     ' may ':5,      ' Jue ':6,     ' Jul ':7,     ':8,      ' Sep ':9,     ' Oct ':10,     ' Nov ':11,     ' Dec ' : 12,} def parse_apache_date (DATESTR):    day, month, yearandtime  = datestr.split ('/')     year, hour, minute,second= yearandtime.split (' : ')     return datetime.datetime (int (year), Month[month],int (day), int (hour), int (minute) )  def countdict (d, k):    if k in d:        &NBSP;D[K]&NBSP;+=&NBSP;1&NBSP;&NBSP;&NBSP;&NBSP;ELSE:&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;&NBSP;D[K]  = 1 def parse_apache_log (logfile,ten_m):     result = {}     with open (logfile)  as fd:         for line in filerev (FD):             splited_line = line.split ()              datestr = splited_line[3][1:]             apache_date = parse_apache_date (DATESTR)              if apache_date > ten_m:                 countdict (Result, apache_date.strftime ('%s '))             else:                 return result if __name__ ==  ' __main__ ':     now = datetime.datetime.now ()     timedelta =  Datetime.timedelta (minutes=10)     ten_m_ago = now - timedelta     key =  ' Http.count '     data = parse_apache_log ( Sys.argv[1], ten_m_ago)     sock = socket.socket ()      Sock.connect ((' 127.0.0.1 ',  2003))     print data    for  K, v in data.items ():         sock.send ("%s %d  %s\n " %  (key, v, k))

Python Fundamentals 1

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.