Python learning notes (2) and python learning notes

Source: Internet
Author: User

Python learning notes (2) and python learning notes

Note eight
Usage metadata
The biggest difference between a dynamic language and a static language is the definition of functions and classes, instead of at compile time, but dynamic creation at runtime.
Logging module for program debugging
Import logging
Unit Test
To write unit tests, we need to introduce the unittest module that comes with python.

File read/write
Reading and Writing files is the most common IO operation. python has built-in functions for reading and writing files. To read and write a file is to request the operating system to open a file object, and then read data from the file object through the interface provided by the operating system, or write data to the file object.
Step 1: open the file
Use the python built-in open () function
Step 2: read or write the object
Read (), write ()
Step 3: Close the file
Close (), the file must be closed after use, because the file object will occupy the resources of the operating system, and the number of files that can be opened at the same time by the operating system is also limited
However, every write is too cumbersome. Therefore, PYthon introduces the with statement to automatically call the close () method.
With open () as f

Binary files
By default, all text files are read and ASCII-encoded. To read binary files, such as video and video files, open the file in "rb" mode.
Character encoding
To read a non-ASCII text file, it must be opened in binary mode and decoded. Such as GBK-encoded files
F = open ('/Users/michael/gbk.txt', 'rb ')
>>> U = f. read (). decode ('gbk ')
>>> U
U' \ u6d4b \ u8bd5'
>>> Print u
Test
If manual conversion is troublesome, python also provides a codecs module to help us automatically convert the encoding during file reading and directly read unicode
Import codecs
With codecs. open ('/Users/michael/gbk.txt', 'R', 'gbk') as f:
F. read () # U' \ u6d4b \ u8bd5'


Serialization
During the running of the program, all the variables are in the memory and can be modified at any time. However, once the program ends, all the memory occupied by the variables will be reclaimed by the operating system.
The process of changing variables from memory to storage or transmission is called serialization and picking in python. After serialization, The serialized content can be written to the disk or transmitted to another machine over the network.
In turn, it is called deserialization to re-read the variable content from the serialized object to the memory.
Python provides two modules for serialization.
CPickle, pickle

Import Module
: D
Out [11]: {'age': 20, 'name': 'jack', 'score ': 88}

In [12]: try:
...: Import cPickle as pickle
...: Role t ImportError:
...: Import pickle
....:

In [13]: pickle. dumps (d)
Out [13]: "(dp1 \ nS 'age' \ np2 \ nI20 \ nsS 'score '\ np3 \ nI88 \ nsS 'name' \ np4 \ nS 'jack' \ np5 \ ns."
Pickle. the dumps () method serializes any object into a str, then you can write this str into a file, or use another method to pickle. dump () directly serialize the Object and write it into a file-like Object

In [15]: f = open (r "C: \ Users \ MyHome \ Desktop \ dumps.txt", "wb ")

In [16]: pickle. dump (d, f)

In [17]: f. close ()
When we want to read the object from the disk to the memory, we can first read the content into a str, and then use pickle. the loads () method deserializes objects, or you can directly use pickle. the load () method directly deserializes an Object from a file-like Object.

In [18]: f = open (r "C: \ Users \ MyHome \ Desktop \ dumps.txt", "rb ")

In [19]: d = pickle. load (f)

In [20]: f. close ()

In [21]: d
Out [21]: {'age': 20, 'name': 'jack', 'score ': 88}

JSON advanced
If we want to pass objects between different programming languages, we must serialize objects into standard formats, such as XML, but a better way is to serialize them to JSON, JSON is a string that can be read by all languages, stored on disks or transmitted over the network. JSON is not only a standard format, but also faster than XML, and can be read directly on the Web page, which is very convenient.
The built-in json module in python provides excellent Python object to JSON format conversion. Let's take a look at how to convert a python object into a JSON object.

 

In [22]: import json

In [23]: d = dict (name = "Jack", age = 24, score = 96)

In [24]: json. dumps (d)
Out [24]: '{"age": 24, "score": 96, "name": "Jack "}'

The dumps () method returns a str with the standard JSON content. Similarly, the dump () method can directly write JSON into a file-like Object.
To deserialize JSON into a python object, use loads () or the corresponding load () method. The former deserializes JSON strings, the latter reads the string from the file-like Object and deserializes it.
In [25]: json_str = '{"age": 24, "score": 96, "name": "Jack "}'

In [26]: json. loads (json_str)
Out [26]: {u'age': 24, u'name': u'jack', u'score ': 96}


Python dict objects can be directly serialized into JSON {}. However, many times we prefer to use class to represent objects, such as defining the Student class, and then serialize:

Import json

Class Student (object ):
Def _ init _ (self, name, age, score ):
Self. name = name
Self. age = age
Self. score = score

S = Student ('bob', 20, 88)
Print (json. dumps (s ))
Run the code and get a TypeError without mercy:

Traceback (most recent call last ):
...
TypeError: <__ main _. Student object at 0x

The error occurs because the Student object is not a JSON object that can be serialized.

If the Instance Object of the class cannot be serialized as JSON, this is definitely unreasonable!

Don't worry. Let's take a closer look at the parameter list of the dumps () method. In addition to the first required obj parameter, the dumps () method also provides a lot of optional parameters:

Https://docs.python.org/2/library/json.html#json.dumps

These optional parameters allow us to customize JSON serialization. The previous Code failed to serialize the Student class instance to JSON because the dumps () method does not know how to change the Student instance to a JSON {} object by default.

The optional parameter default is to convert any object into a JSON object in sequence. We only need to write a conversion function for Student and then pass the function into it:


Def studentdict (std ):
Return {"name": std. name, "age": std. age, "score": std. score}

Print (json. dumps (s, default = studentdict ))

However, if you encounter a Teacher class instance next time, it cannot be serialized as JSON. We can steal the lazy and convert any class instance into dict:

Print (json. dumps (s, default = lambda obj: obj. _ dict __))


Similarly, if we want to deserialize JSON into a Student object instance, the loads () method first converts a dict object. Then, the input object_hook function converts dict to a Student instance:

Def dict2student (d ):
Return Student (d ['name'], d ['age'], d ['score '])

Json_str = '{"age": 20, "score": 88, "name": "Bob "}'
Print (json. loads (json_str, object_hook = dict2student ))
The running result is as follows:

<__Main _. Student object at 0x10cd3c190>
The output is a deserialized Student instance object.


Analysis
If we want to write a search engine, the first step is to crawl the page of the target website with crawlers. The second step is to parse the HTML page and check whether the content is news, images, or videos.
How to parse HTML? python provides HTMLParser to parse HTML easily

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.