Py-lesson03, lesson.

Source: Internet
Author: User

Py-lesson03, lesson.
1. Chicken soup before class

Soul Chicken Soup: book kite chaser Bai luwon Lin da look at the United States Summary: Reading can improve one's conservation and cultivate health.
No matter what the outside world, the real world, and the warmth of human feelings, when I open these books, I seem to immediately sit next to the masters, listen to them to learn how to learn and give me experience in life, the truth of the world, the way to look at the problem, when I read a certain degree, will feel that some of my previous ideas are silly, some practices are stupid, when I deal with some other things, I will find that the power of wisdom can be used so smoothly and freely, as if I was just a good solution.
There are two situations in which people read books,
One. Calm downTo read,
One, Read in firstNaturally, the mind is quiet. So is learning! 2. Set
Deduplication
You can test the link.
If you want to remove a column from duplicates, convert it into a set:
#!/usr/bin/envpython
list_1 = [1,2,3,3,5,5,6,7,8,9]
list_1 = set(list_1)
print(list_1,type(list_1))
 
#! /Usr/bin/envpython
List_1 = [1, 2, 3, 5, 5, 6, 7, 8, 9]
List_1 = set (list_1)
List_2 = set ([, 4])
List_3 = set ([1, 3, 7])
Print (list_1.intersection (list_2) # intersection
Print (list_1 & list_2)
Print (list_1.union (list_2) # Union
Print (list_2 | list_1)
Print (list_1.difference (list_2) # difference set, which exists in list_1 and does not exist in list_2, which can be retrieved
Print (list_1-list_2)
Print (list_1.issubset (list_2) # subset
Print (list_1.issuperset (list_2) # parent set
Print (list_3.issubset (list_1) # list_3 is a subset of list_1
Print (list_1.policric_difference (list_2) # symmetric difference set, which does not exist in each other. Duplicate
Print (list_1 ^ list_2)
Print (list_1, type (list_1 ))

Print ('-------------')
# List_2.isdisjoint ()
List_4 = set ([5, 6, 8])
Print (list_2.isdisjoint (list_3 ))

Print ('********************')
List_1.add (999)
List_1.update ([1, 888,999,777])
Print (list_1)


Print ('===============================> ')
Print (list_1.discard ('ddd ') # If the discard does not exist, no error is returned.
Print (list_1.remove (xe) # If remove does not exist, an error is returned. If it is deleted, no data is returned.

3. File Operations
Data = open ("/Users/F/Documents/yesterdy", encoding = "UTF-8"). read ()
Print (data)
F = open ("/Users/F/Documents/yesterdy", 'w', encoding = "UTF-8") # file handle. w is a newly created file that overwrites existing files.
# Print ('---------------------> data1 info message ')
# Data1 = f. read ()
# Print (data1)

# Print ('------- data1 -------- % s' % data1) # finish reading it all over.
# Move the cursor back if you want to use it

Print ('starting write ........')
F. write ("I love Beijing Tiananmen \ n") # An error will be reported
# R can only be read, w is write, newly created, a is appended, and cannot be read

F. close ()


F = open ("/Users/F/Documents/yesterdy", 'R', encoding = "UTF-8") # file handle. w is a newly created file that overwrites existing files.
For I in range (4 ):
Print (f. readline ())

Print ('-----------------------> ')
For line in f. readline ():
Print (line. strip ())

For index, line in enumerate (f. readline ()):
If index = 3:
Print ('----- I am \ t ------')
Continue # Skip the third line
Print (line. strip ())

For line in f:
Print (line) # one row of reading, with the highest efficiency. Only one row of printing is saved in the memory. The iterator


# Counter:
Count = 0
Count + = 1
For line in f:
If count = 3:
Continue
Print (line)
# How to move the cursor to the beginning of a row after reading
Print (f. tell () # print the current pointer position
Print (f. read (5 ))
F. seek (0) # move the cursor back to the starting position of the pointer
 
Print (f. encoding) # print the file encoding
# If f. errors: # f. errors error handling
Print (f. fileno () # file handle location for underlying system IO Scheduling
F. seekable () # pointer position that binary files can be moved back
F. flush () # Write to the hard disk. The flush method can also refresh the progress bar in real time.
Import sys, time
For I in range (50 ):
Sys. stdout. write ('#')
Sys. stdout. flush ()
Time. sleep (0.1)
print(dir(f.buffer))
print(f.buffer)

F. seek (10) # No matter how you move it, It is intercepted from the first character by default, unless you specify the position
F. truncate (10) # truncate from the nth character

# Reading and Writing
# R + = r + a read/write mode
Print (f. readline ())
F. write ("------- diao --------")
# W + write-Read mode; create a file first, and then write it in. The content of the original file will be overwritten.
F. write ("------- diao bu diao ---------")
# A + append read, available for append read
 
# Rb; binary file, which does not support characters. The application is socket network transmission and thunder download
# The following enconding should not be added. The py 3.x binary is binary, and the string is a string. You must specify
F = open ('yesterday', 'rb', encoding = 'utf-8 ')
F. write ("hello, binary \ n". encode ('utf-8') # I use UTF-8 to encode a binary file.
F. close ()
# AB appending a binary file
# RU can convert both of them into a unified line break
# On windows, the line break is \ r \ n, and the unix system line break is \ n
# File modification, which can be replaced directly
F = open ('/Users/F/Documents/yesterdy', 'R', encoding = 'utf-8 ')
F_new = open ('/Users/F/Documents/yesterdy. Bak', 'w', encoding = 'utf-8 ')
For line in f:
If "bb" in line:
Line = line. replace ("bb", "alex ")
# F_new.write (line)
# Else:
# F_new.write (line)
F_new.write (line)
Batch modification similar to sed:
# File Modification
Import sys
Find_str = sys. argv [0]
Replace_str = sys. argv [1]
F = open ('/Users/F/Documents/yesterdy', 'R', encoding = 'utf-8 ')
F_new = open ('/Users/F/Documents/yesterdy. Bak', 'w', encoding = 'utf-8 ')
For line in f:
If "find_str" in line:
Line = line. replace ("find_str", "replace_str ")
# F_new.write (line)
# Else:
# F_new.write (line)
F_new.write (line)
f.close()
f_new.close() 
# With syntax helps you close files
With open ('/Users/F/Documents/yesterdy', 'R', encoding = 'utf-8') as f:
Print (f. readline ())
 


4. character encoding
bogon:Documents F$ python3Python 3.5.2 (v3.5.2:4def2a2901a5, Jun 26 2016, 10:47:25) [GCC 4.2.1 (Apple Inc. build 5666) (dot 3)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> import sys>>> print(sys.getdefaultencoding())utf-8bogon:Documents F$ pythonPython 2.7.10 (default, Oct 23 2015, 19:19:21) [GCC 4.2.1 Compatible Apple LLVM 7.0.0 (clang-700.0.59.5)] on darwinType "help", "copyright", "credits" or "license" for more information.>>> import sys>>> print(sys.getdefaultencoding())ascii

ASCII is a seven-bit code, which is stored in 8 bits in total. The maximum bit is 0! The value range is 0 ~ 127 (corresponding decimal), so it occupies one byte and eight bits in the memory!
The character in GBK is one or two bytes, and the interval of single-byte 00-7F is the same as that in ASCII; the first byte of double-byte character is between 81-FE, this can be used to determine whether it is a single byte or a dual byte.
Output in java:
System. out. println ("some text". getBytes ("GBK"). length );
System. out. println ("some text". getBytes ("GB2312"). length );
Output 9 above
Therefore, the gbk English character occupies one byte, and the Chinese character occupies only two bytes.
UTF-8 is variable length, 1-6 bytes, a small number of Chinese Character detection can not indicate that all Chinese characters are. A few Chinese characters occupy 3 bytes each, and most of them occupy 4 bytes.
Unicode (unified code, universal code, Single Code) is a character encoding used on a computer. It sets a unified and unique binary encoding for each character in each language to meet the requirements of cross-language and cross-platform
Requirements for text conversion and processing. R & D started in December 1990 and officially announced in December 1994. With the enhancement of computer capabilities, Unicode has been popularized in more than a decade since its launch.
The current practical Unicode version corresponds to the UCS-2 and uses a 16-bit encoding space. That is, each character occupies 2 bytes.
Unicode is a unified international encoding, so unicode should contain all Chinese characters. I can see in some places that unicode characters occupy 2 bytes. That is, the maximum number of expressions is 2 ^ 16 (65536 ).
It can contain at least 90 thousand Chinese characters. So Unicode characters still occupy at least 3 bytes? I have some questions.

 
 

 

 

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.