Python core programming-Regular Expression learning notes (after-school practice)

Source: Internet
Author: User

1. Identify subsequent strings: "Bat", "bit", "but", "hat", "hit" or "hut".

[BH] [Aiu]t

2. Match any word pairs that are separated by a single space, that is, the first and last names.

[a-za-z]+\s[a-za-z]+

3. Match any word and single letter separated by a single comma and a single space character, such as the first letter of the last name

[A-za-z]+,\s[a-za-z]

4. Match all valid Python identifier sets

[a-za-z_]\w+     # matches any of the letters and underscores that begin, identifiers can include letters, underscores, and numbers.

5. Match the street address according to the United States received address format. The U.S. street address uses the following format: 1180 Bordeaux drive. Make your regular expressions flexible enough to support multi-word street names, such as 3120 De la Cruz Boulevard

\d+[a-za-z\s]+

6. Match the simple web domain name starting with "www" and ending with ". com": for example, www://www.yahoo.com.

7. Match all string sets that can represent Python integers

[-+]?\d+            # has sign or no

8. Match all string sets that can represent Python long integers

[-+]?\d+[ll]        # Long Integer, followed by an integer with uppercase or lowercase l

9. Match all string sets that can represent Python floating-point numbers

[-+]?\d+\.\d*

10. Represents all character sets that can represent the Python plural

-? \d+\+[\+-]\d+j

11. Match all collections that can represent valid e-mail messages

[^\s] [Email protected] (\w+\.?) +

12. Match all collections (URLs) that can represent valid URLs

(http://([a-za-z0-9\-]+\.?) +)

13. Create a regular expression that extracts the actual type name from the string. function will return int for <type ' int ' > string

<type\s\'(\w+) \ >

14. Processing date, a regular expression is common to represent numbers remaining three months in the standard calendar

1[012]

15. Processing credit card numbers

(\d{4}-\d{6}-\d{5}|\d{4}-\d{4}-\d{4}-\d{4})

16. Update the code for gendata.py, which is the direct output of the data to redata.txt instead of the screen

#!/usr/bin/env python#gendata.py fromRandomImportrandrange, Choice fromStringImportAscii_lowercase as LC#from sys import INT fromTimeImportCtimetlds= ('com','edu','Net','org','Gov') with open (' Path\redata.txt','W') as FP: forIinchRange (Randrange (5,11)): Dtint= Randrange (2**32)                     #Pick DateDtstr = CTime (dtint)#Date StringLlen = Randrange (4, 8)#login is shorterLogin ="'. Join (Choice (LC) forJinchRange (Llen)) Dlen= Randrange (Llen, 13)#Domain is longerDom ="'. Join (Choice (LC) forJinchRange (Dlen))Print('%s::%[email protected]%s.%s::%d-%d-%d'%(DTSTR, login, DOM, choice (TLDs), Dtint, Llen, Dlen)) Str='%s::%[email protected]%s.%s::%d-%d-%d'%(DTSTR, login, DOM, choice (TLDs), Dtint, Llen, Dlen) fp.write (str+'\ n')

17. Determine the number of occurrences of each day of the week in a redata.txt

ImportReweek= {"Mon": 0,"Tue": 0,"Wed": 0,"Thu": 0,"Fri": 0,"Sat": 0,"Sun": 0}with Open (' Path\redata.txt','R') as Fp:data=Fp.readlines () forLineinchData:day= Re.match ('^\w{3}', line) key=Day.group () Week[key]= Week[key] + 1Print(week)

18. Extract

#determine if the data is damaged Sat June 09:14:03 1985::ImportRewith Open (' Path\redata.txt','R') as Fp:data=Fp.readlines () forLineinchData:time= Re.match ('^\w{3}\s\w{3}\s{1,2}\d{1,2}\s\d{2}:\d{2}:\d{2}\s\d{4}', line)ifTime is  notNone:Print('not damaged')        Else:            Print('There is data corruption')

19-25. Create a regular expression

#19-27 PracticeImportRetimes=[]emails=[]months=[]years=[]clocks=[]lnames=[]dnames=[]with Open ('Path\redata.txt','R') as Fp:data=Fp.readlines () forLineinchData:time= Re.match ('(\w{3}\s (\w{3}) \s{1,2}\d{1,2}\s (\d{2}:\d{2}:\d{2}) \s (\d{4})::((\w+) @ (\w+\.?) +))::', line)#Extract Timestamp        ifTime is  notNone:times.append (Time.group (1))#Extract full timestampMonths.append (Time.group (2))#Extract MonthYears.append (Time.group (4))#Year of extractionClocks.append (Time.group (3))#Extraction TimeEmails.append (Time.group (5))#Extracting MessagesLnames.append (Time.group (6))#Extract Login nameDnames.append (Time.group (7))#Extracting domain NamesPrint('Full time:', Times)Print('Month:', months)Print('Year:', years)Print('point in time:', clocks)Print('e-mail address:', emails)Print('Login Name:', Lnames)Print('Domain Name:', Dnames)

Results:

26. Replace e-mail addresses in a single line of data with your own email address

Import Rewith Open ('path\redata2.txt'R') as FP:     = Fp.readlines ()    for in  data:        Print(re.sub (  '\[email protected] (\w+.?) +"[email protected]', line). Rstrip ())

Results:

27. Extract the month, day, and year from the timestamp and then format "Month, day, year".

Importref= Open ('Redata.txt','R') with open ('Path\redata2.txt','R') as Fp:data=Fp.readlines () forLineinchdata:m= Re.match ('^\w{3}\s (\w{3}) \s{1,2} (\d{1,2}) \s\d{2}:\d{2}:\d{2}\s (\d{4})', line)Print('%s,%s,%s'% (M.group (1), M.group (2), M.group (3)))

Results:

Python core programming-Regular Expression learning notes (after-school practice)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.