Python practice exercise: Phone numbers and e-mail address extraction Programs

Source: Internet
Author: User

Topic:

Suppose you have a boring task to find all the phone numbers and email addresses in a long page or article. If you turn pages manually, you may need to look for a long time. If you have a program that can find the phone number and e-mail address in the text of the Clipboard, just press Ctrl-a to select all the text, press Ctrl-c to copy it to the Clipboard, and then run your program. It will replace the text in the Clipboard with the phone number and e-mail address found.

Test text
Skip to main Contenthomesearch formsearchgo! Topicsarduinoart & designgeneral computinghacking & Computer Securityhardware/diyjavascriptkidslego? Lego? Mindstorms? Linux & Bsdskip to main contenthomesearch formsearchgo! Catalogmediawrite for UsAbout Ustopicsarduinoart & designgeneral computinghacking & Computer securityhardware/d Iyjavascriptkidslego? Lego? Mindstorms? Linux & bsdmangaminecraftprogrammingpythonscience & Mathscratchsystem administrationearly AccessGift Certificatesfree ebook edition with every print book purchased from nostarch.com! Shopping cart3 Items total: $53.48view cart checkoutcontact UsNo Starch Press, inc.245 8th StreetSan Francisco, CA 9410 3 usaphone:800.420.7240 or +1 415.863.9900 (9 a.m. to 5 p.m., M-f, PST) 传真: +1 415.863.9950Reach Us by Emailgeneral Inqui RIES: [Email protected]media requests: [email protected]academic requests: [email protected] (please See this page for academic review requests) help with your OrdER: [Email protected]reach Us on social mediatwitterfacebooknavigationmy accountlog outmanage your subscription Preferences.  About Us | ★jobs!  ★|  Sales and Distribution |  Rights |  Media |  Academic Requests |  Conferences |  Order FAQ |  Contact Us |  Write for Us | Privacycopyright 2018 No Starch Press, Inc.
Post-run results
Copied to clipboard:800-420-7240415-863-9900415-863-9950 [email protected][email protected][email protected][email protected]Hit any key to close this window...
Ideas

When you start taking over a new project, it's easy to start writing code directly. But more often, it's better to take a step back and consider the bigger picture. I suggest drawing up a high-level plan and figuring out what the procedure needs to do. Don't think about real code for the time being, think about it later.
1. Create a regular expression for your phone and a regular expression to create an e-mail
2. Match the Clipboard text
3. Copy the processed text to the Clipboard

Start writing the program now.
#! python3# phoneAndEmail.py - Finds phone numbers and email addresses on the clipboard.import re, pyperclip# 创建电话的正则表达式phoneRegex = re.compile(r'''(   (\d{3}|\(d{3}\))?  # 区号可选,444或(444)   (\s|-|\.)?  # 分隔符:字符或-或. 可选   (\d{3})  # 三个数字   (\s|-|\.)?  # 分隔符:字符或-或. 可选   (\d{4})  # 四个数字   )''',re.VERBOSE)# 创建email的正则表达式emailRegex = re.compile(r'''(   [a-zA-Z0-9._%+-]+  # username   @   [a-zA-Z0-9.-]+  # domail name   (\.[a-zA-Z]{2,4})  # dot-something   )''',re.VERBOSE)# 匹配剪切板的文本text = str(pyperclip.paste())matches = []for groups in phoneRegex.findall(text):   phoneNum = '-'.join([groups[1], groups[3], groups[6]])   matches.append(phoneNum)for groups in emailRegex.findall(text):   matches.append(groups[0])# 把处理好的文本复制到剪切板if len(matches) > 0:   pyperclip.copy('\n'.join(matches))   print('Copied to clipboard:')   print('\n'.join(matches))else:   print('No phone numbers or email addresses found.')
Analyze code

Re. Verbose is a parameter that allows a regular expression to ignore comments and whitespace characters. Verbose means that you can add some notes and make them more readable.
Regular expressions See: Python Regular

Another pit is groups, I didn't understand the difference between groups and group.
Group () is the meaning of intercept grouping, example:

import rea = "123abc456"print re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(0)   #123abc456,返回整体print re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(1)   #123print re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(2)   #abcprint re.search("([0-9]*)([a-z]*)([0-9]*)",a).group(3)   #456

Groups () returns a tuple containing all the group strings, from 1 to the included group number.
The code in Phonenum = '-'. Join ([groups[1], groups[3], groups[6]]) groups is a variable, don't look wrong.

Python practice exercise: Phone numbers and e-mail address extraction Programs

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.