Python Implementation Scan Forum replies, automatic attachment (for the type of finding, etc.)

Source: Internet
Author: User

Background:

Because of the need to share some books on the www.kindle114.com , in view of a variety of network disk God horse, will be involved in the sharing of copyright to expire, always change the sharing connection is very troublesome, so only through the mail to disseminate knowledge, this need to see the reply to the mailbox every day, One by one send, is very troublesome. With a Python script to scan these pages, the email address after the automatic sending is what lazy people should do.

Implementation process:

It is necessary to use crontab to implement this script on a daily schedule.

The script is divided into three parts:

1. Traverse all replies pages of the specified post, crawl the email format string, and in some cases terminate the scan

2. The list that was sent over the past, if it is new, is added to the Send list

3. Send the Send list with Python email and bring the attachment

Areas that can be improved:

1. Record the end of each scan, next scan from this location, to avoid duplication of scanning waste

2. Make a dictionary of URLs and attachments to send flexibly

3. Create an online system to maintain this data and list

Note the point:

1. Python's sendmail, send list is to be used with the type of list

2. If it is QQ mailbox, to the mailbox of the independent password removed (I do not know if not remove how to verify), or there will be a validation failure error

3. Some people may be for privacy reasons, the email address as a strange appearance, then I can not do ...

4 This article is just a discussion, the format of the forum, need to flexibly modify the script to achieve web page crawl

5. In view of the fact that there are so many rough publications, it is reasonable to preview it first. If you think e-book or video is valuable, please go to the cinema to see a movie or buy a genuine book, write this short article all need to spend time and energy, not to mention is to write books and film, people rely on this to make a living.

#!/usr/bin/pythonImportsys, Urllib, re fromEmail. HeaderImportHeader fromEmail. MimetextImportMimetext fromEmail. MimemultipartImportMimemultipartImportsmtplib, DateTimedefSendMail (towho,fromwho,bookname,text): Msg=Mimemultipart ()## fill your BT torrent or eBook ...ATT = mimetext (open (BookName,'RB'). Read (),'Base64','gb2312') att["Content-type"] ='Application/octet-stream'att["content-disposition"] ='attachment; filename= "Redis_shejiyushixian.mobi"'Msg.attach (att) msg[' from'] =fromwho msg['subject'] = Header (BookName +'('+ str (datetime.date.today ()) +')','gb2312') Msg.attach (Mimetext (text)) Server= Smtplib. SMTP ('smtp.qq.com')    ## Fill your QQ number and passwordServer.login ('1234567','xxxxxxxxxxxx') Error=server.sendmail (msg[' from'],towho,msg.as_string ()) Server.closePrintErrordicsave= {} forLineinchOpen"Log"): Dicsave[line.rstrip ()]= 1DiC={}predic= {}## grep Email format stringPattern = Re.compile (r'[_.0-9a-z-][email protected] (?: [0-9a-z][0-9a-z-]+.) +[a-z]{2,3}\b')## Enhance this script to mark stop page as next start Page forXinchRange (1,100):    ## Fill Forum URL which need scanUrl="http://www.kindle114.com/thread-4567-%d-1.html"%xPrint "%s"%url WP=urllib.urlopen (URL) content=Wp.read ()## get mail listm =pattern.findall (content) Stopflag= 1 forIinchm:## If page is duplicate means scan is wasting time and need stop        ifI not inchPredic:stopflag=0## only new mail address was needed to deliver        ifI not inchDicsave:dic[i]= 1ifStopflag = = 1:        Print "Scan Page over%d\n"%x BreakPredic=mmaillist= []## Write deliver mail address to log to avoid duplicate deliver in next runFile_object = open ("Log",'A +') forKvinchDic.items ():Printk Maillist.append (k) file_object.write ( K+"\ n") File_object.close ()ifmaillist:Print "delivering...\n"; SendMail (Maillist,'[email protected]','Redis_shejiyushixian.mobi','What is your interested in?')    Print "Deliver is completed...\n";Else:    Print "Mail List is empty\n";

Python Implementation Scan Forum replies, automatic attachment (for the type of finding, etc.)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.