Python scans Forum replies, automatically sends attachments (for example, requests), and python replies.
Background:
The author needs to share some books on www.kindle114.com. In view of the various online storage devices, the sharing of copyrights will become invalid, and it is always troublesome to change the sharing connection, therefore, it is imperative that you only use emails to disseminate knowledge. Therefore, you need to go to the reply email every day and send them one by one, which is very troublesome. Use a python script to scan these pages, and automatically send the email address after it is captured.
Implementation process:
You need to use crontab to execute this script regularly every day.
The script consists of three parts:
1. traverse all reply pages of a specified post, capture strings in the Email format, and terminate scanning in some cases
2. Compare the list sent in the past. If it is a new one, add it to the sending list.
3. Use the python email function to send the sending list with attachments.
Improvements:
1. Record the end point of each scan. The next scan will be performed from this location to avoid repeated scans.
2. Make the URL and attachments into a dictionary and send them flexibly.
3. Create an online system to maintain the data and list
Note:
1. sendmail in python. To send the list, use the list type
2. If it is a QQ mailbox, remove the independent password of the mailbox (I do not know how to verify it if it is not removed), otherwise there will be an error of verification failure
3. Someone may write the email address as a strange look for privacy reasons, so I can't do the same...
4. This article is just a reference, and the Forum form is ever-changing. You need to modify your scripts flexibly to capture webpages.
5. Since there are too many publications, it is reasonable to preview them first. If you think that e-books or videos are valuable, please go to the cinema to watch movies or purchase genuine books. It takes time and effort to write this short article, not to mention writing books and making movies.
#!/usr/bin/pythonimport sys, urllib, refrom email.Header import Headerfrom email.MIMEText import MIMETextfrom email.MIMEMultipart import MIMEMultipartimport smtplib, datetimedef sendMail(toWho,fromWho,bookName,text): msg = MIMEMultipart() ## fill your BT torrent or eBook ... att = MIMEText(open(bookName, 'rb').read(), 'base64', 'gb2312') att["Content-Type"] = 'application/octet-stream' att["Content-Disposition"] = 'attachment; filename="Redis_shejiyushixian.mobi"' msg.attach(att) msg['from'] = fromWho msg['subject'] = Header(bookName + '(' + str(datetime.date.today()) + ')','gb2312') msg.attach(MIMEText(text)) server = smtplib.SMTP('smtp.qq.com') ## fill your QQ number and password server.login('1234567','xxxxxxxxxxxx') error=server.sendmail(msg['from'],toWho,msg.as_string()) server.close print errordicSave = {}for line in open("log"): dicSave[line.rstrip()] = 1dic = {}preDic = {}## grep Email format stringpattern = re.compile(r'[_.0-9a-z-]+@(?:[0-9a-z][0-9a-z-]+.)+[a-z]{2,3}\b')## please enhance this script to mark stop page as next start pagefor x in range(1,100): ## fill forum url which need scan url="http://www.kindle114.com/thread-4567-%d-1.html" % x print "%s" % url wp = urllib.urlopen(url) content = wp.read() ## get mail list m = pattern.findall(content) stopFlag = 1 for i in m: ## if page is duplicate means scan is wasting time and need stop if i not in preDic: stopFlag = 0 ## only new mail address is needed to deliver if i not in dicSave: dic[i] = 1 if stopFlag == 1: print "Scan Page Over %d\n" % x break preDic = mmailList = []## write deliver mail address to log to avoid duplicate deliver in next runfile_object = open("log", 'a+')for k,v in dic.items(): print k mailList.append(k) file_object.write(k+"\n")file_object.close()if mailList: print "Delivering...\n"; sendMail(mailList,'77167680@qq.com','Redis_shejiyushixian.mobi','What are you interested in ?') print "Deliver is completed...\n";else: print "Mail List is empty\n";
Python post Automatic posting
I don't quite understand what you described. Just a few points:
1. Modify the urllib2 agent. Because urllib2 is used by many robot posting programs, the default agent may be blocked. You can disguise yourself as a browser like IE or firefox by modifying the agent.
2. What is shipping to the Posting Box? Do you call posting by modifying the DOM node ??? The correct posting program should first analyze the composition of the page form, then assemble the appropriate http request and send it through the POST or GET method.
How do you capture the data in postDict In Your python automatic logon to Baidu? Can I post messages automatically?
I previously wrote the tutorials you need:
[Tutorial] teaches you how to use the tool (F12 of IE9) to analyze and simulate the internal logic process of logging on to the website (Baidu homepage ).
After reading the code, refer to the complete code I wrote:
[Tutorial] simulate the Python version of the website (contains the complete and runable code of the two versions)
You can understand how to capture and implement the data required by postDict.
(No post address is provided here, so I can only
If you want to view all the content of a post, search for the post title on google to find the post address)