Test and pre-release, and use python to check whether the webpage has a daily link

Source: Internet
Author: User
It is inevitable that the test link will be released to the production environment. Generally, this is to avoid risks through some testing check tools to check the link, below we will briefly describe the general implementation ideas. Many Internet companies may encounter tests, prerelease, and online environments, to isolate the test from the formal online environment. In this case, the test link is released to the production environment, generally, some testing tools are used to check links to avoid risks. I encountered a problem two days ago. Due to development negligence, I published the daily url online. However, there was no automated monitoring tool in the test, which resulted in no timely detection. Because I was just reading python recently, I wanted to use python for simple monitoring after I finished the process.

The general idea is: Use python to write a script to analyze all URLs on the Web page to see if it contains daily links. Then, place the script in the crontab to run the scheduled task and run the check once every 10 minutes. If an illegal link is found, an alert is sent to the relevant personnel. There are about 100 lines of script code. It is easy to understand and paste the code.

I originally wanted to use beautifulsoup, but considering the trouble of installing the third-party library, I still use the built-in sgmllib, so I don't need to care about the library. The mail function is not implemented. You can implement the following based on your smtp server.

The Code is as follows:


#! /Usr/bin/env python
# Coding: UTF-8

Import urllib2
From sgmllib import SGMLParser
Import smtplib
Import time
# From email. mime. text import MIMEText
# From bs4 import BeautifulSoup
# Import re

Class UrlParser (SGMLParser ):
Urls = []
Def do_a (self, attrs ):
'''''Parse tag '''
For name, value in attrs:
If name = 'href ':
Self. urls. append (value)
Else:
Continue

Def do_link (self, attrs ):
'''''Parse tag link '''
For name, value in attrs:
If name = 'href ':
Self. urls. append (value );
Else:
Continue

Def checkUrl (checkurl, isDetail ):
''''' Check whether the webpage source code corresponding to the checkurl has an invalid url '''
Parser = UrlParser ()
Page = urllib2.urlopen (checkurl)
Content = page. read ()
# Content = unicode (content, "gb2312"). encode ("utf8 ")
Parser. feed (content)
Urls = parser. urls

DailyUrls = []
DetailUrl = ""
For url in urls:
If 'daily 'in url:
DailyUrls. append (url );
If not detailUrl and not isDetail and 'www .bc5u.com 'in url:
DetailUrl = url

Page. close ()
Parser. close ()

If isDetail:
Return dailyUrls
Else:
Return dailyUrls, detailUrl

Def sendMail ():
''''' Send reminder email '''
Pass

Def log (content ):
''''' Record execution log '''
LogFile = 'checkdailyurl. Log'
F = open (logFile, 'A ')
F. write (str (time. strftime ("% Y-% m-% d % X", time. localtime () + content + '\ n ')
F. flush ()
F. close ()

Def main ():
''''' Entry method '''
# Check ju
Url = "www.bc5u.com"

DailyUrls, detailUrl = checkUrl (url, False)
If dailyUrls:
# Check the daily link and send an alert email
SendMail ()
Log ('check: find daily url ')
Else:
# Do not check the daily link.
Log ('check: not find daily url ')

# Check judetail
DailyUrls = checkUrl (detailUrl, True)
If dailyUrls:
# Check the daily link and send an alert email
Log ('check: find daily url ')
SendMail ()
Else:
# Do not check the daily link.
Log ('check: not find daily url ')

If _ name _ = '_ main __':
Main ()

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.