To match the <div class="back fl"><a href="javascript:void(0);" onclick="_gaq.push([‘_trackEvent‘,‘function‘, ‘onclick‘, ‘blog_articles_shangyipian‘]);location.href=‘/u012582664/article/details/56845037‘;"><span><i class="fa fa-arrow-left"></i></span><em>安装最新版python</em></a></div><div class="forward fr"><a href="javascript:void(0);" onclick="_gaq.push([‘_trackEvent‘,‘function‘, ‘onclick‘, ‘blog_articles_xiayipian‘]);location.href=‘/u012582664/article/details/59120585‘;"><em>各种数据库的注释</em><span><i class="fa fa-arrow-right"></i></span></a></div>
' 56845037 ' and ' 59120585 ' in html =, try to use the regular:
pattern_l = r‘‘‘<a href="javascript:void(0);" onclick="_gaq.push([‘_trackEvent‘,‘function‘, ‘onclick‘, ‘blog_articles_shangyipian‘]);location.href=‘(.+?)‘;">‘‘‘re.findall(pattern_l,html)
The result is unsuccessful. Returned as NULL, useful:
soup = BeautifulSoup(html, "lxml") print(soup.find_all(onclick="_gaq.push([‘_trackEvent‘,‘function‘, ‘onclick‘, ‘blog_articles_shangyipian‘]);location.href=‘/u012582664/article/details/(.+?)‘;"))
Or return empty, ask you how to write, is where the problem
Python string matching issues >> python
The answer is quite clear:
Http://www.goodpm.net/postreply/python/1010000008985846/python string matching problem. html
Python string matching problem