Recently in the study of Python, found that use is still very unaccustomed, a lot of PHP inside a very simple function in Python have to look for half a day, and very versatile to achieve their own.
Today to do a collection, need to filter the contents of the label, made an afternoon, seemingly finally got out, testing the release to the expected effect, nonsense not to say that the code
From Html.parser import htmlparserdef strip_tags (HTML, save=none): result = [] start = [] data = [] def startt AG (Tag, attrs): If tag not in Save:return start.append (tag) If attrs:j = 0 For attr in attrs:attrs[j] = attr[0] + ' = "' + attr[1] + '" ' j + = 1 attr s = ' + ('. Join (attrs)) Else:attrs = ' Result.append (' < ' + tag + attrs + ' > ') def Endtag (TAG): if start and tag = = Start[len (start)-1]: result.append (' </' + tag + ' > ') parser = Htmlparser () Parser.handle_data = result.append if Save:parser.handle_starttag = Starttag parser.ha Ndle_endtag = Endtag parser.feed (HTML) parser.close () for I in range (0, Len (result)): TMP = RESULT[I].RSTR IP (' \ n ') tmp = Tmp.lstrip (' \ n ') if Tmp:data.append (TMP) return '. Join (data)
How to use:
result = Strip_tags ("" "<a target=" "" "" _blank "title=" soccer livescore Live "href=" http://live.500.com/"> Soccer livescore Live </a> <a target= "_blank" title= "competition color Soccer" href= "http://zx.500.com/jczq/" > Competition football </a><a target= "_blank" title= " Basketball competition "href=" http://zx.500.com/jclq/"> Basketball </a></div> "><p> play Snake seven inch, North Single 7 string 1. Since <a target= "_blank" title= "Beijing single Field" href= "http://zx.500.com/zqdc/" > Beijing single Field </A>SP value calculation rules and competitive color, 4 strings 1 and below betting to buy more cost-effective , and a 7-string bet of more than 1 is likely to be taxed, but not worth it. According to the calculation, Beijing single-field 4 strings from 1 to 7 strings between 1 betting the most cost-effective. </p> "", [' P ', ' img ']) print (Result)
Output Result:
The occurrence of the anti-virus soccer score live competition color football basketball <p> dozen snakes hit seven inch, North single 7 string 1. As Beijing single-field SP value calculation rules and competition, 4 strings 1 and below betting to buy more cost-effective, and 7 strings more than 1 bets are likely to pay taxes, but not cost-effective. According to the calculation, Beijing single-field 4 strings from 1 to 7 strings between 1 betting the most cost-effective. </p>
Keep <a> and <p> labels only
Python implements PHP-like strip_tags functions, and can be customized to set retention tags