These two days in Python to write a collector, there is a functional module is the HTML code conversion to UBB, online seemingly no ready-made procedures, they wrote a function, by the way exercise their regular.
Import re def Html2ubb (content): #以下是将html标签转为ubb标签 pattern = re.compile (' <a href=\ ') ([ss]+?) \ [^>]*> ([ss]+?) </a> ', Re. I) content = pattern.sub (R ' [Url=1]2[/url] ', content) pattern = re.compile (' ]+src=\ ' ([^\ "]+) \" [^>]* > ', re. I) content = pattern.sub (R ' [img]1[/img] ', content) pattern = Re.compile (' <strong> ([ss]+?) </strong> ', Re. I) content = pattern.sub (R ' [b]1[/b] ', content) pattern = re.compile (' <font color=\ "([ss]+?) \ > ([ss]+?) </font> ', Re. I) content = pattern.sub (R ' [1]2[/1] ', content) pattern = re.compile (' <[^>]*?> ', re. I) content = Pattern.sub (', content ') #以下是将html转义字符转为普通字符 content = Content.replace (' < ', ' < ') content = Content.re Place (' > ', ' > ') content = Content.replace (' "', '" ') content = Content.replace (' "', '" ') content = Content.replace (' ") ', ' ' '] content = content.replace (' © ', ' © ') content = Content.replace (' ® ', ' ® ') content = content.replace (' ', ') c
ontent = Content.replace ('-', '-')Content = Content.replace (' – ', ' – ') content = Content.replace (' ‹ ', ' ‹ ') content = content.replace (' › ', ' › ') content = Cont Ent.replace (' ... ', ' ... ') content = Content.replace (' & ', ' & ') return content
When used directly call the Html2ubb function, the return value is UBB code HTML to UBB