Submitted a question on stackoverflow just now.
I intended to extract content from a web page which contains character Unicode characters represented in the form of "% xx ". as I used Perl module lwp to get web page, naturally handled These unicode characters using Perl RegEx as below.
my $html = "%20%26%40 ";
$html =~ s#%([0-9a-f]+)#\x{\1}#ig;
print "$html\n";
But above code dosen't work, It output nothing but "00". Get stuck now... any hint wocould be appreciated.
Some people replied very quickly. Below are their answers.
Perl has functions built inURI::EscapeModule
For this already. You don't need to mess with regular expressions
use URI::Escape;
my $encode = uri_unescape($string);
See this page for more
Funny and uugly code:
my $html = "%20%26%40 ";
$html =~ s#%([0-9a-f]{2})#"chr(0x$1)"#igee;
print "$html\n";
Edit: (I'm obliged to say) This code is maybe cute, but do not use
This in production! (There are working cases where it's not working)
You can observe all the discussion here http://stackoverflow.com/questions/12144401/how-can-convert-character-xx-in-html-using-perl.
I shoshould say stackoverflow is indeed a great place for technical people :-)