You can use a URL to capture the TITLE of a web page. if some websites cannot find the TITLE, the method is stupid. This post was last edited at 11:25:29, January 11 ,.
Curl capture title
The code is written in this way. I don't know if there are any better methods. Please give me some advice
Some websites can be caught, such as Baidu and some websites, such as the homepage of Pacific automobile.
Public function set_title () {// Get in URL $ url = $ _ POST ['URL']; // $ url = "www.pcauto.com.cn"; cannot catch! // A series of curl settings $ ch = curl_init (); curl_setopt ($ ch, CURLOPT_URL, $ url); curl_setopt ($ ch, CURLOPT_HEADER, 0); curl_setopt ($ ch, CURLOPT_ENCODING, 'gzip '); curl_setopt ($ ch, CURLOPT_RETURNTRANSFER, 1); $ content_source = curl_exec ($ ch); curl_close ($ ch ); // Get the encoding format of captured content $ encode = mb_detect_encoding ($ content_source, array ('gb2312', 'gbk', 'utf-8', 'ascii ')); // transcoding $ content_source = iconv ($ encode, 'utf-8 // IGNORE ', $ content_source); // captureIf (preg_match ("/(.*?) <\/Title>/I ", $ content_source, $ title) {echo $ title [1];} else {echo 'title pulling failed ';}} </pre> </p>
Reply to discussion (solution) ([\ S \ S] *?) <\/Title> </p>
The problem lies in Regular Expression Matching. you just need to add an s modifier.
If (preg_match ("/(.*?) <\/Title>/is ", $ content_source, $ title ))
S if this modifier is set, the DOT metacharacters (.) in the pattern match all characters, including line breaks. If this parameter is not set, line breaks are not included.
Regular modifier
The problem lies in Regular Expression Matching. you just need to add an s modifier.
If (preg_match ("/(.*?) <\/Title>/is ", $ content_source, $ title ))
S if this modifier is set, the DOT metacharacters (.) in the pattern match all characters, including line breaks. If this parameter is not set, line breaks are not included.
Thank you very much.