Regular access to Web source keyword and description, eggs a little pain

Source: Internet
Author: User
Situation One:

<meta name= "description" content= "Wall Street Bond (bond.wswire.com) is the world's first Bond website, providing you with the fastest and most professional bond information in the global bond market and all-weather bond financing, bond rating and quote services, Wall Street bonds cover the exchange bond market, the interbank bond market, interbank lending and the open market, among other aspects of bond information services. Wall Street bonds bring together a number of top professional institutions to analyze research reports, two times a day of accurate data analysis, and illustrated market reports. "><meta name=" keywords "content=" Wall Street, telecommunications, Wall Street telecommunications, global bonds, Treasuries, bonds, bond markets, corporate bonds, debt, convertible bonds, repurchase, repurchase, redemption, bond announcements, interest rates, financial debt, central bank, short-term financing vouchers, Bookkeeping Treasury, monetary policy, finance, exchange rate, notes, open market, stable income, public debt, counter transaction, Interbank bond market, inter-bank borrowing, bond information, financing debt, bond financing, bond rating, Interbank market, exchange market, overseas market, central bank notes >


Situation Two:

<meta name=keywords content= "Microwave oven using high-power low energy consumption and more energy saving (picture), environmental knowledge,,, microwave oven,,, high fire,,, energy-saving,,, power,," ><meta name=description Content= "Microwave oven using high-power low energy consumption and more energy saving (figure)" >



Note: There may be case and name, and the Content property [color= #FF6600] is not in the same position [/color]

Younger brother tried to write a bit, can only match a write Web page, do not know what the problem. Daniel, please answer, humbly!
Keyword

1.preg_match ("/<meta[\s]+name=[" \ "]keywords[' \"] content=[' \ "] (. *) [' \ ']/isu ', $this->tmphtml, $inarr); 2.preg_match ("/<meta[\s" content=[' \ "] (. *) [' \"] name=[' \ "]keywords[' \"]/isu ", $this->tmphtml, $inarr 2);
1.preg_match ("/<meta[\s]+name=[" \ "]description[' \"] content=[' \ "] (. *) [' \ ']/isu ', $this->tmphtml, $inarr); 2. Preg_match ("/<meta[\s]+content=[" \ "] (. *) [' \"] name=[' \ "]description[' \"]/isu ", $this->tmphtml, $inarr 2);

Description: Some Web pages can match, some can not


Reply to discussion (solution)

Oh yes, I forgot to explain, some pages are like this:
<meta name=keywords content= "Microwave oven using high-power low energy consumption and more energy saving (picture), environmental knowledge,,, microwave oven,,, high fire,,, energy-saving,,, power,," >

<meta name=description content= "Microwave oven with high-power consumption and lower energy saving (figure)" >

Keywords and description do not have double quotes. Can not match, hope that the eldest brother help me perfect, the best test through

Isn't there a get_meta_tags function?

Isn't there a get_meta_tags function?
+1, you can return a meta array, and then extract what you need.

Oh, laughed at, well thank you Ah, Foolbirdflyfirst Yangball

Name in front:

<meta (\s) name= (\ ' |\ "|) Keywords (\ ' |\ "|) (\s*) content= (\ ' |\ "|) (.*) (\'|\"|) (\s*) ><meta (\s) name= (\ ' |\ ' |) Keywords (\ ' |\ "|) (\s*) content= (\ ' |\ "|) | (\'|\"|) (\s*) ><meta (\s) name= (\ ' |\ ' |) Description (\ ' |\ "|) (\s*) content= (\ ' |\ "|) (.*) (\'|\"|) (\s*) ><meta (\s) name= (\ ' |\ ' |) Description (\ ' |\ "|) (\s*) content= (\ ' |\ "|) | (\'|\"|) (\s*) >


Name in the back:

<meta (\s) content= (\ ' |\ "|) (.*) (\'|\"|) (\s*) name= (\ ' |\ "|) Keywords (\ ' |\ "|) (\s*) > <meta (\s) content= (\ ' |\ ' |) | (\'|\"|) (\s*) name= (\ ' |\ "|) Keywords (\ ' |\ "|) (\s*) > <meta (\s) content= (\ ' |\ ' |) (.*) (\'|\"|) (\s*) name= (\ ' |\ "|) Description (\ ' |\ "|) (\s*) > <meta (\s) content= (\ ' |\ ' |) | (\'|\"|) (\s*) name= (\ ' |\ "|) Description (\ ' |\ "|) (\s*) >

Name in front:

<meta (\s) name= (\ ' |\ "|) Keywords (\ ' |\ "|) (\s*) content= (\ ' |\ "|) (.*) (\'|\"|) (\s*) ><meta (\s) name= (\ ' |\ ' |) Keywords (\ ' |\ "|) (\s*) content= (\ ' |\ "|) | (\'|\"|) (\s*) ><meta (\s) name= (\ ' |\ ' |) Description (\ ' |\ "|) (\s*) content= (\ ' |\ "|) (.*) (\'|\"|) (\s*) ><meta (\s) name= (\ ' |\ ' |) Description (\ ' |\ "|) (\s*) content= (\ ' |\ "|) | (\'|\"|) (\s*) >

Name in the back:

<meta (\s) content= (\ ' |\ "|) (.*) (\'|\"|) (\s*) name= (\ ' |\ "|) Keywords (\ ' |\ "|) (\s*) ><meta (\s) content= (\ ' |\ ' |) | (\'|\"|) (\s*) name= (\ ' |\ "|) Keywords (\ ' |\ "|) (\s*) ><meta (\s) content= (\ ' |\ ' |) (.*) (\'|\"|) (\s*) name= (\ ' |\ "|) Description (\ ' |\ "|) (\s*) ><meta (\s) content= (\ ' |\ ' |) | (\'|\"|) (\s*) name= (\ ' |\ "|) Description (\ ' |\ "|) (\s*) >

According to the upstairs, further concluded that:
Name in front:

< (\s*) (meta| meta| Meta) (\s*) (name|name| Name) = (\ ' |\ ' |) (keywords| keywords| Keywords) (\ ' |\ "|) (\s*) (content| Content| Content) = (\ ' |\ ' |) (.*) (\'|\"|) (\s*) >< (\s*) (meta| meta| Meta) (\s*) (name|name| Name) = (\ ' |\ ' |) (keywords| keywords| Keywords) (\ ' |\ "|) (\s*) (content| Content| Content) = (\ ' |\ ' |) | (\'|\"|) (\s*) >< (\s*) (meta| meta| Meta) (\s*) (name|name| Name) = (\ ' |\ ' |) (description| description| Description) (\ ' |\ "|) (\s*) (content| Content| Content) = (\ ' |\ ' |) (.*) (\'|\"|) (\s*) >< (\s*) (meta| meta| Meta) (\s*) (name|name| Name) = (\ ' |\ ' |) (description| description| Description) (\ ' |\ "|) (\s*) (content| Content| Content) = (\ ' |\ ' |) | (\'|\"|) (\s*) >

Name in the rear:

< (\s*) (meta| meta| Meta) (\s*) (content| Content| Content) = (\ ' |\ ' |) (.*) (\'|\"|) (\s*) (name|name| Name) = (\ ' |\ ' |) (keywords| keywords| Keywords) (\ ' |\ "|) (\s*) >< (\s*) (meta| meta| Meta) (\s*) (content| Content| Content) = (\ ' |\ ' |) | (\'|\"|) (\s*) (name|name| Name) = (\ ' |\ ' |) (keywords| keywords| Keywords) (\ ' |\ "|) (\s*) >< (\s*) (meta| meta| Meta) (\s*) (content| Content| Content) = (\ ' |\ ' |) (.*) (\'|\"|) (\s*) (name|name| Name) = (\ ' |\ ' |) (description| description| Description) (\ ' |\ "|) (\s*) >< (\s*) (meta| meta| Meta) (\s*) (content| Content| Content) = (\ ' |\ ' |) | (\'|\"|) (\s*) (name|name| Name) = (\ ' |\ ' |) (description| description| Description) (\ ' |\ "|) (\s*) >

The above is the regular access to the Web source keyword and description, egg a bit painful content, more relevant content please pay attention to topic.alibabacloud.com (www.php.cn)!

  • Contact Us

    The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

    If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

    A Free Trial That Lets You Build Big!

    Start building with 50+ products and up to 12 months usage for Elastic Compute Service

    • Sales Support

      1 on 1 presale consultation

    • After-Sales Support

      24/7 Technical Support 6 Free Tickets per Quarter Faster Response

    • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.