Advanced search methods (Baidu star 07)

Source: Internet
Author: User

 

 

Baidu's Advanced Search Method (preliminary round of 2007)

Question description:

Have you tried to use the site inurl syntax query on Baidu? If not, try again.

For example, enter site: www.baidu.com inurl: News.

All URLs containing the "news" substring on www.baidu.com are found.

Now we have two copies of data: site_inurl.txt and url.txt.

Each line in site_inurl.txt is a query string consisting of the site inurl syntax, And the URL list is saved in url.txt.

Can you find all the URLs that can be retrieved by the query string in site_inurl.txt in the URL list?

For example, the content of site_inurl.txt is as follows:

Site: www.baidu.com inurl:/more

Site: zhidao.baidu.com inurl:/Browse/

Site: www.sina.com.cn inurl: www20041223am

Url.txt contains the following content:

Http://www.baidu.com/more/

Http://www.baidu.com/guding/more.html

Http://www.baidu.com/events/20060105/photomore.html

Http://hi.baidu.com/browse/

Http://hi.baidu.com/baidu/

Http://www.sina.com.cn/head/www20021123am.shtml

Http://www.sina.com.cn/head/www20041223am.shtml

The output result of your program running should be:

Http://www.baidu.com/more/

Http://www.baidu.com/guding/more.html

Http://www.sina.com.cn/head/www20041223am.shtml

The program uses the command line to input these two file names. The first parameter is the file name corresponding to the site_inurl file, and the second parameter is the URL column.

The file name corresponding to the table. Please output the program output to the standard output.

 

The following is the source code. This question is relatively simple. You only need to extract and store the content from the two input files. When extracting the information in site_inurl, you must filter out the information you do not need. You only need the information to be queried later. After the preceding steps are completed, the query information is extracted and compared with all URLs. If the comparison is successful, the URL is output to the standard output.

 

 

# Include <iostream>

# Include <vector>

# Include <fstream>

 

Using namespace STD;

 

Void inputall (vector <string> & input, vector <string> & Data, char * file1, char * file2)

{

Ifstream in (file1 );

Ifstream store (file2 );

String STR;

Char Buf [100];

Do

{

In. Ignore (100 ,'');

In. Ignore (6 );

In. Getline (BUF, sizeof (BUF ));

STR = Buf;

Input. push_back (STR );

} While (in );

Input. pop_back ();

 

While (store. Getline (BUF, sizeof (BUF )))

{

STR = Buf;

Data. push_back (STR );

}

}

 

Void getresult (const string & STR, const vector <string> & Data)

{

Int Len = Str. Length ();

For (INT I = 0; I <data. Size (); I ++)

{

For (Int J = 0; j <data [I]. Length (); j ++)

{

If (data [I]. Compare (J, Len, STR) = 0)

{

Cout <data [I] <Endl;

Break;

}

}

}

}

 

Void process (const vector <string> & input, const vector <string> & Data)

{

For (INT I = 0; I <input. Size (); I ++)

Getresult (input [I], data );

}

 

Int main (INT argc, char * argv [])

{

Vector <string> input;

Vector <string> data;

Inputall (input, Data, argv [1], argv [2]);

Process (input, data );

}

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.