Let the spider go out and climb 1000 free hot pictures to attract the audience in the vertical search field

Source: Internet
Author: User
【Abstract]

I am very interested in vertical search, and I am holding more in-depth research with the master in the garden, so I will show you the 1000 hot pictures crawled by the SPIDER (statement: let's see the pictures crawled by the spider software and don't spread them ). Searching for images is only a specific application of vertical search. I don't need to explain it in detail. You also know that the prospect is far from that. The crawler provided in this article is a restricted version (only 1000 hot images can be crawled). The purpose is not to search for crawlers and images, but to attract numerous experts to explore the field of vertical search.

[Preparation concept]

Vertical search is a professional search engine for a certain industry. It is a subdivision and extension of search engines and an integration of some specialized information in the Web library, extract the required data from the fields for processing and then return it to the user in some form. RelativeGeneral Search EngineThe new search engine service model is proposed, such as the large amount of information, inaccurate query, and insufficient depth, provides valuable information and related services for a specific domain, a specific population, or a specific demand. Its features are "specialized, refined, deep" and industry-specific. Compared with general search engines,Vertical Search EngineIt is more focused, specific, and in-depth.

[View results first]

You can install at least the. net3.5 or above environment on the machine (you do not need to introduce the installation, you should know it), and then download the followingProgramRun directly (Only one EXE file):

 

If your browser does not support the above icon download, can through the address https://skydrive.live.com /? Cid = 35d7be189926747a & id = 35d7be189926747a % 211223 download directly.

The first execution may be slow, depending on the network speed of your machine. If a folder appears under the EXE file and an image is in the folder, the file runs normally. A vertical search result for hot images is displayed.

[Principles]

As I have little knowledge about it, I just want to briefly talk about the principle. The search uses a known URL address and then traverses all the URL addresses, that is to say, let the spider robot find the URL address by itself (of course, it cannot make the spider appear in an endless loop, it will record the path that has been crawled to avoid duplication). In this way, coupled with multi-thread concurrency, the URL address will climb more and more. It can be said that in the vast Internet, there should be no end point.

The above is mainly to find the addresses of various servers, but how to search data in specific fields? As mentioned in this article (beauty pictures ). Image Recognition is needed here. The image itself is a binary file, and the spider robot will not look at the content in the image as if it were a beautiful girl or ugly girl, it only knows some binary information about the image.AlgorithmTo confirm the image content and format.

If the chinamet that sent the first paragraph of the spider robot out of the general search, then the second paragraph of the spider robot searches for the URL and identifies the image data that is qualified, then the vertical search is performed. Next, a complete set of processes will be presented to users reasonably. This program will analyze the pixel size of the searched image. The small image will be filtered and not saved, and the same name will also analyze whether the content is the same. If it is different, it will be renamed, in short, you don't have to worry about downloading large numbers of small images and renaming them. If you can, write some terminals to display the UI to the user, which is more perfect.

 

[Conclusion]

Some texts may not be so clear due to time reasons, but it is very clear that [Spider]-> [data]-> [terminal presentation ], recently, a student account in WP7 was used to upload the demo program "Spider crawler" to Microsoft marketplace. I am very happy that it has passed the test without any bugs. Next, I want to write more vertical search products, such as data capture in some specific fields, such as jewelry market prices, mobile phone sales, and Weibo hot topics. Finally, do you have any more advanced insights on vertical search? I have already thrown bricks, so don't leave them any more. You are welcome to give your own opinions and suggestions.

 

[Related]

Spider crawler network high-pixel image capturing tool [zspider. Net]

 

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.