A search engine developed by a three-year-old child

Source: Internet
Author: User
Three-year-olds: This idea is naive.

Note:
Qq273942617: The article can be reproduced, but do not modify it, including the error.
* For reprint, please indicate the direct link to the Forum. Do not add any link in any form in the text of the article.
* Please specify the name of the poster in this forum.

Conception of procurement and engineering search
Alias: Procurement search, Chinese procurement, Chinese product and procurement, and product procurement. I haven't thought about which one to use, and the domain name is not easy to use.

Procurement and engineering search
(Yellow Pages, products, procurement, sales, engineering)

Purpose:
1. Collect and organize enterprise yellow pages. I am not satisfied with all yellow pages. Including the New *, search *, and chinayp (Strangely, this website is visited by a large number of Americans. It is worth noting ). My summary shows that the current yellow pages are incomplete and are copied and not dynamically updated. No useful information except contacts. (Weak points, good)
2. Use searchsearch to obtain all the connections of various enterprises, and then crawl these websites with Spider (without having to start from scratch, and the results are more comprehensive than Google, Baidu, and Yahoo. Because it is the result of adding more .)
3. query basic enterprise information and collect enterprise-related information
Enterprise> Information
Enterprise> product
Product> Enterprise
Product-> sales (secondary, because there is too much garbage due to repeated information)
Product-> purchase (secondary: too much garbage due to repeated information)
Enterprise> counters (exhibition stand)> This is redundant if it is simple and practical for purchase. My goal is dedicated search, not B2B. As far as I know, raw material procurement seldom goes on Ali *** unless trade procurement.

The intention of this website is
The results are rich, even better than Google, Baidu, and Yahoo. However, the results are mainly filtered and reordered by enterprise, industry, and product characteristics. The relevance of customer needs is not PageRank, this is a shit about the webpage of the manufacturer you are looking.
Software Platform
Java/Windows or. net (collection, sorting, sorting, sum, because it is considered on the PC, so do not want to use Linux, you can collect other people's computers, use their CPU idle time. Of course, after the official operation, it is still dedicated). The task is controlled by the master server through webservces, or the decentralized server to undertake the task.
Java + Linux + PHP + MySQL (query accepted)
Core steps
1. Use the "Enterprise source" module to collect a complete list of enterprises (legal person, registered capital, and registration ID number)
2. Analyze, deduplicate, and generate the latest, most complete, dynamic, and accurate yellow pages through the "Basic Enterprise Information sorting" module (dynamic verification .. Multi-point correction)
I want complete information. Unlike the new yellow pages of dog s, searching for Yellow Pages is not even a manufacturer's address.
3. the "Enterprise-> product" module collects a complete list of enterprise products.
4. The "product-> enterprise" module provides horizontal comparison
5. The "Industry Classification" module carefully classifies industrial and commercial enterprises. This process needs to be done in advance and assists with manual
6. In the list of industry keyword words, the weight value is adjusted according to the current popularity (considering sampling, if you want to be professional, you must make yourself look professional at least and have a certain understanding of the industry. Instead of creating a list. To put it bluntly, my idea is a bit like content, rather than searching .)
7. List of industry segments (product-level) KEYWORDS. The weight value is adjusted based on the current heat (sampling is considered)
8. By region, industry, enterprise, and product level 4.
9. Website website classification (Easy to say, difficult to do, probability and statistics .)
A) Official enterprise website
B) portal website
C) product marketing network
D) Academic Research
E) occasionally mention this keyword
F) spam text
G) For details, refer to: Manufacturing (service), circulation, design, sales (purchase), standard, academic, news, occasional mention of spam, questions and answers, discussion group BBS.
Hardware
Preliminary Design: Three end-user-oriented Query Systems, Linux, and load balancing distribution queries
The target is five, with a fixed user of 10 million and a peak of 5000 million users.
Offline analysis: not enough computers can only raise 5-10 PCs, Linux, hard drive 10*100 GB
(The capacity is unknown yet, but I decided not to store any garbage. Waste of garbage and waste my electricity)

Region range:
At the beginning of the design, we should consider the world as a whole and initially support Chinese characters. However, the country and webpage texts must be considered in the database design.
If you can provide a foreign language> A Chinese language conversion, it is good to provide a blind Foreign Language Information for browsing. It may lead to mechanical translation. Many parts are foreign, so it is better to abstract the original predictions.

Promotion Method:
It is mainly promoted in forums where target users are concentrated. Because there are many people. Write Articles to procurement websites and magazines. The article can be widely reproduced and added to favorites for a long time.
Promotion objectives: engineering personnel (Development), procurement personnel, and business personnel. More than people know about the website each year. Man-made visits and people can go back and use them (it is easy for a netizen to say .??)

Difficulties:
There are too few target customers. Promotion is difficult because there are too many similar websites. It is very difficult to stand out from the crowd. The viscosity of Google and Baidu is too high. However, this design still has a high degree of intent. At least a major problem can be solved. Extensive investigations were conducted. The key is that many people suspect that the design cannot achieve the expected results. Second, I suspect that the promotion is not available.

Summarizes the situation of similar websites. There are two points: technology is the key. Promotion is an enemy.

I have recently finished my work on the project, and I don't know whether it is worth it if I want to try it. To be criticized.

========================= Author resume
This is a piece of email to a professional purchaser.

My current idea is to create a vertical search engine named <enterprise product and procurement Search>. Integrate the "yellow page number" and "Enterprise and product" information.

The specific ideas are as follows:
1. Retrieve "Yellow Pages" and product information from yellow pages and business websites around the world through vertical search.
Background: As companies or factories are currently facing many telecommunications suppliers, their phone numbers are often changed. The current yellow pages (China's largest yellow pages of China Telecom and China Netcom ), the biggest drawback is the inaccuracy (due to customer replacement) and slow Update (a large number of relationships, as well as the reasons for phone number transfer and change ).
2. Enterprises at the company level and at the factory level provide information about clustering, while other enterprises only provide phone numbers.
This means that the current Yellow Pages website only provides the address, phone number, contact, and zip code of the enterprise. However, other information cannot be provided.
Generally, looking for a supplier begins with a product or service requirement, for example, looking for accessories or OEM, or directly purchasing its finished products for trade. At the beginning, it was a vague consciousness.

I want to classify the external attributes of an enterprise, such as products, design capabilities, manufacturing capabilities, customers, suppliers, Plant Areas, number of employees, quality certification, and environmental certification.
Then, using the "Enterprise Name" + "attributes to form keywords, submit the query to the general search engine, and return 0.1 million pieces of data, I get these links, then, a web spider is dispatched to capture these webpages. Use keywords related to a certain type of attribute to bold (or red) display.
Then, submit a query request to the general search engine using the keyword "Product Name", and return 0.1 million pieces of data, for example, I get these links, and then send a web spider to capture these pages. Use keywords related to a certain type of attribute to bold (or red) display. Of course, this process involves complex computing. It mainly calculates the keywords of webpage matching industry keywords and webpage matching product characteristics.

Further classification of webpage sources, such as "Enterprise Website", "Business website", "news website", "industry website", and "design and development website ", "product standard website", etc.

For example, for "insulating silicon film", Google: There are 64,200 pieces of Insulating silicon film. the useful information we need may be overwhelmed because we cannot view 10000 webpages in one day.

If you follow my method, the final result is displayed:
1. Manufacturer [30]
Guangzhou richun Electronic Products Factory
Http://www.cps800.com/products/22908.htm
Guangzhou richun Electronic Products Factory is a research and development and production manufacturer focusing on thermal insulation materials for High-Power Electronic Products. All products comply with the EU RoHS environmental protection requirements. Our factory can provide the following products and services:
Supply of silicon film, silicon film, silicon tape, silicone cloth, soft silicon pad, insulation grain, Mica film, silicon rubber cap sleeve, thermal insulation silicon rubber sleeve tube, silicone terminal sleeve, wheat pull (PET), PVC/PC insulation piece, power cord buckle, PCB spacing column, capton (polyimide film ).

Other manufacturers ....
2. Business Information [1, 2000]
3. Industry websites [20]
4. Product Standards [20]
5. Academic Articles [100]
Shenzhen aoce Technology Co., Ltd.
Http://www.dianyuan.com/sale/d/44/57727.html
It is also known as the oil paste. It uses special silicone oil as base oil and new metal oxides as filler.
It is a white paste made of various functional additives and special processing techniques.
Thermal Conductivity, temperature resistance, is an ideal medium for heat-resistant devices, and stable performance,
There will be no corrosive gas in use, and there will be no impact on the metal exposed. High Purity
Smooth, uniform, and high-temperature insulation are guaranteed.
Power Device and radiator assembly surface to help eliminate air gap of contact surface and increase heat flow
Reduces the thermal resistance, reduces the operating temperature of power devices, and improves the reliability and delay.
Long service life.

Other ....
6. other websites with low relevance [5000]

In this way, users can choose as needed. For example, if you look at the manufacturer directly, the number has been reduced to 30, which is easy to find. If you look at the product standards, you can also go directly to the target website.

Compared with Google search, Google returns PageRank-high websites, that is, in the above six categories, Business Information (sales) websites with low correlation with other words may flood us (they may have many links or use a word frequently, such as USB ).

======================================
What I want to ask now is:
1. Is this pre-classified webpage solution desirable?
2. How to classify the business personnel, business personnel (selling to the enterprise), development engineering and technical personnel, and business personnel.
That is to say, the purchaser wants to classify all the webpages related to the enterprise or the product according to what ??
3. As a professional, can you give me any suggestions?
 
Chen

 

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.