Intermediary transaction http://www.aliyun.com/zixun/aggregation/6858.html ">seo diagnose Taobao guest cloud host technology Hall
Bd,gg will surely be replaced, this is no doubt that we have just done the site, time to 2 weeks after the idea of the search engine, this is a good thing, after all, he is the standard, 99% of the evaluation is correct, emphasizing the original, emphasizing the chain, emphasizing the number of citations, emphasizing the flow.
Then I can do a search engine, also sell bids, the answer is definitely negative. Why, look at my practice, if you can, please start.
Nutch, is very similar to Google's Open source search engine, from this, is the truth, because the first want to make money on the network, but the run off, estimated that everyone is the same, so I was no exception, did a search engine
But I'm not collecting GG and BD, but their own collection, do, baigoogledu, and so on other are Baidu's self-cultivation, such a free station do not do, and China's Open directory, do not do, you do not do, DMOZ is a huge group, you might as well join them, Maintain a region of a word, where are brothers and sisters, the problem can get more responses, you know you do search engines, Sina are uncertain.
Say it backwards, let's start with the problem.
Did not find that the need for clustering to be so large data storage, at the same time, no traffic, how to deal with that?
Flow is the King Ah, no flow of passion, this very hurt to do 3 weeks around the webmaster's heart, this time, I began to consider the promotion of the same, are the brush flow software, mass mailing, QQ Mass, forum Mass, the early 2006, this good to get, or effective, day thousands of, but soon I found that My new server was sent by me to damage the motherboard and hard drive, thousands of pieces of things, hung up, and at that time the flow of 1000IP is only 5 dollars, not worth it!!
Thus, the purchase of traffic became, the first consideration of the problem, purchased 100,000 traffic! has been bought 3 months, every day is 10w, passion is again burning, do get a lot of tangible benefits, first of all, Baidu immediately included all my site all articles and links, I am not static Oh! In addition, the same items, product search results, basically in the back of the HC website, Compare some large vertical search rankings by a large number of, information greatly increased.
3 months later, bought a forum in the middle of the advertising position, 1500 block one months, but the effect is not ideal, every day 70ip,bbs.**516.com, disappointment! 1500 is how much traffic, I then if put the same cost, I was Alex ranked 170,000. Oh, forget about it, I just need 2 months, Alexa rankings have reached a staggering 220,000, at that time, the cost of tension, I consider the region to promote strong, put the forum to reply to ads, it turns out, I was wrong, this small mistake, so that I will get 200,000 of the ranking is not, and 100,000 of the investment is not, Because the rankings always go forward!
Normal, if we do 6 months, Alexa ranked 10w, should still be good, or at that time directly to purchase ranking service, but there are 100,000 flow of support, but also can, but small station, always subject to the strength of the funds, this want to get really IP action, cast a wrong, so I suggest, Real station, do not care about the effectiveness of flow, large flow is your kingly, with the flow, the weight is high, ranking on high, talk about the future! To sum up, the big flow is your kingly, with the flow, the weight is high, will be quickly indexed by search engines!
Well, I'm sure a lot of people want to know the technical details.
Nutch is JSP, you buy JSP space, now the price is much cheaper, and then the database, configuration files, Apache, OK.
Chinese part of the words of your adjustment, configuration, now the network on the nutch of the article, gathering is a time-consuming thing, boot collection, depth not deep, urls100 is enough, 24 hours later, the acquisition of the large directory file copy replacement. Remember Oh, do not support the continuation of breakpoints, you must be in a stable network of spiders, or die. The huge spider crawls, asks you to know the reason which the Baidu hates the rubbish station, so simple, the rubbish data is crazy, your filter strategy certainly is good to engage.
Inside is the way to filter the Python URL, or write a good study, x.sina.com.cn will certainly be able to climb your spider.
About filter the first in Java inside the design well.
Chinese, remember to do it yourself, because does not support Chinese participle, China, is broken down into the country, you have to do participle, oneself into a dictionary, many places have dictionaries, thesaurus best to adjust to ask 65,000, this data is acceptable results.
Edit comments: Admin5 Thank you for your contribution, there is a dispute only exchange, there is progress in communication, welcome friends enthusiastically published suggestions!