A friend is angry because his original article was reproduced by others. There is a reason for this outrage, not as some people say it is not necessary. Because when your original site has been reproduced, it may lead to your original article by the search engine punishment and can not survive.
For example, one of your original blog sites, because it is a small station, update speed is limited, search engine on your site crawl and update frequency is relatively low, so you published the original article, published today, not necessarily this one or two may be indexed by search engines, and if a high weight, large-scale site, In your original article published the same day to copy your article, because this station search engine update frequency, high weight, every day may be search engine crawl update included, then the search engine will think this article is the originator of this station. In addition, some stations do not have morals, reproduced when not for the original author annotated copyright and the original address link, resulting in numerous reprinted sites as original authors, the original link address to the site. This also can definitely let the search engine to identify the article originator is this station.
There is nothing to reprint such a two article. If the station to the station under the poison, the entire column or even the entire site of plagiarism replication. The consequences are serious and will eventually result in a search engine miscalculation. This station is a garbage station. Even a devastating punishment for the station. Which is usually called K Station.
This kind of big station content plagiarism others, very rubbish, stationmaster also is knowingly, because they are to flow and advertisement income. This situation suggests to the law and other means to deal with these large-scale spam webmaster. No other effective means and methods can be identified.
Search engine How to determine the original:
One. Original: Simple to understand is the first time on the network published content.
Two. False Original: Is the original carried out the second or the first n after the modification to be reproduced published. For example, revise title, add Summary, reprint BU complete content and so on.
How does a search engine make a decision about originality?
Generally speaking of his following Gui factors to decide:
1, the snapshot date.
2, spider crawl date.
3, the page outside the chain how much.
4, the degree of modification of the article.
For example: If a title for "search engine how to determine what you are the original content" article at 10 today for the first time published on a blog or website. What is the result?
Search engine spider came to this blog or website, found this page, analysis content, put into the database, and was identified as the first discovery, this must be original!
So this is included in the process of determining the details of his Gui:
1. Necessary conditions
If this site is not included, this article will be considered original?
-Of course bu is! Because it's bu can appear in the search database!
--How to make it an original content?
--the first condition, the website must be indexed by search engines.
If this website is included, but bu often update na?
-Very simple, if BU often updated, published articles to be included in the time will also think is original.
3. Reprint and Collection
--What if the article was reprinted?
If the article is reproduced, then see reproduced this article to station update cycle and the first publication station update cycle which faster.
-BU is too aware of the update cycle.
For example, in a station published, B station reproduced, if the spider first visited a station, found the article, and then came to B station found the article, it is obvious that the original weight to a station.
Does the acquisition match this situation?
-Yes, it is. If b collects a, but B is included earlier than a, B may become original!
4. Access time
--What if the spider first visited the B station?
Of course, the weight to B station, the general situation will be like this!
If B station reproduced article with a Stand original article page link na?
-This is very clear, just included, if the ranking, two results together, he may or B stood a bit better.
Of course, the article reprinted more than a few times, a station links more, to a station article more advantages, the ranking will slowly become a standing in front.
--If the article is reproduced in the paper is a B-station page to link na?
This is funny, a joke to search engines, but if they decide bu good, it becomes a link popularity competition.
BU, if they are a lot of external links, and the difference between bu large, then the rules should be returned to the original point, who is included first who is original.
5. Snapshot Date
--The snapshot date shows the earliest time, generally is original!
--BU must, this statement in an update cycle, such as the article published within a week, the sooner the snapshot of the address will be recognized as the original possible.
But if the article has been published Gui months, said BU search engine has been retrieved the snapshot, the snapshot date changed!
--and what else could it be?
-his, general, such as Baidu included, he may be a collection of his database, after filtering, the content will be included in the search results. In this period on some of his problems, such as a station first published, B station reproduced. Spiders first visit a station and then visit B station. Then you may put the results of B stand out, and a station is still in the database.
So said search engine did not have his and BU said search engine spiders did not visit these content, perhaps in the search engine inventory has been recorded in his, but you do not have to check the time out of it, like number 25th just released the content, but the snapshot is 20th, this is the search engine inventory content, At the same time this is the original test of the core point of time.
This situation generally appears between the new station and the old station, a station published, B station reproduced, but a station in the search engine to trust and BU High. BU, as long as a station is visited first, the original right or a stand, this is the most difficult to distinguish the situation, because we bu know the spider first visit which station, unless you know two Web site space log content, you can see the search engine on two pages have access time.
6, False original
-False original will also be considered original?
Most of the time, the search engine spider intelligence is equivalent to a three-year-old child, BU can clearly distinguish these things, because it has to think too stylized. If you get the title changed, the article has to mend, then the spider will be very difficult to determine whether this article is included, perhaps it can determine his part of the content is repeated, but it also bu can because of these and will this article is really think is reproduced! Of course, as the search engine program is designed to improve, you should have a similar degree of things out, such as the text content more than a few percent of the similarity will be considered reproduced.
This analysis, I believe you should understand it. Just walnut own view, hope that we absorb what they want, BU agree to also come to mention their own opinion!
In addition Gui suggest:
1, if you have to stand is a new station, the weight BU High, how to let the spider home page to find you and put into the database? In fact, very simple: with Nets, Baidu collection of these tools to make spiders faster to find your page!
2, everyone has his suggestion, is to add their own copyright and content page address, others collect when you are cool, included although BU will be fast, but the last link more, NI is still original content.
3, published articles to wait until their own collection and then go to other sites to publish, at the same time add their own original address, this method is very secure! The probability of the big stations being picked up is very high!