Hide specific articles on search engines
the source of this problem is this:
As we all know, with the search engine improvement, will increasingly reject collection and false original, especially Baidu also introduced the Origin algorithm, the collection station for K station and other measures. If labeled as a collection station, all efforts may be wasted.
Believe that a lot of webmaster, but also want to more original content, and do not want to rely on collecting other people's articles. However, a new station, especially personal webmaster, the content of the full speed must be very slow, we do not only to curry favor with search engines, but also to please readers. If the reader cannot get richer information in your station, the experience must be bad. In fact, the name of the old station, there is a considerable proportion of collection or adaptation of content, which is in line with the spirit of sharing the Internet. Major television stations and newspapers are also reproduced and abstracts, as long as the pick well meet the specific needs of the content needs, is valuable.
The key is: Do not use the collection of articles for their own web site fraud search traffic. This should be in line with Internet ethics and consensus. If only the original content to participate in the search engine game rules, rather than the original part of the shielding search engine. This can be flat to meet the search engine, site owners, users of the interests of the three.
So the question boils down to one point: how to effectively and reliably let "part of the article Block search engine"?
I don't know if this is a more common problem, if a website, not only hope that through a wealth of articles to meet the audience, but also afraid of being judged by search engines as a collection station, that this is a true face, the key, the core, related to the survival and development of the big problem.
Recently also has been learning relevant aspects of knowledge, to personal opinion, shielding search engine has several types of ways:
First, with robots.txt
Second, WP station can judge the user characteristics (read this blog post after you think of)
Third, through JS package links
Four, through redirection, such as short link, php background redirection, etc.
In contrast to the above several ways,
The first method: Robots.txt like a seal on the door: "Hey, spider, there's something I don't want you to retrieve." This is the so-called gentlemen's agreement, search engine must have the ability to see you affixed to the door of the seal, but it is not included. To determine whether a station has a large collection of content, spiders may have a motive to pry.
This method of technology implementation is the lowest cost, and should be able to meet most of the situation. As if Baidu in this aspect of the integrity can be assured, such as not indexed Taobao content, but also very much hate on Baidu content index.
The further problem with this approach is that:
WP built in the station, how can efficiently let "part of the article Shielding search engine"?
1, the article title add Characteristics: For example, the title of each article with a special character, this method is feasible, robots.txt with disallow:* special Note * Can it?
2, the article label recognition: this at the operational level seems to be the most convenient, but the label seems to be a dynamic tag, can not be filtered in the robotx.txt?
3, the article into a specific directory: This robots.txt relatively good to write, however, in WP article content management how to operate easily?
The second method is to check the ID card of the person in the door, if the visitor is a search engine, then the passage is forbidden. This method is dedicated to the WP, and then its benefits can be very detailed treatment, such as Baidu's attitude towards the collection is relatively tight, and Google is not the same, that some articles can be closed to Baidu and Google open the door. Another big advantage is that you can integrate the judgment into the WP environment, such as through plug-ins or themes to automate the operation.
The third way: like the door in a change of the number, search engines only know to track the numbers on the gate, but the browser through JS to point to another correct entrance. However: the search engine to JS analysis ability is more and more strong, and from Google's some statements, search engine also don't like your content for people and search engine is not the same.
This method is used in a large number of Taobao customer links hidden Aspects, this method of validity estimate is not too long, and the operation is more troublesome, more suitable for static separate pages, not too suitable for the database organization of WP article structure.
The fourth method: like to add a secret to the number, only you knock (click), only to replace you with the correct number. The average visitor will definitely click, and the search engine will not simulate clicking on this action.
This approach is relatively thorough and "safe", with the disadvantage that:
1, and the third method of operation is somewhat complex, suitable for static separate pages, or pages of local links, not too suitable for the environment of WP.
2, excessive redirection, should consume the server's computing resources, add up, if a large number of articles are to be redirected, the server may be overwhelmed.
Implementation code
Specifically how to achieve WordPress search engine to hide specific articles? Nonsense not to say, directly on the PHP code, placed in the current theme of the functions.php can be used (save with UTF-8 encoding):
What needs to be explained is that if your WordPress site has page caching turned on, this feature is invalid function Ludouse_add_custom_box () {if (function_exists (' Add_meta_box ')) {a
Dd_meta_box (' Ludou_allow_se ', ' Search engine ', ' ludou_allow_se ', ' post ', ' side ', ' low ');
Add_meta_box (' Ludou_allow_se ', ' Search engine ', ' ludou_allow_se ', ' page ', ' side ', ' low ');
} add_action (' add_meta_boxes ', ' ludouse_add_custom_box ');
function Ludou_allow_se () {Global $post;
Add validation fields Wp_nonce_field (' Ludou_allow_se ', ' ludou_allow_se_nonce ');
$meta _value = Get_post_meta ($post->id, ' Ludou_allow_se ', true);
if ($meta _value) echo ' <input name= "ludou-allow-se" type= "checkbox" checked= "Checked" value= "1"/> Shielding search engine ";
else Echo ' <input name= "ludou-allow-se" type= "checkbox" value= "1"/> shielding search engine; ///Save option Setting function Ludouse_save_postdata ($post _id) {//Verify if (!isset ($_post[' ludou_allow_se_nonce ')) return $po
st_id;
$nonce = $_post[' ludou_allow_se_nonce '];
Verify that the field is valid if (!wp_verify_nonce ($nonce, ' Ludou_allow_se ')) return $post _id; Determine whether automaticSave if (defined (' Doing_autosave ') && doing_autosave) return $post _id;
Verify user permissions if (' page ' = = $_post[' Post_type ']) {if (!current_user_can (' Edit_page ', $post _id)) return $post _id;
else {if (!current_user_can (' Edit_post ', $post _id)) return $post _id;
///Update settings if (!empty ($_post[' Ludou-allow-se ')) Update_post_meta ($post _id, ' ludou_allow_se ', ' 1 ');
else Update_post_meta ($post _id, ' ludou_allow_se ', ' 0 ');
} add_action (' Save_post ', ' ludouse_save_postdata ');
For settings that are not allowed to crawl articles and pages//prohibit search engine crawling, return 404 function Do_ludou_allow_se () {//This feature is only valid for articles and pages if (Is_singular ()) {Global $post;
$is _robots = 0;
$ludou _allow_se = Get_post_meta ($post->id, ' Ludou_allow_se ', true);
if (!empty ($ludou _allow_se)) {//below is the crawler agent to judge the key word group//a bit simple, optimize yourself $bots = Array (' Spider ', ' bot ',
' Crawl ', ' slurp ', ' yahoo-blogs ', ' Yandex ', ' Yeti ', ' blogsearch ', ' ia_archive ',
' Google ', ' Baidu '); $useragent = $_server[' http_user_agent ']; if (!empty ($useragent)) {foreach ($bots as $lookfor) {if (Stristr ($useragent, $lookfor)!== false) {$is _r
Obots = 1;
Break
}//If the current article/page prohibits search engine crawling, return 404//Of course you can change to 403 if ($is _robots) {status_header (404);
Exit
}}} add_action (' WP ', ' do_ludou_allow_se ');
How to use
After successfully adding the above code to the current theme of the functions.php, we can use the normal, completely fool-like. In the WordPress background article and page edit page, the right side of the bottom of the column we can see such a marquee:
If the current article/page need to prohibit search engine crawl, tick. Check, when this article/page is accessed by the search engine will return 404 status, nothing. If you do not like to return 404 to the search engine, worry about the dead chain too much impact seo, you can put in the code:
Status_header (404);
Exit
Change into:
echo "<meta name=\" robots\ "content=\" noindex,noarchive\ "/>\n";
Will again:
Add_action (' WP ', ' do_ludou_allow_se ');
Change into:
Add_action (' Wp_head ', ' do_ludou_allow_se ');
This adds a meta declaration directly to the head section of the page:
<meta name= "Robots" content= "noindex,noarchive"/>
Tell the search engine not to index this page, not to display snapshots. It is important to note that the header.php in your subject directory must have the following code:
set up articles only allow search engines to view
some articles just for SEO only released, want to let these articles only allow search engine crawl, ordinary visitors can not see, in WordPress How to do?
Implementation code
If your WordPress site does not open the page cache, this requirement is not difficult to achieve, we can refer to the search engine hidden in the face of specific articles in the code, a little modification can be. Add the following PHP code to the functions.php of the current topic and save with the UTF8 encoding:
Add options to an article and page edit page function Ludouseo_add_custom_box () {add_meta_box (' ludou_se_only ', ' search engine exclusive ', ' ludou_se_only ', ' post ')
, ' side ', ' low ');
Add_meta_box (' ludou_se_only ', ' search engine exclusive ', ' ludou_se_only ', ' page ', ' side ', ' low ');
} add_action (' add_meta_boxes ', ' ludouseo_add_custom_box ');
function ludou_se_only () {Global $post;
Add validation fields Wp_nonce_field (' ludou_se_only ', ' ludou_se_only_nonce ');
$meta _value = Get_post_meta ($post->id, ' ludou_se_only ', true);
if ($meta _value) echo ' <input name= "ludou-se-only" type= "checkbox" checked= "Checked" value= "1"/> Only allow search engines to view ';
else Echo ' <input name= "ludou-se-only" type= "checkbox" value= "1"/> Only allow search engines to view '; ///Save option Setting function Ludouseo_save_postdata ($post _id) {//Verify if (!isset ($_post[' ludou_se_only_nonce ')) return $po
st_id;
$nonce = $_post[' ludou_se_only_nonce '];
Verify that the field is valid if (!wp_verify_nonce ($nonce, ' ludou_se_only ')) return $post _id; Determine whether to automatically save if (defined (' Doing_autosave ') && Doing_autosave) rEturn $post _id;
Verify user permissions if (' page ' = = $_post[' Post_type ']) {if (!current_user_can (' Edit_page ', $post _id)) return $post _id;
else {if (!current_user_can (' Edit_post ', $post _id)) return $post _id;
///Update settings if (!empty ($_post[' ludou-se-only ')) Update_post_meta ($post _id, ' ludou_se_only ', ' 1 ');
else Delete_post_meta ($post _id, ' ludou_se_only ');
} add_action (' Save_post ', ' ludouseo_save_postdata ');
function do_ludou_se_only () {//This feature is only valid for articles and pages if (Is_singular ()) {Global $post;
$is _robots = 0;
$ludou _se_only = Get_post_meta ($post->id, ' ludou_se_only ', true);
if (!empty ($ludou _se_only)) {//below is the search engine agent to judge the key word group//a bit simple, optimize yourself $bots = Array (' Spider ', ' bot ',
' Crawl ', ' slurp ', ' yahoo-blogs ', ' Yandex ', ' Yeti ', ' blogsearch ', ' ia_archive ',
' Google ');
$useragent = $_server[' http_user_agent ']; if (!empty ($useragent)) {foreach ($bots as $lookfor) {if (stristr($useragent, $lookfor)!== false) {$is _robots = 1;
Break If it is not a search engine, an error message is displayed/the logged on user is not affected if (! $is _robots &&!is_user_logged_in ()) {Wp_die () You do not have permission to view this article!
');
}}} add_action (' WP ', ' do_ludou_se_only ');
How to use
After successfully adding the above code to the current theme of the functions.php, we can use the normal, completely fool-like. In the WordPress background article and page edit page, the right side of the bottom of the column we can see such a marquee:
If the current article/page need to prohibit search engine crawl, tick. When checked, the following error message is displayed when this article/page is accessed by ordinary visitors (search engines and logged-in users are unaffected):