How to find the root cause of the error from a problem encountered during network crawling

Source: Internet
Author: User

A Web Crawler error occurred a while ago. As shown in the Web Crawler log, almost all content of this website set has not been crawled.

Later, analysis showed that this was a Sharepoint bug ......

Of course, it is not a major bug, and it is easy to avoid it after you know it. I also posted this bug on Weibo. Here I will mainly introduce how to find out the specific cause from a mistake, which can also be used as an example to teach people to fish.

(If it is an error in your own code, simply attach the process for debugging. This is not covered in this article)

There are two types of tools involved: one is the log viewing tool (I use the most earthy Notepad ...... You can use Microsoft's Log Viewer in SharePoint admin toolkit. The other is the decompilation tool (I used justdecompile, a free decompilation tool. slogan is very interesting, may the source be with you ).

 

The specific process of finding this bug is as follows:

Symptom: the content of a website set cannot be found after the website is completely crawled (this website set is created using code)

 

Step 2: if you cannot find anything, you can directly check the crawler logs in the search service of the management center. As a result, you can see that an error occurred while crawling this website set:

 

Step 2: at this time, you can search for the problem (yes, do not try to find the root cause of the error from the very beginning, unless you are sure that this error is caused by your own program, or you may feel a bit confused in your mind, it is definitely an efficient way to use the search engine reasonably ), of course, it is recommended to search in English. At this time, the benefits of using the English version of SharePoint are shown, but we generally use the Chinese version. At this time, we can first try to translate the Chinese version back, as long as there are not many translations and this error is mentioned on the internet, all of them can be accurately translated through search (another method is to search for the error string in the SharePoint resource file, but this is time-consuming and laborious and may not be able to find ......). Sorry for this problem, I found it ......

 

Step 2: Check SharePoint logs. Generally, I am lazy first looking for the event viewer. Some problems may be recorded in the Event Viewer. Unfortunately, this problem does not exist. Then, you can find the log files in 14 \ logs. The reason is that the logs are scattered in many text files, which is not very convenient to find, the format is not very easy to read (Microsoft's log viewing tool, to be honest, is still quite slow ......).

If you do not need a ready-made tool, you need to first find the log file, and the method is relatively simple. If it is an error just occurred, find the latest log file, otherwise, it is easy to locate the file where the error may be recorded based on the modification interval of the file.

 

Step 2: Find the specific error. Open the log file using notepad or other text viewing tools (this log is still plain text). There are only a few ways to find the specific error location:

The most stupid one is to find the logs based on the error time. However, if the log level is set to high, there will be hundreds of logs within one minute, which is not easy to find;

Most precisely, search by correlation ID (Chinese translation is called the interconnection ID). This stuff will usually appear in the following typical SharePoint error page:

If there is no internet ID as described in this article, you can directly search for the error message, for example, "the data is empty. This method or attribute cannot be called for null values .". If exception is involved in the log, the complete error information and call Stack are recorded, so you can find the problem. For this problem, you can find the following information:

 

Step 2: Search again (yes, you are not mistaken), because at this time we can see more specific error information. In general, my approach is to directly search for the name of the wrong method. If errors are common, they can also be easily found (or you can only find someone who has encountered the same problem, but no one answered, haha ). However, I still cannot find the reason for this search. (I don't remember if I did not do this, so I can't find it now ......)

 

Step 2: locate the root cause of the error. From the above log, we can see that the entire callstack does not have its own code, which is a problem caused by the SharePoint code. If you look forward with the same correlation ID (The GUID at the end of the log), you can see the starting point of the error:

It can be seen that an exception is thrown when sitedata Web Service is called.

So how do you know why this exception occurs? Now that the complete call Stack has been provided in the log, we only need to find the method for making the error, so we can probably guess the cause of the error.

As shown in the preceding log, the last SharePoint-related method in the error is Microsoft. sharepoint. soapserver. sitedataimpl. the getsitegroupsxml method, combined with the previous starting point, can determine that this is the method called in the sitedata web service.

 

Step 2: further positioning. At this time, we have to sacrifice the big killer: the anti-compiler. To see the specific reason, we need to go deep into the source code of SharePoint. The good news is that almost all DLL of SharePoint 2010 are unobfuscated and easy to interpret.

But which file should be decompiled?

At this time, you need to have a certain understanding of the level of SharePoint, you need to know that the Web service directory "_ vti_bin" is a virtual directory (if you do not know, you can also take a look at it in IIS.) This sitedata can be easily seen in IIS. the specific asmx path is in the 14 \ ISAPI directory. Generally, the asmx file only records the information of one assembly:

Although this information is short, it is enough for us to know the information of this Assembly. The Assembly name is stssoap. Generally, the Assembly name and DLL name are consistent, so our goal is to find a file named stssoap. dll.

In this step, you have a better understanding of SharePoint-you need to know where the DLL of Sharepoint is distributed: Most of the DLL can be found in GAC, of course there are some other places, such as stssoap. DLL is in the IIS path (that is, in the "80" directory _ app_bin. By the way, there is a DLL in this directory that I often see, that is, Microsoft. sharepoint. applicationpages. DLL, which is the background files of the configuration pages in layouts ).

Of course, if you do not have this knowledge, there is a stupid way to do a full search by file name ......

 

Step 2: decompile. Open this stssoap. dll with any decompilation tool (I use justdecompile), and you can easily see the background class of the web service and the class with errors:

Then let's take a look at the code of the specific error method, that is, the getsitegroupsxml:

The method is a little long (not too long). I didn't paste it completely, combined with the specific error information in the log (XXX for a database operation. get_string () method). We can note two points with a purpose: first, the stored procedure and parameter called (we are lucky to use ADO directly. NET database, many of the bottom-Layer Methods of SharePoint are tracked to the end of the COM + unmanaged method, we can't do it). Second, the two sentences that may be wrong, sqldatareader. getstring ().

If you have experience, you can guess the reason for this step: When getstring is used, the value in the database is null and the type of dbnull is converted to C, when you directly getstring, the previous error occurs. Here, there are two getstrings (of course, several getstrings are not appended to the method), one is name and the other is description. First, we can guess from the method name and attribute name that this method is used to obtain the SharePoint user group. Can the name and description of the user group be null? For me, this is true (because this user group was created using code. During the creation, I actually used the description attribute null ......)

Uncover the mystery !!

Of course, a good technician is curious. Now that he has come here, why not continue?

 

Step 2: locate the database level. In the content database, find the proc_secgetallgroupsandmembershipinfo stored procedure and execute it first ......

And so on. Is there a parameter siteid, which can be set by region? Go back to the earliest crawler log. Do you remember the wrong website set ...... How do you get the website set ID? Too many methods ...... You can find the URL parameters on various settings pages related to the website set in the Management Center; you can find them using powershell; you can find them using a console program; you can find them directly from the database ...... Select as you like.

Results After the stored procedure is executed:

Aha!

 

Step 2: Go deeper. Let's see how this stored procedure is written?

You can see a function named tvf_groups_site. Continue to read this function:

From the groups table, and this table:

Well, there's nothing mysterious ...... Is the table of the SharePoint storage user group.

At this time, you can shake your tail and consider writing this process like decryption into a blog to share it.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.