Linux Debugging (Eight): is the core really that hard to track?

Source: Internet
Author: User

Several cores have been encountered this week and are very typical.

Here to share with you.

People who believe they have experience in Linux programming must have met.

Feel a lot of people around the core has a natural sense of fear, especially for the new students. Of course. Also have a job for several years to see Core also helpless. Analysis today, core, in fact, most of them are very easy to solve. Assuming a core is very difficult to reproduce, then the description is still very complex, which is considered corner case. It may require a very long time. You have to have very good execution time in your mind (reading the source code, Learning logic.) The source code corresponding to the state of execution, analysis of some state machine conversion, and then to analyze what might happen.

I believe that the previous articles will lay a better foundation for the analysis and resolution of such corner case.

In contrast, a case with a very high rate of recurrence or repetition is very easy to solve.


multithreading must be out of the core?

The assumption is that your newly added code introduces the core, which is actually very easy to solve, a simple comparison of the changes in the diff. Then see if there are any lower-level errors.

Suppose you can't find it, see if it's a multithreaded problem? Single-threaded assumption without core, change to multi-threading out of the core. That means that multiple threads compete for certain variables: changing certain variables at the same time causes problems. At this point you may have the first reaction locked.

I am very disgusted with the lock. Even if the size of your lock is very small, the scope is small enough. But only if the lock, it represents a blockage. It will be very troublesome to maintain it. Are these shared variables really so worth locking? Can I change to a local variable? Suppose he is a piece of dynamic memory. In order to call an interface without frequent requests to free memory (for example, the interface thousands of calls per second), then the initialization time to request a piece of memory is absolutely reasonable: please set it as a thread variable bar.

This memory is applied to each thread at the time of initialization.

Of course, you assume that you are implementing a framework or an interface that is called by a schema, and that this interface is thread-safe. So you don't seem to be able to control when this thread starts, and how many threads are there, so there's no way out of it?

In fact, there are a lot of methods, for example, you can maintain the corresponding relationship of a "thread" variable in a map

__gnu_cxx::hash_map< pid_t, void *> thread_data_map;void * thread_buffer;std::map<pid_t, void *>::iterator It;lockit = Thread_data_map.find (PID), if (it = = Thread_data_map.end ()) {    //init "thread data"    thread_data_map[ PID] = Create_buffer ();} else {    thread_buffer = It->second;} Unlock


There is a lock that has to be used here. In fact, because the number of threads is limited, this efficiency is fine. This week I implemented an online application with a QPS that can reach 2000 +, and basically the cost of locking is negligible in the entire call stack.

Of course. A better framework might provide an interface such as Onthreadinit. Then apply the thread variable here:

int pthread_setspecific (pthread_key_t key, const void *value);

The function that implements the logic gets the variable:

void *pthread_getspecific (pthread_key_t key);

When do you want to use thread variables? See if there is a write operation on the variable under multi-threading, assuming that the thread variable (or lock) must be applied, otherwise it will be core.


don't bury yourself in a hole:


Today a classmate's core seems to have made an "optimization", saving the time to apply variables.

void Init () {    my_struct * some_var;    ...    Some_var->res = new Some_res;    Some_var->res->set_value1 (some_common_value1);    Some_var->res->set_value2 (some_common_value2);} void * THREAD_FUNC (my_struct * some_var) {         some_var->res->set_value3 (value_3);   ...}

Set_value3 is the reason for a core.

This is also a typical multi-threaded case that must be out of the core. In fact, RES is not required to apply in advance. Change it to a local variable. The loss of almost no performance. Of course, assuming the resource is very large, then consider it a thread variable.


so how to analyze the core?

In fact, the core assumptions presented in the above scenario are only from the core itself and may not be a good place to troubleshoot the problem. And the location of the core may not be the same, inevitably appear scapegoat; call a third-party module. In fact, its own global variables lead to the third-party call stack when it comes out of the core.

Be sure to check that your processing is correct.

Determines whether the calling interface is thread safe. Let's say you're a colleague. You have to make sure that he is right. Just like today's other core, the caller insists that he writes thread-safe, as if it were not thread-safe, as if the code had not been written, and finally found out that the code he wrote was not thread-safe. He asks you what is thread-safe.

So, how do you analyze the core?

Start by understanding the application scenarios where the core appears. For example, long sentence processing will have core, multithreading will have core. Special characters will have a core. This QA will give you a clearer explanation. Then the core call stack to determine the approximate source code location.

Before the source code, no secrets.

You read the source code, it shows the logic, but in your mind. To have a runtime execution call stack, multi-threaded scheduling. Rambled.

Of course. Still can't solve, start from the core to get more specific information!

Take a look at what the call stack's parameters are, switch a thread to see what other threads of the frame are.

Can't solve it?

Then it is corner case, just put the application on the core immediately after the start. Remember that the EMC option for a bug is unable to root cause, which is aptly described. At the same time do not forget self-comfort: yards of many yards, there must be a bug, who can guarantee service 100%?


Linux Debugging (Eight): is the core really that hard to track?

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.