Nodejs Error Handling Best practices

Source: Internet
Author: User
Tags error handling

Turn from: Https://segmentfault.com/a/1190000002741935#articleHeader3

This article will answer some questions for beginners of Nodejs: When should I throw an exception in a function I write, when should I pass it to callback, when it triggers eventemitter, and so on. What assumptions my function makes about the parameters. Should I check for more specific constraints? For example, whether the parameter is Non-null, is greater than 0, it looks like an IP address, and so on. What do I do with the parameters that don't fit the expectations? Should I throw an exception, or pass the error to a callback. How can I distinguish between different exceptions in the program (such as "Request Error" and "Service Unavailable"). How can I provide enough information to make the caller aware of the error details? What should I do to deal with unexpected mistakes. Should I use Try/catch, domains or something?

This article can be divided into several parts based on each other: background: Hope you have the knowledge. Operation failure and programmer error: Two basic exceptions are introduced. The practice of writing a new function: The basic principle of how to make a function produce a useful error. A specific recommendation for writing a new function: Write a checklist example that produces a robust function that can produce a useful error: a document and a preamble that take the Connect function as an example. Summary: Summary of this point. Appendix: Error object Property Convention: Provides a list of properties in a standard way to provide more information. background

This article assumes that:

You are already familiar with the concepts of exceptions in JavaScript, Java, Python, C + +, or similar languages, and you know what it means to throw exceptions and catch exceptions. You know how to write code with NODEJS. You are comfortable using asynchronous operations and can use callback (Err,result) mode to complete the asynchronous operation. You need to know why the following code does not handle the exception correctly [footnote 1]

function Myapifunc (callback)
{
/
 * * This pattern does not work!
 */
try {
  dosomeasynchronousoperation (function (err) {
    if (err)
      throw (
    err); /* Continue as normal *
  /});
catch (ex) {
  callback (ex);
}
}

You will also be familiar with three ways of passing errors:-Thrown as exceptions. -Pass the error to a callback, the function is to handle the exception and handle the asynchronous operation return the result. -Triggers an error event on the Eventemitter.

We'll talk about these in more detail in the next few ways. This article does not assume that you know any knowledge about domains.

Finally, you should know that there are differences between errors and exceptions in JavaScript. An error is an instance of error. The error is created and passed directly to another function or thrown. If an error is thrown, it becomes an exception [footnote 2]. As an example:

throw new Error (' Something bad happened ');

But it's OK to use an error without throwing it.

Callback (New Error (' Something bad happened '));

This usage is more common, because in Nodejs, most of the errors are asynchronous. In fact, Try/catch's only common use is where Json.parse and similar authentication user input. The next thing we'll see is that it's very rare to catch an exception in an asynchronous function. This is very different from java,c++ and other languages that are heavily dependent on exceptions. operation failure and Programmer's error

It is useful to divide the errors into two broad categories [footnote 3]: the failure of the operation is the error generated by the correctly written program at run time. It is not a bug in the program, but it is often something else: the system itself (with insufficient memory or too many open files), System configuration (no route to the remote host), network problems (port hangs), remote service (500 error, connection failure). The example is as follows: Cannot connect to server cannot resolve host name invalid user input Request Timeout server returns 500 sockets is suspended system memory programmer error is a bug in the program. These errors can often be avoided by modifying the code. They will never be dealt with effectively. Read a property of undefined call an asynchronous function does not specify a callback when the object is passed a string that passed an object when the IP address was passed.

People call the failure of the operation and the mistakes of the programmers "wrong", but they are very different. The failure of an operation is an error case that all the correct programs should handle, as long as they are properly handled they do not necessarily indicate a bug or a serious problem. "File not Found" is an operation failure, but it does not necessarily mean that there is something wrong. It may just represent the program if you want to create it in advance with a file.

By contrast, a programmer error is a complete bug. In these situations you make mistakes: forgetting to validate user input, knocking out the wrong variable name, and so forth. Such errors cannot be dealt with at all, and if so, that means you replace the wrong code with the code that handles the error.

This distinction is important: the failure of an operation is part of the normal operation of the program. The mistakes made by programmers are bugs.

Sometimes, you will encounter both operational failures and programmer errors in a root problem. The failure of the HTTP server to run when it accesses an undefined variable is a programmer's fault. The currently connected client sees a Econnreset error while the program crashes, and is usually reported as a "Socket hang-up" in Nodejs. This is an unrelated operation failure for the client because the correct client must handle server downtime or network outages.

Similarly, this is a mistake in itself if the operation fails to be handled properly. For example, if the program wants to connect to the server, but gets a econnrefused error, and the program does not listen for error events on the socket, then the program crashes, which is the programmer's fault. A disconnect is an operation failure (because this is what happens when any of the correct programs are in the system's network or other modules), and if it is not handled correctly, it is a mistake.

Understanding the difference between operational failures and programmer failures is the basis for figuring out how to pass exceptions and handle exceptions. Understand this and read on. processing operation failed

Just as with performance and security issues, error handling is not something that can be added to a program without any error handling. You have no way to handle all the anomalies in a concentrated place, just as you can't solve all the performance problems in a centralized place. You have to think about any possible results that would result in a failure (such as opening a file, connecting to a server, fork a child process, etc.). Including why the error, the reason behind the mistake. It will be mentioned later, but the key is that the granularity of error handling is finer, because where the error occurs and why the error determines the impact size and countermeasures.

You may find that you are constantly dealing with the same errors on some layers of the stack. This is because the bottom layer does not do anything meaningful except to pass up an error on the upper level and then pass the error to its upper layers. Typically, only the top-level caller knows what the correct response is, is to retry the operation, report it to the user or others. But that does not mean that you should throw all the errors into the top-level callback function. Because the top-level callback function does not know the context in which the error occurred, it does not know which operations were executed successfully, and which operations actually failed.

Let's be more specific. For a given error, you can do these things: deal directly. It's clear what to do sometimes. If you get a enoent error when you try to open the log file, chances are you're opening the file for the first time, and all you have to do is create it first. A more interesting example is that you maintain a persistent connection to a server (such as a database), and then you encounter a "socket hang-up" exception. This usually means either the remote or the local network fails. Many times this error is temporary, so in most cases you have to reconnect to solve the problem. (This is not the same as the next retry, because there is not necessarily an operation going on when you get the error) to spread the error to the client. If you don't know how to handle this exception, the easiest way to do that is to discard the actions you're performing, clean up all the starts, and then pass the error to the client. (How to pass an exception is another matter, which is discussed next). This approach is suitable for situations where errors cannot be resolved within a short time. For example, the user submits an incorrect JSON, and it doesn't help that you parse it again. Retry the operation. For those errors from the network and remote services, sometimes retrying the operation can solve the problem. For example, the remote service returned 503 (the service is not available for errors), and you may try again after a few seconds. If you are sure you want to try again, you should clearly document the number of times you will try again, try again and again until you fail, and the interval of two retries.   Also, do not assume that you need to retry each time. If it is deep in the stack (for example, called by a client, and that client is controlled by another user-operated client), it is better to fail quickly so that the client can try again. If each layer in the stack feels the need to retry, the user will eventually wait longer, because each layer is unaware that the lower level is also trying. Crash directly. For errors that are not likely to occur, or errors caused by programmer errors (such as the inability to connect to a local socket in the same program), you can record an error log and then crash directly. Others, such as out-of-memory errors, are not handled by scripting languages such as JavaScript, and crashes are quite reasonable. (Even so, you should consider doing this in a separate operation like Child_process.exec, where you get enomem errors, or errors that you can reasonably handle). You can also crash directly when you can't do anything to get the administrator to fix it. If you run out of all the file descriptors or do not have access to the configuration file, there is nothing you can do in this case, only waiting for a user to log in to the system to fix things. Log errors and do nothing else. Sometimes you can't do anything, there's no action to try or give up, there's no reason to crash the application. For example, you use DNS to track a set of remote services,As a result, a DNS failure has been made. There's nothing you can do except record a log and continue to use the rest of the service. But you have to at least record something (except for everything). If this happens thousands of times per second and you can't handle it, it may not be worth the record every time it happens, but it should be recorded periodically.(no way) to handle the programmer's mistakes

There's nothing to be done about the programmer's mistakes. By definition, a piece of code that is supposed to work is broken (such as a variable name is wrong) and you can't fix it with more code. Once you do so, you use the error-handling code instead of the wrong code.

Some people are in favor of recovering from a programmer's mistake, which is to let the current operation fail, but to proceed with the request. This practice is not recommended. Consider a situation where there is a mistake in the original code that does not take into account a particular situation. How can you be sure that this question will not affect other requests? If other requests share a state (server, socket, database connection pool, etc.), there is a significant likelihood that other requests will be abnormal.

A typical example is a rest server (such as a restify), if a request-handling function throws a referenceerror (for example, the variable name is wrong). It's very likely that it will lead to serious bugs and is extremely difficult to find. For example, a state that is shared among some requests may be turned into null,undefined or other invalid values, and the result is that the next request fails. Database (or other) connections may be compromised, reducing the number of requests that can be processed in parallel. Finally, only a few of the available connections will be bad, which will cause the request to become serially processed by parallelism. To make matters worse, the Postgres connection will be left in the open request transaction. This results in Postgres the old value of a row in the "hold" table because it is visible to the transaction. This problem can occur for weeks, causing the table to grow indefinitely, and subsequent requests are all slowed down from milliseconds to minutes [footnote 4]. Although this problem is closely related to Postgres, it is a good indication that a simple error by the programmer can put the application into a very scary state. The connection will stay in the authenticated state and be used by subsequent connections. The result is that the user is mistaken in the request. The socket will open all the time. In general, Nodejs will use a two-minute timeout on an idle socket, but this value can be overwritten, which will reveal a file descriptor. If this happens continuously, the program will be forced to retire because it has exhausted all the file descriptors. Even if this timeout is not overwritten, the client hangs for two minutes until the "hang-up" error occurs. The two-minute delay will make the problem difficult to handle and debug. Many memory references will be left behind. This can lead to leaks, resulting in memory exhaustion, increased GC time and a sharp drop in performance. This is very difficult to debug, and it is very tricky to relate to the error that caused the leak.

The best way to recover from mistakes is to crash immediately. You should use a restarter to start up your program and restart it automatically when the crash is running. If the restarter is ready, the crash is the quickest way to restore a reliable service in the interim.

The only downside to a crash application is that connected clients are temporarily disturbed, but remember that these errors are, by definition, bugs. We are not talking about a normal system or a network error, but a real bug in a program. They should be rare on the line and are the highest priority for debugging and repair. In all the situations discussed above, the request is not necessarily successful. The request can be completed successfully, may crash the server again, may be done incorrectly in some obvious way, or end incorrectly in a way that is difficult to debug. In a complete distributed system, the client must be able to handle server-side errors by resetting and retrying. Whether the Nodejs application is allowed to crash or not, the failure of the network and the system is a fact. If your online code crashes so frequently that the disconnect becomes a problem, the real problem is that you have too many bugs on your server, rather than crashing because you chose to make a mistake.

If a server crashes frequently causing clients to drop frequently, you should focus your experience on bugs that cause the server to crash, turn them into catch exceptions, rather than avoid crashing as much as possible if the code is clearly problematic. The best way to debug this type of problem is to configure the Nodejs to print the kernel file when there is no catch exception. Using these kernel files on Gnu/linux or Illumos systems, you can see not only the stack records when the application crashed, but also the arguments passed to the function and other JavaScript objects, even the variables referenced in the closure. Even if code dumps is not configured, you can use stack information and logs to start handling problems.

Finally, remember that a programmer's failure on the server side can cause the client to fail, and that the client must handle the server-side crash and network outages. It's not just theory, it's actually happening on the online environment. the practice of writing functions

We've talked about how to handle exceptions, so how do you pass the error to the caller when you're writing a new function?

The most important thing is to write a good document for your function, including the parameters it accepts (attached type and other constraints), the return value, the possible errors, and what these errors mean. If you don't know what is going to happen or don't understand the meaning of the error, it is a coincidence that your application is working properly. So, when you write a new function, be sure to tell the caller what errors and errors are possible.

Throw, Callback or Eventemitter.
The function has three basic patterns of passing errors. Throw passes the exception in a synchronized fashion-that is, the same context in which the function is being tuned. If the caller (or caller of the caller) uses Try/catch, the exception can be caught. If all callers are useless, the program usually crashes (exceptions can also be captured by domains or process-level uncaughtexception, as described below). Callback is one of the most basic methods of asynchronous delivery events. The user passes in a function (callback) and then calls the callback after an asynchronous operation completes. Usually callback will be invoked in the form of callback (Err,result), in which case err and result must have a non-empty, depending on whether the operation succeeds or fails. The more complex scenario is that the function returns a Eventemitter object instead of Callback, and the caller needs to listen for the error event of the object. This approach is useful in both cases. When you are doing a complex operation that may produce multiple errors or multiple results. For example, one request sends data back to the client while fetching data from the database, rather than waiting for all the results to arrive together. In this example, instead of using callback, a eventemitter is returned, and each result triggers a row event that triggers the end event when all the results are sent, triggering an error event when there are errors.

On objects with complex state machines, these objects tend to be accompanied by a large number of asynchronous events. For example, a socket is a eventemitter, which may trigger "connect", "End", "timeout", "drain", and "close" events. In this way, it is natural to use "error" as another event that can be triggered. In this case, it is important to know exactly when the "error" is triggered, and what other events (such as "close") are triggered, the order in which they are triggered, and whether the socket is closed at the end.

In most cases, we will put callback and event emitter in the same "asynchronous error delivery" basket. If you have a need to pass asynchronous errors, you usually just use one of them rather than the other.

So, when to use throw, when to use callback, when again with Eventemitter it. It depends on two things: whether this is an operation failure or a programmer's fault. Whether the function itself is synchronous or asynchronous.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.