How to monitor and troubleshoot Linux servers with Sysdig

Source: Internet
Author: User
Tags cpu usage sysdig

If you need to track the system calls that a process makes and receives, the first thing to think about is what? You probably think of strace, you're right. Monitor raw network traffic from the command line what tools do you use? If you think of tcpdump, you have made a wise decision. If you happen to need to keep track of open files (in other words the Unix language is: Everything is file), chances are you'll use lsof.

Strace, Tcpdump, and lsof are really great tools that should be part of every system administrator's toolbox. This is precisely why you would like Sysdig, a powerful system-level [exploration] and debugging tool, open source, and the founder called "Strace + tcpdump + lsof, topped with a wonderful sauce with Lua cherries." Okay, let's put the humor aside, Sysdig. A great feature of this is that it not only analyzes the "live" state of the Linux system, but also saves the system state to a dump file for offline detection. And you can customize Sysdig's behavior and even enhance Sysdig's ability with a built-in (or your own) small script called "chisels". Various chisels analyze the event streams captured by Sysdig with their own different scripting features.

In this tutorial, we will look at the installation of Sysdig on Linux and the basic system monitoring and debugging usage.

Installing Sysdig

For the sake of simplicity, clarity and relevance to the Linux distribution, this tutorial selects the automated installation process described by the official website, which automatically probes the operating system and installs the necessary dependencies.

Run the following command as root and install Sysdig from the official Apt/yum source.
Curl-s Http://s3.amazonaws.com/download.draios.com/stable/install-sysdig | Bash
After the installation is complete, we can run sysdig like this to feel it.
Sysdig
The screen immediately fills in all the information that's happening in the system, and we can't use that information, so we should run:
SYSDIG-CL | Less
To view the list of existing chisels:

To_see_a_list_of_available_chisels_15334678610_e5956a26e8_z

The following categories are available by default, with multiple built-in chisels for each class.
Cpu
Usage CPU usage

Errors
Error

/ o

Input/Output

Logs
Log

Misc
Miscellaneous

Net
Internet

Performance
Performance

Security
Safety

System
State System Status
Displays information for a specific chisel (with detailed command-line usage), running:
SYSDIG-CL [Chisel_name]
For example, looking at the Spy_port chisel under the "Net" category, you can run:
Sysdig-i Spy_port
Sysdig_i_spy_port_15521424095_0365bf20c3_z

Chisel can be combined with filters to get more useful output, and filters can be used for field data as well as for tracking files.
Filters follow the structure of "category. Field", for example:

FD.CIP: Client IP Address
Evt.dir: The event direction, which can be ' > ', represents an entry event, or ' < ', which represents the exit event.
The following command displays all the filters:
Sysdig-l
In the remaining tutorials, I will demonstrate some of the sysdig use cases.

Sysdig Example: Server performance tuning

Assuming your server is experiencing performance problems (such as unresponsive or severe response delays), you can use bottlenecks chisel to display the slowest 10 system calls at this time.

The following command can be detected in real time on a running server. The "-C" tag is followed by the Chisel name, which tells SYSIDG to run the specified chisel.
Sysdig-c bottlenecks
Alternatively, you can perform offline server performance analysis. In this case, you can save the full sysdig trace data to a file and then run Bottlenecks chisel on the trace file as follows:

First, save the Sysdig trace file (end data collection with CTRL + C):
Sysdig-w Trace.scap
After collecting the trace data, you can check the slowest system calls during the trace with the following command:
Sysdig-r Trace.scap-c Bottlenecks
Sysdig_r_trace.scap_c_bottlenecks_15334678670_ebbe93265e_z

Note the #2 #3 and #4 columns, which represent the execution time, process name, and PID, respectively.

Sysdig Example: monitoring user behavior

Assuming you are a system administrator who wants to monitor the user behavior of the system (for example, what commands the user has tapped at the command line and which directories to enter), then Spy_user Chisel comes in handy.

We will first collect sysdig trace files with some additional options:

Sysdig-s 4096-z-w/mnt/sysdig/$ (hostname). scap.gz

Tells Sysdig to capture 4,096 bytes for each event.

"-z"

(Used with-W) to compress the trace file.

"-w

<trace-file> "
Saves the Sysdig trace data to the specified file.
In the example above, we name the compressed trace file according to the hostname. Remember, you can knock Ctrl + C at any time to terminate the run of Sysdig.

After we have collected enough data, we can use the following instructions to view the interaction behavior of each user in the system:
Sysdig-r/mnt/sysdig/debian.scap.gz-c Spy_users
Sysdig_r_mnt_sysdig_debian.scap.gz_c_spy_users_15518254291_5c9671ca41_z

In the above output, the first column represents the process PID associated with the user activity.

What if you want to target a specific user and only monitor the activity of that user? You can filter the results of Spy_users chisel by user name:
Sysdig-r/mnt/sysdig/debian.scap.gz-c spy_users "User.name=xmodulo"
Sysdig_r_mnt_sysdig_debian.scap.gz_c_spy_users_user.name=xmodulo_15498248556_66d15422b1_z

Sysdig Example: monitoring file I/O

We can use the "-P" tag to customize the output format of the Sysdig trace, including the required fields (such as: User name, process name, and file or socket name) in double quotation marks. In the following example, we create a trace file that includes only write events to the home directory (we can then use "Sysdig-r writetrace.scap.gz" to see it).
Sysdig-p "%user.name%proc.name%fd.name" "Evt.type=write and Fd.name contains/home/"-z-w writetrace.scap.gz
Sysdig_p_15498248586_de5f5fc93d_z

Sysdig Example: monitoring network I/O

As part of server debugging, you may need to spy on network traffic, which is usually tcpdump. With Sysdig, traffic sniffing is also easy, and the way is more user-friendly.

For example, you can view the data Exchange (ASCII form) of a particular process of a server (such as apache2) with a specific IP address:
Sysdig-s 4096-a-C Echo_fds fd.cip=192.168.0.100-r/mnt/sysdig/debian.scap.gz proc.name=apache2
If you want to monitor the original data transfer (binary form), change "-A" to "-X".
Sysdig-s 4096-x-C Echo_fds fd.cip=192.168.0.100-r/mnt/sysdig/debian.scap.gz proc.name=apache2
For more information, examples, and case studies, you can log in to the project website. Believe me, Sysdig has infinite possibilities. Don't listen to me, just install sysdig now and start digging!

Reference website
Https://www.ibm.com/developerworks/cn/linux/1607_caoyq_sysdig/index.html

How to monitor and troubleshoot Linux servers with Sysdig

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.