Sysdig: A Tool for Linux Server monitoring and troubleshooting (1)
When you need to track the system calls generated and received by a process, what come first in your mind? You may think of strace, so you are right. What command line tools will you use to monitor original network communication? If you think of tcpdump, you have made an excellent choice. If you need to track open files in Unix: Everything is a file), you may use lsof.
Strace, tcpdump, and lsof are really great tools. They should be part of every system administrator tool set, and that's why you should fall in love with sysdig. It is a powerful open-source tool for system-level survey and troubleshooting. Its creators described it as "strace + tcpdump + lsof + with a wonderful lua cherry sauce ". Aside from humor, one of the best features of sysdig is that it not only analyzes the "on-site" Status of the Linux system, but also saves the status as a dump file for offline checks. More importantly, you can customize sysdig behaviors, or even use a built-in script named chisel to enhance the function. A separate chisel can analyze the event streams captured by sysdig in various styles specified by the script.
In this tutorial, we will explore the installation and basic usage of sysdig, and implement system monitoring and troubleshooting on Linux.
Install Sysdig
For this tutorial, we will choose to use the automated installation process provided on the official website to simplify and shorten the installation process and version. During the automation process, the installation script automatically detects the operating system and installs the necessary dependency packages.
Run the following command as root to install sysdig from the official apt/yum Repository:
- # curl -s https://s3.amazonaws.com/download.draios.com/stable/install-sysdig | bash
After the installation is complete, call sysdig as follows:
- # sysdig
Our screen will be immediately filled with all events on the system. This information is not easy for us to perform more operations. For further processing, we can run:
- # sysdig -cl | less
To view the available chisel list.
By default, the following categories are available, and each category contains multiple built-in chisel.
- CPU Usage: CPU Usage
- Errors: Error
- I/O
- Logs: Log
- Misc: Hybrid
- Net: Network
- Performance: Performance
- Security: Security
- System State: System status
To display the information on the specified chisel, including detailed command line usage, run the following command:
- # Sysdig-cl [Chisel name]
For example, we can check the spy_port chisel information under the "network" category:
- # sysdig -i spy_port
The chisel can be applied to the combination of real-time data and record files at the same time through a filter to obtain more useful output.
The filter follows the class. field structure. For example:
- Fd. cip: Client IP address.
- Evt. dir: Event direction, which can be '>' to enter the event, or '<' to exit the event.
You can use the following command to display the complete Filter list:
- # sysdig -l
In the remaining part of this tutorial, I will demonstrate several sysdig use cases.