August 11, 2008
After learning these 10 skills, you will become the world's most powerful Linux system administrator. The whole world is a bit exaggerated, but you have to work in a large team, these skills are essential. Learn about Shh channels, VNC, password recovery, console reconnaissance, and so on. Examples are attached to each technique. You can copy these examples to your system.
Good system administrators are differentiated in terms of efficiency. If an efficient system administrator can complete a task that takes two hours for another person to complete within 10 minutes, he should be rewarded (get more rewards ), because he saves time for the company, and time is money, isn't it?
The skill is to improve management efficiency. Although this article does not intendAllSkills for discussion, but I will introduce the 10 basic magic weapons used by "lazy" administrators. These skills can save time-at least more time to play, even if you don't get more rewards for efficiency.
Tip 1: unmount the unresponsive DVD drive
Experience of new network users: When you press the eject button on the server (running the Redmond-based operating system) DVD drive, it will pop up immediately. He then complained that in most Enterprise Linux servers, if a process is run in that directory, the pop-up will not happen. As a long-term Linux administrator, I will restart the machine. If I don't know what is running and why the DVD drive is not released, I will pop up the disk. However, this is very inefficient.
The following describes how to find the process of keeping the DVD drive and easily pop up the DVD drive: First simulate it. Put the disk in the DVD drive, open a terminal, and load the DVD drive:
# mount /media/cdrom
# cd /media/cdrom
# while [ 1 ]; do echo "All your drives are belong to us!"; sleep 30; done
Now open the second terminal and try to bring up the DVD drive:
# eject
The following message is displayed:
umount: /media/cdrom: device is busy
Before releasing the device, let's find out who is using it.
# fuser /media/cdrom
The process is running and the disk cannot be popped up. This is actually our error.
Now, if you are the root user, you can terminate the process at will:
# fuser -k /media/cdrom
Now you can unmount the drive:
# eject
fuser
Normal.
Tip 2: Restore the problematic Screen
Perform the following operations:
# cat /bin/cat
Note! Terminal is just like spam. All input content is messy. So what should we do?
Inputreset
. However, enterreset
And inputreboot
Orshutdown
Too close. It's so chilling-especially on production machines.
Don't worry, the machine will not restart during this operation. Continue operation:
# reset
Now the screen is back to normal. This is much better than logging in again after the window is closed, especially when you have to go through five machines and SSH to reach this machine.
Tip 3: screen collaboration
David, a senior maintenance user from the product project, called and said, "Why cannot I compile supercode. C on the new machines you deploy ".
You will ask him: "What machine are you running ?"
David replied: "posh ". (This virtual enough company named its five production servers in the way Spice Girls was used ). Now you can show your skills. Another machine is operated by David:
# su - david
To posh:
# ssh posh
Run the following code:
# screen -S foo
Then call David:
"David, run the command on the terminal# screen -x foo
".
This allows you and David to link sessions in Linux Shell. You can enter, but you can see what the other party is doing. This avoids other layers, and both parties have the same control. The advantage of doing so is that David can observe your fault diagnosis skills and understand how to solve the problem accurately.
Finally, we can see the problem: David's compilation script hardcoded an old directory not on this new server. Load it and compile it again to solve the problem, and then David continues to work. You can continue your previous entertainment activities.
Note that both parties must log on to the same user.screen
The command can also implement multiple windows and split screens. Read the manual page for more information.
Forscreen
Session. I also have the last technique. To separate and enable it, enter Ctrl-A D
(That is, press and holdCTRLKey and clickAKey. Then pressDKey ).
Then runscreen -x foo
Command can be respliced.
Tip 4: retrieve the root password
If you forget the root password, you must reinstall the entire machine. Even worse, many people will do this. However, it is very easy to start the machine and change the password. This is not applicable in all cases (for example, if you set a GRUB Password, but forget it). But here is an example of cent OS Linux, which describes the general operations.
First, restart the system. The grub screen shown in 1 is displayed when the instance is restarted. Move the arrow keys to keep them on this screen instead of starting properly.
Figure 1. Grub screen after restart
Then, use the arrow keys to select the kernel to be started and enterEEdit the kernel line. Then you can see the screen shown in 2:
Figure 2: Prepare to edit the kernel line
Use the arrow keys again to highlightkernel
Start row, pressEEdit kernel parameters. When the Screen 3 is reached, append the number 1 after the parameters shown in Figure 3:
Figure 3. append the number 1 after the Parameter
Then pressEnterAndBThe kernel is started to single-user mode. Then runpasswd
Command to change the user root password:
sh-3.00# passwd
New UNIX password:
Retype new UNIX password:
passwd: all authentication tokens updated successfully
Now you can restart the instance. The machine will start with a new password.
Tip 5: SSH Backdoor
Many times, my website needs someone's remote support, but he is blocked by the company's fire prevention. Few people realize that, if they can reach the outside through the firewall, they can easily bring in external information.
From the original intention, this is called "hitting a hole in the firewall ". I call itSSH Backdoor. To use it, you must have a machine that is used as an intermediary to connect to the Internet.
In this example, the machine is called blackbox.example.com. The machine behind the company's firewall is called Ginger. The machines supported by this technology are called Tech. Figure 4 illustrates the setup process.
Figure 4. Add a hole in the firewall
The procedure is as follows:
- Check what is allowed, but make sure you have the right person. Most people worry that you have enabled the firewall, but they do not understand that this is completely encrypted. Moreover, external machines must be cracked before they can enter the company. However, you may belong to the "dare to do" type. The method you should choose, but do not complain about others when you are not satisfied.
- Use
-R
Mark the connection to blackbox.example.com from Ginger through SSH. If you are the root user on ginger, tech needs the root user ID to help you use the system. Use-R
Forward the description of port 2222 on blackbox to port 22 on ginger. This sets up the SSH channel. Note that only SSH communication can enter Ginger: You will not place ginger on unprotected Internet.You can use the following syntax to perform this operation:
~# ssh -R 2222:localhost:22 thedude@blackbox.example.com
After entering blackbox, you only need to maintain the logon status. I always enter the following command:
thedude@blackbox:~$ while [ 1 ]; do date; sleep 300; done
Keep the machine busy. Then the window is minimized.
- Instruct tech friends to connect to blackbox via SSH instead of using any special SSH flag. However, you must give them the password:
root@tech:~# ssh thedude@blackbox.example.com
.
- After tech is located on blackbox, you can use the following command to connect to Ginger from SSH:
thedude@blackbox:~$: ssh -p 2222 root@localhost
- Tech will prompt you to enter the password. Enter the root password of ginger.
- Now you can work with tech support to solve the problem. Even the screen needs to be used together! (See Tip 4 ).
Tip 6: Use the SSH channel for remote VNC sessions
VNC or virtual network computing has existed for a long time. Generally, I only need VNC when some graphics programs on the remote server can only be used on this server.
For example, in Tip 5, Ginger is a storage server. Many devices use GUI programs to manage storage controllers. These GUI management tools usually need to connect directly to the storage server through a network, which is sometimes stored in a dedicated sub-network. Therefore, you can only access this GUI through ginger.
You can try to use-X
Option to connect to ginger through SSH and start it, but this requires a high bandwidth and you need to endure the waiting pain. VNC is a network-friendly tool and is suitable for almost all operating systems.
Suppose the settings are the same as those in Tip 5, but tech is expected to access VNC instead of SSH. In this case, you need to perform similar operations, but the forwarded port is the VNC port. Perform the following steps:
- Start a VNC Server session on ginger. Run the following command:
root@ginger:~# vncserver -geometry 1024x768 -depth 24 :99
These options indicate the startup server with a resolution of 1024x768 and A pixel depth of 24 bits per pixel. If you use slow connection settings, 8 may be a better option. Use:99
Specifies the port that can access the VNC server. The VNC protocol is started at 5900, so:99
Indicates that the server can be accessed from Port 5999.
You must specify a password when starting this session. The User ID is the same as the user ID when the VNC Server is started (root user in this example ).
- Connect to blackbox.example.com from ginger and forward port 5999 on blackbox to ginger. This is done by running the following command in Ginger:
root@ginger:~# ssh -R 5999:localhost:5999 thedude@blackbox.example.com
After running this command, you need to keep the SSH session open to keep the port forwarded to ginger. In this case, if you are on blackbox, run the following command to access the VNC session on Ginger:
thedude@blackbox:~$ vncviewer localhost:99
This will forward the port to ginger through SSH, but we want to allow VNC to access Ginger Through tech. For this reason, another channel is required.
- In tech, open a channel and use shh to forward port 5999 to port 5999 on blackbox. Run the following command:
root@tech:~# ssh -L 5999:localhost:5999 thedude@blackbox.example.com
The SSH used this time is marked-L
It is not to put 5999 in blackbox, but to get it from it. After arriving at blackbox, you need to keep this session open. Now you can use VNC in tech!
- In tech, run the following command to connect VNC to Ginger:
root@tech:~# vncviewer localhost:99
.
Tech will now have a VNC session directly to ginger.
Although it is a bit difficult to set up, it is much harder to set up than to fix the storage array. However, it is easy to practice it several times.
I would also like to add that tech can run putty if it runs a Windows operating system and does not have a command line SSH client. Putty can be set to forward the ssh port by finding the options in the sidebar. If the port is 5902 instead of 5999 in this example, you can enter the content in Figure 5.
Figure 5. Putty can forward SSH data used as a channel
If you do this, tech can use VNC to connect to localhost: 2, just as tech is running on Linux.
Tip 7: Check bandwidth
Imagine Company A has a storage server named Ginger and mounts NFS through a client node named Beckham. Company A determined they needed to get more bandwidth from Ginger because there were a large number of nodes that needed NFS to mount the Shared File System of ginger.
The most common and cheapest way to do this is to combine two gibit Ethernet NICs. This is the cheapest because you usually have an additional available Nic and an additional port.
Therefore, this method is used. But now the question is: How much bandwidth does it need?
The theoretical limit of gibit Ethernet is 128 Mbit/s. Where does this number come from? Take a look at these computations:
1 GB = 1024 MB;1024 MB/8 = 128 MB; "B" = "bits,", "B" = "bytes"
But what is actually seen? Is there any good measurement method? I recommend iperf. You can obtain iperf as follows:
# wget http://dast.nlanr.net/Projects/Iperf2.0/iperf-2.0.2.tar.gz
You need to install the tool on a shared file system that is visible to ginger and Beckham, or compile and install the tool on two nodes. I will compile it in the main directory of Bob users, which can be seen on both nodes:
tar zxvf iperf*gz
cd iperf-2.0.2
./configure -prefix=/home/bob/perf
make
make install
On ginger, run:
# /home/bob/perf/bin/iperf -s -f M
This machine will be used as a server and output execution speed in Mbit/s.
On the Beckham node, run:
# /home/bob/perf/bin/iperf -c ginger -P 4 -f M -w 256k -t 60
The results on both screens indicate the speed. On a common server using the gibit adapter, the speed may be approximately 112 Mbit/s. This is a common bandwidth in the TCP stack and physical cable. By connecting two servers in an end-to-end manner, each server uses two connected Ethernet cards, and I get a bandwidth of about 220 Mbit/s.
In fact, the NFS displayed on the connected network is about 150-160mbit/s. This still indicates that the bandwidth can achieve the expected effect. If a smaller value is displayed, check whether there is a problem.
I recently encountered a situation where I connected two NICs using different drivers by connecting the driver. This results in very low performance. The bandwidth is about 20 Mbit/s, which is smaller than the bandwidth when the ethernet card is not connected!
Tip 8: Command Line scripts and utilities
The Linux System Administrator has become more efficient by using the command line script with the right to use. This includes clever use of loops and knowledge of how to use themawk
,grep
Andsed
Etc. Generally, this can reduce the number of keys and the user error rate.
For example, you need to generate a new/etc/hosts file for the Linux cluster to be installed. Generally, you can add an IP address to the VI or text editor. However, you can use an existing/etc/hosts file and append the following content to this file. Run the following command on the command line:
# P=1; for i in $(seq -w 200); do echo "192.168.99.$P n$i"; P=$(expr $P + 1);
done >>/etc/hosts
The 200 host names (n001 to N200) will be created by IP addresses (192.168.99.1 to 192.168.99.200. Manual filling of such a file may create duplicate IP addresses or host names, so this is a good example of using the built-in command line to eliminate user errors. Note that this is done in bash shell (default value for most Linux releases.
For another example, check whether the memory size of each computing node in the Linux cluster is the same. Generally, it is best to have a release or similar shell. However, for demonstration, SSH is used below.
Assume that SSH is set to do not use password verification. Then run:
# for num in $(seq -w 200); do ssh n$num free -tm | grep Mem | awk '{print $2}';
done | sort | uniq
Such command lines are quite concise. (It will be worse if a regular expression is put in it ). Let's segment it and discuss each part in detail.
First, from 001 to 200. Useseq
Command-w
Enter 0 in front of the option. Then replacenum
Variable to create a host connected through SSH. Send a command to the target host. In this example:
free -m | grep Mem | awk '{print $2}'
This command indicates:
- Use
free
Command to obtain the memory size in MB.
- Obtain the result of this command and use
grep
Obtain the contained stringMem
.
- Obtain the row and use
awk
Output The second field, which is the total memory in the node.
Perform this operation on each node.
After the command is executed on each node, the entire output of the 200 nodes is transmitted (|
D)sort
Command to sort all memory values.
Finally, useuniq
Command to eliminate repeated items. This command causes one of the following situations:
- If all nodes (n001 to N200) have the same memory size, only one number is displayed. This number is the memory size seen by each operating system.
- If the node memory size is different, several memory size values are displayed.
- Finally, if SSH on a node fails, some error messages are displayed.
This command is not perfect. If you find that the memory value is different from expected, you do not know which node has a problem or how many nodes are there. To do this, you need to issue another command.
This tip provides a quick way to view some content, and you can immediately know if an error occurs. Its value lies in quick checks.
Tip 9: Console Reconnaissance
Some software may output error messages to the console, but the console may not be displayed in the shh session. You can use the VCs device for inspection. In an SSH session# cat /dev/vcs1
Run the following command. This will display the content in the first console. You can also use 2 or 3 to view other virtual terminals. If a user is input on a remote system, you will see the content entered by the user.
In most data farms, using remote terminal servers, KVM, or even serial over LAN is the best way to view this type of information. It also provides out-of-band viewing features. Using the VCs device can provide a fast in-band method, which can save time to go to the data center to view the console.
Tip 10: collect random system information
In tip 8, an example of using command line to obtain information about the total memory in the system is described. In this tip, I will introduce several other methods for collecting important information from systems that require verification, troubleshooting, or remote support.
First, collect information about the processor. The following command is easy to implement:
# cat /proc/cpuinfo
.
This command provides information about the speed, quantity, and model of the processor. In many casesgrep
You can obtain the required value.
I often perform a check to determine the number of processors in the system. Therefore, if I buy a quad-core server with a dual-core processor, run the following command:
# cat /proc/cpuinfo | grep processor | wc -l
.
Then I can see that the value is 8. If not, I will call the supplier and ask them to deliver another processor to me.
Another information I need is the disk information. Availabledf
Command. I always add-h
Mark to see output in units of billions of bytes or megabytes.# df -h
The disk partition information is displayed.
The final part of the List is how to view the system firmware-a way to obtain the BIOS level and firmware information on the NIC.
To check the BIOS version, rundmidecode
Command. Unfortunately, it cannot be used easily.grep
Obtain information, so this is not a very effective method. For my Lenovo t61 laptop, the output is as follows:
#dmidecode | less
...
BIOS Information
Vendor: LENOVO
Version: 7LET52WW (1.22 )
Release Date: 08/27/2007
...
This is more effective than restarting the machine and viewing the post output.
To check the driver and firmware version of the Ethernet Adapter, runethtool
:
# ethtool -i eth0
driver: e1000
version: 7.3.20-k2-NAPI
firmware-version: 0.3-0
Conclusion
You can learn many skills from those proficient in command line. The best way to learn is:
- Work with others. Share screen sessions and observe how others work-you will find new ways of doing things. You may need to be modest and be guided by others, but you can usually learn a lot.
- Read the manual page. By carefully reading the manual page, even well-known commands can provide more insights. For example, you may not know how to use
awk
Network Programming.
- Solve the problem. As a system administrator, you always need to solve problems, whether caused by you or others. This is experience, and experience can make you better and more efficient.
I hope at least one skill can help you learn what you don't know. Basic skills like this can make you more efficient and increase your experience, but most importantly, tips give you more free time to do things you are interested in, such as playing video games. The best administrators are relatively leisurely because they do not like to work. They can find the fastest way to complete the task and quickly complete the task to maintain a casual life.