The command line has a fairly high reputation for Linux users. Unlike other operating systems, the command line is a scary proposition, but for the experienced Daniel in the Linux community, the command line is the most recommended and encouraged to use. Typically, the command line compares the graphical user interface to provide a more elegant and efficient solution.
Command line along with the growth of the Linux community, Unix shells, such as bash and zsh, have grown into a powerful tool and an important part of the Unix shell. With bash and other similar shells, you can get some very useful features, such as piping, file name wildcard, and reading commands from a file, which is a script.
Let's introduce the powerful features of the command line in practice. Each time a user logs on to a service, their user name is logged to a text file. For example, let's look at how many independent users have used the service.
The following sequence of commands shows the powerful features that are implemented by a small string of commands:
$ Cat Names.log | Sort | Uniq | Wc-l
The pipe symbol (|) transmits the standard output of one command to the standard input of another command. In this example, the output of the cat Names.log is passed to the input of the sort command. The sort command is to reorder each line in alphabetical order. Next, the pipe transmits the output to the Uniq command, which deletes duplicate names. Finally, the Uniq output is sent to the WC command. The WC is a character count command that can return the number of rows using the-l argument. A pipe allows you to string together a series of commands.
However, sometimes the requirements can be complex, and the Cascade command becomes unwieldy. In this case, the shell script can solve the problem. A shell script is a series of commands that are read by a shell program and executed sequentially. Shell scripts also support features of some programming languages, such as variables, process controls, and data structures. Shell footsteps are useful for batch programs that are often run repeatedly. However, the shell script also has some weaknesses:
- Shell scripts can easily become complex code that makes it difficult for developers to read and modify them.
- Usually, its syntax and interpretation are not so flexible, and not intuitive.
- Its code is usually not used by other scripts. The code reuse rate in the script is very low, and the script usually solves some very specific problems.
- They generally do not support library features, such as HTML interpreters or processing HTTP request libraries, because libraries are generally only in the popular language and scripting languages.
These problems often cause the script to become inflexible and waste a lot of time on the developer. The Python language, as its alternative, is a pretty good choice. Using Python as an alternative to shell scripting often has many advantages:
- Python is installed by default in the mainstream Linux distributions. Open the command line and enter Python to get into the Python world immediately. This feature allows it to be the best choice for most scripting tasks.
- Python is easy to read and the syntax is easy to understand. Its style focuses on writing simple and clean code that allows developers to write style code that fits the shell script.
- Python is an interpretive language, which means that there is no need to compile. This makes Python the ideal scripting language. Python also reads, interprets, and outputs a looping style that allows developers to quickly try new code through the interpreter. Developers can implement some of their ideas without having to rewrite the entire program.
- Python is a fully functional programming language. Code reuse is simple because Python modules can be easily imported and used in scripts. Scripts can be easily extended.
- Python has access to excellent standard libraries, as well as a large number of third-party libraries that implement multiple functions. such as interpreter and request library. For example, the Python standard library contains a time library that allows us to convert time into the various formats we want and compare it to other dates.
- Python can be a part of the command chain. Python cannot completely replace bash. Python programs can be as UNIX-style (read from standard input, output from standard output), so Python programs can implement some shell commands, such as cat and sort.
Let's reuse the Python build, based on the issues mentioned earlier in the article. In addition to the work that has been done, let's take a look at how many times a user has logged on to the system. The Uniq command simply deletes duplicate records without prompting you how many times they have been repeated. We use the Python script instead of the Uniq command, and the script can be used as part of the command chain. Here's how the Python program implements this (in this case, the script is called namescount.py):
#!/usr/bin/env python
import sys
if __name__ = = "__main__":
# Initializes a names dictionary with empty
# The key value pairs for name and occurrence number in the dictionary
names = {}
# Sys.stdin is a file object. All methods referenced to the file object,
# can be applied to Sys.stdin.
For name in Sys.stdin.readlines ():
# Every line has a newline character
to end # We need to delete it
name = Name.strip ()
if name in Names:
Names[name] + = 1
else:
names[name] = 1
# iteration dictionary,
# output name, space, then the number of the name appears for
name, Count in Names.iteritems ():
sys.stdout.write ("%d\t%s\n"% (count, name))
Let's take a look at how the Python script works in the command chain. First, it reads data from the standard input Sys.stdin object. All the output is written to the Sys.stdout object, which is the implementation of the standard output in Python. Then use the Python dictionary (called a hash table in other languages) to hold the mappings of names and repetitions. To read the number of login times for all users, simply execute the following command:
$ Cat Names.log | Python namescount.py
This will output the number of times a user has appeared and his name, using tab as a separator. The next thing is to output in descending order of user login times. This can be implemented in Python, but let's use UNIX commands to implement it. As mentioned earlier, use the sort command to sort alphabetically. If the sort command receives a-RN parameter, it sorts by descending the number. Because the Python script outputs to the standard output, we can get the output by using the pipe link sort command:
$ Cat Names.log | Python namescount.py | Sort-rn
This example uses Python as part of the command chain. The advantages of using Python are:
- You can link to commands such as cat and sort. Simple tools (read files, sort files by number), and you can use mature UNIX commands. These commands are read on one line, which means that these commands are compatible with large volumes of files, and they are highly efficient.
- If a part of the command chain is difficult to implement, it's clear that we can use the Python script, which allows us to do what we want to do and then reduce the burden of a chain of commands.
- Python is a reusable module, and although this example specifies names, you can output each row and the number of repetitions of the row if you need to process other inputs to the duplicate rows. Make the Python script modular so you can apply it to other places.
To demonstrate the power of combining modules and piping styles in Python scripts, let's expand on this. Let's find out the top 5 users who use the service most. The head command allows us to specify the number of rows that need to be output. Add this command to the command chain:
$ Cat Names.log | Python namescount.py | Sort-rn | Head-n 5
This command will only list the top 5 users. Similarly, to get the 5 users who use the service at least, you can use the tail command, which uses the same parameters. The result of the Python command is output to the standard output, which allows you to extend and build its functionality.
To demonstrate the modular nature of the script, let's extend the problem again. The service also generates a comma-separated CSV log file that contains an email address list and an evaluation of the address for our service. The following is one example:
"Email@example.com", "This service is great."
The task is to provide a way to send a thank-you message to the top 10 users who use the service. First, we need a script to read the CSV and output one of the fields. Python provides a standard CSV reading module. The following Python script implements this functionality:
#!/usr/bin/env python
# CSV module that comes with the Python standard library
import CSV
import sys
if __ name__ = = "__main__":
# CSV module uses a Reader object as input
# in this case, it's Sys.stdin.
CSVFile = Csv.reader (Sys.stdin)
# This script must receive a parameter that specifies the ordinal # of the column
using SYS.ARGV to get the parameters.
Column_number = 0
If len (SYS.ARGV) > 1:
column_number = Int (sys.argv[1))
# CSV file each row is a comma-delimited field For
row in CSVFile:
print Row[column_number]
This script can convert the CSV and return the text of the field specified by the parameter. It uses print instead of Sys.stout.write because print defaults to using standard output as its output file.
Let's add this step to the command chain. The new script is combined with the other commands to achieve the email address with the most comment output. (assuming the. csv filename is called Emailcomments.csv, the new script is csvcolumn.py)
Next, you need a way to send mail, in the Python function standard library, you can import the Smtplib library, which is a module for connecting to the SMTP server and sending mail. Let's write a simple Python script that uses this module to send a message to each top 10 user.
#!/usr/bin/env python import smtplib import sys gmail_smtp_server = "smtp.gmail.com" Gmail_smtp_port = 587 Gmail_em AIL = "Your gmail Email Goes" Gmail_password = "Your gmail PASSWORD Goes here" def initialize_smtp_server (): '
' This function initializes and greets the SMTP server.
It logs in using the provided credentials and returns the SMTP server object as a result. "' SmtpServer = smtplib. SMTP (Gmail_smtp_server, Gmail_smtp_port) Smtpserver.ehlo () Smtpserver.starttls () Smtpserver.ehlo () Smtpserver.logi N (gmail_email, Gmail_password) return smtpserver def send_thank_you_mail (email): to_email = Email From_email = G
Mail_email subj = "for being a active commenter" # The header consists of the to and from and Subject lines # separated using a newline character header = "to:%s\nfrom:%s\nsubject:%s \ n"% (To_email, From_email, SUBJ) #
hard-coded templates are not best practice. Msg_body = "" "Hi%s, Thank you verY for your repeated comments in our service.
The interaction is much appreciated. Thank you. "" "% email content = header +" \ n "+ msg_body SmtpServer = Initialize_smtp_server () smtpserver.sendmail (f
Rom_email, To_email, content) Smtpserver.close () if __name__ = = "__main__": # for every line of input.
For email in sys.stdin.readlines (): send_thank_you_mail (email)
This Python script can connect to any SMTP server, either locally or remotely. For ease of use, I use the Gmail SMTP server, and normally, you should provide a password password to connect to Gmail, which uses functions in the SMTP library to send mail. Once again proving the power of using Python scripts, interoperating like SMTP is easier to read using Python. The same shell script may be more complex and libraries like SMTP are basically not.
To send e-mail to the top 10 users with the highest frequency of comments, you must first get the contents of the e-mail column separately. To remove a column, you can use the Cut command in Linux. In the following example, the command is in two separate strings. For ease of use, I write output to a temporary file, which can be loaded into the second string of commands. This just makes the process more readable (Python sends the mail script for short sendemail.py):
$ Cat Emailcomments.csv | Python csvcolumn.py |
? Python namescount.py | Sort-rn >/tmp/comment_freq
$ cat/tmp/comment_freq | head-n | cut-f2 |
Python sendemail.py
This shows Python's real power as a utility such as the Bash command chain. Scripts that accept data from standard input and write any output to standard output allow developers to string these commands, fast, simple commands, and Python programs in the chain. This philosophy of designing small programs for one purpose is well suited to the command flow method used here.
Python scripts that are usually used at the command line, and when they run a command, the parameters are selected by the user. For example, the head command gets the parameter flag for a-N and the number that follows it, and then prints only the number of rows in that number size. Each parameter of the Python script is provided through the SYS.ARGV array, which can be accessed later in the import sys. The following code shows how to use a single word as a parameter. The program is a simple adder that has two numeric parameters, adds them together, and prints out the output to the user. However, this command-line parameter is used in a very basic way. It's also easy to make mistakes-for example, by typing two strings, such as Hello and world, you'll get the wrong start:
#!/usr/bin/env python
import sys
if __name__ = = "__main__":
# The argument of SYS.ARGV is always the FIL ename,
# meaning that length of system arguments would be
# more than one, when command-line arguments exist.
if len (SYS.ARGV) > 2:
num1 = Long (sys.argv[1])
num2 = Long (sys.argv[2])
else:
print "This Command takes two arguments and adds them "print" less than two arguments
. "
Sys.exit (1)
print "%s"% str (NUM1 + num2)
Thankfully, Python has a lot of modules that handle command-line arguments. Personally, I like Optionparser more. Optionparser is part of the Optparse module provided by the standard library. Optionparser allows you to do a series of very useful operations on command line parameters.
- If you do not provide a specific parameter, you can specify a default parameter
- It supports parameter flags (either displayed or not) and parameter values (-N 10000).
- It supports different formats for passing parameters-for example, differential-n=100000 and-n 100000.
We're going to use Optionparser to improve the Sending-mail script. The original script has a lot of variables hard-coded in place, such as SMTP details and user login credentials. The following is provided in the code where these variables are used to pass command-line arguments:
#!/usr/bin/env python import smtplib import sys from optparse import optionparser def initialize_smtp_server (smtpserv
Er, smtpport, email, pwd): ' This function initializes and greets the SMTP server.
It logs in using the provided credentials and returns the SMTP server object as a result. "' SmtpServer = smtplib.
SMTP (SmtpServer, Smtpport) Smtpserver.ehlo () Smtpserver.starttls () Smtpserver.ehlo () (email, smtpserver.login) Return smtpserver def send_thank_you_mail (email, smtpserver): to_email = Email From_email = Gmail_email SUBJ = "For being an active commenter" # The header consists of the to and from and Subject lines # separated using a
NewLine character. Header = "to:%s\nfrom:%s\nsubject:%s \ n"% (To_email, From_email, subj) # hard-coded templates are not best practi
Ce.
Msg_body = "" "Hi%s, Thank you very a very for your repeated comments on our service.
The interaction is much appreciated. Thank you. "" "%Email content = header + "\ n" + msg_body Smtpserver.sendmail (from_email, to_email, content) If __name__ = "__main
__ ": Usage =" Usage:%prog [options] "parser = Optionparser (usage=usage) parser.add_option ("--email ", dest=" email ", help= ' email to login to SMTP server ' parser.add_option ("--pwd", dest= "pwd", help= "password to login to SMTP")
Server ") parser.add_option ("--smtp-server ", dest=" SmtpServer ", help=" SMTP server URL ", default=" smtp.gmail.com ") Parser.add_option ("--smtp-port", dest= "Smtpserverport", help= "SMTP Server Port", default=587) options, args = PA Rser.parse_args () if not (Options.email or options.pwd): Parser.error ("must provide, both an email and a password
") SmtpServer = Initialize_smtp_server (Options.stmpserver, Options.smtpserverport, Options.email, Options.pwd)
# for every line of input.
For email in sys.stdin.readlines (): send_thank_you_mail (email, smtpserver) smtpserver.close ()
This script shows the role of Optionparser. It provides a simple, Easy-to-use interface to command line arguments, allowing you to define certain attributes for each command-line option. It also allows you to specify default values. If you don't give some parameters, it can give you a specific error.
How much have you learned now? Instead of using a Python script to replace all bash commands, we recommend that Python do some of the difficult tasks. This requires more modular and reusable scripting, as well as leveraging Python's powerful capabilities.
Using stdin as a file object, this allows Python to read input, which is output from other commands to the pipeline, and output to stout, allowing Python to pass information to the next step in the piping system. Combined with these features, you can implement powerful programs. The example mentioned here is to implement a log file that handles the service.
In practice, I've been working on a GB-level CSV file, and I need to use the Python script to convert a SQL command that contains the inserted data. Knowing the files I need to work with and working with the data in a table, the script takes 23 hours to execute and generate 20GB of SQL files. The advantage of using the Python programming style mentioned in this article is that we don't need to read this file into memory. This means that the entire 20gb+ file can be processed on one line. And we have a clearer breakdown of each step (read, sort, maintain, and output) for some logical steps. And we have the security of these commands, which are the core tools of the Unix-type environment, they are very efficient and stable, and can help us build stable and secure programs.
Another advantage is that we don't need to hard-code the filename. This allows the program to be more flexible, passing only one parameter. For example, if the script is interrupted at 20000 in a file, we do not need to rerun the script, we can use tail to specify the number of rows that failed, so that the script continues to run in this location.
Python has a wide range of applications in the shell, not limited to this article, such as OS modules and subprocess modules. The OS module is a standard library that performs many operating system-level operations, such as listing the directory structure, file statistics, and an excellent Os.path module that can handle the canonical directory path. The subprocess module allows the Python program to run system commands and other advanced commands, such as pipeline processing between the use of Python code and the spawned process mentioned above. If you need to write Python shell scripts, these libraries are worth studying.