Article Title: The art of writing Linux utilities in C language. Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.
Linux and other UNIX-like systems always come with a large number of tools that execute a wide range of functions from obvious to incredible. The success of a UNIX-like programming environment is largely attributed to the high quality and selection of tools, as well as the simplicity of their interconnection.
As a developer, you may find that the existing utility is not always able to solve the problem. Although it is easy to solve many problems by combining existing utilities, solving other problems requires at least some practical programming work. These subsequent tasks are usually candidate tasks for creating new Utilities. Creating new utilities in combination with existing utilities can solve the problem by doing the least work. This article describes the quality of excellent utilities and the process of designing such utilities.
What are the qualities of excellent utilities? The UNIX Programming Environment book by Kernighan & Pike contains a wonderful discussion on this issue. An excellent Utility is a utility used to do your work as well as possible. It must work well with other utilities; it must be easy to work with other utilities. Programs that cannot be used together with other utilities are not utilities, but applications.
Utilities should allow you to easily build disposable applications at a low cost based on the materials at hand. Many people think that utilities are like tools in the toolbox. The purpose of designing a utility is not to allow a single tool to do everything, but to own a set of tools, each of which does one thing as well as possible.
Some utilities are quite useful themselves, while other utilities must work with a series of utilities. Examples of the former include sort and grep. On the other hand, xargs is rarely used separately except for other utilities (the most common is find.
What language is used to write utilities? Most UNIX system utilities are written in C. The examples in this article use Perl and sh. Use appropriate tools to do the right thing. If you use a utility frequently enough, the cost of writing it in a compiled language may be rewarded by performance improvement. On the other hand, the script language may provide faster development speed when the workload of the program is very low.
If you are not sure, you should use the language you know best. At least when you prototype a utility or figure out how it is available, the programmer's efficiency will take precedence over performance adjustments. Most UNIX system utilities are written in C, because these utilities are used frequently enough to make efficiency more important than development costs. Perl and sh (or ksh) may be good languages for rapid prototyping. For utilities that work with other programs, it may be easier to use shell to write them than to use a more traditional programming language. On the other hand, when you want to interact with the original bytes, C may be the best choice.
Design utilities A good rule of thumb is that when you have to solve a problem for the second time, you should first consider the design of the utility. Do not feel sorry for the one-time work you wrote for the first time; you can regard it as a prototype. For the second time, compare your required functions with those required for the first time. Before and after the third time, you should consider taking the time to write a common utility. Even purely repetitive tasks may benefit the development of utilities. For example, many universal file rename programs have been developed because people are disappointed to try to rename files in a common way.
Do one thing well; do not do multiple things badly. The best example of doing one thing well may be sort. No utility except sort has the sorting function. The basic idea is simple: if you only solve one problem at a time, you can spend time solving it.
Imagine that if most programs have the sorting function, but some only support the lexical sorting, while others only support the numerical sorting, and some even support the selection of keywords instead of the entire row, that would be a frustrating thing. At least, this is also annoying.
When you find that a problem needs to be solved, you should try to break down the problem into multiple parts. Do not repeat the existing parts in other utilities. The more you pay attention to tools that can be used with existing tools, the more useful your utilities are.
You may need to write multiple programs. The best way to complete specialized tasks is usually to write one or two utilities and link them with clues, rather than writing a single program to solve the whole thing. It is ideal to use 20 rows of shell scripts to combine the new utility with existing tools. If you try to solve the problem once, the first change that follows may require you to reconsider the whole process.
I occasionally need to generate two or three columns of output from the database. Compile a program to generate output in a single column, and then use a program to separate the output, which is usually more efficient. The shell scripts that combine these two utilities are temporary, and individual utilities have a longer life cycle than the scripts.
Some utilities serve very specific needs. For a directory containing a large amount of content, if ls output gets out of the screen very quickly, this may be because one of the files has a very long file name, this forces ls to only use a single column for output. It takes some time to use more for the output pagination. Why do we sort the rows by length like the following and then output the results through tail?
List 1. Minimal utility sl that can be found in the world
The script in Listing 1 exactly does one thing. It does not accept any options because it does not require options; it only cares about the length of the row. Thanks to the convenient <> Expression of Perl, this small utility applies to both standard input and Files specified by the command line.
Become a filter Almost all utilities are best suited to think of as filters, although some very useful utilities do not match this model. (For example, a program may be very useful when executing a count, although it does not work well as a filter. Programs that accept command line parameters only as input and potentially generate complex output may be very useful .) However, most utilities should work as filters. By convention, filters act on the rows of text. Most filters should support multiple input files.
Remember that the utility needs to be run in the command line and script. Sometimes, the ideal behavior is slightly different. For example, most versions of ls will automatically sort the input to multiple columns when writing to the terminal. By default, grep prints the name of the file from which the matching item is found when multiple files are specified. The difference should be related to the way the user wants the utility to work, rather than other matters. For example, the old version of GNU bc displays a forced copyright tag at startup. Do not do that. Let your utility do what it should do.
Utilities like to live in pipelines. The pipeline allows utilities to focus on their work, rather than focusing on the details. To live in a pipeline, the utility needs to read data from the standard input and then write data to the standard output. If you want to process records, you 'd better make each row a "record ". Existing programs such as sort and join have already considered this. They will thank you for doing so.
I occasionally use a utility that repeatedly calls other programs for a file tree. This fully utilizes the standard UNIX utility filter model, but this model is only applicable to the utility that reads input and writes output; A utility that cannot be used for local operations or for receiving input/output file names.
Most programs that can run with standard input can also run on a single file or a group of files. Note: it can be proved that this violates the rule against repeated work; obviously, this can be solved by feeding cat output to the next program in the series. However, this seems reasonable in practice.
Some programs may read records in a valid format, but produce completely different outputs. An example is a utility that divides input materials into columns. Such a utility may regard the rows in the input as records, but multiple records are generated on each line in the output.
Not every utility fully fits this model. For example, xargs does not accept records, but accepts file names as input, and all actual processing is done by other programs.
Generalization Try to think of a task as similar to the one you actually execute. If you can find a general description of these tasks, you 'd better try to write a utility that fits the description. For example, if you find that you are sorting the text by morphology in one day, and sorting the text by number in another day, it may be meaningful to write a general sorting utility.
The generalization of functions sometimes leads to the discovery that a program that looks like a single utility is actually two utilities used together. This is good. Writing two well-designed utilities may be easier than writing an ugly or complex utility.
Doing one thing well does not mean simply doing one thing. It means to handle consistent but useful problem space. Many people use grep. However, its major utility lies in its ability to execute related tasks. The various options of grep are used to complete the work of many small utilities. If these work is done by a separate small utility, a large amount of shared and repeated code will eventually be generated.
This rule and the rule for doing one thing well are all inevitable results of a fundamental principle: Avoid code duplication whenever possible. If you write a half-dozen program, each of which sorts rows, you may have to fix six similar bugs six times instead of using a sort program for better maintenance.
This is part of writing a utility, that is, adding most of the work to the process of completing the utility. You may not have time to fully generalize a utility at the very beginning, but when you keep using the utility, you will get a corresponding return.
Sometimes it is useful to add related functions to a program, even if this function is not used to complete identical tasks. For example, a program that prints raw binary data perfectly when running on a terminal device may be more useful because it enables the terminal to enter the original mode. This makes it much easier to test problems involving keyboard ing and new keyboards. I'm not sure why you get the font size when you press the delete key (~) ? This is an easy way to find out what content is actually sent. This is not a completely identical task, but it is similar enough and may become an additional feature.
The errno utility in Listing 2 is a good example of generalization because it supports both numbers and Symbol names.
Robust The stability of the utility is very important. Utilities that are prone to crashes or cannot process real data are not useful utilities. The utility should be able to process any length
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.