From Wang Yin: the flaw of Unix
I would like to explain my understanding of the nature of Unix philosophy through this article. While I point out a design problem with UNIX, the goal is not to attack the interest of UNIX. Although Unix has a very serious problem on the basic concept, after years of development, the problem may have been compensated by various other factors (such as a large number of human). But if we start to confront this problem, we may be able to slowly improve the structure of the system, so that it is more efficient, convenient and safe to use, and that's fine. At the same time, it is hoped that the description of Unix command essence can help people to grasp Unix quickly, and apply its potential flexibly to avoid its shortcomings.
The "Unix Philosophy", commonly referred to, consists of the following three principles [Mcllroy]:
1, a program to do only one thing, and do it well.
2, the program can work together.
3, the program handles the text stream, because it is a common interface.
Of these three principles, the first two actually existed earlier than Unix, and they describe the most basic principle of programming-the principle of modularity. Any programming language with functions and invocations has these two principles. In short, the first one is for the function and the second for the call. The so-called "program" is actually a function called "main" (see below).
So only the third (interface with text flow) is unique to Unix. The following "Unix philosophy", if not modified, refers specifically to this third principle. But many facts have shown that this third principle actually contains a material error. It has not only been creating unnecessary problems for us, but has largely undermined the implementation of the first two principles. However, this principle has been consecrated by many people. Many programmers use text streams to represent data in their own programs and protocols, triggering a variety of headaches and ignoring them.
Linux has an innovation that is better than Unix, but we have to see that it inherits the Unix philosophy. The Linux system's command line, configuration file, and various tools pass data through non-normalized text streams. This creates inconsistencies in information formats and difficulties in inter-program collaboration. However, I do not mean that Windows or Macs are doing much better, although they have improved. Virtually all common operating systems are affected by the Unix philosophy, so that they are more or less shadowed.
The impact of Unix philosophy is manifold. From the command line to the program language, to the database, to the Web ... Every aspect of the computer and network system shows its shadow. Here, I will associate many problems with their roots--unix philosophy. Now I'll start with the simplest command line, and hopefully you'll see from these simplest examples the Unix process of executing commands and the problems that exist. (The essence of a text stream is a string, so these two nouns are common in the following paragraphs.) )
The basic process of running a Linux command
Almost every Linux user has been puzzled by its command line. Many people (including me) used Linux for several years and did not fully master the use of the command line. Although read the document reading thought all see through, then still will appear inexplicable problem, sometimes even spend a large half a day on top. In fact, if you see through the nature of the command line, you will find that many problems are not user's fault. Linux has inherited Unix's "philosophy", using text streams to represent data and parameters, leading to difficult to learn from the command line.
Let's start by analyzing how the Linux command line works. is a very simple process of running Linux commands. This is not the whole process, of course, but the more specific details have nothing to do with the subject I am now talking about.
From what we can see, in the whole process of the LS command running, the following things have happened:
1. The shell (in this case bash) gets the input string "Ls-l *.c" from the terminal. The shell then splits the string with a blank character, resulting in three strings of "LS", "-L", and "*.c".
2, the shell found that the second string is the wildcard "*.c", so in the current directory to find a match with this wildcard character file. It found two files: foo.c and bar.c.
3, the shell put the names of the two files together with the rest of the string into a string array {"LS", "-L", "bar.c", "foo.c"}. Its length is 4.
4. The shell generates a new process in which a program named "LS" is executed, and the string array {"LS", "-L", "bar.c", "foo.c"}, and its length 4, as the parameter of LS's main function. The main function is the "portal" of the C language program, which you may already know.
5, the LS program starts and obtains these two parameters (ARGV,ARGC), does some analysis to them, extracts the useful information in it. For example, LS finds that the second element of the string array argv "-L" begins with "-" and knows that it is an option-the user wants to list the details of the file, so it sets a Boolean variable to represent this information so that it can later determine the format of the output file information.
6, LS lists foo.c and bar.c two files "long format" information after exiting. Returns the value as an integer 0.
7, the shell learned that LS has exited, the return value is 0. In the shell, 0 indicates success, while other values (regardless of positive negative numbers) indicate failure. So the shell knows that LS is running successfully. Since there is no other command to run, the shell prints out a prompt to the screen and starts waiting for the new terminal input ...
From the process of running the above command, we can see that the text stream (string) is ubiquitous in the command line:
The user input in the terminal is a string.
The shell gets a string from the terminal, decomposes it to get 3 strings, and expands the wildcard character to get 4 strings.
The LS program obtains the 4 strings from the parameters, and when it sees the string "-l", it decides to use the long format for the output.
Next you will see the problem caused by this practice.
A corner of the iceberg
At the beginning of the book "Unix Hate Man Handbook" (The Unix-hater's Handbook, hereinafter referred to as UHH), the author enumerates a series of counts of the UNIX command-line user interface, which I think is a bad-tempered beginner's curse. But look carefully, you will find that although the attitude is not good, some of their words have very deep truth. We can always learn something from the person who scolded us, so we looked carefully and found that the root of these command-line problems is the "Unix philosophy"-The use of text streams (strings) to represent parameters and data. Many people do not realize that the excessive use of text streams has caused too many problems. I'll list them later, but I'll give you some of the simplest examples to explain the nature of the problem, and you can try it yourself now.
Here are the LS experiments, but I failed the experiment, the following is the experimental process
Execute the following commands in your Linux terminal (enter: greater than, minus, lowercase l). This will create a file called "-L" under the directory.
$ >-l
Execute Command LS * (your intent is to list all files in the directory in short format).
What did you see? You did not give LS any option, the file is unexpectedly "long format" listed, and this list does not have you just created the name "-L" file. For example, I get the following output:
-rw-r--r--1 WY WY 0 2011-05-22 23:03 bar.c-rw-r--r--1 WY WY 0 2011-05-22 23:03 foo.c
What the hell is going on here? Revisit the above, paying special attention to the second step. The original shell before calling LS, the wildcard * expanded into all the files in the directory, that is, "foo.c", "bar.c", and a file called "-l". It adds the 3 strings to LS's own name, puts it into a string array {"ls", "bar.c", "foo.c", "-L"}, and gives it to LS. What happens next is that LS gets the string array and finds that there is a string "-L" in it, which is considered an option: the user wants to output file information in "Long format". Because "-l" is considered an option, it is not listed. So I got the result: long format, and one less file!
What does that mean? Is it the user's fault? Experts may laugh, how someone would be so silly, in the directory to create a "-l" file. But it is this attitude that has led us to turn a blind eye to the mistake and even let it flourish. In fact, to remove the superiority of the heart, from a rational point of view, we found that all these are system design problems, not the user's error
I think that in order to dispense with the responsibility, a system must provide practical safeguards, not just verbal engagements that require users to be "careful". It's like if you dig a big hole in the street, you have to put up roadblocks and warning lights. You can't just put a little flag in there, with a line of small print: "The construction ahead, the consequences of their own." "I think every normal person will be judged to be the fault of the builders.
But Unix for its users has been like this construction, it requires users: "Look at the man page, otherwise the consequences of self-esteem." "It's not that users want to be lazy, but these terms are too many, no one can remember." And who's going to look at the remote stuff before they get bitten? But when bitten, regret is too late. To accomplish a simple task, you need to know so many possible pitfalls, and the more complex tasks you can do. In fact, these small problems of Unix add up, do not know how much valuable time to spend.
、
If you want to be more sure about the dangers of this problem, try the following. Before this, please create a new test directory to avoid losing your files!
1, in the new directory, we first set up two folders dir-a, Dir-b and three ordinary files File1,file2 and "-rf". Then we run "rm *" with the intention of deleting all normal files without deleting the directory.
$ mkdir dir-a dir-b$ touch file1 file2$ >-rf$ RM *
2, then use LS to view the directory.
You will find that there is only one file left: "-rf". Originally "rm *" can only delete ordinary files, now because there is a directory called "-RF" file. RM thought that was the option of forcing it to be removed, so it deleted all the files in the directory along with the directories (except "-RF").
Surface Solutions
Does this mean that we should prohibit the existence of any file name that begins with "-" because it causes the program to divide the options and file names? Unfortunately, because Unix gives programmers "flexibility", not every program thinks that the "-" parameter is the option. For example, commands such as Tar,ps under Linux are exceptions. So the scheme is not very feasible.
As we can see from the above example, the source of the problem seems to be that LS does not know the existence of a wildcard * at all. It is the shell that expands the wildcard character to LS. In fact, LS gets a string array of filenames and options mixed together. So the author of the UHH said: "The shell should not expand wildcards at all." Wildcards should be sent directly to the program, which is expanded by invoking a library function on its own. ”
This scenario is true: if the shell directly gives the wildcard character to LS, then LS will only see a parameter of "*". It will call the library function to find all the files in the current directory in the file system, it will be clear that "-l" is a file, not an option, because it does not get any option from the shell (it only gets a parameter: "*"). So the problem seems to be solved.
However, each command checks the existence of the wildcard itself and then calls the library function to interpret it, greatly increasing the programmer's workload and the probability of error. Besides, the shell not only expands the wildcard characters, but also environment variables, curly braces, ~ expand, command substitution, arithmetic operations unfold ... Does this make every program do it by itself? This is in violation of the first Unix philosophy-the principle of modularity. And this method is not once and for all, it can only solve this problem. We will also encounter more problems caused by text flow, which cannot be solved by this method. Here is an example of this.
Another corner of the iceberg
These seemingly trivial problems actually contain the Unix nature of the problem. If we don't recognize it correctly, we jump out of the question and go to the other one. Let me tell you a personal experience. I had a thing like this last summer at the end of my internship at Google ...
Because of my project's dependency on an open source project, I must submit all the files for this open source project in Google's Perforce code base. This open source project has more than 9,000 files, and Perforce is so slow, in the submission to one hours, the sudden error exits, said two files can not be found. Tried again two times (by the way out to drink coffee, played billiards), or failure, so that the day is almost over. So I searched for these two files, which do not exist. How could it be? I used the command line of the company manual to import the project files into the Perforce. This command is like this:
Find-name *.java-print | Xargs P4 Add
It works because the Find command finds all the files that end with ". Java" Under the directory tree, separates them into a string with a space character, and then gives it to Xargs. Xargs then takes the string apart as a space character into multiple strings, after "P4 add", combines it into a single command and executes it. Basically you can think of find as "filter" in Lisp, and Xargs is "map". So this command translates into Lisp-style pseudo code:
(Map (Lambda (x) (P4 add X)) (Filter (lambda (x) (Regexp-match) "*.java" x))) (Files-in-current-dir))
What is the problem? After an afternoon of confusion I finally found that in this open source project in a directory, there is a file called "App Launcher.java". Because its name contains a space, the Xargs is split into two strings: "App" and "Launcher.java". Of course, none of the two files exist! So Perforce complained about not finding them at the time of submission. After I told the team's head of the discovery, he said, "How can these guys get a name for the Java program?" It's too much food! ”
But I don't think it's the programmer's error in this open source project, which actually shows the Unix problem. The root of the problem is that Unix commands (find, Xargs) pass filenames as strings, and their default "protocol" is "space-delimited file names." And this project just has a file name inside there are spaces, so led to the creation of ambiguity. Who should blame it? Since Linux allows a space in the file name, the user is entitled to use this feature. In the end of the problem, users are called Rookie, why they are not careful, do not look at the man page.
Then I took a closer look at the man page of Find and Xargs, and found out that their designers were actually aware of the problem. So find and Xargs each have an option: "-print0″ and" -0″. They allow find and xargs to use no whitespace, and "NULL" (ASCII character 0) as the delimiter for the file name, which avoids the problem of whitespace in the file name. However, it seems that every time you encounter such a problem is always known behind. Do users really need to know so much, be careful, to use Unix effectively?
Text flow is not a reliable interface
These examples actually show the same essential problem from different sides: there are serious problems in passing data with text streams. Yes, the text stream is a "generic" interface, but it is not a "reliable" or "convenient" interface. The Unix command works basically like this:
Get text flow from standard input, process, print text stream to standard output.
The program communicates with the pipeline so that the text stream can be passed between programs.
There are two main processes:
1, the program to the standard output "printing" when the data is converted to text. This is a coding process.
2, the text through the pipeline (or file) into another program, the program needs to extract the information it needs from the text. This is a decoding process.
The coding seems very simple, you just need to design a "grammar", such as "separated by a space", you can output. But the design of the code is far from easy to imagine. If the coding format is not well designed, the decoding person is troublesome, light needs regular expression to extract the text information, encountered a complex bit of coding (such as program text), you have to use parser. The most serious problem is that because of encouraging the use of text streams, many programmers are free to design their coding methods without having to think closely. This results in almost every Unix program has its own different output formats, making decoding a very headache problem, often ambiguous and confusing.
The problem with the Find/xargs above is that the find-encoded delimiter (space) is confused with the possible spaces in the file name-the space is not the same. The problem with the previous LS and RM is that because the shell "encodes" the file name and the option "string", the LS program cannot decode to determine whether it is a file name or an option--this string is not the same string!
If you have used Java or a functional language (Haskell or ML), you might understand some type theory (type theory). In type theory, the type of data is diverse, Integer, String, Boolean, List, record ... The so-called "data" that passes between programs is just these types of data structures. However, according to the Unix design, all types have to be converted into strings and passed between programs. This poses a problem: Because the unstructured String does not have enough expressive force to differentiate the other data types, there is often ambiguity. In contrast, if you use Haskell to represent command-line arguments, it should look like this:
Shelldata Parameter = Option String | File String | ... 1data Parameter = Option String | File String | ...
Although the substance of both things is String, Haskell adds "tags" to it to differentiate between Option and File. So when LS receives the parameter list, it determines from the tag which is the option, which is the argument, not the content of the string to guess.
Too many problems with text flow
In summary, the problem with text flow is that the original simple and clear information, is encoded into a text stream, it becomes difficult to extract, or even lost. The front is a small problem, in fact, the text stream brings a lot of serious problems, it even created the entire field of research. The idea of text flow affects too much design. Like what:
configuration Files : Almost every one saves data in a different text format. Think about it:. BASHRC,. Xdefaults,. SCREENRC,. fvwm,. Emacs,. VIMRC,/etc directory that series! So users need to know too many formats, but they don't have any essential differences. In order to tidy up these documents, it cost a lot of manpower and material resources.
program Text : I'll talk about it later. The program is used as a text file, so we need to parser. This led to the entire compiler field spending a lot of manpower and material research parsing. In fact, the program can be directly stored as the parse tree, so the compiler can read the parse tree directly, not only save compilation time, even parser do not write.
Database Interface : The interaction between a program and a relational database uses a string containing a SQL statement, which makes it very difficult to debug because the contents of the string are not associated with the type of the program.
XML: The original intention of the design is to solve the problem of data encoding, but unfortunately, it is difficult to parse itself. It is similar to SQL and is poorly correlated with the type in the program. The type name in the program, even if the definition in the XML is biased, the compiler will not error. The "force close" that is often present in Android programs is the reason most of the time. Some of the XML-related things, such as XSLT, XQuery, XPath, and so on, are poorly designed.
Web: JavaScript is often inserted as a string into a Web page. Because strings can be arbitrarily combined, this causes a lot of security problems. Web security research, some of which is to solve this kind of problem.
IDE Interface : Many of the interfaces provided by compilers to editors and Ides are text-based. The compiler prints out the line numbers and information for the error, such as "102:32 variable x undefined", and then the editor and IDE extract the information from the text and jump to the appropriate location. Once the compiler changes the print format, these editors and Ides will have to be modified.
Log Analysis : Some companies to debug the process of printing text log information, and then specifically asked to write a program to analyze the log, from the inside to extract useful information, very time-consuming and laborious.
Test : When many people write unit test, they like to compare the data structure with a standard string after it has been transformed into a string by a function such as toString, which causes the tests to expire after the string format has changed and must be modified.
There are a lot of examples, you just need to find out around you.
What are human readable and generic interfaces?
When I mention the various drawbacks of the text stream as an interface, it is often pointed out that although the text flow is unreliable and cumbersome, it is more common than other interfaces because it is the only human-readable (human-readable) format, and any editor can directly see the contents of the text stream, while other formats are not. What I want to say about this is:
1, what is called "human readable"? is the text stream really that readable? A few years ago, the ordinary text editor encountered the Chinese language often garbled, to toss for a while to let them support Chinese. Fortunately, with the cooperation of the whole world, we now have Unicode.
2, now to read the Unicode file, you need not only support Unicode editor/browser, you have to be able to display the corresponding code segment font. Does the text flow achieve "human readable" really effortless?
3, in addition to text flow, in fact, there are many human readable formats, such as JPEG. It is more "readable" and "generic" than the text stream, and doesn't even have a font.
Therefore, the root of the text flow is not the "human readable" and "universal" key. The real key is "standardization." If other data types are standardized, then we can add support to them in any editor, browser, terminal, completely human and machine readable, just as we read text and JPEG today.
Solution Solutions
In fact, there is a simple way to solve all these problems once and for all:
1. Preserve the original structure of the data type. No text streams are used to represent data other than text.
2, in an open, standardized, extensible way to represent all data types.
3, the data transfer and storage between programs, just like the internal structure of the program.
The nature of the Unix command line
Although the text stream is causing so many problems, Unix still doesn't die, because there are so many upper-level applications that rely on it, and it's almost the backbone of the entire Internet. So one of the practical implications of this article for the current situation might be to help people quickly understand the Unix command-line mechanism and encourage programmers to use structured data in new applications.
Unix commands are overly complex and functionally redundant, but if you look at their nature, you can easily learn how to use them. In short, you can use common programming ideas to explain all of the Unix commands:
1, Function : Each Unix program is essentially a function (main).
2. Parameter : Command line parameter is the parameter of this function. All parameters are strings for the C language, but after parse, they may have several different types :
1, variable name : In fact, the file name is the name of the variable in the program, like X, Y. And the nature of the document is an object in the program.
2, String : This is the real program in the string, just like "Hello World".
3, keyword argument: The option is essentially "keyword argument" (kwarg), similar to Python or Common Lisp inside the corresponding thing, short options (look like "-L", "-C" and so on), is essentially a bool type of kwarg. For example, "Ls-l" in Python syntax is LS (l=true). The long option is essentially a string type of kwarg. For example, "Ls–color=auto" in Python syntax is LS (color=auto).
3. Return value : Since the main function can only return an integer type (int), we have to serialize the return value of other types (string, list, record, ...) to a text stream and then send the file to another program. Here "file" refers to disk files, pipelines and so on. They are the channels through which text flows. As I have already mentioned, the nature of a document is an object in the program.
4, Combination : So-called "pipeline", but a simple combination of functions (composition). such as "A x | B ", represented by a function is" B (A (x)) ". But note that the computational process here is essentially a lazy evaluation (similar to Haskell). When B "needs" the data, a will read a larger portion of x and calculate the result for B. Not all combinations of functions can be represented by pipelines, for example, how to represent "C (B (x), A (y)) with A pipe"? So the combination of functions is a more general mechanism.
5, Branch : If you need to send the return value to two different programs, you need to use tee. This is equivalent to saving the result to a temporary variable in the program and then using it two times.
6. Control Flow : The return value of the main function (type int) is used by the shell as the control flow. The shell can break or continue running a script based on the return value of the main function. This is like the Java exception.
7. Shell: The essence of various shell languages is to connect the language of these main functions, and the shell is actually a REPL (Read-eval-print-loop, like Lisp). In the language of programming, the shell language is completely superfluous, and we can actually use the same programming language as the application in REPL. This is how the LISP system does it.
The possibility of direct data storage
Because of the structured data stored, any tool that supports this format allows the user to manipulate the data structure directly. This can bring unexpected benefits.
1, because the command line operation is a structured parameter, the system can be very smart to fill the command by type, so that you are completely impossible to enter a syntax error command.
2. You can insert a "meta data" such as a display image directly on the command line.
3, Drag&drop the object on the desktop to the command line, and then execute.
4. Because the code is stored in the parse tree structure, the IDE can easily be extended to support all programming languages.
5, you can look at the email when the code snippet of the IDE-like structured editing, and even compile and execute.
6. Structured version control and program comparison (diff). (Refer to my talk)
There's a lot more, just our imagination.
Program language, operating system, database Trinity
If the main function can accept multiple types of arguments and can have keyword argument, it can return one or more different types of objects as return values. And if these objects can be automatically stored in a special " database ", then the shell, Pipeline, command-line options, and even file systems are not necessary to exist. We can even say that the concept of "operating system" becomes "transparent". Because of this, the essence of the operating system is simply a "runtime system" of some programming language. It's kind of like a JVM to Java. In essence, Unix is the C language runtime system.
If we go further, making the connection to the database transparent, that is, using the same programming language to "implicit" the database, rather than a dedicated database language like SQL, the concept of "database" becomes transparent. What we get will be a very simple, unified, convenient, and powerful system. There is only one programming language in the system, and programmers write high-level language programs directly, execute them from the command line in the same language, and don't worry about where the data is placed. This can greatly reduce the complexity of the programmer's work, let them focus on the problem itself, rather than the internal structure of the system.
In fact, systems like this have historically existed (Lisp machine, System/38,oberon), and have received good results. But for some reason (historical, economic, political, technical), they all die out. But it has to be said that this way is better than the existing Unix way, so why not learn it? I believe that with the development of programming language and compiler technology, this simple and unified design concept will someday change the world.
Shell fourth (bottom)