Article Title: Unix uses conversion and reference to manage metacharacters. Linux is a technology channel of the IT lab in China. Includes basic categories such as desktop applications, Linux system management, kernel research, embedded systems, and open source.
In Unix, there is a special type of character called metacharacters. They indicate special meanings in the system. For example, * and? The operating system considers these metacharacters as wildcards. If these metacharacters exist in the path, file name, or command parameters, the operating system may be misunderstood. Therefore, some methods should be used in the system to enable the operating system to treat these metacharacters as common characters. In Unix systems, the conversion and reference methods are used for processing. What System Engineers need to understand is the difference between the two solutions. And adopt appropriate solutions under specific circumstances.
1. Application of the escape function.
In fact, the escape mechanism is not only available on Unix operating systems, but also similar processing mechanisms in other programming languages. If you have experience in program development, this escape function may be easier to understand. Simply put, the escape function is to add an escape character \ before some metacharacters with special meanings to tell the operating system that this is a common character, cancels the special meaning of metacharacters. For example, if * represents a wildcard, or if ls * represents all files and directories. The \ * is to treat the * as a common character, and the system will no longer regard it as a wildcard.
The common application of this escape function is relatively simple. You only need to add \ before the metacharacters. However, the author also needs to emphasize several special applications of escape characters. Although these applications are special, they are rarely used by common users. However, these functions may be particularly useful for System Engineers.
First, insert a space in the file name.For example, if there is a My Documents folder in the Microsoft operating system, there is a space in the middle. This space also belongs to a special original character in Unix systems. If spaces are directly added when a file or directory is created, the Unix operating system will prompt an error message. At this time, it is often necessary to transfer the help of characters. For example, run the command mkdir My \ Documents ents to create a directory name with spaces in the middle. This is also true if you want to create a file name with spaces. However, if a file or directory contains metacharacters, you also need to use transfer characters to define these special metacharacters. Otherwise, some inexplicable problems may occur.
Second, use escape characters to wrap the command.In Unix operating systems, some commands are especially complex, especially those that are subordinate to Unix operating systems. For example, expdp is a database object Export command in the Oracle database. Although this command has powerful functions, it is a pity that only complex parameters can be used to complete some specific functions. Sometimes it takes several lines to write a command. Although the Unix operating system provides the command line feed method. However, the automatic line feed function is weak. For example, the line feed is not required in the system project. Therefore, using this automatic line feed will make it difficult to read the code. As a result, many System Engineers still want to manually execute branches and branch them after some key parameters to increase the readability of commands. However, if you directly use the return key branch, the system does not recognize it. This is because the return key system considers it as a key for executing commands. That is, when the system engineer clicks the Enter key, the system will think that the user has entered the command. After you type the Enter key, the system automatically runs this command. Therefore, directly entering the Enter key cannot meet the requirements of the command branch. In this case, the help of escape characters is required. If an existing command is long, the engineer wants to divide it into two lines, mainly by dividing the content after the parameter-name into another line. Now you can add an escape character \ before this parameter, and then press the Enter key. With this escape character, the system will cancel the line feed function of the carriage return key. After this operation, a secondary prompt will appear, indicating that the command has not ended and the next line continues. This function may be very practical for System Engineers. At this time, System Engineers can easily split a long string of commands according to their own needs, thus improving the readability of the commands.
Note that the Escape Character \ is a special metacharacter. If you want to use this \ symbol in the command or file name, you also need to use the transfer character. If you want to use the echo or printf command to display the URL. There are many \ symbols in the website. In this case, escape characters must be used for the system to treat the \ symbol as a common character.
2. Use references to solve metacharacters.
In addition to the reference function mentioned above, you can also use the reference function to process these metacharacters. Simply put, a command parameter is placed in a pair of quotation marks. If the content in the quotation marks contains metacharacters, these metacharacters will not work. Now that the escape function can solve the problem of metacharacters, the system also proposes a reference solution. Do you want to do this multiple times? Actually not. When a command line contains multiple metacharacters, you must add an escape character before each metacharacters. To solve this problem, it is very complicated to use escape characters. In this case, it may be more ideal to use the reference mechanism to solve the problem of metacharacters. For example, the system engineer wants to output the following information on the screen (this is the path to a shared file): 192.128.11.3 \ share \ IT \ software \ pdf. If transfer characters are used, what should I do? The output contains four metacharacters (Escape Character \). Therefore, the Administrator has to use four escape characters for processing. Echo 192.128.11.3 \ share \ IT \ software \ pdf is required. This is obviously very troublesome. In this case, it is obviously more appropriate to use references. To use the reference mechanism, you only need to write this command as follows:
Echo '2017. 128.11.3 \ share \ IT \ software \ pdf'
Enclose a large string of commands in single quotes. The system treats some metacharacters in the command line as common characters. That is, you do not need to use transfer characters for each metacharacter. Obviously, this reference solution is much more convenient than using escape characters.
When using the reference mechanism to process metacharacters, pay attention to the difference from double quotation marks. For example, the following three commands are available: echo $ JAVA_HOME, echo '$ JAVA_HOME', and "echo $ JAVA_HOME ". $ JAVA_HOME indicates the Java environment variable of the application. What will happen if the system engineer runs the preceding three commands in sequence? The first command will display the Java environment variables normally. The second command will display $ JAVA_HOME directly, that is, the metacharacter $ has been treated as a common character. The third command still displays the Java environment variables. It can be seen that double quotation marks and single quotation marks are still different in the reference mechanism. So what are their differences? In general, System Engineers should pay attention to the following content. Single quotes protect all metacharacters, that is, when a metacharacters are encountered, the system is told to treat them as common characters. However, if double quotation marks are used, the content between single quotation marks is interpreted as a command line. For example, $ is used as the prefix of environment variables. In fact, single quotation marks and double quotation marks are mutually protected. That is, double quotation marks protect the single quotation marks, while single quotation marks protect the double quotation marks. Because both double quotes and single quotes are metacharacters, you can use the reference mechanism to protect them. However, when both single and double quotation marks (both as common characters) exist, we recommend that you use transfer characters to protect single or double quotation marks. This can avoid misunderstanding and easily improve the readability of the Code. Similarly, if you want to use the Transfer Character \ as a common character, it is best to use single quotes to protect the transfer character, rather than using the transfer character to protect the transfer character. Although this does not affect the actual application, it is an effective way to improve code readability.
From the above analysis, we can see that although the transfer character and reference mechanism can both treat metacharacters as common characters. However, there are still some differences in implementation methods between the two. Therefore, it is best to select a suitable solution based on the application scenarios. In general, the difference between the two is only the difference in implementation methods, there is no difference in the specific functions. However, you need to be careful when selecting a specific solution for ease of coding. However, in general, System Engineers must master both methods. Then the appropriate solution is provided based on different situations. If you have mastered a solution, you may not be able to solve all issues related to metacharacters.