Created:
Article attributes: Reprinted
Source: Li Guofeng
Article submitted: xundi (xundi_at_xfocus.org)
Name: Li Guofeng
Student ID: 19808056
Date:
Department: Computer Science and Technology, Computer Application Technology
Cgi security issues
CGI is frequently used when constructing a website, especially when it is used together with JavaScript to implement powerful functions. It is precisely because of the powerful processing capability of CGI programs on the server side, as long as there is a security mistake in the CGI script, intruders can access the computer, causing unpredictable consequences.
I. scripts and programs
Several factors should be taken into account when deciding on the language used to write CGI scripts, one of which should be security. Shell scripts, Perl scripts, PHP scripts, and C executable programs are the most common forms of CGI scripts. Currently, Microsoft's ASP is also widely used, in terms of security, each type has advantages and disadvantages. Although none of them are the best-based on other considerations, such as speed and reusability-each has a practical field.
Shell scripts are generally used for small, fast, and even unwanted CGI programs. Therefore, security is often not considered when writing them. Such negligence can lead to some defects, so that only people with general knowledge of the system can enter the system to move freely.
Although shell CGI programs are the easiest to write and can even be pieced together, it is difficult to control them because they generally do work by executing other external programs. This causes some potential risks. The CGI program inherits the security issues of any programs it uses.
Perl is more advanced than shell scripts. Perl has many advantages in CGI programming and is quite secure. However, Perl can provide CGI authors with sufficient flexibility, resulting in a false sense of security. For example, Perl is interpreted. This means that it is compiled first and then executed one step at a time. This makes it easy to include incorrect user data as part of the code, so as to incorrectly interpret it and form the cause of program suspension.
Finally, let's talk about C. C has quickly become a standard application development language. Almost all UNIX and Windows NT systems are developed using it. From the perspective of security, C seems to be very good, but due to its popularity, many of its security issues have been widely known, and these problems can also be easily exploited.
For example, C performs very poorly on strings. It does not perform any automatic locating or cleanup, so programmers can handle everything on their own. When processing strings, most C programmers simply create a pre-defined space and want it to be large enough to process any user input.
Of course, shell scripts, Perl, PHP, and C are not the only CGI scripting languages. In fact, any computer language that can interact with the web server in a predefined manner can be used to write CGI programs. On UNIX and Windows NT servers, data is transmitted to scripts through environment variables and standard input (stdin). Therefore, any data source that can be read from these two data sources can be written to standard output (sidout) can be used to create CGI: Fortran, C ++, basic, and Cobol. Windows programmers can use popular Visual Basic, which means experienced VB programmers do not have to learn a new language. Macintosh uses appleevents and applescript to communicate with CGI programs, so any language that can read and write both can be used.
However, shell scripts (no matter which shell is used), Perl, and C are still the most popular languages for compiling CGI scripts. This is not to say that they must be used; it is just to say that the libraries of most programs-that is, the most tested and secure libraries-are all written in these three languages. If you choose CGI programming language on your own, it is best to learn from previous experiences.
2. Trust
Almost all cgi security issues come from interaction with users. After receiving input from external data sources, a simple and foreseeable CGI program suddenly stretched in multiple directions. Each side may have the smallest gap so that hackers can sneak in. It is this interaction with users-through forms or file paths-that gives CGI scripts this capability, but it also makes them the most potentially dangerous part to run on Web servers.
Writing a safe CGI script is largely a combination of creativity and delusion. The author must be creative enough to think of the user's use, whether unintentionally or all other methods that may implicitly cause problems to send data. And it must be a bit paranoid, because they may not know when, where, and where they will be tested one by one.
2.1 Two Ways to cause problems
When users log on to the web site and start interactive access, they can cause problems in two ways. One is to distort or violate every restriction or constraint created on the page without complying with the rules, and the other is to do as required.
Most CGI scripts are run in the background of HTML forms and are responsible for processing user input information and providing custom output. In this case, most CGI scripts write data in a special format. They expect the user's input to match the form for collecting and sending information. But this is not always the case. You can have many ways to bypass these predefined formats and send some seemingly random data to the script. CGI programs must be prepared for this purpose.
Next, you can send the expected data type to the CGI script and fill in each field in the form as expected. This type of submission can be imagined from an unintentional user interacting with the site, or from a malicious "hacker ", relies on his knowledge about operating systems and web server software and exploits common programming errors. On the surface, all these intrusions are normal, but they are the most dangerous and difficult to detect. Web site security depends on preventing such intrusion.
2.2 do not trust form data
The most common security mistake in CGI programming is to believe that the data transmitted from the form to the script is unknown to many users, they can always find some methods that programmers have never imagined to send data-and are almost impossible for programmers. The script must consider this. For example, the following situations are all possible:
1) The result selected from a group of single-choice buttons may not be one of the options provided in the form.
2) The data length from a text field may be greater than the length allowed by the maxlength field.
3) The field name may not match the name specified in the form.
2.3 Sources of unreasonable data
For some unintentional or intentional reasons, it may lead to unexpected and risky behaviors when your script receives data that does not know how to process.
The following code implements a form and searches for Yahoo! The CGI script of the database is spam. This script is well designed and safe because it ignores unknown input.
<Form method = "Post" Action = "http://search.yahoo.com/bin/search">
Enter your name, first then last:
<Input type = "text" name = "first">
<Input type = "text" name = "last">
</Form
Maybe the user happens (or consciously) to edit the URL as this CGI script. When the browser submits data to the CGI program, it must simply connect the data in the input form to the cgi url (for get methods ), just as you can easily enter the web page address into your browser, you can modify the data that is sent to this script by yourself.
For example, when you click the submit button on the form, Netscape places a long string character in the location field, which is composed of the cgi url followed by a string of data, most of them look like names and values defined in the form. If you want to, you can freely edit the content of the location field and modify the data as needed: add fields not in the form and extend the text data restricted by the maxlength option, or almost any object. The following shows the URLs that a CGI script expects to submit from the form.
Http://www.altavista.digit.com/cgi-bin? Pg = Q & what = web & IBD = & Q = % 22an + entirely + other % 22
You can modify the same URL. The CGI script is still called, but the received data is unexpected. To ensure security, the script should be designed at the time of writing to identify such input as undesired data and reject it.
Finally, an ambitious hacker may write a program to a Web server and pretend to be a web browser. This program may do something that a real Web browser has never done before, for example, sending a CGI script to a hundred megabytes of data. What if the CGI script does not limit reading data from the POST method? It may crash and may allow the person who crashed the system to access the system.
2.4 reject unqualified form data
CGI scripts can reject unexpected Input submitted to them in several ways. Some or all of these skills should be used when writing CGI.
First, the CGI script should set the limit on the amount of data received, not only to limit the entire submission, but also to limit each name/value pair in the submission. For example, the CGI script reads the POST method and checks the size of the Content-Length environment variable to determine whether an input is reasonable and expected. If the unique data received by the CGI script is a person's name, if the Content-Length is greater than 100 bytes, an error should be returned for a reason. No reasonable last name is so long. By setting limits, the script can no longer blindly read the content sent to it.
Next, ensure that the script knows what to do when it receives unrecognized data. For example, if a form requires the user to select one of the two radio buttons, the script should not assume that one button is not selected, the other one must be selected. The following Perl code makes the following mistake:
If ($ form_data {"radio_choice"} EQ "button_one "){
# Button one has been clicked}
Else {
# Button two has been clicked}
This Code assumes that because the form only provides two options, and the first item is not selected, the second item is definitely selected. This is not necessarily true. Although the previous example has no harm, such assumptions may be dangerous in some cases.
The CGI script should be able to handle this situation accordingly. For example, if unexpected or "impossible" situations occur, you can print an error, as described below:
If ($ form_data {"radio_choice"} EQ "button_one "){
# Button one seleted}
Elsif ($ form_data {"radio_choice} EQ" button_two "){
# Button two selected}
Else {
# Error}
By adding the second if statement -- explicitly checking "radio_choice" is actually "button_two" -- this makes the script safer; it does not make assumptions anymore.
Of course, errors are not necessarily generated by scripts in these cases. Some scripts are too careful to verify that every field, even the slightest unexpected data, generates an error message, which is often disappointing to users. Let the CGI script identify unexpected data and then discard it, and automatically select a default value.
On the other hand, scripts can help users correct errors rather than simply sending an error message or setting a default value. If the form requires the user to enter confidential text, the script should be able to automatically skip the blank characters in the input before comparison. The following is a Perl program snippet that completes this function.
$ User_input = ~ S/S //;
# Remove white space by replacing it with an empty string
If ($ user_input EQ $ secret_word ){
# Match! }
Finally, we can proceed further so that the CGI script can process as many different input forms as possible. Although it is impossible to expect all the content that may be sent to the CGI program, there are usually several common methods for a specific aspect, so you can check them one by one.
For example, simply because the written form uses the POST method to submit data to the CGI script, it does not mean that the data must come in that way. Check the requeet_method environment variable to determine whether the get or POST method is used and read the data accordingly, rather than assuming that the data is from the expected standard input (stdin ). A successful CGI script can receive data submitted by whatever method and is secure during processing.
All in all, the script should not make assumptions about the received form data. Unexpected situations should be predicted as much as possible and incorrect input data should be correctly processed. Test the data as much as possible before using the data. Reject unreasonable input and print an error message. If an error or omission occurs, select a default value; you can even try to encode the input to become a reasonable input of the program. The method you choose depends on how much time and effort you want to spend, but remember never blindly receive all the information sent to the CGI foot.
2.5 do not trust path data
Another type of data that users can modify is the server environment variable of path_into. This variable is filled by any path information in the cgi url that follows the script file name. For example, if foobar. Sh is a CGL shell script.
If you use this path_info environment variable, you must carefully verify its content. Just as the form data can be modified in many ways, path_info can also be modified. The CGI script operated blindly based on the path file specified in path_info may cause malicious users to harm the server.
For example, if a CGI foot is designed to simply print the file referenced in path_info, users who edit the cgi url can read almost all the files on the machine, as shown below:
#! /Bin/sh
# Send the header
Echo "conext-type: text/html"
Echo ""
# Wrap the file in some html
#! /Bin/sh
Echo "<HTML> Echo "Here is the file you requested: <PRE>/N"
Cat $ path_info
Echo "</PRE> </body> Although the script works fine when the user only clicks the predefined Link (that is, the http://www.server.com/cgi-bin/foobar.sh/public/faq.txt), a more creative (or malicious) the user may use it to receive any files on the server. If he wants.
Another much safer method is to use the path_translated environment variable whenever possible. Not all servers support this variable, so scripts cannot depend on it. However, if any, it can provide a fully modified path name, rather than a relative URL like path_info.
However, in some cases, if you use path_translated in a CGI script, you can access files that cannot be accessed through a browser. You should know this and its applications.
On most Unix servers, the htaccess file can be located in each subdirectory of the Document Tree and is responsible for controlling who can access special files in the directory. For example, it can be used to restrict a group of web pages to only display to company employees.
Although the server knows how to interpret. htaccess, and thus how to restrict who can still not view these pages, the CGI script does not know. Programs that use path_translated to access arbitrary files in the file tree may happen to overwrite the protection provided by the server.
Whether path_info or path_translated is used, another important step is to verify the path to ensure it is either a real relative path or one of several accurate and predictable paths recognized by the script. For a predetermined path, the script will simply compare the provided data with the internal list of approved files, which means that the script must be re-compiled when adding or modifying the path, however, security is guaranteed. You can select only a few predefined files, but cannot specify the actual path and file name.
The following are some rules that should be followed when handling the paths provided by visitors.
1) The relative path does not start with a slash. Diagonal lines mean "relative to root" or absolute path. If there is, CGI scripts also seldom need to access data outside the web root. In this way, the paths used are relative to the web root directory, rather than absolute paths. Any content starting with a slash should be rejected.
2) The sequence of a single vertex (.) and two vertices (.) In a path also has special meanings. Single point means "for the current directory", while double points mean "relative to the parent directory of the current directory ". A smart hacker can create a reverse layer-3 string such as.../etc/passwd, and then go down to the/etc/passwd file. Reject any content containing the dual-point sequence.
3) Reference disk volumes based on the concept of using drive letters on the NT server. The paths that contain references to the drive start with a letter with a colon. Reject any content with the colon as the second character.
4) NT-based servers also support univesal naming conventions (UNC) references. A UNC file type specifies the machine name and a sharing point, and the rest is related to the specified sharing point on the specified machine. The UNC file type always starts with two backslashes. Any UNC path should be rejected.
2.6 everything looks normal,...
Now you know several ways to provide unexpected data to CGI scripts and how to deal with them. The bigger question is how to verify the legal data submitted by the user.
In most cases, correct but cleverly written form submissions cause more problems than cross-border data. It is easy to ignore meaningless input, but it is much more difficult to determine whether the input in a valid and correct format will cause problems. CGI scripts are very flexible and can do almost anything that a computer can do. Therefore, a very small mistake in security is often exploited without restriction-and this is the most dangerous place.
2.7 process file names
The file name is simple data submitted to the CGI script, but it can cause a lot of trouble if you are not careful. If the user-entered name contains a specific cause, such as a directory Skew and dot, the result may be/file.txt or./file.txt, for example, file.txt. All files in the system may be exposed to a smart hacker based on the installation of the Web server and the operations on submitted file names.
What should I do if the user inputs the name of an existing file or a file name that is important to system running? What if the input name is/etc/passwd or C:/winnt/system32/krnl32.dll? Depending on what operations are performed on these files in the CGI script, they may be sent to users or overwritten by garbage. In Windows 95 and Windows NT, if the backslash character (/) is not checked, the Web browser may be allowed to access files not on the Web machine through the UNC file name.
What if an invalid character is entered in the file name? In UNIX, any file name starting with a period (.) is invisible. In Windows, the slash (/) and backslash (/) are both directory separators. It is very likely that a Perl program is accidentally written. When the file name starts with a pipe (PIPE) (|), an external program is actually executed even though I thought it was just to open a file. If you know what to do, you can even send the control character (such as the Escape key or return key) as part of the file name to the script.
Worse, in shell scripts, a semicolon is used to end a command and start another command. If the script is designed to be a cat input file, the user can input file.txt; RM-RF/as the file name, resulting in the return of fi1e.txt, and then clear the entire hard disk without any confirmation.
2.8 The input is reasonable, but the output is unreasonable
To avoid all these problems, close all security gaps opened by them and check each file name entered by the user. Make sure that the input is exactly what the program expects.
The best way to do this is to compare each character in the input file name with the list of accepted characters. If they do not match, an error is returned. This is much safer than maintaining a list of all valid characters and comparing them-it is too easy to take away any character.
The following program list is an example of how to use Perl to complete this comparison. It can contain any letter (in upper or lower case), any number, underline, or period. It also checks to ensure that the file name does not start with a period. In this way, the code in this section cannot change the directory slash. You cannot place multiple commands on a semicolon line or destroy the pipes called by Perl OPEN.
The program list ensures that all characters are valid
If ($ file_name = ~ /[^ A-Za-Z _/.]/) | ($ file_name = ~ /^ /./)){
# File name contains an illegal characgter or starts with a period
}
Warning
Although the Code in the above program list clears most invalid file names, there may be some restrictions on the operation, and this Code is not covered. For example, can a file name start with a number? Or start with an underscore? What if the file contains more than three characters after multiple periods or periods? Can the entire file name be too short to meet the restrictions of the file system?
You must constantly raise such questions to yourself. The most dangerous thing to do when writing a CGI script is to think that the user will follow the command. In fact, users do not. It is the programmer's own business to ensure that users do not make mistakes.
2.9 process html
Another type of input that looks harmless but can cause a lot of trouble is the HTML obtained when the user is requested to input text information. The following program list is a Perl program fragment that sends greetings to anyone who has entered a name in the $ user_name variable, such as John Smith.
Send a customized greeting script to the program list
Print ("<HTML> <title> greetings! <Title> <body>/N ");
Print ("Hello, $ user_name! It's good to see you! /N ");
Print ("</body> <HTML>/N ");
Imagine that if the user does not just enter a name, instead, enter <HR> More dangerous than inputting simple HTML to modify pages or access images is that malicious hackers may enter a server-side include command. If the Web server is set to obey the server-side include, you can enter
<! -- # Include file = "/secret/project/p1an.txt" -->
Instead of his name, you can see all the text of the secret plan, or you can enter <! -- # Inc1ude fi1e-"/etc/passwd" --> to obtain the password file of the machine. The worst case may be that hackers may enter <! -- # Exec cmd = "RM-RF/" --> instead of his name. In this way, the code in the above program list will delete almost all content on the hard disk.
Warning
Due to frequent malicious use, server-side include is often prohibited to protect websites from attacks. Now let's assume this is okay. Even if the server-side include is disabled and you don't mind seeing any images on your hard disk or changing the page display, there is still a problem-not only for programmers, but also for other users.
A common purpose of CGI scripts is to keep a name book (guestbook): customers who visit the site may sign a name to let others know that they are already there. Generally, a user simply enters his name, which will appear in the visitor list. However, if the last signee! <Form> <SELECT> What should I do if I enter the user name? The <SELECT> flag will cause the web browser to ignore all content between <SELECT> and a nonexistent </SELECT>, including any names added to the List later. Only the first three are displayed even if 10 people sign the signature, because the third name contains a <form> and a <SELECT> flag. Because the third signer uses HTML tags in his name, no name is displayed after him.
There are two solutions for users entering HTML instead of plain text:
1) A fast but rough way is to not allow smaller signs (<) and greater than signs (>), because all HTML tags must be included in these two characters, therefore, clearing them (or returning an error if they are met) is a simple way to prevent HTML from being submitted and returned. The following Perl code clears the two characters: $ user_input = ~ S/<> // G;
2) a more refined approach is to convert these two characters into their HTML for code-a special code used to represent each character without using the character itself. The following code replaces the greater than symbol with the <replacement less than symbol, and the> replacement with the greater than symbol to complete the conversion:
$ User_input = ~ S/</& 1 t;/g;
$ User_input = ~ S/>/g;
2.10 process external processes
Finally, we should be cautious about dealing with user input with external processes in CGI scripts. Because executing a program outside of your cgi script means you cannot control what it is doing, you must do your best to verify the input sent to it before execution starts.
For example, shell scripts often mistakenly combine a command line program and form input for execution. If the user input meets the requirements, everything is normal, but other commands may be added and executed illegally.
The following is an example of a script that produces this error:
Finger_output = 'finger $ user_input'
Echo $ finger_output
If the user politely enters someone's e-mail address for finger, everything will work normally, but if he enters an e-mail address, with a semicolon and another command, the command will also be executed, if the user enters the webmaster@www.server.com; RM-RF/, it is too much trouble.
Even if no hidden commands are added to user data, unintentional input errors may cause problems. For example, the following code line generates an unexpected result-listing all files in the directory-if the user input is an asterisk.
Echo "your input:" $ user_input
When sending user data through shell, it is better to check shell meta-character (metacharacters) as the previous code snippets do-these may cause unexpected behavior.
These characters include semicolons (multiple commands are allowed in one line), asterisks, question marks (completed file matching), exclamation points (jobs running under CSH ), single quotes (execute a command containing it) and so on. Just like filtering out file names, maintaining a list of Allowed characters is generally easier than trying to find out each unsupported character. The following Perl code snippet verifies an e-mail address:
If ($ email_address ~ =/[^ A-zA-z0-9 _/-/+/@/.]) {
# Lllegal character! }
Else {system ("Finger $ email_address ");}
If you decide to allow shell metacharacters in the input, you can also make them safer. Although unverified users can be simply entered with quotation marks to avoid shell operations by special characters, this does not actually play a role. See the following statement:
Echo "finger information: <HR> <PRE>"
Finger "$ user_input
Echo "</PRE>
Although the quotation marks on $ user_input can prevent shell from interpreting a semicolon and prohibit hackers from simply inserting a command, the script still has many security vulnerabilities. For example, the input may be 'rm-RF/', where single quotes can lead to hacker command execution even if finger does not know.
A better way to process special characters is to code them, so that the script only obtains their values without interpreting them. By changing user input codes, all shell metacharacters are ignored and passed to the program as added data. The following Perl code processes non-alphanumeric characters.
$ User_input = ~ S/([^ w]) /// 1/g;
Now, if a user inputs a command, each character, even a special character, is sent to finger by shell.
Keep in mind that verifying user input-do not trust any information sent to you-makes your code easier to read and safer to execute. It is better not to deal with hackers after executing commands, but to perform a one-time data check at the door.
--------------------------------------------
Process internal functions
For interpreted languages, such as shell and Perl, if the data entered by the user is incorrect, errors that are not generated by the program may occur. If user data is interpreted as part of the code to be executed, any content entered by the user must comply with the language rules. Otherwise, an error occurs.
For example, the following Perl code snippets may work normally and may produce errors, depending on what the user inputs:
If ($ search_text = ~ /$ User_pattern /){
# Match! }
If $ user_pattern is a correct expression, everything will be normal, but if $ user_pattern is invalid, Perl will fail, leading to CGI program failure-this may be an insecure method. To avoid this situation, Perl should have at least the eval () operator, which calculates the value of the expression and has nothing to do with the execution of it. Returning a code value indicates whether the expression is valid or not. The following code is the ultimate version of the previous code.
If (eval {$ search_text = ~ /$ User_pattern /}){
If ($ search_text = ~ /$ User_pattern /){
# Match!
}
}
Unfortunately, most shells (including the most commonly used,/bin/sh) do not have a simple way to check errors like this, which is another reason to avoid them.
--------------------------------------------
When executing an external program, you must also know how user input sent to those programs affects the program. Programmers can protect their CGI scripts against hacker intrusion, but if they rashly transfer the content entered by a hacker to an external program without knowing how those programs use the data, it will also be futile.
For example, many CGI scripts run the mail program to send an e-mail containing user input. This may be very dangerous because mail has many internal commands, and any command may be input and activated by users. For example, if you use mail to send the text entered by the user, and the text contains a line with the font size (~) Mail interprets the next character of the line as one of the many commands that it can execute. For example ,~ R/etc/passwd will cause the mail to read the machine's password file and send it to the recipient (maybe the hacker himself ).
In such an example, Sendmail should be used (a lower-level mailing program with fewer mail features), rather than sending e-mail on UNIX machines.
As a general rule, you should use programs that are as close as possible to your requirements when executing external programs, without having to have too many unnecessary functions. The fewer tasks that external programs can do, the fewer chances that they will be used to do bad things.
Warning
The following is another question about using mail and Sendmail: Make sure that the email address sent to the mail system is a valid email address. Many mail systems use the e-mail address starting with "|" as the command to be executed, which opens the door for hackers who enter such an address. Remember to verify the data again.
Another example of how to better understand external programs to effectively use them is grep. Grep is a simple command line utility that searches for a common expression in a file. The expression can be a simple string or a complex character sequence. Most people will say that using grep will not cause any problems, but although grep may not cause any losses, it can be fooled. The following describes how it is fooled, the following code is used. It is assumed that the case-sensitive search for user input items is completed in many files.
Print ("the following lines contain your term: <HR> <PRE> ");
$ Search_term = ~ S/([^ w]) /// 1/g;
System ("grep $ search_term/public/files/*. txt ");
Print (<"PRE> ");
All of this looks good, unless the user may input-I. It is not searched, but switched to grep, just like any input starting with a hyphen. This will cause grep to be suspended due to waiting for standard input of the string to be searched, or an error if the content after-I is interpreted as another switching character. Undoubtedly, this is not the programmer's original intention. In this case, it is not dangerous, but it is possible in other cases. Remember, there are no harmless commands, and each command department must be carefully considered from all angles.
In general, you should be familiar with every external program executed by your cgi script as much as possible. The more you know about programs, the more you can protect them from data destruction-you can monitor data and disable some options or features. External programs are often a fast and convenient solution to many CGI program problems-they are all tested, available, and flexible. But they can also become the door to hacker intrusion. Do not be afraid to use external programs-they are often the only way to complete a function in a CGI program-but be aware of the potential harm they may cause.
Iii. internal injury
So far, we have taken into account the potential security risks that people who visit the site through web examples, from thousands of miles away. But there is actually another risk factor that is closer to each other.
A common mistake in cgi security is to forget local users. Although Web browser users do not affect local security, such as file protection and owner, local users of web servers can do so and more efforts must be made to prevent such intrusions. On most multi-user systems, such as UNIX, Web servers run as a program, and machines are still used by many people to do many things. Just because working with someone or visiting a school doesn't mean that someone can resist the temptation, rather than installing the web.
3.1 CGI script user
Most Web servers are installed as special users running CGI scripts. This is the user who owns the CGI program when the CGI program runs, and his permissions can limit what the script can do.
In UNIX, the server itself runs as a root (system Super User or administrator, and allow it to use port 80 as the place for the browser to communicate with it (only root can use these "Reserved" ports 0 to 1023; all users can use the remaining ports ). When the server executes the CGI program, most Web servers can be set to run the program with another user rather than the Web server itself-although not all servers can.
Running CGI scripts as root is dangerous! The server should be set to use a common user, such as a common nobody, to run CGI scripts. The smaller the user permission, the less damage the CGI script can cause. For example, Apache webserver automatically switches user permissions to nobody after startup.
3.2 setuid dangerous
Programmers should also know whether to set the setuid bit in their unix cgi script. If you allow this option for an executable file, the program will have the same permissions as the user who owns the file, rather than the user who executes it. If you set the setuid bit in your cgi script, no matter which user the server runs it as, its permissions are equivalent to the owner of the file. Of course, this poses a major risk-users who run scripts with their permissions may lose control. Fortunately, the setuid bit is easily disabled. Execute chmod A-s on all CGI scripts to close all setuid, and the program can run with permitted permissions.
Of course, in some cases, you may want to set the setuid bit-for example, if the script needs to run as a special user to access a database. In this case, you must be careful to ensure that other files of the program are protected to restrict the users that can access it.
3.3 "Community" Web Server
Even if the web server uses a common user to execute scripts, there is still a potential problem, that is, a person is not always able to control the server. If many people control the server together, everyone can run the CGI script as a nobody user. This allows any of these people to use CGI programs to access areas that they could not have previously accessed, and these areas are permitted by nobody.
Maybe the possible solution to security problems is to restrict CGI control to one person. In some cases, although this seems reasonable, it is often not possible for large sites. For example, a university has hundreds of students, and every student wants to write and install CGI scripts.
3.4 use CGI wrap
When multiple users can access CGI, a better solution to the problem of determining which user the script runs is CGI wrap. CGI wrap can be found in the Using CGI web site. It is a simple package that runs CGI scripts by the user who owns the file, not the user specified by the server. This simple precaution makes the script owner responsible for its possible harm.
CGI wrap allows the CGI script authors to take charge of their own script permissions, so it is not only a powerful tool to protect important files owned by others, it is also a powerful tool to encourage people to write secure scripts. Only their own files will be in danger. Such a reality will greatly promote the script authors.
3.5 CGI script permission
You should also be clear about which User owns the CGI script and the file permissions of the script. Permissions for directories containing scripts are also very important.
For example, if the cgi-bin directory on the Web server is writable to all users, any local user can delete the CGI script and replace it with another one. If the script itself is writable by everyone, anyone can modify the script to complete everything.
Please refer to the following harmless unix cgi script:
#! /Bin/sh
# Send the header
Echo "Content-Type: Tex/html"
Echo ""
# Send some html
Echo "<HTML> Echo "<body> your fortune: <HR> <PRE>
Forune
Echo "</body> Now, if the permissions set on the script allow a malicious user to change the program as follows:
#! /Bin/sh
# Send the header
Echo "Content-Type: text/html"
Echo ""
# Do some damage!
Rm-RF/
Echo "<HTML> <title> got you! <Title> <body>"
Echo "Then the next user accessing the script on the Web will cause a lot of problems even if he does not do anything bad. It is important to check the integrity of user input on the web, but more importantly, ensure that the script itself is not modified and cannot be modified.
3.6 local file security
The integrity of the files created by the script on the local hard disk is equally important. After obtaining a reasonable file name entered by the Web user, it is also important to use this file name. Based on the operating system running on the Web server, permissions and owner information can be stored in the file together with the data in the file.
For example, a UNIX system can record the file access permissions, including the permissions of the users who created the file, the permissions of users in the same group, and the permissions of others in the system. Windows NT uses a more complex access control list system, but the functions are roughly the same. Users of web servers may also be in trouble by setting and granting or disallowing these tags.
For example, when creating a file, you should know the permissions set for it. Most web server software sets umask or permission code to 0000, which means you can create a file that can be read and written by anyone. Although permission settings on files may not be different for Web browsing users, users with local access can exploit non-strict permission settings to cause harm. Based on this reality, file permissions should be restricted as strictly as possible.
The simplest way to ensure that each file opening call has a minimum limit set is to set the script umask. Umask () is a UNIX call that limits permissions on subsequent file creation. The UMASK () parameter is a number used to block the permission Code created for subsequent files. If umask is 0022, No matter what explicit permissions are granted to the group users and other users when opening the file, the created file can only be written by the user. Even if umask has been set, the permission should be explicitly specified during file creation. If only the CGI script can access the file, only the user running the CGI program can access the file-the permission is 0600. If another program needs to access the file, the owner of the program can become a user in the same group as the CGI script. In this way, you only need to set the group user permissions-the permission is 0660. If you must allow all users to access the file, you must make the file read-only and not write-0644 permission.
3.7 Use an explicit path
Finally, the local user can attack the web server in the last way-spoofing the server to run an external program he wrote, rather than running the program specified in the CGI script. The following is a simple program. It can be seen from the fortune command of UNIX that the browser is still smart.
#! /Bin/sh
# Send the header
Echo "conten_type: text/html"
Echo ""
# Send the fortune
Echo "<HTML> Echo "<you crack open the cookie and the fortune reads: <HR> <PRE>"
Fortune
Echo "</PRE> <body> The script looks harmless. It does not receive user input, so the user cannot do any tricks here. Because it is only run on Web servers, the permission settings of the script itself can be very strict, which can prevent any attempt of local users to modify it. If you set the correct permissions for the directory where the script is located, there seems to be no problem, right?
Of course there are still problems. Remember to be a little extreme.
The above program list calls external programs, in this example echo and fortune. Because these scripts do not use explicit paths to specify their locations on the hard disk, the shell uses the PATH environment variable to locate them and searches for the program to be executed from each item in the variable. This may be dangerous. For example, if the fortune program is installed in/usr/games, but/tmp is listed before it in the path, any program that happens to be named "Fortune" and located in the temporary directory will be executed, rather than the real fortune.
This program can do anything the Creator wants to do, delete files, register the request information, and pass the data to the real fortune-so that users and programmers are not smart. When running an external program in a CGI script, you must specify an explicit path. PATH environment variables play a major role, but they can also be used illegally like other variables.