C ++ background practices: Ancient CGI and Web development
This article is intended for C/C ++ programmers.
========================================================== =====
When talking about web development, we first think of PHP, JavaEE/JSP,. NET/ASP, Ruby on rails, and Python Django. It can be said that C ++ has nothing to do with Web development. However, the development of dynamic web pages (Web Development) exists before these dynamic web pages were born. Therefore, C/C ++ can also be used for web development. The technology it uses is CGI.
At the beginning of the world, the dynamic web page language was not yet born when chaos was not yet reached, and CGI was the only thing to rely on for dynamic websites. Google/Baidu CGI may have many terms: CGI scripts, CGI programs, CGI standards, and so on. In fact, these are all from different perspectives. CGI is the abbreviation of Common Gateway Interface ". The first time I heard this name, I didn't know what it was. In the final analysisCGI is an interface protocol.. Protocols are a set of accepted standards (CGI standards are also acceptable), such as network protocols. Everyone follows a set of standards, which reduces the difficulty of communication. For CGI development, it is to compile a CGI executable program. In fact, CGI can be compiled in various languages, not only Java, Python, PHP, C #...... Yes, and the Shell can. Of course C and C ++ can also. Many early CGI programs were developed by Perl (scripting language), so CGI programs are also known as CGI scripts. In fact, this name is not necessarily accurate. Because the executable file compiled by C ++ can also be CGI.
Today, PHP and Java are widely used. Many people think that using C ++ to write CGI is almost obsolete (in fact, this is not the case, but it is a small crowd ). So if you are interested in C/C ++ or history, you can read this article.
When a webpage request and response are browsed, a webpage is usually requested through a URL, and then the server returns the webpage file to the browser. The browser parses the file locally and renders it to the webpage we see. However, the webpage we usually see is not a static webpage. That is to say, this webpage file is not found on the server. It is dynamically generated when a webpage request is sent, such as a PHP/JSP webpage. The returned content varies depending on the parameters you requested. Similarly, when you request a CGI program (
For example, you can directly enter the CGI program URL in the browser or send it to the CGI program when submitting a form.The CGI program is responsible for parsing the parameters passed from the front end, understanding its intent, and then returning data, such as HTML, XML, or JSON.
WARNNING: Apache does not support CGI by default. CGI configuration is required. You can use Baidu for specific methods.Prepare the front-end knowledge. If you are a C ++ programmer, you may not be familiar with the front-end (OK, I am not familiar with it). Before proceeding, first, you need to have some preliminary front-end knowledge (do not talk about the front-end as much as possible). You do not need to know how to render a beautiful web page, but you need to know how to interact between the front-end and back-end. How to send data to front-end pages. A common HTML page usually uses the following methods: form submission (html native) js-controlled form submission js-Ajax-requested data-the first (simplest) method is described here: Form submission
<Form action = "/cgi-bin/hello. cgi "method =" get "> <table> <tbody> <tr> <td> User Name: </td> <input name = "username"/> </td> </tr> <td> password: </td> <input name = "password"/> </td> </tr> <td> <input type = "submit" value =" OK "/> </td> </tr> </tbody> </table> </form>
The value of the action attribute of the form tag indicates the page to jump to after the form is submitted to the url (Ajax can achieve no jump to pull data, refresh the page ), here, the value of action is the url address of the cgi program. (WARNNING:/Corresponds to the root directory of the website, rather than the root directory of the Linux File System ). The method attribute indicates the data request method, which can be get or post. Do not go into details. I enter the user name jellywang, password 123456, and click OK.
Current domain name/cgi-bin/hello. cgiAnd carry the parameter username = jellywang. Then the page will jump to this cgi (just like a normal webpage jump, the browser address bar is updated ). For get requests. The URL in the browser's address bar looks like this: localhost:/cgi-bin/hello. cgi? Username = jelly & password = 123456. Obviously, this is not safe enough, so we can also use post requests. In this way, the address bar will not be able to see such submitted parameters. (In fact, post is not safe enough, and you are not encouraged to directly submit the plaintext password. This article only serves as an example. Secure Login is not the focus of this article) environment Variables and CGI processing when the front-end page submits data to the cgi program through the get or post method, Then how should the cgi program parse it? The answer is:
Environment Variable. Both Linux and Windows have the concept of environment variables. Linux users have to deal with environment variables in the System Configuration File When configuring many environments. CGI programs obtain parameters by taking values from environment variables. Here we will introduce several environment variables (Baidu for more information ):
REQUEST_METHOD |
Front-end page Data Request Method: get/post |
QUERY_STRING |
Information transmitted when GET is used |
CONTENT_LENGTH |
Length of valid information in STDIO |
SCRIPT_NAME |
The name of the CGI program called. |
SERVER_NAME |
Server IP address or name |
SERVER_PORT |
Host port number |
Who defined where these environment variables come from? Is it Linux? POSIX? Of course not. Here we need to declare again that CGI is an interface protocol. These environment variables belong to the protocol, so whether your server is in a Linux or Windows operating system, whether your server is Apache or Nginx, the names and meanings of these variables are the same. Apache/Nginx fills the content in the environment variable, and the specific fill specification comes from the CGI interface protocol. In the C language standard, the getenv function is used to obtain the environment variable value library. (Header file stdlib. h)
// For example, chr * str = NULL; str = getenv ("QUERY_STRING ");
For get requests, you can retrieve the string username = jelly & password = 123456 from the environment variable QUERY_STRING. Then the program parses the string and parses the key and value of the parameter. For post requests, this parameter string is obtained directly by marking the input (STDIN). For example, scanf or cin can be used. After the request is parsed and the corresponding logic is processed (such as checking whether the username and password are consistent), the CGI program will return the content to the front-end page, which is completed through the standard output (STDOUT, for example, printf or cout, you can return xml, json, plain text or an html webpage. This step completes the HTTP response process. Therefore, before returning direct data, you must first output the HTTP header. For example, if you want to return an html webpage, you must first output:
Cout <"Content-Type: text/html \ n" <
WARNNING:Note that two linefeeds (\ n) must be output ). Because the header of the HTTP protocol and the message entity (such as HTML code) are separated by blank lines.
The following code uses cout to generate html code (for example, output the user name you just entered to log in successfully ). The front-end page will receive the html code and then the browser will render it as a webpage. This is a dynamic web page operation completed by CGI. Cgicc library for C ++ CGI programming requires manual string parsing, as well as self-managed header. For example, if the resource is transferred, 302 is returned and a new address is given by Location in the header. Obviously, these things have built-in solutions for PHP, Python, and other languages. Third-party libraries are required for C ++. Here we recommend a GNU open-source library, Cgicc. In addition to parsing get/post requests, you can also redirect requests, set cookies, upload files, and so on. In the US, the Cgicc database does not support sessions. But this is not a problem. We can easily use cookies to implement SESSION functions. Because CGI itself creates a process upon a request, the process ends after the return (except for FastCGI below ). The optional solution for maintaining a SESSION variable on the server is to use file storage or storage in Redis, Memcached, and other memory databases. The SESSIONID sent to the client is completed by using the Cookie function supported by Cgicc. CGI pain points and FastCGICGI are a standard language without limitation. Therefore, Java, PHP, and Python can generate dynamic web pages in this way. However, these dynamic languages are rarely used in this way. It turns out that CGI has a major hard injury. That is, every CGI request, Apache starts a process to execute this CGI program, that is, fork-and-execute with Unix characteristics. When a user requests a large number of requests, the fork-and-execute operation will seriously slow down the Server process. Java's Servlet technology is a resident memory technology that does not frequently create and destroy process context. FastCGI technology came into being. Simply put, it is essentially a process pool technology with resident memory. The scheduler is responsible for sending the passed CGI requests to the handler process that processes CGI. After a request is processed, the process will not be destroyed and will continue waiting for the next request. With the FCGI technology, CGI regressed to a certain extent in the second spring. PHP-FPM itself is a Patch that enables PHP to support FCGI technology and has now been incorporated into PHP standards. Of course, C ++-supported FCGI technology has also emerged. Apache can be installed with FCGI modules, such as mod_fcgid. We know before the modern CGI programming paradigm that CGI can directly return an html webpage. CGI programs can also perform various computing and logic processing tasks. With the development of various web frontend and backend technologies, as well as the use of big data and highly concurrent servers, there are more and more scenarios. The usage of modern CGI is changing. Nowadays, more and more tasks are transferred from the backend to the front-end, and the front-end pages are processed more with the rich Js technology. JS can use Ajax technology to initiate data requests to CGI in the background. Ajax does not need to refresh the entire page to load backend data (such as extracting data from a database ). CGI is generally no longer used to directly return html pages, and sink complex computing and IO tasks to the backend (the backend can further perform routing and forwarding to achieve load balancing ). Use CGI as the middle layer between the front and back ends. At that time, CGI serves to complete basic data exchange: parse front-end data requests and then forward them to the corresponding backend. Then, it retrieves data from the backend and returns XML or JSON to the front-end. Front-end JS uses data in XML/JSON to fill and draw a wide array of pages.