TINYHTTPD Source Detailed

Source: Internet
Author: User
Tags bind http request int size socket sprintf strcmp strlen port number

TINYHTTPD is a lightweight Web server that has finally taken the time to study in recent days. Its source code baidu can download, more than 500 lines, is really learning Linux programming good material. Many netizens have written about TINYHTTPD's blog, but I still feel that not enough depth, strictly speaking is not enough depth, often is to put more than 500 lines of code a throw, analysis of the main process, draw a flowchart on the finished. I think there are a lot of things to dig out, perhaps can also adjust the code, although it is not clear how much to adjust, I carefully said.

My analysis of the process is based on the main route, that is, a trunk process: The server creates a socket and listens to a port---browser input URL make a request, the server receives the request, creates a thread processing request, the main thread continues to wait for the new thread to read the HTTP request, And parse the relevant fields, read the contents of the file or execute the CGI program and return to the browser, close the client socket, the new thread exits

Let's first look at the main function

int main (void)
{
 int server_sock =-1;
 U_short port = 0;
 int client_sock =-1;
 struct sockaddr_in client_name;
 int client_name_len = sizeof (client_name);
 pthread_t Newthread;

 Server_sock = Startup (&port);
 printf ("HTTPd running on port%d\n", port);

 while (1)
 {
  Client_sock = accept (Server_sock,
                       (struct sockaddr *) &client_name,
                       &client_ Name_len);
  if (Client_sock = =-1)
   Error_die ("accept");
 if (Pthread_create (&newthread, NULL, Accept_request, client_sock)! = 0)
   perror ("Pthread_create");
 }

 Close (server_sock);

 return (0);
}

This code, as long as a little knowledge of Linux network programming is very understood, creating a server socket, binding, listening, waiting for client connections. Only the author put these steps in a function called startup. Then look at startup.

int startup (U_short *port)
{
 int httpd = 0;
 struct sockaddr_in name;

 httpd = socket (pf_inet, sock_stream, 0);
 if (httpd = =-1)
  Error_die ("socket");
 memset (&name, 0, sizeof (name));//can also be used bzero
 name.sin_family = af_inet;
 Name.sin_port = htons (*port);
 NAME.SIN_ADDR.S_ADDR = htonl (inaddr_any);//Any network interface
 if (Bind (httpd, (struct sockaddr *) &name, sizeof (name)) < 0)
  Error_die ("bind");
 if (*port = = 0)/  * If dynamically allocating a port */
 {
  int namelen = sizeof (name);
  if (GetSockName (httpd, (struct sockaddr *) &name, &namelen) = =-1)
   Error_die ("getsockname");
  *port = Ntohs (name.sin_port);//system dynamically assigns a port number
 }
 if (Listen (httpd, 5) < 0)
  Error_die ("Listen");
 return (httpd);//returns the service socket descriptor
}
A very common step, not much to say.

After that, the server is waiting for the connection, the author does not really care where the client comes from, the second and third parameters of the accept can be completely null. Then the creation thread passes the client socket as a parameter, and the request is processed by the new thread, which is a common means of server programming and improves concurrency. Note that the threading function here is not entirely legal, at least on Linux it does not conform to the prototype definition of the threading function, and the compiler is only warning without error.

The next focus is on the thread function accept_request.

void accept_request (int client) {char buf[1024];
 int numchars;
 Char method[255];
 Char url[255];
 Char path[512];
 size_t I, J;
 struct STAT st;      int cgi = 0;

 /* becomes true if server decides this is a CGI * program */char *query_string = NULL;
 NumChars = get_line (client, buf, sizeof (BUF)); i = 0;
 j = 0; while (!
  ISspace (Buf[j]) && (I < sizeof (method)-1)) {Method[i] = buf[j]; i++;
 j + +;

 } Method[i] = ' + ';
  if (strcasecmp (method, "GET") && strcasecmp (method, "POST")) {unimplemented (client);
 Return

 } if (strcasecmp (method, "POST") = = 0) cgi = 1;
 i = 0;
 while (ISspace (Buf[j]) && (J < sizeof (BUF))) J + +; while (!
  ISspace (Buf[j]) && (i < sizeof (URL)-1) && (J < sizeof (BUF))) {Url[i] = buf[j]; i++;
 j + +;

 } Url[i] = ' + ';
  if (strcasecmp (method, "GET") = = 0) {query_string = URL;
  while (*query_string! = '? ') && (*query_string! = ') ") query_string++; IF (*query_string = = '? ')
   {cgi = 1;
   *query_string = ' + ';
  query_string++;
 }} sprintf (Path, "htdocs%s", url);
 if (Path[strlen (path)-1] = = '/') strcat (path, "index.html"); if (stat (path, &st) = =-1) {while ((NumChars > 0) && strcmp ("\ n", buf) */* Read & Discard Headers *
  /NumChars = get_line (client, buf, sizeof (BUF));
 Not_found (client);
  } else {if ((St.st_mode & s_ifmt) = = S_ifdir) strcat (Path, "/index.html");
      if ((St.st_mode & s_ixusr) | |
      (St.st_mode & s_ixgrp) | |
   (St.st_mode & S_ixoth))
  cgi = 1;
  if (!cgi) serve_file (client, path);
 else execute_cgi (client, path, method, query_string);
} close (client); }

First, it is crucial to understand the meaning of get_line. We need to know that when you enter the URL in the browser after enter, it is sent to the server is a text-type string, followed by the HTTP request format, similar to the following:

get/http/1.1

HOST:www.abc.com

Content-type:text/html

...

Get_line do is to read a row, and regardless of whether the original is \ n or r \ n, are converted to \ n plus the character end. The implementation is as follows:

int get_line (int sock, char *buf, int size)
{
 int i = 0;
 char c = ' + ';
 int n;

 while ((I < size-1) && (c! = ' \ n '))
 {
  n = recv (sock, &c, 1, 0);//read one character at a time from sock, loop through
  if (n > 0)
  {if
   (c = = ' \ r ')//If you read a carriage return, the character is normally immediately followed by \ n
    = recv (sock, &c, 1, msg_peek);
    if ((n > 0) && (c = = ' \ n '))
     recv (sock, &c, 1, 0);//Then read again, C or \ n, the loop jumps out of
    else
     c = ' \ n ';
   }
   buf[i] = c;
   i++;
  }
  else
   c = ' \ n ';
 }
 Buf[i] = ' + ';
 
 return (i);//Returns the number of characters read
}

Get_line After the end, is to start parsing the first line, determine whether it is a get method or Post method, currently only support these two kinds. If it is post, or the CGI is set to 1, indicating that the CGI program is to be run, if it is a get method and with the.

Or get the URL to access, it can be a very common/,/index.html and so on. The program defaults to the root directory under Htdocs, and the default file is index.html. It also determines whether a given file has permission to be held, and if so, it is considered a CGI program. Finally, according to the value of the variable CGI to choose: Read the static file or execute the CGI program to return the results.

Let's first look at the simplest static file case, calling the function serve_file

void Serve_file (int client, const char *filename)
{
 file *resource = NULL;
 int numchars = 1;
 Char buf[1024];

 Buf[0] = ' A '; Buf[1] = ' + ';
 while ((NumChars > 0) && strcmp ("\ n", buf)/  */Read & Discard Headers */
  NumChars = get_line (client , buf, sizeof (BUF));//must read the client's head, or the subsequent send will not display in the browser normally.

 resource = fopen (filename, "R");
 if (resource = = NULL)
  not_found (client);
 else
 {
  headers (client, filename);
  Cat (client, Resource);
 }
 Fclose (Resource);
}

Take the file name as an argument, first read the client's head, and then open the Create file stream. To simulate an HTTP response, the header is first sent to the client, and the header information contains at least the following points:

http/1.0 OK

Server

Content-type:

\ r \ n (a blank line that identifies the end of the head)

Finally send the data body part, i.e. the file content, in the Cat method, fgets every line read, send, until the end. The headers and CAT functions are not listed here. Let's take a look at a specific test example, followed by debugging in GDB

I set up a new file under the root directory under Htdocs index2.html, which reads as follows:

<a href= "http://10.108.222.96:54205/test.sh" >display date</a>

I put a link here, the href part is about CGI, regardless, just see whether the text part can be displayed in the browser.

Run directly after compiling first./HTTPD, program print "httpd running on port 53079"

We access the index2.html file in the browser, as shown in the following image:


The text can be displayed correctly. So how to debug the observation in GDB.

xiaoqiang@ljq-lenovo:~/chenshi/tinyhttpd-0.1.0$ gdb Attach 7029 "View the PID of the httpd process via PS, and then GDB attach" attaching to process
7029 Reading symbols From/home/xiaoqiang/chenshi/tinyhttpd-0.1.0/httpd...done. Reading symbols from/lib/i386-linux-gnu/libpthread.so.0 ...
(No debugging symbols found) ... done.
[Thread debugging using libthread_db enabled]
Using host libthread_db Library "/lib/i386-linux-gnu/libthread_db.so.1". Loaded symbols for/lib/i386-linux-gnu/libpthread.so.0 Reading symbols from/lib/i386-linux-gnu/libc.so.6 ...
(No debugging symbols found) ... done. Loaded symbols for/lib/i386-linux-gnu/libc.so.6 Reading symbols from/lib/ld-linux.so.2 ...
(No debugging symbols found) ... done. Loaded symbols for/lib/ld-linux.so.2 0xb7750424 in __kernel_vsyscall () (GDB) bt #0 0xb7750424 in __kernel_vsyscall () # 1 0xb772dc08 in Accept () from/lib/i386-linux-gnu/libpthread.so.0 #2 0x0804a8d6 in Main () at httpd.c:516 (GDB) L ACCEP
T_request Warning:source file is more recent than executable. 47/* A request has caused a call to accept () on the server port to * return.
Process the request appropriately. * Parameters:the socket connected to the client */50/************************************************************** /n. void accept_request (int client), (int.) {buf[1024]; numchars int; char method[255];
[255];
(GDB) L-path[512 Char];
size_t I, J;
N-struct STAT st;      int cgi = 0;
/* becomes true if server decides this is a CGI/* program */*query_string = NULL; NumChars = get_line (client, buf, sizeof (BUF));//reads a line from the socket i = 0;
j = 0; The (! ISspace (Buf[j]) && (I < sizeof (method)-1)) (GDB) b 64 "set breakpoints in line 64, observe what is read" Breakpoint 1 at 0x8048b3f:file htt
PD.C, line 64. (GDB) C continuing. "Until the request is initiated in the browser, the latter will print" [New Thread 0XB63FEB40 (LWP 7655)] [switching to Thread 0XB63FEB40 (LWP 7655)] Breakpoint 1, ACC Ept_request (client=4) at httpd.c:64 numchars = Get_linE (client, buf, sizeof (BUF));//reads a line from the socket (GDB) n-i = 0;
j = 0;	 (GDB) P buf "Print read line" $ = "get/index2.html http/1.1\n", ' \000 ' <repeats 997 times> "really is the first line of HTTP GET request" (GDB) L 60      int cgi = 0;
/* becomes true if server decides this is a CGI/* program */*query_string = NULL; NumChars = get_line (client, buf, sizeof (BUF));//reads a line from the socket i = 0;
j = 0; The (!
ISspace (Buf[j]) && (I < sizeof (method)-1)) [method[i] = buf[j]; i++; j + +; (GDB) L-method[i] = ' + ';//Get to HTTP method strcasecmp if (method, "get") && strcasecmp (method, "POST" )) 74 {75//Ignore case comparison unimplemented (client), return;//not yet supported request method, thread returns to (GDB) L Serve_file "Other details debugging is not here In the demo, jump directly into Serve_file "412 * parameters:a pointer to a file structure produced from the socket 413 * File D Escriptor 414 * The name of the file to serve */415/***********************************************/416 void Serve_file (int client, const char *filename) 417 {418 File *resource = NULL; 419 int
NumChars = 1;
420 Char buf[1024]; 421 (GDB) L 422 buf[0] = ' A ';
Buf[1] = ' + '; 423 while ((NumChars > 0) && strcmp ("\ n", buf)/* Read & Discard Headers */424 NumChars = Get_line (cl
Ient, buf, sizeof (BUF));
425 426 resource = fopen (filename, "R");
427 if (resource = = NULL) 428 not_found (client);
429 Else 430 {431 headers (client, filename);
(GDB) b 426 "Set breakpoints at 426 lines" Breakpoint 2 at 0x804a247:file httpd.c, line 426.

(GDB) C continuing. Breakpoint 2, Serve_file (client=4, filename=0xb63fdf4e "htdocs/index2.html") at httpd.c:426 426 resource = fopen (Filenam
E, "R"); (GDB) p filename $ = 0xb63fdf4e "htdocs/index2.html" (GDB) n 427 if (resource = = NULL) (GDB) n 431 headers (client, fil
ENAME);
(GDB) n 432 Cat (client, Resource); (GDB) S "Into cat inside Look" cat (client=4, resource=0xb6c00468) at httpd.c:170 (GDB) L 165 * easier just to doSomething like pipe, fork, and exec ("cat"). 166 * Parameters:the Client Socket Descriptor 167 * File pointer for the file to cat */168/************** /169 void Cat (int client, FILE *resource) 171 Char buf[102
4];
172 173 fgets (buf, sizeof (BUF), Resource);
174 while (!feof (Resource)) (GDB) n 173 fgets (buf, sizeof (BUF), Resource); (GDB) n 174 while (!feof (Resource)) (GDB) P buf "speaks of a line of index2.html, and then send" $ $ = "<a href=\" http://10.108.222.96:54205/ Test.sh\ ">display date</a>\n", ' \000 ' <repeats 306 times>, "\" \225^\267\000\000\000\000 \312q\267\000\ 320t\267 \000\000\000 \312q\267\304re\267 \000\000\000el^\267\001\000\000\000\000\320t\267 \000\000\000\364\277q\ 267\360\331?\266v\003_\267\364\277q\267 \000\000\000 \312q\267\000\320t\267\000\000\000\000$k^\267 \312q\267\000\ 320t\267 ", ' \000 ' <repeats times>," a\252\004\b\364\277q\267 \000\000\000\377\377\377\377\000\000\000\000\ 236\201^\267 ", ' \000 ' <repeats times>," \312q\267u\205^\267 \312q\267\000\320t\267 ", ' \000 ' <repeats ;"
\364, \277q\267\001\000\000\000r\252\004\b\000\000\000\000\343v^\267 "...
(GDB) n 176 Send (client, buf, strlen (BUF), 0);
(GDB) n 177 fgets (buf, sizeof (BUF), Resource);  (GDB) n 174 while (!feof (Resource)) (GDB) n 179} (GDB) n serve_file (client=4, filename=0xb63fdf4e "htdocs/index2.html")
At httpd.c:434 434 fclose (Resource); (GDB) bt #0 serve_file (client=4, filename=0xb63fdf4e "htdocs/index2.html") at httpd.c:434 #1 0x08048f83 in Accept_reque St (client=4) at httpd.c:130 #2 0xb7726d4c in Start_thread () from/lib/i386-linux-gnu/libpthread.so.0 #3 0xb7665b8e in Clone () from/lib/i386-linux-gnu/libc.so.6 (GDB) n 435} (GDB) s accept_request (client=4) at httpd.c:139 139 Close (Clie NT); <span style= "Background-color:rgb (255, 255, 255); > "Until it runs here, the browser's request will actually stop, meaning the tab bar that keeps rotating the flag is stopped" </span> (GDB) s 0xb7726d4c in Start_thread () from/lib/i386-linux-gnu/libpthread.so.0 (GDB) s stepping until exit from function Start_thread, which have no line number I
Nformation.
[New Thread 0XB5BFDB40 (LWP 7656)]

[Switching to Thread 0XB5BFDB40 (LWP 7656)] Breakpoint 1, Accept_request (client=4) at httpd.c:64 numchars = get_line (client, buf, sizeof (BUF));//read one line from socket (GDB) n [Thread 0xb63feb40 (LWP 7655) exited] i = 0;
j = 0;  (GDB) p buf $4 = "Get/favicon.ico http/1.1\n", ' \000 ' <repeats 997 times> "read a line again, read Favicon.ico, don't understand what's going on" (GDB)

As already mentioned, TINYHTTPD currently supports two request forms, a pure GET request or a get and direct POST request with a. Understand the source code Htdocs directory of CGI are written by Perl, I do not know the reader you understand, anyway Bo Master I do not understand, so change a change, change to their own needs, with Shell write. As index2.html shows:

<a href= "http://10.108.222.96:54205/test.sh" >display date</a>

The test.sh script is as follows:

#!/bin/sh
#echo "Content-type:text/html"
Echo
echo "Time= ' Date '
echo "<p>server Time: $time"
echo "</body>

This includes the character data that the server responds to the customer, passing the server time. Note that adding test.sh to add Execute permission is considered a CGI program, and the port number in the href is changed to your specific port number, which is just an example. Look at the response from the server when you click on "Display Date" in the browser:

(GDB) L execute_cgi "in order to save space, the following content I removed extraneous content" Warning:source file is more recent than executable. 214 * Parameters:client Socket Descriptor 215 * Path to the CGI script */216/**************************** /217 void execute_cgi (int client, const char *path, 218 const Char *method, const char *query_string) 219 {[n] char buf[1024]; 229 buf[0] = ' A '; buf[1] = ' n '; 231 if (strcase
CMP (method, "GET") = = 0) (gdb) b 231 "set breakpoint on execute_cgi" Breakpoint 1 at 0x8049555:file httpd.c, line 231. (GDB) C continuing. "When the browser initiates the request, Serve_file is called, but at this point the breakpoint is at execute_cgi, so there is no response until the mouse clicks the link" [New thread 0xb7567b40 (LWP 7708)] [Thread 0XB7567B40 ( LWP 7708) exited] [new thread 0XB6BFFB40 (LWP 7709)] [Thread 0XB6BFFB40 (LWP 7709) exited] [new thread 0xb63feb40 (LWP 771 0)] [switching to Thread 0XB63FEB40 (LWP 7710)] Breakpoint 1, execute_cgi (client=4, path=0xb63fdf4e "htdocs/test.sh", Me THOD=0XB63FE14E "GET", Query_string=0xb63fe2"") at httpd.c:231 231 if (strcasecmp (method, "GET") = = 0) (gdb) Info args "View this function call parameter value" Client = 4 Path = 0xb63fdf4e " htdocs/test.sh "" File is test.sh script "method = 0xb63fe14e" GET "query_string = 0xb63fe255" "257 258 if (pipe (Cgi_output) < 0) {259 cannot_execute (client); 260 return; 261} 262 if (pipe (Cgi_input) < 0) {263 cannot_execute (client); 2
return;
265} 266 (GDB) b 258 "Set breakpoints at Creation pipe" Breakpoint 2 at 0x804973e:file HTTPD.C, line 258.

(GDB) C continuing. Breakpoint 2, execute_cgi (client=4, path=0xb63fdf4e "htdocs/test.sh", method=0xb63fe14e "GET", Query_string=0xb63fe2 "") at httpd.c:258 258 if (pipe (Cgi_output) < 0) {(GDB) n 262 if (pipe (Cgi_input) < 0) {(GDB) n 267 if ((pi	
d = fork ()) < 0) {(GDB) L 262 if (pipe (Cgi_input) < 0) {263 cannot_execute (client); return; 265} 266 267 if (PID = fork ()) < 0) {268 cannot_execute (client); 269 return; 271 if (PID = = 0)/* child:cgi s
Cript */(GDB) L 272 {273 Char meth_env[255];
274 Char query_env[255];
275 Char length_env[255];
276 277 dup2 (Cgi_output[1], 1);
278 dup2 (Cgi_input[0], 0);
279 Close (cgi_output[0]);
280 Close (cgi_input[1]);
281 sprintf (meth_env, "request_method=%s", METHOD);
(GDB) L 282 putenv (meth_env); 283 if (strcasecmp (method, "GET") = = 0) {"My test example is a GET request but does not need to set any environment variables" 284 sprintf (query_env, "query_string=%s", que
ry_string);
285 putenv (query_env); 286} 287 Else {/* POST */288 sprintf (length_env, "content_length=%d", content_length); 289 putenv (length_en
V); 290} 291 execl (path, path, NULL);
"Child Process Execution test.sh" (GDB) L 292 exit (0); 293} 294 295 Else {/* Parent/296 close (cgi_output[1]), 297 close (cgi_input[0]), 298 if (strcasecmp (metho D, "POST") = = 0) 299 for (i = 0; i < content_length; i++) {recv (client, &c, 1, 0); 301 Write (cgi_in
Put[1], &c, 1); (GDB) b 298 "because the child process executes test.sh, the parent process sends a response to the browser, so first go into the parent process and see what the hair is," Breakpoint 3 at 0x80498ec:file httpd.c, line 298.

(GDB) C continuing. Breakpoint 3, execute_cgi (client=4, path=0xb63fdf4e "htdocs/test.sh", method=0xb63fe14e "GET", Query_string=0xb63fe2 "") at httpd.c:298 298 if (strcasecmp (method, "POST") = = 0) (GDB) n 304 while (read (cgi_output[0], &c, 1) > 0) (GDB) L 299 for (i = 0; i < content_length; i++) {"If it is post, continue to read the data body from the cgi_input, it is imported into the standard input, so it is piped into the cgi_output[1]
"Recv (client, &c, 1, 0);
301 Write (Cgi_input[1], &c, 1);
302} 303 304 while (read (cgi_output[0], &c, 1) > 0) 305 Send (client, &c, 1, 0);
306 307 Close (cgi_output[0]);
308 Close (cgi_input[1]);
(GDB) S "Single Step from cgi_output[0] read" 305 Send (client, &c, 1, 0);
(GDB) p C $ = ten ' \ n ' (GDB) s 305 Send (client, &c, 1, 0);
(GDB) P C "The following section is just read the test script" The results show:

Of course, I'm just demonstrating one of these cases, as the case of a GET request with a query, the POST request with the data body, only by the reader himself to try, Bo Master for a moment.

Well, it's the end of the sense of interpretation. Looks like there's a little bit of detail. Bloggers still have to continue to study, in short, through this example really know more about Linux programming, thanks to open source, haha.


Reference links

1 http://blog.csdn.net/jcjc918/article/details/42129311

2 http://blog.sina.com.cn/s/blog_a5191b5c0102v9yr.html

3 CGI Description: http://www.jdon.com/idea/cgi.htm

4 http://www.scholat.com/vpost.html?pid=7337

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.