<> Some implementation methods for counting the number of source code lines, including the number of source code lines
This problem can be basically implemented in a language, but it is simple and complex. In this discussion, we only use shell and c to discuss the number of lines in the source code in linux. This project is intended to implement a python version. Since the python version is not very familiar, it will be completed later.
Shell version
The powerful shortcuts of shell are shown here. We usefind
The command can directly search the target file, and then we can directly calculate the retrieved object. Statistics we know how to usewc
This command, but let's take a look at the wc output:
206 ./2014-03-09-jekyll-blog.md
The above is what we want, while the following file name is not what we want. How can this problem be solved? One idea is to save the result to a file and then usecut
Wait for the command to transform the file, and then go to the statistics. This method is quite troublesome. The second approach is to useawk
, Directly split our results. According to train of thought 2, we will encode this.
Here$1
Is the path name$2
Maximum depth$3
The file extension name or some regular expressions. I originally planned to create a test based on the user input, and then generate the Regular Expression by myself. I feel this is a bit more complicated. The file suffixes used in a project are generally not much, and the core code is counted, so this one can be simpler.
The file here is the return value of the previous find. We put the above Code in a loop. This is the simplest use of awk. It prints out the first field of a record. But don't confuse this.$1
And in the shell script$1
It is different. This is in the awk command, so do not worry about errors.
In order to make the output look better, we add some modifications to the echo output, mainly the color modification:
echo -e "$s: \e[1;32m $t \e[0m"echo -e "\e[1,31m total: \e[1;32m $w \e[0m"
Now, a script that basically meets the requirements for code statistics is complete. A complete version can be viewed in my gist.
Effect
C version
The reason why I want to use c for statistics is that on the linux platform, I can use something about system programming to solve this problem. Because this does not provide directory operations in Standard c, I did not study it in windows, so it is difficult to add more details.
Let's take a look at how to implement the C-version statistical program? First, we will solve the problem of file traversal. We don't have to worry about the next step. Fortunately, a dirent. h header file is provided in linux, where the related functions are located. What we can use isopendir
readdir
telldir
seekdir
closedir
There are several functions, which are almost the same for writing the function. One missing is the function for changing the directory.chdir
The function is implemented inunistd.h
So these are basically the same.
Readdir returns a struct with the d_name parameter, which is the name of the saved file. With this parameter, we can determine its attributes. Because we need to determine whether this is a file or a directory.
How to judge? No shellif [ -d filename ]
This is a quick judgment, but one thing we need to think of is that in liunx, everything is a file. We can use the judgment of the file attributes.lstat()
This function is insys/stat.h
, We uselstat(filename,&statbuf)
Then the macro can be used to judge the problem.S_ISDIR(statbuf.st_mode)
In this way, the judgment of the file type is solved.
Next, let's consider how to traverse a folder? First, it must be in a loop. Because there are sometimes more than one file under a directory, but what about different directories? For example, how to traverse a sub-directory under a directory? We consider using recursion to call the function we wrote recursively. Because only one directory is read at a time. And whenreaddir
Return a NULL value when reading the end of the file. You may ask how this recursion is implemented. After the loop endschidr("..")
Return to the upper layer to continue the traversal. This is the condition for Recursive termination.
It is time to calculate the code. We have removed the directory file, and the rest is the real file. For Statistics, use fgets to traverse a file, and then add a loop to solve the problem. One bad thing here is that there is no way to judge the file type. Statistics are statistics on files in the directory, which is not as flexible as shell.
The following is the source code. The basic idea is as follows. Note that there are two special directories:.
And..
We need to ignore it. Otherwise, an endless loop will occur.
#include <stdio.h>#include <stdlib.h>#include <string.h>#include <fcntl.h>#include <unistd.h>#include <errno.h>#include <dirent.h> /*some functions option on dir*/#include <sys/stat.h>#define MAXLINE 200 int linecount=0; /*for save the countline*/ int getLine(char *fname) { FILE *fp; char line[MAXLINE]; int total = 0; fp = fopen(fname,"r"); while(fgets(line,200,fp) != NULL) total++; return total; } /*Recurse a the target dirent*/ void Recur_dir(const char *dir, int depth) { DIR *dp; /*dir pointer*/ struct dirent *entry; /*structure to save dir info*/ struct stat statbuf; /*use to charge which is dir and which is file*/ /*get the decriptor*/ if((dp = opendir(dir)) == NULL) { printf("cannot open directory:%s\n",dir); fprintf(stderr,"opendir error:%s\n",strerror(errno)); exit(1); } chdir(dir); /*enter into target dir*/ while((entry = readdir(dp)) != NULL) { lstat(entry->d_name,&statbuf);/*get the entry status*/ if(S_ISDIR(statbuf.st_mode)) { /*is a dirctory and ignore the .. and .*/ if(strcmp(".",entry->d_name)== 0 || strcmp("..",entry->d_name) == 0) continue; //printf("%*s%s\n",depth,"",entry->d_name); Recur_dir(entry->d_name,depth+4); /*continue to open */ } else /*if not a dirctroy and we count the line*/ { linecount += getLine(entry->d_name); /*count everyfile line*/ } } chdir(".."); closedir(dp); } int main(int argc, const char *argv[]) { char *topdir = "." if(argc > 2) { printf("Usage:%s dir\n",argv[0]); exit(1); } if(argc == 2) { topdir = argv[1]; } Recur_dir(topdir,0); printf("Total:%d\n",linecount); return 0; }
If no parameter is provided, the current directory is traversed and the directory provided for statistics is provided.
Effect