learn the basic coding rules of C language from open source projects
Each project has its own style guide: a set of coding conventions for that project. Some managers choose basic coding rules, others prefer very advanced rules, there are no specific coding rules for many projects, and each developer in the project uses his own style.
It's easier to understand that all code has a large library of consistent styles.
There are a lot of resources about better coding rules that people can take, and we can learn good coding rules in the following ways:
- Reading books or magazines
- Browse the website
- Communicate with colleagues
- Attend training
Another more interesting approach is to learn how developers write code by studying a well-known open source project. For the C language, the Linux kernel is a good choice.
For beginners and even intermediate C developers, it may not be easy to drill into the code of the Linux kernel, but our goal is not necessarily to contribute to its source, but to explore how it is implemented.
Let's take a function implementation from the Linux source code as an example:
This code looks very clean, actually this function
- Only a few lines of code
- The signature is well defined.
- The notes are well written.
- The indentation is well arranged.
- Variable name is very clear
The same functionality can be implemented by another developer in the following way
The coding style has a huge impact on the readability of the source code, and devoting some time to training developers and periodically reviewing the code is always easy to maintain and upgrade.
Let's use Cppdepend to drill down into the Linux kernel's source code and discover some of the coding rules used by its developers.
Modular
Modularity is a design technique that improves the degree to which software is built using different independent parts, and you can easily manage the maintenance of modular code.
For procedural languages like C, which have no namespaces (namespace), components (component), or classes (class), we can use directories and files for modularity.
Here are some possible scenarios:
- Put all the code in one directory
- Isolate the file associated with a module or submodule and put it in a specific directory
The Linux kernel uses directories and subdirectories to modularize the core source code:
Packaging
Encapsulation refers to hiding a copy of the functionality and data within the implementation. In C, encapsulation is implemented by using the keyword static. These entities are called File field functions and variables.
Let's look at all the static functions by executing the following CQLINQ query statement:
Using the metric view, we can clearly see how many of the functions are static. In the metric view, the code base is described by a tree chart (TREEMAP). A tree chart (Treemap) is a method of using nested rectangles to present tree structure data. The tree structure used in cppdepend is the usual code hierarchy:
- Project contains directory
- Directory contains files
- file contains struct, function and variable
The tree Diagram view provides a useful way to describe the result of a CQLINQ request, so we can visually see the types related to that request.
As we can see, many functions are declared as static
Now let's query the static domain:
As with the markup for a function, many variables are declared as static properties.
In the Linux kernel source code, as long as the functions and variables are private to the file domain, the encapsulation approach is used.
Using structs to store your data model
In C programming, functions use variables to achieve different processing requirements, which can be:
- static variables
- Global variables
- Local variables
- struct-Body variables
Each project has some data models that can be used by many source files, and global variables can be used, but this is not a good solution, and it is recommended to use structs to group data.
Let's search for global variables that are of a primitive type:
Only a few variables match the query, perhaps we can group some of these variables into the struct, for example (Elfcorehdr_addr and Elfcorehdr_size), or (Pm_freezing and Pm_nosig_ Freezing) this way.
Make the function short and able
Here is a recommendation on the length of a function from a Linux coding style webpage:
Functions should be short and skilful, and a function only does one thing. They should just occupy one screen to two screens (we all know the size of the Iso/ansi standard screen is 80x24), and do one thing and do it well.
The maximum length of a function is inversely proportional to its complexity and level of indentation. So, if you have a conceptually simple function that contains a lengthy (but simple) case statement, and each of these branches corresponds to a simple process for a different situation, then this function has little to do with the long point.
Let's look for a function with more than 30 rows:
There are only a few methods that are more than 30 rows.
Number of function arguments
Functions that have more than 8 parameters can be very frustrating to call, and may also degrade the performance of a function. An alternative is to provide a full-time reference structure.
There are more than 8 parameters for only 2 methods.
Number of local variables
The method has more than 8 local variables, making the method difficult to understand and difficult to maintain. More than 15 are very complex and should be split into two smaller methods (with the exception of auto-generated tools).
There are more than 15 local variables for only 5 of these functions.
Avoid defining complex functions
There are many metrics that measure complex functions, the number of lines of code, the number of arguments, and the number of local variables is basic.
There are some other interesting indicators:
- Loop complexity is a widely used indicator in process software, which is equal to the number of resolutions that a process can accept.
- Nesting depth is a metric defined on a method that is proportional to the deepest nesting scope in the method body.
- The maximum nested loop is equal to the deepest loop level nested within a function.
The acceptable maximum value of these indicators is more dependent on the team's choice and there is no standard reference value.
Let's look for a function that might require refactoring:
Only a few can be thought of as complex functions.
Naming conventions
There are no standards for naming conventions, and each project manager can choose what they think is better, and it is important to abide by the conventions of choice and to have the project consistently named.
For example, in Linux, the struct must start with a lowercase letter, and we can verify that the rule is set up throughout the core code and execute the following query:
Only 4 structs begin with "_" instead of a lowercase letter.
Indent in
Indentation is great for making code easy to read, and the following excerpt from the Linux Coded style Web page illustrates the motives behind the use of indentation.
Rationale: The overall idea behind indentation is to clearly define the beginning and end of a control block. Especially if you've been on the screen for up to 20 hours, it's easier to see how indentation works (if you have a lot of indentation)
At present, some people will claim that the indentation of the eight-character will make the line of code too far to the right, which would be hard to read on the terminal screen where the code is 80 characters wide. The answer to this is: if you need more than 3 levels of indentation, anyway, you're stuck in jail, you should modify your program.
Conclusion
Exploring some well-known open source projects is always good for improving your programming skills, and there is no need to download and build the project, which you can find from--for example,--github.
Go: Learn the basic coding rules for C language from an open source project