Reproduced to: http://blog.csdn.net/chgaowei/article/details/6658260
What is data-driven programming?
Preface:
I recently studied Unix programming art. In the past, I thought it was about UNIX tools. Now I have carefully read the design principles. Its core is the Unix philosophy introduced in chapter 1 and 17 design principles, which will be carried out later. As I have said before, to learn the appropriate materials, one way to determine whether it is suitable is to see if you can read it. I have a sense of mutual hate for this book. 4 ~ If you have 6 years of work experience, you can read it.
Question:
When introducing the Unix design principles, one of them isRepresentation principle: Fold knowledge into data to make the logic simple and robust". Based on some of my previous experiences, I have a strong resonance with this principle. So I first learned about data-driven programming. I will share this with you and discuss it with you.
Core of data-driven programming
The core starting point of data-driven programming isCompared with program logic, humans are better at processing data.. Data is easier to control than program logic, so we should try our best to transfer the complexity of design from program code to Data.
Is that true? Let's look at an example.
Assume that a program needs to process messages sent by other programs. The message type is a string, and each message needs a function for processing. The first impression may be handled as follows:
[CPP]View plaincopy
- Void msg_proc (const char * msg_type, const char * msg_buf)
- {
- If (0 = strcmp (msg_type, "inivite "))
- {
- Inivite_fun (msg_buf );
- }
- Else if (0 = strcmp (msg_type, "tring_100 "))
- {
- Tring_fun (msg_buf );
- }
- Else if (0 = strcmp (msg_type, "ring_180 "))
- {
- Ring_180_fun (msg_buf );
- }
- Else if (0 = strcmp (msg_type, "ring_181 "))
- {
- Ring_1820.fun (msg_buf );
- }
- Else if (0 = strcmp (msg_type, "ring_182 "))
- {
- Ring_182_fun (msg_buf );
- }
- Else if (0 = strcmp (msg_type, "ring_183 "))
- {
- Ring_183_fun (msg_buf );
- }
- Else if (0 = strcmp (msg_type, "OK _200 "))
- {
- OK _200_fun (msg_buf );
- }
- ......
- Else if (0 = strcmp (msg_type, "fail_486 "))
- {
- Fail_486_fun (msg_buf );
- }
- Else
- {
- Log ("unknown message type % s \ n", msg_type );
- }
- }
The above message type is taken from the SIP protocol (not exactly the same, the SIP protocol uses the HTTP protocol for reference), and the message type may increase. The process may be a little tired. It is also difficult to check whether a message in the middle is processed. In addition, if no message is added, a process branch is required.
According to the data-driven programming idea, the design may be as follows:
[CPP]View plaincopy
- Typedef void (* sip_msg_fun) (const char *);
- Typedef struct _ msg_fun_st
- {
- Const char * msg_type; // Message Type
- Sip_msg_fun fun_ptr; // function pointer
- } Msg_fun_st;
- Msg_fun_st msg_flow [] =
- {
- {"Inivite", inivite_fun },
- {"Tring_100", tring_fun },
- {"Ring_180", ring_180_fun },
- {"Ring_181", ring_181_fun },
- {"Ring_182", ring_182_fun },
- {"Ring_183", ring_183_fun },
- {"OK _200", OK _200_fun },
- ......
- {"Fail_fun", fail_486_fun}
- };
- Void msg_proc (const char * msg_type, const char * msg_buf)
- {
- Int type_num = sizeof (msg_flow)/sizeof (msg_fun_st );
- Int I = 0;
- For (I = 0; I <type_num; I ++)
- {
- If (0 = strcmp (msg_flow [I]. msg_type, msg_type ))
- {
- Msg_flow [I]. fun_ptr (msg_buf );
- Return;
- }
- }
- Log ("unknown message type % s \ n", msg_type );
- }
The following advantages:
1. more readable and clear message processing process.
2. It is easier to modify. To add new messages, you only need to modify the data without modifying the process.
3. Reuse. In the first solution, many else if messages are of different types and processing functions, but the logic is the same. The following solution extracts the same logic and mentions the components that are easy to change.
Thoughts hidden behind:
The principles behind many design ideas are actually the same. The Implementation ideas behind data-driven programming include:
1. control complexity. By transferring the complexity of the program logic to the data that is easier for humans to process, the complexity can be controlled.
2. Isolate changes. As in the preceding example, the logic of each message is unchanged, but the message may be changed. In this case, the logic of the message that is easy to change is separated from the logic that is not easy to change.
3. Separation of mechanisms and policies. This is very similar to the second one. Many parts of this book refer to mechanisms and strategies. In the above example, I understand that the mechanism is the message processing logic, and the policy is different message processing (I would like to write an article later to introduce the mechanism and policy ).
What data-driven programming can do:
As shown in the preceding example, it can be applied to function-level design.
At the same time, it can also be applied in program-level design. For example, a table-driven state machine can be implemented (this article will introduce it later ).
It can also be used in system-level design, such as DSL (my experience is somewhat lacking and it is not very definite at present ).
What is it:
1. It is not a brand-new programming model: it is just a design concept and has a long history. It has many applications in the Unix/Linux community;
2. It is different from the data in Object-Oriented Design: "in data-driven programming, data not only indicates the state of an object, but also defines the process of the program. Oo focuses on encapsulation, data-driven programming focuses on writing as few code as possible."
The words worth thinking about in the book:
Data is overwhelming. If a correct data structure is selected and all organizations are well organized, the correct algorithm is self-evident. The core of programming is the data structure rather than algorithms. -- Rob Pike
Programmers are helpless ..... Only by jumping off the code, getting up and thinking carefully about data is the best action. The essence of Expression Programming. -- Fred Brooks
Data is easier to control than program logic. It is a good practice to transfer the design complexity from code to Data. -- Author of Unix programming art.
Data-driven programming-table-Driven Method
The sample code in this article uses the C language.
I have previously introduced data-driven programming, what is data-driven programming. This section describes a simple data-driven approach. Today, we will go further to introduce a more complex and practical way-table-driven approach.
The table-driven method is mentioned in "Unix programming art". For a more detailed description, see "code Daquan". There is a specific chapter (about Chapter 8 ).
Simple Table DRIVER:
What is data-driven programming? There is a sample code. It can also be seen as a table-driven approach, but this table is relatively simple. After receiving a message, it determines which function to call for Processing Based on the message type.
A more complex table DRIVER:
For a message (event)-driven system, a module of the system needs to communicate with several other modules. After receiving a message, it must perform different processing based on the sender, type, and status of the message. A common practice is to use three cascade switch branches to implement hardware encoding:
[CPP]View plaincopy
- Switch (sendmode)
- {
- Case:
- }
- Switch (msgevent)
- {
- Case:
- }
- Switch (mystatus)
- {
- Case:
- }
Disadvantages of this method:
1. Low Readability: Find the code for processing a message and jump to multiple layers of code.
2. Too many switch branches, which is actually a type of repetitive code. They all share common features and can be further refined.
3. Poor Scalability: if a new module state is added to the program, this may change all message processing functions, which is inconvenient and error-prone.
4. The program lacks the backbone: it lacks a backbone that can be outline, and the main program is drowned in a lot of code logic.
Use the table-driven method:
Define a function jump table based on the three enumerated values: module type, message type, and module status:
[CPP]View plaincopy
- Typedef struct _ event_drive
- {
- Mode_type MOD; // message sending module
- Event_type event; // Message Type
- Status_type status; // status
- Event_fun eventfun; // handler pointer in this state
- } Event_drive;
- Event_drive eventdriver [] = // This is the definition of a table, not necessarily a table in the database. You can also define a struct array.
- {
- {Mode_a, event_a, status_1, fun1}
- {Mode_a, event_a, status_2, fun2}
- {Mode_a, event_a, status_3, fun3}
- {Mode_a, event_ B, status_1, fun4}
- {Mode_a, event_ B, status_2, fun5}
- {Mode_ B, event_a, status_1, fun6}
- {Mode_ B, event_a, status_2, fun7}
- {Mode_ B, event_a, status_3, fun8}
- {Mode_ B, event_ B, status_1, fun9}
- {Mode_ B, event_ B, status_2, fun10}
- };
- Int driversize = sizeof (eventdriver)/sizeof (event_drive) // size of the driver table
- Event_fun getfunfromdriver (mode_type mod, event_type event, status_type status) // driver table lookup Function
- {
- Int I = 0;
- For (I = 0; I <driversize; I ++)
- {
- If (eventdriver [I]. Mod = mod) & (eventdriver [I]. event = event) & (eventdriver [I]. Status = status ))
- {
- Return eventdriver [I]. eventfun;
- }
- }
- Return NULL;
- }
Benefits of this method:
1. Improved the readability of the program. You only need to check the driver table to learn how to process a message.
2. Reduced repeated code. This method certainly has less code than the first one. Why? It abstracts some repeated items: Switch branch processing, and abstracts the common items in the process-the getfunfromdriver function and a driver table according to the search and processing methods of the three elements.
3. scalability. Pay attention to this function pointer. Its definition is actually a contract. Similar to Java interfaces, pure virtual functions in C ++ only meet this condition (input parameter, return value ), can be used as an event processing function. This is a bit of plug-in structure, you can easily replace these plug-ins, add, delete, and thus change the behavior of the program. In this case, the search for event processing functions is isolated (also called isolated changes ). ,
4. The program has an obvious trunk.
5. Reduced Complexity. By transferring the complexity of the program logic to the data that is easier for humans to process, the complexity can be controlled.
Inheritance and combination
Consider an event-driven module. This module manages many users and each user needs to handle many events. The driver table we created is not for the module, but for the user. It should be for the user to receive an event from a module in a certain state. Let's assume that users can be divided into different levels, and each level has different processing methods.
With the object-oriented approach, we can consider designing a user's base class to implement the same event processing method. Based on different levels, we can define several different subclasses to inherit from common processing, different processing methods are implemented. This is the most common idea. It can be called the inheritance law.
How can we implement the table-driven approach? Directly design a user's class without subclass, and there is no specific event handling method. It has a member, which is a driver table. After receiving the event, it delegates all the members to the driver table for processing. Different driver tables can be defined for different user levels to assemble different object instances. This can be called the combination method.
Inheritance and combination are also mentioned in design patterns. The advantage of a combination is its scalability, elasticity, and encapsulation. (For details about inheritance and combination, refer to this article: object-oriented inheritance combination)
In this case, you can continue to use the struct or object for the driver table.
Some performance optimization suggestions for the above method:
If you do not have high performance requirements, the above method is sufficient. If the performance requirement is high, you can perform appropriate optimization. For example, you can create a multi-dimensional array with each dimension representing the module, status, and message. In this way, you can locate the processing function based on the subscript Based on the enumerated values of the three, rather than the query table. (In fact, it is still a data-driven idea: the data structure is a static algorithm .)
Data-driven programming is more advanced and abstract. It should be a process script or DSL. I used to write a simple script to describe the process on XML. This section will be introduced later.
Http://blog.csdn.net/chgaowei/article/details/6966857