Recently did quite a lot of work on fetching data from different Web pages, after repeated, there is the idea of refactoring, the language used is java.
1. Previous practices:
Because it is a functional program, it is considered as an over-the-load program, and no special classes are created:
Public Static voidMain (string[] args)throwsIOException, SQLException {fetchdata ("http://...", "A");}Private Static voidFetchdata (string URL, string timepoint)throwsIOException, SQLException {String content=gethttpcontent (URL); Web content Date datadate=getdatadate (content); Time List<MainBoard> boardlist =getboardlist (content, datadate);//Parse Data set Storedb (Boardlist, Datadate, timepoint); Write to Database}
Some variable values are also written to die in the program:
Null= drivermanager.getconnection ("Jdbc:sqlserver://192.168.1.1:123;databasename=aaaa", "12345", " 12345 "= Dbconn.createstatement ();
The Getboardlist () function, which is used to get the time, takes the data out through regular expressions and traversal comparisons, returning the relevant data classes.
The Storedb function is responsible for writing to the database:
String sql = "Delete from TABLE where datadate= '" + simpledateformat.format (date) + "' and timepoint= '" + timepoint + "'" C1>;stmt.execute (SQL); for = "INSERT INTO TABLE values" + "('" + board.getcode () + "'," + "'" + board.getstockname () + "'," ++ "," ++ "," + "'" + Timepoint + "'," + "'" + Simpledateformat.format (Board.getdatadate ()) + "'," + "'" + timefo Rmat.format (Board.getupdatetime ()) + "')"; Stmt.executeupdate (ss);}
The initial structure can basically be seen as purely procedural, and does not fit into different class files based on functionality. If subsequent requirements need to crawl more different types of pages, the code can be bloated and confusing.
For each new format, the above getdatadate, Getboardlist, Storedb, and mainboard need to be replaced, and an empty data class is generated.
2. Refactoring, Abstraction:
After a lot of repetition, there is a need to improve the code architecture. First, the theoretical knowledge, "refactoring" and "Head" are combined to see.
In refactoring, you see a description of this:
Turn the process design into object design: 1. For each record type, turn to a dummy data object that contains only the access function. 2. For each of the procedural styles, the code is refined into a separate class. 3. For each length of the program, decompose it, and then move the decomposed function into its related dummy data class. 4. Repeat the above steps.
The original object design is based on a data object, and then the operation or code related to this data is refined into a function, put into this data object.
When you write a program, it is process-oriented.
So the first change in thinking is:
Abstract the program as a data processing plant, the processing plant by a number of modules/departments together. Data as a stream, flow through the process of the processing plant, is processed by different departments, is converted, but eventually there will be a storage and display form (that is, the carrier).
observing the processing of data from a higher level is an important way to optimize the structure, abstracting the same steps/actions and leaving the specific implementation details to different classes.
Therefore, the operation of the program can be divided into data flow and data flow processing. Each time there is a requirement change, because the interface of the module is defined, the implementation of different modules can be modified.
With the idea of abstraction, once again, the program is re-thinking about how the code is organized.
A dummy data object is observed, that is, a class that contains only data variables and corresponding values/values methods. Combined with practice, it is a good way to organize operations related to this data class into dumb data classes.
For example, the Storedb function above receives a dummy data class as a parameter and constructs a database statement. So here you can put this function into the data class:
Class Data { //... Public String Generateinsertsql () { string ss = "INSERT INTO TABLE values" + "('" + board.getcode () + "'," + "' "+ board.getstockname () +" ', "+ board.getsh () +", "+ board.getdollar () +", "+ " "+ Timepoint +" ', " + "'" + Simpledateformat.format (Board.getdatadate ()) + "'," + "'" + Timeformat.format (Board.getupdatetime ()) + "')"; return SS; }}
Think about why this organization is better than the previous form? One reason is that this is more in line with the way people think.
Thinking about organizational procedures in this way, and then thinking produces a second change:
The class is considered to be data-centric and is attached to various operations of the data as functions. A program can be seen as an interaction between data that comes with behavior. In this way, the previously summarized data streams are differentiated into individual individuals, and the processing operations of the data are attached to different individuals.
Feel seems to touch the object-oriented doorway, decided to look at the discussion of the people, found that https://www.zhihu.com/question/19701980 this is quite enlightening.
further understanding of object-oriented:
An action or a matter who is to be completed, emphasizing the "who". As a result, the program changes to a group of "living creatures" between the interaction.
Back to the program practice, three bodies are abstracted from the process structure: Urlconstructor,httpservice,sqlconstructor. Represents generating URL strings, reading Web page content, and constructing SQL statements, respectively. Urlconstructor also has a function to extract the target data from the content of the Web page.
At this time the process becomes, urlconstructor constructs the URL, then httpservice receives the URL and obtains the Web page data, by the Urlconstructor parsing processing, and by the Sqlconstructor produces the SQL statement, Finally, the DB object is written to the database.
But then came a new puzzle: though structurally more abstract than before, the feeling still requires some structured statements to handle the interaction between objects. How to eliminate this part of the impact? Continue to study, found a discussion http://bbs.csdn.net/topics/40441744 is to explain the doubts in mind.
The thought changed again :
Object-oriented is a kind of thinking, and language independent. Code that is not written in sequential execution is process-oriented, and object-oriented emphasizes what kind of thinking to organize the program.
C can also write object-oriented programs, and the organization is not good, written in Java will also be a process-oriented program. Therefore, if the organization is good, sequential code execution is also object-oriented.
A more common example is global variables, which use global variables in functions to break the encapsulation of classes, not object-oriented programming. The good practice is that global variables are passed into the member function as parameters, and the encapsulation is implemented. Doing so can also facilitate unit testing, and improve clarity.
The above thinking also explains the three main elements of object-oriented: encapsulation, inheritance, polymorphism. Encapsulation is the data and operation as a whole, external only exposed interface. Inheritance and polymorphism are programs that can be easily extended, and callers need not focus on implementation details to flexibly respond to demand changes. This discussion can be seen in the https://www.zhihu.com/question/20275578, especially the "invalid S" answer.
At this point, we finally understand the dependency inversion, dependency injection and control inversion and other concepts that were not previously understood. See https://www.zhihu.com/question/31021366 Core idea is interface-oriented programming.
In order to clear the object-oriented thinking, it is possible to first glimpse the design pattern of the doorway. such as adorner mode, Factory mode, observer mode, and so on, from an object-oriented perspective can be very fast. The design pattern has several principles: 1. Interface-oriented programming; 2. Open for expansion, closed for modification; 3. combination is greater than inheritance. Using design patterns often comes across a number of issues that need to be weighed against all aspects of decision making.
But conversely, not object-oriented programming must be on the design pattern. "Design patterns are a set of reusable, most known, categorized purposes, code design experience Summary", design patterns generally have a suitable scenario, beyond this range, it is not necessarily effective.
The design pattern above is the frame design, this temporarily does not do the delve into.
Then again, refactoring, refactoring I think can be considered a kind of thinking, need to solidify in the brain, a sense of programming = = refactoring.
Refactoring I think there are a few key ideas: 1. Repeat the code to move to the same place; 2. If modifications are required, the goal is to modify only one place; 3. A change affects only one class; 4. A class is affected only by one change. Wait ~ ~
Because "the code is written first, and then it is written for the computer".
The most important thing is to work hard, now my understanding of object-oriented thinking is just beginning, will definitely back to think again and again practice. Hope to be more skilled and efficient.
Finally, through this refactoring, the thoughts on Java regular performance are recorded.
Because the write regular expression is generally used to wildcard characters such as:
<TD .*><a.*> (. *) </a></td>
And the regular does not mean that the search efficiency is high. Wildcards can lead to increased match times, so some simple expressions using functions such as indexof () can improve performance by themselves. From the test results, simple expressions can improve their own time by a fraction of the performance, but the practice of reading Web pages is negligible. So how do you use regular? Whether to implement it yourself or use complex expressions, you need to test before making a decision.
One-time refactoring experience