Java-based data collection (2): java data collection
In the previous article "Java-based data collection (a)": http://www.cnblogs.com/lichenwei/p/3904715.html
I mentioned how to read the webpage source code and dynamically capture the webpage data we need through group regular expressions.
Now I want to write down the data storage. The idea is very simple. We only need to store the data in a temporary variable every time we read a data, and then insert it into the database.
Create a table first:
DoMysql. java (database connection class and data insertion method)
1 package com. lcw. curl; 2 3 4 import java. SQL. connection; 5 import java. SQL. driverManager; 6 import java. SQL. SQLException; 7 import java. SQL. statement; 8 9 10 public class DoMySql {11 12 // define the MySql driver, database address, database username and password, execute the Statement and database connection 13 public String driver = "com. mysql. jdbc. driver "; 14 public String url =" jdbc: mysql: // 127.0.0.1: 3306/football "; 15 public String user =" root "; 16 public String password = ""; 17 public Statement stmt = null; 18 public Connection conn = null; 19 20 // create a data insertion method 21 public void datatoMySql (String insertSQl) {22 23 try {24 try {25 Class. forName (driver ). newInstance (); 26} catch (Exception e) {27 28 e. printStackTrace (); 29} 30 // create connection 31 conn = DriverManager. getConnection (url, user, password); 32 // create a Statement object to send the SQL Statement to the database 33 stmt = conn. createStatement (); 34} catch (SQLException e) {35 e. printStackTrace (); 36} 37 try {38 // execute the SQL insert statement 39 stmt.exe cuteUpdate (insertSQl); 40} catch (SQLException e) {41 e. printStackTrace (); 42} 43 try {44 stmt. close (); 45 conn. close (); 46} catch (SQLException e) {47 e. printStackTrace (); 48} 49} 50 51}
GetData. java (data filtering class)
1 package com. lcw. curl; 2 3 import java. util. regex. matcher; 4 import java. util. regex. pattern; 5 6 public class GetData {7 8/** 9*10 * @ param regex regular expression 11 * @ param content 12 * @ return13 */14 public String getData (string regex, string content) {15 Pattern pattern = Pattern. compile (regex, Pattern. CASE_INSENSITIVE); // set the regular expression, Case Insensitive 16 Matcher = pattern. matcher (content); 17 if (matcher. find () {18 return matcher. group (); 19} else {20 return ""; 21} 22} 23 24}
CurlMain. java main program class:
1 package com. lcw. curl; 2 3 import java. io. bufferedReader; 4 import java. io. inputStreamReader; 5 import java.net. URL; 6 7 public class CurlMain {8 9/** 10 * @ param args11 */12 public static void main (String [] args) {13 14 try {15 String address = "http://www.footballresults.org/league.php? League = EngDiv1 "; 16 URL url = new URL (address); 17 InputStreamReader inputStreamReader = new InputStreamReader (url18. openStream (), "UTF-8"); // open the address, return the byte in UTF-8 encoding and convert it to character 19 BufferedReader bufferedReader = new BufferedReader (20 inputStreamReader ); // read text from the character input stream and buffer each character to provide efficient reading of characters, arrays, and rows. 21 22 GetData data = new GetData (); 23 DoMySql mySql = new DoMySql (); 24 String content = ""; // It is used to accept the line character 25 int flag = 0 for each read; // It indicates that the team information is exactly behind the date information, and the regular expression is the same, used to separate data 26 String dateRegex = "\ d {1, 2 }\\. \ d {1, 2 }\\. \ d {4} "; // Regular Expression for date matching 27 String teamRegex ="> [^ <>] * </a> "; // regular expression 28 String scoreRegex = "> (\ d {1, 2}-\ d {1, 2}) </TD> "; // score Regular Expression 29 String tempDate = ""; 30 String teama = ""; 31 String teamb = ""; 32 String sco Re = ""; 33 int I = 0; // number of records 34 String SQL = ""; 35 36 while (content = bufferedReader. readLine ())! = Null) {// read a row of data each time 37 // obtain the competition date information 38 String dateInfo = data. getData (dateRegex, content); 39 if (! DateInfo. equals ("") {40 System. out. println ("Date:" + dateInfo); 41 tempDate = dateInfo; 42 flag ++; 43} 44 // obtain team information, you must first read the date information so that the flag auto-increment 45 String teamInfo = data. getData (teamRegex, content); 46 if (! TeamInfo. equals ("") & flag = 1) {47 teama = teamInfo. substring (1, teamInfo48. indexOf ("</a>"); 49 System. out. println ("lead:" + teama); 50 flag ++; 51} else if (! TeamInfo. equals ("") & flag = 2) {52 teamb = teamInfo. substring (1, teamInfo53. indexOf ("</a>"); 54 System. out. println ("Guest:" + teamb); 55 flag = 0; 56} 57 // obtain score information 58 String scoreInfo = data. getData (scoreRegex, content); 59 if (! ScoreInfo. equals ("") {60 score = scoreInfo. substring (1, scoreInfo61. indexOf ("</TD>"); 62 System. out. println ("score:" + score); 63 System. out. println (); 64 I ++; 65 SQL = "insert into football ('date', 'temama ', 'teamb', 'score ') values ('"66 + tempDate67 +"', '"68 + teama69 +"', '"70 + teamb71 +"', '"72 + score + "')"; 73 System. out. println (SQL); 74 mySql. datatoMySql (SQL); 75} 76 77} 78 bufferedReader. close (); 79 System. out. println ("A total of" + I + "messages"); 80} catch (Exception e) {81 e. printStackTrace (); 82} 83 84} 85 86}
Run the following command:
Next article: Java-based data collection (III): http://www.cnblogs.com/lichenwei/p/3905370.html
Data Collector Java programming, documentation?
The java collector did not see any collector. You may be able to find it when you are walking around.
How to use Java for data collection?
It is recommended that you first study the collector of the personal space attached to the dz forum, and then study how to use java to implement the function.
How can I collect related data from the Internet?
How to write data into your own database?
Finally, how can I put the data in the corresponding module of my website?