Java-based data collection (2): java data collection

Source: Internet
Author: User

Java-based data collection (2): java data collection

In the previous article "Java-based data collection (a)": http://www.cnblogs.com/lichenwei/p/3904715.html

I mentioned how to read the webpage source code and dynamically capture the webpage data we need through group regular expressions.

Now I want to write down the data storage. The idea is very simple. We only need to store the data in a temporary variable every time we read a data, and then insert it into the database.

Create a table first:

DoMysql. java (database connection class and data insertion method)

1 package com. lcw. curl; 2 3 4 import java. SQL. connection; 5 import java. SQL. driverManager; 6 import java. SQL. SQLException; 7 import java. SQL. statement; 8 9 10 public class DoMySql {11 12 // define the MySql driver, database address, database username and password, execute the Statement and database connection 13 public String driver = "com. mysql. jdbc. driver "; 14 public String url =" jdbc: mysql: // 127.0.0.1: 3306/football "; 15 public String user =" root "; 16 public String password = ""; 17 public Statement stmt = null; 18 public Connection conn = null; 19 20 // create a data insertion method 21 public void datatoMySql (String insertSQl) {22 23 try {24 try {25 Class. forName (driver ). newInstance (); 26} catch (Exception e) {27 28 e. printStackTrace (); 29} 30 // create connection 31 conn = DriverManager. getConnection (url, user, password); 32 // create a Statement object to send the SQL Statement to the database 33 stmt = conn. createStatement (); 34} catch (SQLException e) {35 e. printStackTrace (); 36} 37 try {38 // execute the SQL insert statement 39 stmt.exe cuteUpdate (insertSQl); 40} catch (SQLException e) {41 e. printStackTrace (); 42} 43 try {44 stmt. close (); 45 conn. close (); 46} catch (SQLException e) {47 e. printStackTrace (); 48} 49} 50 51}

 

GetData. java (data filtering class)

1 package com. lcw. curl; 2 3 import java. util. regex. matcher; 4 import java. util. regex. pattern; 5 6 public class GetData {7 8/** 9*10 * @ param regex regular expression 11 * @ param content 12 * @ return13 */14 public String getData (string regex, string content) {15 Pattern pattern = Pattern. compile (regex, Pattern. CASE_INSENSITIVE); // set the regular expression, Case Insensitive 16 Matcher = pattern. matcher (content); 17 if (matcher. find () {18 return matcher. group (); 19} else {20 return ""; 21} 22} 23 24}

 

CurlMain. java main program class:

1 package com. lcw. curl; 2 3 import java. io. bufferedReader; 4 import java. io. inputStreamReader; 5 import java.net. URL; 6 7 public class CurlMain {8 9/** 10 * @ param args11 */12 public static void main (String [] args) {13 14 try {15 String address = "http://www.footballresults.org/league.php? League = EngDiv1 "; 16 URL url = new URL (address); 17 InputStreamReader inputStreamReader = new InputStreamReader (url18. openStream (), "UTF-8"); // open the address, return the byte in UTF-8 encoding and convert it to character 19 BufferedReader bufferedReader = new BufferedReader (20 inputStreamReader ); // read text from the character input stream and buffer each character to provide efficient reading of characters, arrays, and rows. 21 22 GetData data = new GetData (); 23 DoMySql mySql = new DoMySql (); 24 String content = ""; // It is used to accept the line character 25 int flag = 0 for each read; // It indicates that the team information is exactly behind the date information, and the regular expression is the same, used to separate data 26 String dateRegex = "\ d {1, 2 }\\. \ d {1, 2 }\\. \ d {4} "; // Regular Expression for date matching 27 String teamRegex ="> [^ <>] * </a> "; // regular expression 28 String scoreRegex = "> (\ d {1, 2}-\ d {1, 2}) </TD> "; // score Regular Expression 29 String tempDate = ""; 30 String teama = ""; 31 String teamb = ""; 32 String sco Re = ""; 33 int I = 0; // number of records 34 String SQL = ""; 35 36 while (content = bufferedReader. readLine ())! = Null) {// read a row of data each time 37 // obtain the competition date information 38 String dateInfo = data. getData (dateRegex, content); 39 if (! DateInfo. equals ("") {40 System. out. println ("Date:" + dateInfo); 41 tempDate = dateInfo; 42 flag ++; 43} 44 // obtain team information, you must first read the date information so that the flag auto-increment 45 String teamInfo = data. getData (teamRegex, content); 46 if (! TeamInfo. equals ("") & flag = 1) {47 teama = teamInfo. substring (1, teamInfo48. indexOf ("</a>"); 49 System. out. println ("lead:" + teama); 50 flag ++; 51} else if (! TeamInfo. equals ("") & flag = 2) {52 teamb = teamInfo. substring (1, teamInfo53. indexOf ("</a>"); 54 System. out. println ("Guest:" + teamb); 55 flag = 0; 56} 57 // obtain score information 58 String scoreInfo = data. getData (scoreRegex, content); 59 if (! ScoreInfo. equals ("") {60 score = scoreInfo. substring (1, scoreInfo61. indexOf ("</TD>"); 62 System. out. println ("score:" + score); 63 System. out. println (); 64 I ++; 65 SQL = "insert into football ('date', 'temama ', 'teamb', 'score ') values ('"66 + tempDate67 +"', '"68 + teama69 +"', '"70 + teamb71 +"', '"72 + score + "')"; 73 System. out. println (SQL); 74 mySql. datatoMySql (SQL); 75} 76 77} 78 bufferedReader. close (); 79 System. out. println ("A total of" + I + "messages"); 80} catch (Exception e) {81 e. printStackTrace (); 82} 83 84} 85 86}

 

Run the following command:

 

Next article: Java-based data collection (III): http://www.cnblogs.com/lichenwei/p/3905370.html


Data Collector Java programming, documentation?

The java collector did not see any collector. You may be able to find it when you are walking around.

How to use Java for data collection?

It is recommended that you first study the collector of the personal space attached to the dz forum, and then study how to use java to implement the function.
How can I collect related data from the Internet?
How to write data into your own database?
Finally, how can I put the data in the corresponding module of my website?

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.