標籤:
讀寫檔案
背景及木:現有資料1000w單表,為壓力測試準備1億條資料。
步驟:
1.將1000w條記錄,除id外都匯入到多個檔案中:
//DELIMITERDROP PROCEDURE if EXISTS createManyTable;create PROCEDURE createManyTable()BEGINDECLARE i int;DECLARE fileName VARCHAR(30);set i=1;while i<251DOSET fileName=CONCAT(‘f_log_‘,i,‘.txt‘);SET @STMT :=CONCAT("select `xx`,`xx`,`xx`,`xx`,.... into outfile ‘temp/",fileName,"‘ lines terminated by ‘\r\n‘ from `f_log` WHERE id>= ",40000*(i-1)," AND id< ",40000*i);PREPARE STMT FROM @STMT;EXECUTE STMT;set i=i+1;end while;END;//DELIMITERCALL createManyTable();
2. 將上述多個檔案合并到同一個檔案,並且在第一列加入id列:
public static void main(String[] args) throws IOException { int i=10000000; int step=40000; File out=new File("E:/data/f_log_data.txt"); for(int k=1;k<251;k++){ File file=new File("E:/data/temp/f_log_"+k+".txt"); StringBuffer sb=new StringBuffer(); if(file.exists()){ sb=readFile(file,i+step*k); writeFile(out,sb); } } } public static StringBuffer readFile(File file,int start) throws IOException{ StringBuffer sb=new StringBuffer(); BufferedReader reader=new BufferedReader(new FileReader(file)); String line=""; while(line != null){ line = reader.readLine(); if(line == null){ break; } if(line.trim().equalsIgnoreCase("")){ continue; } start++; sb.append(start+"\t"+line.trim()+"\r\n"); } reader.close(); return sb; } public static void writeFile(File file,StringBuffer sb) throws IOException{ BufferedWriter writer = new BufferedWriter(new FileWriter(file, true)); writer.write(sb.toString()); writer.close(); } public void writeFile11() throws IOException{ // TODO Auto-generated method stub BufferedWriter writer = new BufferedWriter(new FileWriter(new File("D:/driver/data.txt"), true)); for(int i=0;i<1000000;i++){ if(i%10==0){ writer.write("趙"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); }if(i%10==1){ writer.write("錢"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); } if(i%10==2){ writer.write("孫"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); }if(i%10==3){ writer.write("李"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); } if(i%10==4){ writer.write("鄭"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); }if(i%10==5){ writer.write("吳"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); } if(i%10==6){ writer.write("周"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); }if(i%10==7){ writer.write("王"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); } if(i%10==8){ writer.write("張"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); }if(i%10==9){ writer.write("劉"+(i/10)+"\t"+ (int)(Math.random()*100)+"\n"); } } writer.close(); }
3. 將合并後的檔案匯入到資料表中:
load data local infile ‘/tmp/finance_log_data.txt‘ into table f_log(`id`,`xx`,
`xx`,.........................
);
注意事項:開始考慮使用預存程序來逐步匯入到資料表中,但load data命令不能在預存程序中使用。
另外,資料的合并也可以以shell指令碼完成,但習慣使用java了,因此以java來完成,顯得比較複雜。不過,可以隨便複習一下java的讀寫檔案,有算不錯的經曆。
Q&A
時間問題:產生1億條資料(在有索引的情況下),用時3個小時。如果使用insert語句,估計會瘋掉!
mysql 造1億條記錄的單表--大資料表