Java functions for searching large file strings and Efficiency Test for calling Shell search

Source: Internet
Author: User
Tags call shell

For 1 GB log files, you need to find the row and row location of the specified string. There are two methods: one is to directly use java functions, and the other is to call the Linux shell Command to assist in processing. The following is an example program:

# Cat TestIO. java

import java.io.BufferedInputStream;import java.io.BufferedReader;import java.io.File;import java.io.FileInputStream;import java.io.InputStreamReader;import java.util.regex.Pattern;public class TestIO{        private int lineNum = 0;        private String path = "";        private String searchStr = "";        public void setPath(String value)        {                path = value;        }        public String getPath()        {                return path;        }        public void setSearchStr(String value)        {                searchStr = value;        }        public String getSearchStr()        {                return searchStr;        }        /**         * Java search by index         */        public void start()        {                if(null == path || path.length()<1)                        return;                try                {                        long startMili=System.currentTimeMillis();                        System.out.println("Start search \""+searchStr+"\" in file: "+path);                        File file = new File(path);                        BufferedInputStream fis = new BufferedInputStream(new FileInputStream(file));                        BufferedReader reader = new BufferedReader(new InputStreamReader(fis,"utf-8"));                        String line = "";                        lineNum = 0;                        while((line = reader.readLine()) != null)                        {                                lineNum ++;                                String rs = this.searchStr(line, searchStr);                                if(rs.length()>0)                                {                                //      System.out.println("Find in Line["+lineNum+"], index: "+rs);                                }                        }                        System.out.println("Finished!");                        long endMili=System.currentTimeMillis();                        System.out.println("Total times: "+(endMili-startMili)+" ms");                        System.out.println("");                }                catch(Exception e)                {                        e.printStackTrace();                }        }        /**         * Call shell command to search         */        public void startByShell()        {                try                {                        long startMili=System.currentTimeMillis();                        System.out.println("Start search \""+searchStr+"\" in file: "+path+ " by shell");                        String[] cmd = {"/bin/sh", "-c", "grep "+searchStr+" "+path+" -n "};                        Runtime run = Runtime.getRuntime();                        Process p = run.exec(cmd);                        BufferedInputStream in = new BufferedInputStream(p.getInputStream());                        BufferedReader reader = new BufferedReader(new InputStreamReader(in));                        String line = "";                        lineNum = 0;                        while((line = reader.readLine()) != null)                        {                                lineNum ++;                                String rs = this.searchStr(line.substring(line.indexOf(':')+1), searchStr);                                if(rs.length()>0)                                {                                        String linebyshell = line.substring(0, line.indexOf(':'));                                        //System.out.println("Find in Line["+linebyshell+"], index: "+rs);                                }                        }                        System.out.println("Finished!");                        long endMili=System.currentTimeMillis();                        System.out.println("Total times: "+(endMili-startMili)+" ms");                        System.out.println("");                }                catch(Exception e)                {                        e.printStackTrace();                }        }        public String searchStr(String src, String value)        {                String result = "";                int index = src.indexOf(value,0);                while(index>-1)                {                        result+=index+",";                        index = src.indexOf(value,index+value.length());                }                return result;        }        public static boolean isNumeric(String str)        {            Pattern pattern = Pattern.compile("[0-9]*");            return pattern.matcher(str).matches();         }        /**         * @param args         */        public static void main(String[] args)        {                String file = "./testfile.txt";                TestIO test = new TestIO();                if(args.length>0)                        test.setPath(args[0]);                else                        test.setPath(file);                if(args.length>1)                        test.setSearchStr(args[1]);                else                        test.setSearchStr("hello");                test.start();                test.startByShell();        }}


The test file contains 1.4 GB logs and millions of logs. Where

The keyword hello has less than 50 records;

Chipkill accounts for about 20% of records;

Error accounts for about 50% of records;

Mainbuild166 accounts for about 99% of records;


Test results:

[Root @ mainbuild166 io] # java TestIO./testfile.txt hello

Start search "hello" in file:./testfile.txt

Finished!

Total times: 7825 MS


Start search "hello" in file:./testfile.txt by shell

Finished!

Total times: 3080 MS


[Root @ mainbuild166 io] # java TestIO./testfile.txt chipkill

Start search "chipkill" in file:./testfile.txt

Finished!

Total times: 8760 MS


Start search "chipkill" in file:./testfile.txt by shell

Finished!

Total times: 3732 MS


[Root @ mainbuild166 io] # java TestIO./testfile.txt error

Start search "error" in file:./testfile.txt

Finished!

Total times: 11339 MS


Start search "error" in file:./testfile.txt by shell

Finished!

Total times: 8163 MS


[Root @ mainbuild166 io] # java TestIO./testfile.txt mainbuild166

Start search "mainbuild166" in file:./testfile.txt

Finished!

Total times: 9938 MS


Start search "mainbuild166" in file:./testfile.txt by shell

Finished!

Total times: 12531 MS


From the above test results, it can be seen that when the result set is much smaller than the data set, the method of calling shell is far more efficient than using java functions directly, which is quite in line with the actual situation.

This article is from the "little he Beibei's technical space" blog, please be sure to keep this http://babyhe.blog.51cto.com/1104064/1358167

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.