Use Lucene to cut large documents into multiple small documents (can be run)

Source: Internet
Author: User

This code, in particular, I modified the parentheses in the parameters of the type, I want to write the method is a source file

, a destination file that uses the source file to invoke the method, and then generates the target file.

To cut the original large file, the size of the cut is the largest size of its own limit,

The serial number of the file name is:

From the order of the natural number, increment in turn.

The code is as follows:

Use the directory where you want to modify your original files.

Package comone;



Import Java.io.BufferedReader;
Import Java.io.BufferedWriter;
Import Java.io.File;
Import Java.io.FileReader;
Import Java.io.FileWriter;
Import java.io.IOException;


public class Filefenlei {

public static void Splittosmallfils (string file, String OutputPath) {
try{
File counter, for file name
int filepointer = 0;

Defines the maximum length of a file
int max_size = 10240;

Creating a file output stream
BufferedWriter writer = null;

Create a file input stream
BufferedReader reader =New BufferedReader (new FileReader (file));

Creates a string buffer that stores the data read in a large file
StringBuffer buffer = new StringBuffer ();

String line = Reader.readline ();

Iterate through each line of string read
while (line! = null) {
If the read string is not empty, the string is added to the buffer
and add a carriage return after each line of string.
Buffer.append (line) append ("\ r \ n");

Determine if the buffer length reaches the maximum length of the file
if (buffer.tostring (). GetBytes (). Length >= max_size) {

If the file reaches the maximum length, the data for the buffer is written to the file
Filepointer is part of the filename prefix
writer = new BufferedWriter (new FileWriter (OutputPath + "post-segmentation" + Filepointer + ". txt"));
Writer.write (Buffer.tostring ());
Writer.close ();
File Counter plus One
filepointer++;

Emptying buffer data
Buffer = new StringBuffer ();
}

If the maximum length of the file is not reached
line = Reader.readline ();
}

Writes buffer data directly to a file if the large file has already been read
writer = new BufferedWriter (new FileWriter (OutputPath + "post-segmentation" + Filepointer + ". txt"));
Writer.write (Buffer.tostring ());
Writer.close ();
}catch (IOException e) {
E.printstacktrace ();
}

}

public static void Main (string[] args) {
Filefenlei Fenlei = new Filefenlei ();
Fenlei.splittosmallfils ("E:\\lucene project \ \ Steel is how to practice the". txt "," E:\\lucene project \ \ "folder after splitting \ \");
}


}

Use Lucene to cut large documents into multiple small documents (can be run)

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.