Kafka massive data writing files

Source: Internet
Author: User
Tags log4j

A recent project uses the Kafka client to receive messages, requiring that they be written to the file (in order).

There are 2 ideas:

1. Use log4j to write the file, the advantage is stable and reliable, the file according to the setting, automatically separates the size. The disadvantage is that there is no way to do when writing files to a certain number or a certain amount of time, automatically switch the function of the directory. If you are looping through a file, such as setting up a maximum of 10, then you need to use a shell script to periodically backup these files, and of course it can be solved. But using code to solve the problem is simple:

/** * @desc: Using log4j to write Kafka data files, according to the current point of time, named file directory * <p> such as 2018020422 represents 22 o'clock in the afternoon generated directory * @since 2018-2-4 * @author C Haisson * */public class MessageHandler {private final static Logger Logger = Logger.getlogger (messagehandler.class
	
	);
	
	private static Rollingfileappender Newappender = null;
	
	Toggle log4j Appender Hour time point private static int changeappenderhour = 0;
	
	private static String FilePath = "d:/log/temp/bak/";
	
	private static String FileName = "Message.log";
		public static void Write (String message) throws IOException {Changeappender ();
	Logger.info (message); }//Change Appender object public synchronized static void Changeappender () throws IOException {if (Newappender = null) {A
			Ddappender ();
		Changeappenderhour = Calendar.getinstance (). get (Calendar.minute);
			} if (Calendar.getinstance (). Get (Calendar.minute)!= changeappenderhour) {addappender ();
		Changeappenderhour = Calendar.getinstance (). get (Calendar.minute); }///New Appender object private static void Addappender () throws IOException {Layout Layout = new Patternlayout ("%m%n");
		String dir = new SimpleDateFormat ("YYYYMMDDHHMM"). Format (new Date ());  
		Newappender = new Rollingfileappender (Layout,filepath + file.separator + dir + file.separator + fileName);
		Newappender.setname ("Kafkalogappender");
		Newappender.setmaxfilesize ("500MB");
		Newappender.setmaxbackupindex (1000);
		Logger.removeappender ("Kafkalogappender");
	Logger.addappender (Newappender); }
}
Test code:

/**
 * @desc: Test log4j Write File
 * @since 2018-2-4
 * @author Chaisson * * */Public
class testlog4j Extends Thread {
	
	private final Logger Logger = Logger.getlogger (testlog4j.class);
	
	private int No;
	
	public testlog4j (int no) {
		this.no = no;
	}
	
	private void Testwritefile (int no) throws exception{
		//stringhandler handler = new Stringhandler (no);
		String s = "xxxxxxxxxx";
		Logger.info ("Message length:" +s.length ());
		Long T1 = System.currenttimemillis ();
		for (int i = 1;i <300000;i++) {
			messagehandler.write (i+ "-->" +s);
			Handler.write (":" +i+ "-->" +s);
		}
		Long t2 = System.currenttimemillis ();
		Logger.info ("It cost" + (T2-T1) + "Ms.");
	}
	
	@Override public
	Void Run () {
		try {
			testwritefile (no);
		} catch (Exception e) {
			E.printstacktrace ();
		}
	
	public static void Main (string[] args) {
		testlog4j t1 = new testlog4j (1);
		testlog4j t2 = new testlog4j (2);
		T1.start ();
		T2.start ();
	}
Problem solving, using log4j to write files, you can easily respond to 600W per hour message (3500 characters per message length) scene, test found that the highest per hour can write more than 2000W long content messages (file size 60G above)


2. Write their own files, the advantages of how to write on how the file directory and name to change how to change, high efficiency, you can absolutely guarantee that the message in order to write. Weak stability is not as good as log4j, the code occupies more memory, without the practice of testing its reliability.

/** * @desc: Kafka Write file * @since 2018-2-5 * @author Chaisson * */public class Stringhandler {private final Logger L
	
	Ogger = Logger.getlogger (Stringhandler.class);
	
	Private Threadpoolexecutor ThreadPool;
	
	Defines a bounded blocking queue private arrayblockingqueue<string> MessageQueue = new arrayblockingqueue<string> (999);
	
	Thread pool queues private linkedblockingqueue<runnable> poolqueue = new linkedblockingqueue<runnable> (); public int m_maxfilelength = 1024 * 1024 * 300;
	
	After 300M separate file private int serialno;
	
	Private String BasePath = "d:/log/temp/bak/";
	
	Private String FileName = "Message.log";
	
	Private file File = null;
		Public Stringhandler (int serialno) {this.serialno = Serialno;
	This.start (this); The public void start (Final Stringhandler object) {//thread pool size is 1, with the lowest efficiency, but ensures that the thread executes ThreadPool = new Threadpoolexecutor in an absolute order (
		1, 1, 0L, Timeunit.seconds, Poolqueue); Runnable exeout = new Runnable () {public void run () {while (true) {try {String msg = Messagequeue.take ();
						Thread.Sleep (100);
						collection<string> lines = new arraylist<string> ();
						Lines.add (msg);
						Messagequeue.drainto (lines);						
						Remove the header of this queue and add the result Logger.info ("'ll write" + lines.size () + "to file.");
						while (Poolqueue.size () > 20) {//Prevent excessive memory, when the number of write files threads to 20, stop the creation of threads (recommended value not exceeding) thread.sleep (100);
						} Runnable task = new Writerfilethread (object,lines);
					Threadpool.execute (Task);
					catch (Interruptedexception e) {logger.error (E, E);
		} 
				}
			}
		};
		Thread t = new thread (exeout);
	T.start ();
		public void Write (String msg) {try {messagequeue.put (msg);
		catch (Interruptedexception e) {logger.error (E, E);
	} public File GetFile () {return file;
	public void Setfile (file file) {this.file = file;
	public int Getserialno () {return serialno;
	Public String Getbasepath () {return basepath;
		Public String GetFileName () {return fileName; } class Writerfilethread implements Runnable {private final Logger Logger = Logger.getlogger (writerfilethread.class
	
	); 
	
	Private static final Reentrantlock lock = new Reentrantlock ();
	 
    Private Stringhandler object;
    
    Private String msg;
    
    private collection<string> lines;
        Public Writerfilethread (Stringhandler object, collection<string> lines) {This.object = object;
    This.lines = lines; Public Writerfilethread (Stringhandler object, String msg, collection<string> lines) {this.object = obj
        ect
        this.msg = msg;
    This.lines = lines;
    	public void Run () {lock.lock ();  
    	FileWriter FW = NULL;
        BufferedWriter bw = NULL; try {//Create file synchronized (object) {if (object.getfile () = null | | object.getfile (). Length () > Object.m_
					Maxfilelength) {logger.info ("Serialno" + Object.getserialno ());
				Object.setfile (CreateNewFile ());
}			FW = new FileWriter (Object.getfile (), true); 
        	BW = new BufferedWriter (FW); 
        		if (msg!= null) {//outputstream OS = new FileOutputStream (file, true);
        	Ioutils.write (MSG, BW);  
        		} if (lines!= null) {//outputstream OS = new FileOutputStream (file, true);  
        		Ioutils.writelines (lines, NULL, OS, "UTF-8");
        	Ioutils.writelines (LINES,NULL,BW);
		The catch (IOException e) {logger.error (E, E);  
            finally {ioutils.closequietly (BW);
            ioutils.closequietly (FW);
		Lock.unlock ();
		} private File CreateNewFile () {String Dir1 = "Thread_" + Object.getserialno ();
		String Dir2 = new SimpleDateFormat ("YYYYMMDDHH"). Format (new Date ());
		String namedsuffix = new SimpleDateFormat ("Yyyymmddhhmmsssss"). Format (new Date ()); File File = new file (Object.getbasepath () + file.separator + dir1 + file.separator + dir2 + file.separator + object.getfil ENAME () + "_" + NAmedsuffix);
		if (!file.getparentfile (). exists ()) {File.getparentfile (). Mkdirs ();
	} return file; }
}
Test code:
	private void Testwritefile (int no) throws exception{
		Stringhandler handler = new Stringhandler (no);
		String s = "xxxx";
		Logger.info ("Message length:" +s.length ());
		Long T1 = System.currenttimemillis ();
		for (int i = 1;i <300000;i++) {
			//messagehandler.write (i+ "-->" +s);
			Handler.write (":" +i+ "-->" +s);
		}
		Long t2 = System.currenttimemillis ();
		Logger.info ("It cost" + (T2-T1) + "Ms.");
	}
After testing, you can easily handle long content messages over 1000W per hour (more than 3000 characters in length).
If you increase the maximum capacity of Message Queuing MessageQueue (message cache volume), you can handle up to a maximum of more than 2000W long content messages.
If you do not consider message sequential writes, then set a certain thread pool size, the maximum can handle more than 3000W long content messages.







Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.