Use Java to test the transfer file size of JSON and protocol buffer _java

Source: Internet
Author: User
Tags class definition int size tojson

JSON believes everyone knows what it is, and if you don't know it, it's out of the way and Google is going to do it. There's nothing to be introduced here.
Protobuffer Everyone estimated rarely heard, but if it is Google to do, I believe that everyone will be interested to try, after all, Google exports, more than a boutique.
Protobuffer is a JSON-like transport protocol, but it's not a protocol, it's just a data transfer thing.
So what's the difference between it and JSON?
Cross-language, which is one of its advantages. It brought a compiler, PROTOC, only to use it to compile, can be compiled into Java, Python, C + + code, temporarily only these three, others temporarily do not think, and then can be used directly, do not need to write any other code. Even the ones that are parsed have been brought in by themselves. JSON is, of course, cross-language, but this cross-language is based on writing code.
If you want to learn more, you can go to see:
Https://developers.google.com/protocol-buffers/docs/overview
OK, no more nonsense, let's take a look directly, why we need to compare Protobuffer (hereinafter referred to as GPB) and JSON.
1, JSON because of a certain format, and is the existence of characters, in the amount of data there can be compressed space. While the GPB on the large amount of data, space is much smaller than JSON, wait for the example we can see.
2, JSON each library between the efficiency difference is relatively large, Jackson Library and Gson is about 5-10 of the gap (this only once tested, if wrong, please pat). and GPB only need one, there is no difference between the so-called multiple libraries. Of course, this point is just out of the dine, can be ignored ha.

Talk is Cheap,just show me the code.
In the program world, the code is always kingly, the following directly to the code bar.
Before the code, we should first download Protobuffer, here:
Https://github.com/google/protobuf

1, first of all, GPB is required to have a similar class definition of the file, called Proto file.
Let's take the example of students and teachers to do an example:
We have the following two documents: Student.proto

Option Java_package = "Com.shun"; 
Option Java_outer_classname = "Studentproto"; 
 
Message Student { 
  required int32 id = 1; 
  Optional String name = 2; 
  Optional Int32 age = 3; 
} </span> 

Teacher.proto

Import "Student.proto"; 
Option Java_package = "Com.shun"; 
Option Java_outer_classname = "Teacherproto"; 
 
Message Teacher { 
  required int32 id = 1; 
  Optional String name = 2; 
 
  Repeated Student student_list = 3; 
} </span> 

Here are some of the strangest things we've encountered:
Import,int32,repated,required,optional,option, etc.
Come on:
1) Import indicates the introduction of other proto files
2 required,optional Indicates whether the field is optional, which determines what protobuffer will do if the field has no value. If required is flagged, but when processed, the field does not pass a value, an error is made, and if optional is flagged, there is no problem with not passing the value.
3) repeated believe that should be understood, is whether to repeat, and Java inside the list similar
4 message is the equivalent of class
5 option indicates that the Java_package represents the package name, that is, the package name used when generating Java code, Java_outer_classname is the class name, and note that the class name cannot be the same as the class name in the message below.
As for other options and related types, please visit the Official document.

2, with these several documents, what can we do?
Remember to download the above compiler, unzip it out, we get a protoc.exe, this is certainly under Windows, I did not get other systems, interested students to toss down Luo.
Add to Path (plus can be casual, but not convenient only), and then you can use the above file to generate the class file we need.
Protoc--java_out= The path--proto_path=proto file where the source code resides proto specific files
--PROTO_PATH Specifies the folder path for the proto file, not a single file, mainly for import file lookup use, can omit

If I need to put the source code in D:\PROTOBUFFERVSJSON\SRC, and my proto file is stored in D:\protoFiles
So my compiler command is:

Protoc--java_out=d:\protobuffervsjson\src 
D:\protoFiles\teacher.proto D:\protoFiles\student.proto

Note that the last file here, we need to specify all the files that need to be compiled

You can see the generated file after compiling.
The code is not posted, too much. You can see in private, the code has a lot of builder, I believe that a look at the builder's model is known.
Then you can put the code into your project, of course, a lot of mistakes.

Remember the source code we downloaded earlier? Unzip it, don't be soft. Then find src/main/java/copy of one of the piles to your project, of course, you can also ant or maven compile, but these two things I do not know, do not shortcoming, I am still accustomed to copy directly to the project.

Code error, haha, normal. I don't know why Google has to leave a hole for us.
Turn back to the Protobuffer directory \java see a Readme.txt, find a sentence:

It seems that the code will be a little strange, like the wrong feeling, anyway, I did not press that execution, my orders are:

<span style= "FONT-SIZE:16PX;" >protoc--java_out= or the place where the code is above proto the path to the file (this is the path to the Descriptor.proto file) </span> 

After execution, we can see the error in the code wood has.

3, the next course is the test.
Let's start with the GPB write test:

Package com.shun.test; 
Import Java.io.FileOutputStream; 
Import java.io.IOException; 
Import java.util.ArrayList; 
 
Import java.util.List; 
Import com.shun.StudentProto.Student; 
 
Import Com.shun.TeacherProto.Teacher; public class Protowritetest {public static void main (string[] args) throws IOException {Student.builder 
    Stubuilder = Student.newbuilder (); 
    Stubuilder.setage (25); 
    Stubuilder.setid (11); 
     
    Stubuilder.setname ("Shun"); 
    Construct List list<student> stubuilderlist = new arraylist<student> (); 
     
    Stubuilderlist.add (Stubuilder.build ()); 
    Teacher.builder Teabuilder = Teacher.newbuilder (); 
    Teabuilder.setid (1); 
    Teabuilder.setname ("Testtea"); 
     
    Teabuilder.addallstudentlist (stubuilderlist); 
    Write the GPB to the file FileOutputStream fos = new FileOutputStream ("C:\\users\\shun\\desktop\\test\\test.protoout"); 
    Teabuilder.build (). WriteTo (FOS); 
  Fos.close (); 
 }}</span>

We went to see the file, if no surprises, should be generated.
After the build, we definitely have to read it back.

Package com.shun.test; 
 
Import Java.io.FileInputStream; 
Import java.io.FileNotFoundException; 
Import java.io.IOException; 
 
Import com.shun.StudentProto.Student; 
Import Com.shun.TeacherProto.Teacher; 
 
public class Protoreadtest {public 
 
  static void Main (string[] args) throws FileNotFoundException, IOException { 
     
    T Eacher teacher = Teacher.parsefrom (new FileInputStream ("C:\\users\\shun\\desktop\\test\\test.protoout")); 
    System.out.println ("Teacher ID:" + teacher.getid () + ", Name:" + teacher.getname ()); 
    For (Student stu:teacher.getStudentListList ()) { 
      System.out.println ("Student ID:" + stu.getid () + ", Name:" + Stu.getname () + ", Age:" + stu.getage ());}} 
 
</span> 

The

code is simple, because the code that GPB generates helps us do it.
The basic use of the above, we focus on the GPB and JSON to generate file size differences, JSON detailed code I will not be posted here, then will be posted to show examples, we are interested to download.
Here we use Gson to parse the JSON, and the following only gives the code to write the file after the object is converted to JSON:
The basic definition of the two class student and teacher is not done, we'll just do it. The code is as follows:

 package com.shun.test; 
Import Java.io.FileWriter; 
Import java.io.IOException; 
Import java.util.ArrayList; 
 
Import java.util.List; 
Import Com.google.gson.Gson; 
Import com.shun.Student; 
 
Import Com.shun.Teacher; public class Gsonwritetest {public static void main (string[] args) throws IOException {Student stu = new Stude 
    NT (); 
    Stu.setage (25); 
    Stu.setid (22); 
     
    Stu.setname ("Shun"); 
    list<student> stulist = new arraylist<student> (); 
     
    Stulist.add (Stu); 
    Teacher Teacher = new Teacher (); 
    Teacher.setid (22); 
    Teacher.setname ("Shun"); 
     
    Teacher.setstulist (stulist); 
    String result = new Gson (). Tojson (teacher); 
    FileWriter FW = new FileWriter ("C:\\users\\shun\\desktop\\test\\json"); 
    Fw.write (result); 
  Fw.close (); }}</span> 

The next step is to formally enter our real test code, in which we just put an object in the list, and then we test the size of the GPB and JSON generated by the number of 100,1000,10000,100000,1000000,5000000 in turn.
Improve the previous GPB code so that it generates a different number of lists and then generates files:

Package com.shun.test; 
Import Java.io.FileOutputStream; 
Import java.io.IOException; 
Import java.util.ArrayList; 
 
Import java.util.List; 
Import com.shun.StudentProto.Student; 
 
Import Com.shun.TeacherProto.Teacher; 
   
  public class Protowritetest {public static final int SIZE = 100; public static void Main (string[] args) throws IOException {//construction List list<student> stubuilderlist = 
    New Arraylist<student> (); 
      for (int i = 0; i < SIZE i + +) {Student.builder Stubuilder = Student.newbuilder (); 
      Stubuilder.setage (25); 
      Stubuilder.setid (11); 
       
      Stubuilder.setname ("Shun"); 
    Stubuilderlist.add (Stubuilder.build ()); 
    } Teacher.builder Teabuilder = Teacher.newbuilder (); 
    Teabuilder.setid (1); 
    Teabuilder.setname ("Testtea"); 
     
    Teabuilder.addallstudentlist (stubuilderlist); Write GPB to file FileOutputStream fos = new FileOutputStream ("c:\\users\\shun\\desktop\\test\\proto-" + SIZE); 
    Teabuilder.build (). WriteTo (FOS); 
  Fos.close (); 
 }}</span>

The size of this is changed to the number of tests we've heard above, and we can get the following:

And then we'll look at the JSON test code:

 package com.shun.test; 
Import Java.io.FileWriter; 
Import java.io.IOException; 
Import java.util.ArrayList; 
 
Import java.util.List; 
Import Com.google.gson.Gson; 
Import com.shun.Student; 
 
Import Com.shun.Teacher; 
   
  public class Gsonwritetest {public static final int SIZE = 100; public static void Main (string[] args) throws IOException {list<student> stulist = new Arraylist<stud 
    Ent> (); 
      for (int i = 0; i < SIZE i + +) {Student stu = new Student (); 
      Stu.setage (25); 
      Stu.setid (22); 
       
      Stu.setname ("Shun"); 
    Stulist.add (Stu); 
    } Teacher Teacher = new Teacher (); 
    Teacher.setid (22); 
    Teacher.setname ("Shun"); 
     
    Teacher.setstulist (stulist); 
    String result = new Gson (). Tojson (teacher); 
    FileWriter FW = new FileWriter ("C:\\users\\shun\\desktop\\test\\json" + SIZE); 
    Fw.write (result); 
  Fw.close (); }}</span> 

The same method modifies the size and tests accordingly.

It's obvious that the JSON file size is significantly different from the size of the GPB file when the amount of data is slowly growing, and the JSON is obviously much larger.

The above table should be able to see more clearly, the GPB of large data is very dominant, but in general, the client and the server will not directly carry out such a large data interaction, large data mainly occur on the servers, if you face the need to be hundreds of m log files to another server every day, Then GPB here might be a big help to you.


Said is the depth contrast, in fact the main contrast is the size aspect, the time aspect comparability is not too big, also does not differ too big.
The Gson parser selected in the article, interested friends can choose Jackson or Fastjson, or other, but the resulting file size is the same, but the resolution time is different.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.