International - English

Cart Console

Topic Center

Contact Sales

Home > Internet > Big Data

On the language and data compression ratio of large data

Last Update:2014-12-22 Source: Internet

Author: User

Keywords Java large data compression ratio

Tags based content cpu data data warehouse disk internet internet +

Developer on Alibaba Coud: Build your first app with APIs, SDKs, and tutorials on the Alibaba Cloud. Read more ＞

I wonder if you have read this article about Twritter, "see how Twitter should respond to the general election: less Ruby and more java."

Interested friends can go to search for a look.

Obama and Romney on election Day, the number of Twitter server processing per minute is 327,452! On Twitter, people posted 31 million election-related content that day, while Twitter traffic soared, at one point 15,107 per second. In the Internet world, it's not Obama that really succeeds, it's Twitter, because Twitter doesn't have downtime this time.

"As part of migrating Ruby, we reconfigured the server and the mobile client's access would go through the Java Virtual machine stack to avoid parallel with the Ruby stack," says Rawashdeh, "the ability to withstand such loads is thanks to Twitter rewriting Ruby on with Java Railstwitter. At first the company was opposed to Java, supporting Scala, and today, Twitter combines Scala with Java.

Hadoop, as the giant beast in the open frame of large data, is hard to measure. It is also based on Java development.

The research and development of the BI product series is also based on Java, competitors in foreign countries are generally cognos, Bo and Biee. From the experience of customer selection, customers tend to our two big weapons quite praise: first, the data of High-performance computing, the second is data visualization. These two aspects are the author personally brick built up, so also have a say: ready to use Java processing large data of children's shoes, please rest assured take.

I often see in the work of the network have children's shoes said mass data processing, massive data calculation can not use Java, you have to use C or C, and so on. Only laugh at a time. Most of the time, the debate is totally meaningless because there is no standard answer.

Often see a number of data warehouse products say data compression ratio, compressed to more than 1/10, save a disk 90% and so on.

The author thinks, for the data transmission between the MPP node, the comprehensive network bandwidth may need more ruthless data compression, in addition to save more than nothing, it is not too important.

Now the PC disk Standard is TB, save the disk is not useful, there may be side effects. The analysis is as follows:

When processing the massive data computation request, the general need to load the data into the memory, if has the compression, needs to expand the data in the memory to carry on the computation. General developer are aware that expanding data is an easy cause for frequent memory requests and releases, while decompression is most likely a CPU-consuming process.

Therefore, when measuring a data warehouse or data mart products, save disk can consider, more important is to consider it is not the province of memory, CPU, save time.

A good product, there will be a better memory design to omit or optimize the process of data expansion, this process does not result in frequent memory requests and releases; Based on high-performance considerations, it chooses an efficient way to load disk data or discard memory data for the fastest response, while on the CPU load, will be as far as possible to save CPU calculation, do massive data real-time calculation.

Thus, when measuring a data warehouse or bi product, it is recommended to test it based on a series of disk, memory, and CPU configurations.

This article is an English version of an article which is originally in the Chinese language on aliyun.com and is provided for information purposes only. This website makes no representation or warranty of any kind, either expressed or implied, as to the accuracy, completeness ownership or reliability of the article or any translations thereof. If you have any concerns or complaints relating to the article, please send an email, providing a detailed description of the concern or complaint, to info-contact@alibabacloud.com. A staff member will contact you within 5 working days. Once verified, infringing content will be removed immediately.

Related Keywords:

large amount of data storing large amounts of data examples of large data sets the future of analytics and data science best file compression ratio amount of data on the internet oracle compression ratio

Database operation and maintenance personnel should know thes... 10-23

"Prism Gate" provides a model for people to reflect on person... 04-30

There is no shortage of data mining talent in China, but it i... 04-30

In order to get "big data", Strategic investment Love Station... 04-30

The internet industry in China, we still keep the data very t... 04-29

News client The biggest gold mine is big data 04-27

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

Hot Article

Hot Tags

computing conference access forum computer class data get http html applications

Popular Keywords

html add blank space register business logo register ssl certificate full site sign in sign up node js build cloud register register a subdomain in python network management system tutorial how to learn computer science by myself

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

Get Started for Free

Sales Support

1 on 1 presale consultation

Chat Contact Sales
After-Sales Support

24/7 Technical Support 6 Free Tickets per Quarter Faster Response

Open a Ticket
Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.

Learn More

On the language and data compression ratio of large data

Contact Us

Hot Article

Hot Tags

Popular Keywords

Recommend Topic

A Free Trial That Lets You Build Big!

Sales Support

After-Sales Support