Scientific research and misuse of big data concepts

Source: Internet
Author: User

Scientific research and misuse of big data concepts


ESRI Lu Meng, China


Suddenly like a spring breeze, thousands of trees and pears bloom. The rise of the "Big Data" trend in 2012 has made the "data" It circle a popular term in various industries. It can be said that none of the terms in the IT field can be so noticed and used. In addition to the traditional IT industry and IT industry, a variety of industries such as catering industry, real estate industry, and financial industry can't wait to announce their own "Big Data" strategy.

Microsoft Research Institute's fourth paradigm: data-intensive scientific research, after the definition of human scientific research from the three paradigms of experimental science research, theoretical science research and simulated computational science research, the fourth paradigm, data-intensive scientific research, was introduced.

 


Therefore, the trend of big data inevitably moved into the field of scientific research.

In this era of nationwide speculation, a group of science and technology workers remained calm. Although the term "Big Data" was first proposed in the scientific research field, it is widely used in the Internet field, especially for those VVS recognized by big data, whether they are the first 3 V or 4 V, and the current 11 V, none of them match the characteristics of the data flood generated by the Internet, does the scientific research community really need this?




First, big data focuses on the concept of "fast" data, which can be fast production, Fast propagation, fast change, and fast processing. However, in the field of scientific research, a lot of data is not as fast as that. For example, in many geographic information-related fields, such as land use, soil changes, administrative divisions, and other information, it is very common to have not changed for many years.

Second, we have questions about dimensions. The idea of big data is to collect more data, whether or not the data can be used at present, whether it is the information we are currently concerned about, as long as possible, to collect, if you are not afraid of it, you will be afraid of it. (in many cases, many companies and researchers enter a state where data is lost ). In particular, the prevalence of nosql data has made many researchers shout "Mom no longer has to worry about my data storage paradigm ......". However, we know that In the field of science, the first thing to define is your scientific research goal. The goal must be clearly defined, so your data structure must be designed to meet your research goal at the beginning, in this way, we can carry out our work with a purpose. If we do not define and design it in advance, the goal will be weakened and lost during the study.


There are also questions about the value of data. Internet data can be described as "no time to come", especially in the Internet industry, such as Twitter, Google, and Facebook. However, it is difficult to obtain each piece of data from scientific research. Whether it is obtained from experiments or on-site investigation and sampling, each piece of data may be subject to extremely high labor and time costs.

Obtaining more data is an ideal state. However, if every copy of Data costs a lot, it is necessary to reach the Internet data volume in the field of scientific research, is an almost impossible task.

Of course, in the big data era, big data is not just a massive concept, but also contains the concept of integrity analysis.

In the field of scientific research, it is also an ideal situation to obtain complete data for analysis. In terms of geographical information, sampling points exist as points. According to the concept of geographical information elements, point elements only have the properties of (x, y) and only represent positions, not sizes, therefore, no matter how data is collected, the entire study area cannot be covered. Therefore, all kinds of algorithms that use samples to estimate the whole are so important in the field of geographic information, including spatial sampling and geographic statistical analysis.

Big Data is an idea. However, in the process of using big data, it cannot be a dogma. It doesn't mean that data is increased only when the data volume is used, we need to apply it when we really understand it. As Comrade Xiaoping said: Black Cat, white cat, and mouse are good cats!


 

Scientific research and misuse of big data concepts

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.