There are repeated construction signs big data development mo Astray

Source: Internet
Author: User
Keywords Large data data objects large data development
Tags analysis application application prospect applications based big data blog cloud

Large data has become another hot spot in the field of information technology after cloud computing and the Internet of things, with its "easy to understand" concept and the great application prospect. But in the attention of all walks of life, the domestic big data field obvious progress is not present, but has already shown more or less "bubble". By the name of large data circle money enclosure or repeated construction of the signs have emerged. In this case, we need to deepen the understanding of the content and characteristics of large data. Based on the study of large data tracking, Sadie think tank can focus on four aspects of understanding and grasp.

The key point of large data development is the analysis and application of data object

Have heard experts on a local government leadership, building large-scale data centers, backup storage user data is the development of large data industry, but also some organizations to develop large data must focus on strengthening the infrastructure construction. These points deviate from the big data and lead to big data development.

Fundamentally, "Big data" is not a scientific and rigorous concept, it comes from the phenomenon of explosive growth of data scale. But under the premise of "mass data" and "large scale data", the new concept has to be put forward, because the concept is only focused on the data scale itself, and it cannot fully reflect the data processing and application requirements under the background of data explosion. The concept of "big data" can trigger consensus and become the current hot spot, which lies in the huge realistic demand and specific application demand of large data analysis and utilization in each industry field. Therefore, the content of large data, not only refers to the scale of the traditional technology processing capacity of static data objects, but also contains the dynamic processing of these data objects and application activities.

If you look at the life cycle of the data, from the data generation, data transmission, data collection, to data processing, analysis and application, it involves many links and several levels, but in the case of large data, its focus is not on data transmission, collection, storage, but on data analysis and mining, And thus obtain useful information which is difficult to discover intuitively. Only the focus on data analysis and mining and application, in order to maximize the real value of large data, but only the analysis and application of large data and large data industry development is the most important. Therefore, we believe that the main content of analysis and processing of large data services will be the core of large data development.

Large data Objects focus on acquiring and using

If only the data objects involved in the large data connotation, some viewpoints believe that the construction of the source data collection must be done vigorously. This view is reasonable, but not entirely true. For example, to use large data to achieve intelligent transportation, the need for urban road planning, vehicles, car parks and other data, which is more than government departments, the corresponding database does need to strengthen the construction. However, to be based on the actual situation real time, flexible management of traffic, to achieve effective operation, only rely on the above data is not enough, but also to track and organize the road traffic data, car park capacity data, weather data, road accident information, and these data, not only from the department, including the intersection, but also through micro-blog, Micro-letters and other channels to obtain, and even many times the micro-blog information than the management of the data to be more rapid. So well known, microblogging and other information sources are open, no department can "own" these data, the only thing to do is to crawl, as soon as possible to collect and collate. If you analyze a few similar cases, you will find that in order to deal with the application requirements of the emergency, the real-time appearance and the dynamic data obtained are more valuable than the static data in the normal database, and the data of the public sources such as Weibo and search engine are more often than those of the data Use department (such as the pipeline department in the above example). The internal data available is more valuable.

Therefore, the data objects involved in large data must be treated in a classified manner. For the government departments, public service agencies, enterprises and so on the master and continuously updated data, need to strengthen the construction, as the basis for data application. But at the same time, we must pay attention to micro-blog, micro-letter, social network, search engine and other emerging data sources, do a good job of tracking, grasping, sorting and application of related data.

The focus of large data is on heuristic and auxiliary decision

IBM recently presented another feature of large data, true and accurate. But objectively speaking, this feature is still open to question. In fact, while the goals and ideal results of large data applications, through the analysis and mining of large data objects, new knowledge rules and new useful information are discovered, but for large data analysis and processing by computer, it should not be exact and accurate, even for the source data objects involved in large data. Nor can it be asked to be true and accurate.

From the source data object, will include micro-blog data, social network data, search engine data and so on, for various reasons, the data will inevitably include a variety of error data, useless data. Even through certain data cleaning, data filtering means, can not fully ensure its authenticity and correctness. But the value of large data is originally from the complex data object to find useful information, to the Wu-Wu is the real one to complete the process of work. Therefore, true, accurate is only a relative concept, to work hard, but can not be forced.

From the analysis results, the large data analysis process requires only the discovery of knowledge rules that reflect a certain correlation, rather than the completion of mathematical formulas or logical deduction. The most classic example of diapers and beer, in the early days of the discovery of association rules, is that they do not know why. So the big data is the discovery of rules, not the argument of rules. Its value to researchers and decision makers lies in the ability to guide and inspire the innovative thinking of large data users and to assist in decision-making. To put it simply, if you're dealing with a problem, it's common for people to think of a method, and large data can provide 10 of reference methods, even if only three of them are feasible, and extend the problem-solving approach three times times.

The information security problem of large data should not be exaggerated

The application of large data is related to information resources and information technology, so it will inevitably involve the information security problem. However, information security issues cannot be rendered too large.

One is not to create a so-called security problem. For example, there is a point of view to protect the security of all large data-related source data sets. But such data sources as those listed above, such as Weibo and social networks, are inherently open, accessible and accessible to all, without protection or protection.

Second, the conventional information security problem can not be said to be a big data problem. For example, the protection of government department data resources is a universal problem, even if there is no large data application still exist, even with large data applications will not necessarily be magnified. Therefore, it is not easy to say that large data will bring new information security problems.

In fact, the biggest information security problem facing the big data age is how to prevent an adversary from using large data to analyze important and even secret information from a multitude of subtle messages that are inadvertently disclosed by a counterparty. and the prevention of this, but it is not the scope of large data applications.

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.