The Big data I understand

Source: Internet
Author: User
Keywords nbsp big Data can we

The term "big data" does not need to be said, "Big data" is necessary for any activity or conference that is relevant to the Internet in the past two years. The recently concluded 13th session of the "China Internet Conference" has also dedicated a large data forum.

For any large data practitioner or first contact, there may be a common feeling: Big data is useful! How to use large data?

The vast number of books and articles on large data seem to be sharing a message: More and more industry, people began to pay attention to and actually explore the application of large data, we are working together to depict the large data of the great utility of the blueprint, but on the way to practice, we are still in the initial stage of the

Large data base on the Internet, data warehousing, data mining, cloud computing and other Internet technology development for the application of large data laid a foundation. However, practical application is still in the progress of exploration. As well as exploring learning, I would like to share and discuss with you four questions from my own perspective: what is the big data? What can large data do? What does big data actually do? What's the big data?

first, what is the big data?

Reference 3 more commonly used large data definitions:

(1) The need for new processing mode to have more decision-making power, insight discovery and process optimization capabilities of the massive, high growth rate and diversified information assets.

--gartner

(2) Massive data scale (Volume), fast data flow and Dynamic Data System (Velocity), various data types (produced), great data value (value).

--idc

(3) or the huge amount of data, massive data, large information, refers to the magnitude of the data involved can not be through artificial, within a reasonable period of time to intercept, management, processing, and collation into human can read information.

--wiki

Other definitions of big data are probably similar, and we can use a few key words to define large data.

First, "large scale", this size can be measured from two dimensions, one is to accumulate a lot of data from the time series, and the second is the depth of more detailed data.

Secondly, "diversification" can be different data formats, such as text, pictures, video, can be different categories of data, such as population data, economic data, and can also have different data sources, such as the Internet, sensors and so on.

Third, "dynamic". The data is constantly changing, and can quickly add a lot of data over time, or it can be a changing data in space.

These three keywords define the image of large data.

But it also requires a key ability to "process quickly". If there is such a large, diverse and dynamic data, but it will take a long time to deal with the analysis, that is not large data. From another point of view, to achieve these data processing quickly, by manual affirmation is no way to achieve, therefore, need to use the help of machine implementation.

Finally, we use the machine, through the rapid processing of these data analysis, to obtain the desired information or application of the entire system, can be called large data.

We can use the following diagram to define large data:

The concept of large data has been defined, what can the big data do?

You want to apply large data, from the process, presumably.

First we need to have the data source, then collect and store the data, on this basis, then analysis and application, form our products and services, and products and services will produce new data, these new data will be recycled into our process.

When this whole cycle system becomes an intelligent system that can be automated through machines, it may become a new paradigm, whether commercial or otherwise.

Then specific to the actual application, I think, large data can be implemented in the application, can be summed up in two directions, one is precision customization, the second is forecast.

First, the precision of customization.

Mainly aimed at the supply and demand of the two parties, access to demand-side personalized needs, to help the supplier Heth positioning objectives, and then provide products based on demand, and ultimately achieve the best match between the supply and demand.

Examples of specific applications can also be summed up in three categories.

One is personalized products, such as intelligent search engine, search the same content, everyone's results are different. or some customized news services, or online games.

The second is precision marketing, now more common Internet marketing, Baidu's promotion, Taobao's web promotion, etc., or is based on the location of information push, when I arrived at a place, will automatically push the surrounding consumer facilities.

The third is location positioning, including the location of retail outlets, or the siting of public infrastructure.

These are all through a large data analysis of user needs, and then provide a relatively customized service.

Application of the second direction, predicted.

The forecast is mainly around the target object, based on its past and future related factors and data analysis, so that early warning, or real-time dynamic optimization.

From the specific application, can also be divided into three categories.

One is the decision support class, small to the enterprise's operation decision, the security investment decision-making, the medical profession clinical diagnosis and treatment support, as well as the electronic government affairs and so on.

The second is the risk early warning category, such as epidemic forecast, disease forecast of daily health management, operation and maintenance of equipment and facilities, public safety, credit risk management of financial industry, etc.

The third is real-time optimization classes, such as intelligent circuit planning, real-time pricing and so on.

The above, is a variety of literature, for large data can be used to do some of the imagination, in fact, perhaps large data can do things, can be extended to all aspects.

However, let us look at the reality, the actual application of large data to what extent?

I think that the current large data to achieve the real commercialization of the application, only one, is the internet marketing.

The other directions we listed earlier will have some initial applications, but the basics are still at the exploratory stage. For example, the outbreak forecast, unsecured credit loans, for accuracy, precision, scalability and so on have yet to be examined.

The main reason for the gap between the actual application of large data and the goal blueprint is what I think is the problem of the data source.

You have to get the data before you can apply the data.

Therefore, the availability of data becomes an important dimension of the application evaluation of large data in specific industries.

Data availability can be measured in several dimensions from the standardization, openness and concentration of data

At the same time, after acquiring the data, the application data can be measured from the potential value dimensions of the large data application, including efficiency improvement, cost reduction or new model generation.

In addition, it can be measured in terms of the replicable/promotional perspective of the large data industry, not only in the industry but also in the promotion of cross-industry.

From three dimensions, I personally to the large data in various industries to apply the possibility of a position, but this positioning is very qualitative and rough, the specific may also need more large data application of the industry to explore and explore.

for businesses that specialize in large data applications, what do they do with big data?

I think it can be developed from two dimensions, first of all, a key task is to accumulate data, with its own Internet data and large data technology based on two resources, from a number of subdivision applications, such as the first from the enterprise perspective, and then extended to the industry or even cross-industry perspective, from the subdivision of the application of the first product output, This will be the gateway to more data, as well as the broader application of large data.

But one more thing, for platform-type Internet enterprises, in determining with which enterprises or industry data combination, the application of large data, you can have some filter conditions, for example, is not the use of platform properties, in addition, this application is not replicable or promotional, not only limited to a certain enterprise, At least it can be applied throughout the industry.

Above, is my personal to the big data some ponder, also hoped can with more friends to the big data practical application some discussion and the study.

Original link: http://www.36dsj.com/archives/12536

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.