Five trends that will reshape large data technologies over the next five years

Source: Internet
Author: User
Keywords We have been the trend at the same time

Let's not dwell on how much data a disk can hold or whether it will use Hadoop. The real question about big data is how the business user will use Hadoop, how far our system can go on the intelligent path, and how we're going to make sure it's all under control.

Over the past few years, big data technologies have come a long way, from an optimistic and positive buzzword to a difficult illness that people hate, and the focus shifted from sheer data to the pursuit of type and speed. The so-called "big Data" and its related technologies have undergone a high degree of attention, detailed screening and self-release, the actual results are likely to have a great difference with our cognition. Today, however, we are standing at an important turning point in history, and the various controversies that preceded it have led to definitive conclusions.

Now that automation and intelligence have become a new direction for the world to work, the trend is to streamline data mining and introduce intelligent features into everything from mobile applications to transport systems. Large data "Big" is not the ultimate goal, the emergence of various types of new processing mode is aimed at increasing the amount of data delivery into the intelligent results. The so-called classification is not the ultimate goal, its significance is to help us achieve large-scale data quantification, while more in-depth understanding of the world around us.

In this context, we will use this Businessesflat-out data conference platform to delve into more details-the meeting will be launched in New York, 19th this month, for a week. During this time, the world's technology giants, well-known enterprises and some of the most intelligent emerging enterprises will send representatives to share their insights. They will explore topics related to big data, covering the fight against trafficking, the future direction of Hadoop, and even the cutting-edge technology of artificial intelligence.

Below I will bring to you all I have been concerned about the five trends, perhaps to help you in advance to grasp the meeting speakers to discuss the topic and presentation direction. If you are interested in attending this meeting, I hope this article can play a role in the future.

1. Hadoop develops to a real platform with firm steps

Apache Hadoop may still be just a set of distributed file systems, and MapReduce will continue to play the role of execution framework, but Hadoop will never stop. Thanks to the versatility of yarn, the Hadoop cluster has now been able to run any number of different execution frameworks for any number of different workloads, while maximizing the resource advantages of the same underlying storage infrastructure. For example, the MapReduce cluster for ETL operations can now also serve as a spark cluster to support machine learning, a stream-oriented storm cluster, and a tez cluster for interactive SQL.

In essence, Hadoop has shifted from a specific task-oriented utility to a real platform that supports a variety of applications. Early adopters, represented by Airbnb and Twitter, have gained a competitive advantage from this new use, Cloudera, Hadoop solution providers such as Hortonworks and MAPR also introduce a variety of new features into their products and support the new frameworks that mainstream Hadoop users need in some cases. Emerging enterprises such as continuuity, mortar data, and wibidata accelerate this evolution by simplifying large data applications, while also opening up a subset of the technology base to provide tools for more developers.

Of course, it is not just developers who are affected by the tendency of Hadoop to transform the platform, and many software vendors feel the torrent of the times. Traditional data warehouses, databases, and even statistical software vendors must accept the fact that Hadoop is now able to help them save more data at lower cost while analyzing its content in many ways.

2. Artificial intelligence began to rise

We have computing equipment, we have data, we have algorithms: so we now have the technical basis for building artificial intelligence. Please don't get me wrong, artificial intelligence is not as scary as it is in science fiction, and it can't really replace human status, but this technology will eventually become a reality. As a result of the continuous improvement of machine learning programs, we have been able to use smartphones for voice command recognition, a media service capable of predicting user preferences, software that can identify relationships between billions of data points, and applications that are good at digging up potential value spaces.

IBM's Watson system is close enough to provide chefs with an accurate list of recipe ingredients.

Looking to the future, in-depth study of these areas will help our AI systems become more practical and more powerful. In the complex dataset, these models can extract and identify the in-depth analysis path that cannot be realized by programming. In the absence of supervision, in-depth learning projects have been able to successfully grasp the appearance of specific objects, the vocabulary of different languages to map and even learn host game rules of operation. Almost overnight, many tasks that could not have been accomplished now seem to have a workable solution--such as being able to label content to make it searchable, or predict the user's words and expressions with excellent accuracy, and what to enter next.

By applying new content to new areas, these scenarios are likely to bring us a greater potential value. What are the characteristics of a particular cancer cell being gathered together? Can we help nurses understand information that only doctors can access? What factors that cannot be accurately measured can reflect the causes of teenage suicides? How do we drive self-driving cars with unmanned aircraft into commercial applications? It is true that artificial intelligence is not a savior, but it does show us the possibility of light and vastness.

3. Bringing analytical capabilities to people

Compared to the truly demanding infrastructure, the ability to standardize data analysis and make it easy to implement is not a great achievement-but this trend is still likely to bring significant changes to our society. Simply providing the general public with the ability to examine the data around them in a new way is tantamount to opening a door to limitless possibilities for our lives.

Yesterday, for example, I built a web graphic of my itunes library using freeware, and compared several of the words used by Snowden in a recent interview with the director of the NSA, Keith Alexander. I'm not using data science or deep learning techniques, but I'm still able to perform simpler analysis tasks and then look at the interesting data I've found. Before that, I had mapped my Twitter followers, analyzed the headlines published by the authors of the Gigaom website, and even summed up my food intake and exercise intensity. Perhaps prompting young people to actively look at and analyze their data in an interesting way can help motivate data technologists to further push the program into the civil-who is to say?

And with the increasingly sophisticated tools available to ordinary people, and the growing volume of data we collect (which includes data from sources such as fitness equipment, networked cars and IoT), this quantitative analysis of itself will become increasingly important. For a variety of purposes, we are gradually becoming an important part of the data input and algorithm output process. Our personal data will have all sorts of ramifications--including what we see in the ads and the jobs we receive--and it all comes down to the fact that each user can at least learn a small amount of information that is available to businesses, organizations, and government departments.

4. Cloud Computing

As I said three years ago, the path of cloud computing and big data is bound to come together and collide, and that speculation has become a reality--only the actual impact is wider than I expected. In fact, the biggest impact of this vast convergence is hardly reflected in the actual use of Hadoop, the Business Intelligence suite, or any other analytics software, the service solution. True, these trends have made it easier for start-ups and mature companies to migrate new workloads into the cloud, but for me, the biggest implication of cloud technology change is the introduction of a democratization process into the original, difficult computer science.

As I have already stressed, some of the technology solutions are already being used in the form of "service" (mainly through APIs), and the camp is still growing. If you're a developer and want to learn how to use Hadoop and elastic mapreduce, there are already options available. If you want to be able to access certain services, such as IBM's Worsen or Mindmeld APIs, and you need to borrow the AI layer from other algorithms in your own data, there are also many options available. With the support of many vendors such as Google and Pinterestto Netflix, most of these technologies will be embedded into the services we use.

If these solutions really work, and can bring real intelligence to developers ("intelligent" is not a general recommendation, it's more like an unavoidable plague than an advantage, and even a mediocre task is enough to bring the consumer a better result than expected. I believe many friends in the food procurement list of specific items, but also want to know what the benefits of these ingredients, if some of the food temporarily cut off, we have any back-up options or where to be able to buy a lower price of similar goods. Supported by the processing power and data capacity of smartphones and other computing devices, well-designed applications are able to translate the signals we get from the at&t signal towers to actual revenue.

5. Laws and regulations

Finally, the legal system will also be a potential impact factor in the development of large data--how the effect depends on the perspective of the review. For now, the arbitrators, lawmakers, regulators and even the president are trying to figure out what this huge collection of data really means, and to sketch out some sort of order. Of course, it is not easy to stones in this turbulence, and it is even more difficult to exploit all the competitive advantages in the process.

In the management process, the most difficult problem is how to properly protect the privacy of consumers, this part of the information has a huge potential to significantly improve the actual experience of consumers, but at the same time also brings a great risk of infringement of personal privacy. In addition, a lot of publicity money has started pouring into this emerging area. We want to buy ingredients or new costumes at the best price, and we want to be able to participate in the DNA mapping project and get 99 dollars in return. But we also need to make sure that the potentially sensitive information that we provide is not leaked to others or on occasions that should not occur-such as a rolling advertisement for a public computer.

This is a huge challenge for the law-makers and other practitioners who have drafted legal frameworks, regulations and case law to ensure that consumers avoid privacy breaches while gaining legitimate benefits. Frankly, I am not sure that they can work out a workable solution without understanding the big data technology and its direction, and I do not believe that we will be satisfied with the result.

Of course, we don't want companies like Facebook, Google, and Geico to analyze all of their data, but we don't want to go back to the miserable years when the design of the site is weird, taxis are waiting, the work is extremely inefficient and life is not personalized.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.