LSI President and CEO Abhi Talwalkar: Big data torrent triggers data center revolution

Source: Internet
Author: User
Keywords We we can we can big data we can big data this we can big data this very

The first China cloud Computing conference was opened in June 2013 5-7th at the Beijing National Convention Center. The Conference, with an international perspective, insights into global cloud computing trends, and from the application, explore cloud computing and large data, cloud computing and mobile Internet, cloud security and cloud computing industry applications, such as the focus of the topic. In particular, the Conference has set up a cloud computing services display area, exchange the latest research results of international cloud computing, showcase domestic cloud computing pilot City development achievements, share cloud computing development experience, promote global cloud computing innovation cooperation.

LSI President and CEO Abhi Talwalkar

On the second day of the fifth session of the Cloud Computing Conference, LSI President and CEO Abhi Talwalkar published a speech called "The Revolution of data center of large data torrent". He thinks big data is impacting and changing our world in many ways. Therefore, we need to innovate in tools, industrial ecosystems and infrastructure to better manage, analyze, and derive value from large data.

The following is a live record:

Abhi Talwalkar: Good morning, everyone! I am very happy to participate in the cloud computing conference. I want to thank the Cloud Computing Committee for inviting me to share cloud computing and big data.

Our company is a cloud computing and network solution, with a scale of billions of dollars. We deal with a wide variety of data, such as mobile devices, mobile network computing, and caching techniques. I'm very glad that this Conference is devoted to big data, because big data will change our life. I've heard a lot of examples of big data before, and I'll give you more examples to illustrate.

Let's start with the change in data flow and the changes in computing structure, as well as the changes in the big data that Silicon Valley is now doing, and what impact it will have on our lives.

Big data will certainly drive current developments, such as hardware, software and services. There is also our daily life will be affected by large data. This is the age of data flooding, and we used that definition to illustrate the changes in the society we are seeing. Now there is a lot of data generation in society, a lot of data flow. There is also a lot of data fusion, such as the Global mobile network.

About 2 billion handsets were sold last year, and about 30% are smartphones. These smartphones require very sophisticated mobile services, which naturally produce a lot of data. Only 15%-25% of the Internet data stream is mobile data. Mobile devices are a big trend in China, where more people are using smartphones to connect to the Internet than computers.

There is also a video, we now use more, and we often use video to share and exchange. In the United States like YouTube, every minute will probably upload 100 videos, China's Youku also has a lot of data traffic. There are social platforms, such as Alibaba, Taobao, and cat platforms that can complete 250 million transactions a day, processing close to 1 trillion yuan last year, higher than Amazon and ebay combined. Now the use of micro-blogging and micro-letter is very broad, has become the mainstream of social networking, Facebook registered users have reached 1.1 billion, the network data traffic is very large.

Now the networking equipment has reached 17 billion, by 2020, networking equipment will reach 50 billion, also prove that the internet will produce a lot of data, smart phones, smart terminals, smart networks, smart city concepts are related to data flooding.

We have so much data, how should we deal with it? We have captured so much data that we know that processing data must be combined with technology, and that it is possible to do something very unusual with some good technology. In the area of large data analysis, you can see that a lot of data are captured in real time, can be predicted by real-time capture data, and can be real-time evacuation of people to avoid hurricane damage. In the United States last October, a very large hurricane and storm, different sensors collected a lot of data, such as wind speed data, temperature data, from the satellite to get a lot of data. All data can be put on a computer for analysis and processing, every 6 hours, the data will be updated and the head. In this way, we can get important results from data analysis, which can be used by experts and government departments to predict when the hurricane will arrive in the United States. This predictive function can save many people's lives.

We can give you an example of a medical field. We're all talking about the human genome, and the human genome is the blueprint for human identity. A human genome has 3 billion pairs of base pairs. In the past 50 years, the United States has done a lot of work on the map of human genome, and China and other countries have made great contributions. We've now got a complete sequencing of the human genome, and we've invested billions of of dollars and 13 of time, and this work is nearing completion. If we have large data processing capabilities, we can quickly do the work of the genome, and the cost of investment is not billions of dollars, only need dozens of dollars.

As long as we have a sequencing analysis of the human genome, we can tell which genes are responsible for what diseases humans have. At the same time, we can further analyze genome and genome sequences in depth, and the cost of this process is relatively low. With the development of technology and our understanding of the human genome and sequence, we can reduce the number of visits to hospitals and predict what diseases a gene corresponds to, and in this way, it can really change people's lives.

Three main characteristics of

large data

I would like to give you a few specific examples. Now there is a hot talk about the role of the video monitoring of the Wave talk. China now has 13 million monitors, especially in London, where a city has 5 million cameras, and all real-time video can be fed back to the organization for very powerful analysis. We can monitor the situation of the public, for example, the so-called intelligent procurement. You go to the cat or Taobao shopping, you can immediately pop-up a shopping advice, this is based on your previous purchase records generated. I often take the Boeing 787来 China, in fact Boeing 787 aircraft can also install some cameras and sensors, so that we can get data in real time, understand the performance of an aircraft, real-time performance is what, in order to customize the corresponding maintenance and maintenance strategy. This is the convenience of the data and the processing.

One more example, processing large data requires speed. From different places, different types of information are often stored in the same data center, for example, Alibaba is the case. There are also data on consumer and consumer behavior. Some of the data is not collected in this way and is collected by sensors installed on the aircraft. This means that the data source is different, which is called the different data types. The 2nd is speed. The second feature of large data is speed, which produces very quickly and can produce a lot of data in a very short time. In fact, every few minutes Facebook video can produce more than 3.9 million, the speed of large data is unimaginable. There is also the large number of data produced, most of them are very structured data, and the data level than we have dealt with more than 10 years ago, which is the impact of the popularity of the Internet. I believe that by the year 2020, the global data generated could reach 40Z.

The three features of large data are diversity, speed, and mass. If you can consider these three aspects of the relationship will certainly be able to create value, can save lives, can predict the occurrence of natural disasters, can enable people to quickly evacuate, can help detect diseases, and help them prevent disease, take protective measures. At the same time, video surveillance can be used to ensure the safety of citizens, and these are tangible benefits that can be obtained.

And, of course, I believe that big data can drive innovation in the industry. Big data really drives innovation, and it drives innovation from all fields. Now I want to talk about innovation in terms of how chips and wafers play their part in big data.

Big Data contains a lot of value, and we need to tap into that value. If you want to dig up value, you should pay attention to several problems.

The first problem is data capture. Of course, we know that data comes from a variety of sources, from sensors, from online transactions, from consumer behavior, from smartphones to mobile devices. The format of data from different sources is not the same, and we have to capture data in an efficient way. The 2nd is to hold the data or to store it. We said before that the volume of data is very large, must be properly stored and protected, the data accessibility is very good, all data need to be accessed in real time. The 3rd and most important point is data analysis. Only by reasonable analysis can the data obtain more information in real time, obtain valuable information and transform the data into information.

We have three questions to consider. Many speakers have talked about this, which is very important to the big data industry, so we have to stress again that tools, open source, and framework facilities are very important. Open source, including software and hardware open source, these three big Kong is very important.

Open source is very important to cloud computing

You've been talking about Hadoop for the past two days, and Hadoop has become the new mainstream paradigm, and the software development paradigm that you used more than 10 years ago doesn't apply. The software development paradigm of previous years was only suitable for dealing with structured data, and Hadoop was the mainstream in the face of unstructured data. Hadoop has an effective framework for dealing with unstructured data, especially for distributed data. Hadoop has some tools that support processing and analysis, and these are important, and these tools are improving. I believe that the value of large data can be further mined through the improvement of tools.

Another point is open source hardware and open source software. Open source software This topic is all about, for cloud computing and large data, the most important problem is to achieve the efficient use of cloud data center, which means that there must be good storage facilities and network architecture, we must access these dynamic resources efficiently. These resources are what customers need to share cloud resources. We also need data center management skills. In this case, of course, the concept of OpenStack is generated, which enables the software to acquire the appropriate management capabilities.

The other thing about virtualization is that the virtualization of the web will make the business value of the network dig deeper and cost less. There are also many things to do in the area of hardware open source, such as open source programming in the United States, and there is another open source programming activity in China, mainly to promote the standardization of data center hardware. Developers and users are very concerned about the standardization of hardware, because it can improve efficiency, increase compatibility, and reduce costs. So a lot of work has been done on the open source of hardware and software, and it's worth the effort.

The other point is the cloud. The concept of cloud led to a revolution in the computer industry. We believe that the business value of the cloud will reach $20 billion trillion in the next few years, and we believe the cloud industry is worth $3 trillion trillion.

has become very powerful in the cloud architecture, the top 20 internet companies in the world, including Alibaba, Facebook and Amazon, which occupy 30% of the server architecture market. So it's very important for them to say cloud services.

Data Flow Architecture

Here we share a conceptual data flow architecture, mainly on the analysis of large data itself and the transfer of data. Put it in the cloud perspective, the cloud is made up of three platforms. There must be a terminal, the terminal is a mobile device, the terminal of these mobile devices are mainly access to services, or as a carrier, you may be a mobile network, or a data center, the data center is hosting all services. We believe that the three concepts of cloud services and architectures, and large data, form the framework of the data flow.

I would like to further explain the cloud architecture, data flow architecture. What we did 10 years ago is localized, for example, in a local PC or a local data center, it is now about the data itself and about the transmission of data, data security, and real-time data to where it is supposed to go and fast processing, now the architecture must meet the requirements of data processing, and data usage requirements. That's why we gave it a new name called the Data Flow architecture.

We can consider the data flow. First you need to capture the data in these streams, the data flow can come from different devices, must be acquired in some way, and stored, and the stored data center is also very large, there may be 100,000 servers or 1 million hard drives. Of course, some data is placed on a few data servers, but the performance of these data servers is very strong, mainly with the complexity of the data and the number of data. We need to build this data flow structure based on the characteristics of the data. Of course, this depends mainly on the need for data analysis, such as whether you want real-time data analysis, or not real-time analysis.

The first is the use of smart networks, we must now realize that the more data captured, the need to judge the data, the importance of this data, and sometimes need to judge the value of this data. On a network, the type of data can be judged by the flow of data. This means that we need more intelligent networks to make real-time judgments about the data.

At the same time, we need to deal with various data formats. In this regard, other companies are also studying this issue. We LSI developed a multi-core processor, in addition to multi-core processors, we provide hardware with processing capabilities that can judge the format of the data, such as whether it is video data, whether the data is to be used in real time, or just the type of data that supports video chats. In other words, the hardware is intelligent, the certified data into a number of types for real-time classification, and to conduct a preliminary intelligent judgment. In the next years, two larger network companies will use our technology. So I also believe that through the use of intelligent hardware, network traffic will be reduced by 50%, greatly saving bandwidth.

Another point is about flexibility. The amount of data we deal with is very large, and it's stored in very large data centers, with a lot of hard drives and hard disk damage, and we have to make this hard disk very resilient. The problem with data center headaches now is that you want to keep a certain amount of hard drives working, and you don't want their performance to be affected.

We're using a special storage technology today, and we can locate the data to see which server it exists in. However, the traditional way is to rely on the physical location of the storage. Now we need to consolidate the data distributed across different servers through a rack server, which can improve the backup capability of the data.

The final challenge is efficiency. Big data is constrained by how quickly you set up your data and get results. This is related to your IT budget. The bigger your budget, the more servers you can buy. But we don't have unlimited it budgets. So that's a challenge.

How to deal with this challenge? We now use some flash technology, so that the data in the application and analysis process to achieve different results. Speed of application can be accelerated, speed can be increased by 15%. Leveraging this technology makes it architecture less money while maintaining the same performance. We use flash technology to ensure that the traditional CPU and hard disk storage between the delay state. Now we use a lot of flash to deal with a lot of infrastructure, infrastructure and a large number of data sets. Our focus now is on networks and architectures. The main is to capture profiling information and data. We are very happy to see the mobile network in the development of cloud computing, there will be a great development.

Let me conclude by summarizing the state of our innovation and the social situation. In the 670 's, it can be said that before innovation, we started as an innovative society. After this, the speed of innovation, from Silicon Valley to semiconductor processors, can bring us a high degree of integration, so that the rapid economic development, which is based on the promotion of IT technology. There have been great developments in the area of personal computers, mobile networks and mobile devices, and we are now entering the data center era, data, and so forth in the new currency.

The important idea we've just talked about is that big data changes our society and the world, and we can get more value from it. Silicon Valley has always been an important platform for us to make greater progress on big data. Our lives are changing every day, the way we consume, the way we shop online will get a safer experience over the next five years, and there will be a great security boost in the areas we live in.

The architecture of the computation shifts to the architecture of the data flow, ensuring fault tolerance and getting the desired results at the right time. Silicon Valley will play a very important role in this regard.

Thank you for listening.

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.