SDN Network can transform large data into information assets

Source: Internet
Author: User
Keywords Large data through can these
In the past few years, companies have seen valuable information hidden in large data, which drives suppliers to actively develop technologies (such as Hadoop, Map/reduce, Dryad, Spark, and HBase) to effectively transform large data into information assets. The advent of software definition networks (SDN) will facilitate this process.


most of the data that makes up large data is actually unstructured data. Structured data can be handled through traditional database schemas, and unstructured data is difficult to deal with. For example, video storage. Although video file types, file sizes, and source IP addresses are structured addresses, video content is unstructured and has no fixed-length fields. Most of the value gained from large data analysis now comes from the ability to search and query unstructured data, for example, by using a facial video algorithm to search for a person from thousands of people in the video.


to achieve this search capability, we need to be able to analyze computations across thousands of server clusters (through high-speed Ethernet connections). The process of mining intelligence information from large data involves three steps: 1 divide data into multiple server nodes, 2 analyze each block of data in parallel, and 3 consolidate these results.


we need to repeat these operations until we complete the analysis of the entire dataset.


because of the splitting-merging nature of these parallel computations, large data analysis may bring huge Shandan to the underlying network. Even the world's largest server, the speed of data processing is just as fast as its network transmits data between servers. For example, the Facebook study found that data transfers between successive phases accounted for 33% of the total uptime, and in many cases the communication phase accounted for more than 50% of the total elapsed time.


by addressing this bottleneck, we can significantly speed up large data analysis, which has two implications: 1 better cluster utilization reduces tco;2 of cloud providers that manage infrastructure to provide faster job completion time and real-time analysis results for customers leasing infrastructure.


What we need is an intelligent network that adjusts the data transfer requirements in the split-merge phase at each stage of the calculation, thereby not only increasing speed but also increasing utilization.

The role of
SDN


SDN is likely to build this intelligent adaptive network for large data analysis. Because of the separation of control screens and data screens, SDN provides a well-defined programming interface for software Intelligence to program networks to meet the needs of large data, which are highly customizable, scalable, and flexible.


SDN can configure the network on demand to meet the appropriate size and shape requirements to compute that virtual machines can communicate with each other. This directly addresses the biggest challenge facing large data (massively parallel applications): slower processing speed. Processing is slow because most computing virtual machines in large data applications spend a lot of time waiting for large amounts of data to be dispersed-clustered. Through SDN, the network can create secure paths and expand capacity on demand during decentralized-aggregation operations, thus significantly reducing latency and enhancing overall processing time.


This kind of software intelligence basically understands what the application needs to get from the network, it can bring great precision and efficiency to large data application. There are two main reasons: 1. Well-defined computing and communication patterns, such as the split-merge or map-reduce paradigm of Hadoop; 2 The centralized management structure allows us to leverage application-level information, such as Hadoop Scheduler or HBase Master.


through the use of SDN controllers (a global view of the underlying network: network state, utilization, etc.), this software intelligence can accurately interpret the application requirements through the programming network.


SDN also provides additional functionality to assist in the management, consolidation, and analysis of large data. NEW SDN-oriented network protocols, including OpenFlow and OpenStack, are dedicated to making network management simpler, more flexible, and more automated. OpenStack reduces human resources to set up and configure network elements, and OpenFlow automates the network to achieve greater flexibility to support data Center automation, BYOD, security, and application acceleration.


In addition, SDN plays a key role in developing network infrastructure for large data, simplifying the management of thousands of switches and facilitating interoperability among suppliers, laying the groundwork for accelerated network construction and application development. OpenFlow can achieve this interoperability, helping businesses avoid the shackles of proprietary solutions and are committed to translating large data into information capital.


  With the powerful influence of big data and the growing awareness of the potential of big data, companies must ensure that their networks are expanded to accommodate these emerging needs to ensure long-term success. Clearly, a successful solution will take advantage of two key elements: patterns in large data applications and network programmability provided by Sdn. From this perspective, Sdn will undoubtedly play an important role in adapting the network to faster and further development, and to promote the pace of knowledge and innovation.
Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.