How to manage large data overload

Source: Internet
Author: User
Keywords These these for these for storage virtualization these for storage virtualization large data overload these for storage virtualization large data overload large data

Storage administrators are very frustrated by the complexity and endless demand for storage capacity. Here are some ways to deal with the torrent of data.

In the past, only researchers, internet giants and social media giants such as Amazon, Twitter, Facebook and Shutterfly face such problems, but now more and more companies are trying to find valuable information in their hands through large data mining, and gain a competitive advantage. Companies such as Wal-Mart, Campbell's soup, Pfizer Merck and Wawa are now making ambitious plans for their big data.

Many companies are investing in large data analysis in order to respond more quickly to customers, better track customer information, or introduce new products to the market more quickly.

"For any company in the internet age, if they don't, then their competitors will do it," said Ashish Nadkarni, a storage analyst at IDC, a market research firm. ”

At present, all institutions are gradually flooded with data from both internal and external sources. Many of these data are transmitted in real time, with a lot of data being spent in minutes, hours, or days.

Aberdeen Group, a market research firm, says the resulting increase in storage demand is particularly tricky for large companies. Of these large enterprises, from 2010 to 2011, the storage capacity required for structured and unstructured data grew by an average of 44%. No matter how large a company it is, the demand for data storage will double every 2.5 years. Furthermore, the optimization of video storage, spreadsheets, formatted database and pure unstructured data requires different tools.

Aberdeen group virtualization and storage analyst Dick Csaplar said: "It is a challenge to keep storage costs from growing as storage requirements grow." "Technologies that can help mainstream large data users avoid falling into this vicious cycle are storage virtualization, weight-storage, and tiered technology." object-oriented and relational database storage is a good choice for large data users such as researchers, social media sites and simulation project developers.

The system is more complex to design than the internal daily storage platform in order to store the byte-level (and larger) data in an accessible format. Here are some suggestions from experts for managing and storing large data.

What type of data are you analyzing?

The type of storage you need depends on the type and number of data you are analyzing. All data has a retention period. For example, a stock quote is only important in one or two minutes before price changes. Baseball scores are for people to keep for 24 hours or until the next game. This type of data should be kept in primary storage when it is most needed, and then transferred to inexpensive storage. Years of observation have proven the idea that data that is stored for long periods of time does not usually need to be stored on a master drive that is easily accessible.

How much storage capacity do you actually need?

The amount and type of storage you need to store large data depends on the size of the data you need to store and the time to use for that data.

Three kinds of data are involved in large data analysis. "They are able to stream data from multiple sources to you every second, and your time slices should be minutes before the data is out of date," Nadkarni said. "Such data include weather, traffic, trends on social networks and tweets about global events.

Large data also includes dormant data or data that the company generates and controls for proper use.

Data transfer requires fast capture and analysis capabilities. "Once you analyze them, you don't need them anymore," Nadkarni said. But for dormant data or data that is controlled by the company, you should store them. ”

What kind of storage tools are more appropriate?

For companies that are just beginning to get involved with big data storage and analytics, industry watchers recommend storage virtualization technology that will put all storage under one umbrella, to compress data technology and tiered storage solutions to ensure that the most valuable data is stored in the most accessible systems.

Storage Virtualization provides a software abstraction layer that lets users not find physical devices and allows all devices to be managed as a single pool. Although server virtualization has become a mature component of the current IT infrastructure, storage virtualization is still not widely accepted.

In February 2012, Aberdeen A survey of 106 large companies. The results showed that only 20% of respondents said they had a separate storage management application. On average, 3 management applications correspond to 3.2 storage devices.

However, many storage vendors are unwilling to allow the equipment they produce to be managed by other manufacturers. "Storage virtualization is very complex and extremely time-consuming," Csaplar said. They are therefore not as widely accepted as server virtualization. "Instead, many storage administrators are focusing on cloud solutions for third-or fourth-tier storage, because cloud scenarios can make it easier to transfer data between different infrastructures while reducing storage costs." He added: "Many companies have done so and have received good results, but there is a gap between expectations." ”

Csaplar wants to see the use of cloud storage and other cloud-based computing resources grow in the near future as network connectivity improves, costs drop, and data encryption and decryption capabilities improve during transmission. "With the cloud, you can settle monthly bills from the operating budget without needing a separate capital budget," he said. ”

(Responsible editor: The good of the Legacy)

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.