Jeff Kubacki, chief information officer at Kroll, a risk management consulting firm, set a goal for his company to reduce storage costs by 25% over the next three years. So far, the company's storage volume of data has reached PB (petabytes), Kubacki plans to address this problem in a combination of tiered storage (tiered storage), changing business processes, and adopting new technologies, including cloud storage.
Although it is still only in its infancy, cloud storage, with its resilience, clear charges, multiple storage locations, and the ability to drag data directly from storage devices, makes it look attractive, but it is still unknown how the cloud sends large amounts of data over Ethernet.
"We're still talking to the vendor: If you try cloud storage, what might it bring us?" "We're still studying whether it's right for us," Kubacki said. ”
Kroll's IT architect will investigate how the "pipeline" through the Internet will migrate about 25% of the company's data to the cloud. Kubacki explained that only 25% of the data was migrated because most of the company's data were legal documents, too sensitive to be stored in the cloud. While the storage capacity of the cloud is expanding, it is limited by connectivity to the cloud, so the handover of petabytes of data between the enterprise and the cloud becomes a big challenge.
Businesses will ask whether their pipelines are large enough to migrate their stored data to the cloud, and the answer is usually no. "Response latency is a major impediment to cloud storage adoption," says Adam Couture, a Gartner analyst. "Now, we've found that companies are mostly limited to archiving, backing up, and perhaps some collaboration." ”
But most cloud vendors claim to have a simple solution-that is, migrating data to the data center in a physically migrated way for the first full backup.
Rob Walters, general manager of the planet Dallas division of cloud hosting, said it was relatively easy to host and transmit large amounts of data from a day-to-day, user-level perspective, but it would still "baffle" existing systems if chunks of data migrated 20TB to 25TB. Walters said: "Our current network is not good, it is just a weak point, now we are studying to solve it." ”
For businesses, initial, fully backed up data can be replicated to the cloud via a WAN or LAN link, Couture warns that "the time of the initial full backup depends on how much data you have on your server, and it can last for weeks." ”
Kevin Ellis, chief executive of Nuvolus, said doctors ' offices were Arvada (private cloud storage services provided by Colorado State's nuvolus) to preserve sensitive medical data that could not be replicated and physically taken away from their offices. As a result, the company has asked its healthcare customers to have "appropriate Internet connections"-usually to 10mbit/sec to transfer backup data.
"The situation in the doctor's office is different, the data upload time is different, we can see that time-consuming upload time," Ellis said, "You can upload in the evening, we strive to ensure that we do not affect the day doctor in the office work." ”
Some cloud storage vendors also provide private connections to a storage node from the enterprise to that vendor. According to Nirvanix, a cloud storage vendor in San Diego, this approach is ideal for companies that have first full backup data between 2TB and 75TB, or less than 750M files, and time-sensitive data transfer time. It is also suitable for one-time and persistent data migrations with high throughput, but with a time delay requirement.
Another most common approach is to use the "Artificial network (sneakernet)", which is to copy the data from the client directly from the disk, tape, or cloud storage vendor, and then take it to the data center for an initial backup.
"Some of our customers have sold their storage arrays," said Jon Greaves, chief technology officer at Virginia State Private cloud hosting firm Carpathia Hosting, "and that's the case where the customer unloads the disk directly from the machine after the mirror is finished." ”
Nirvanix Company will configure a dual Gigabit Ethernet interface Storage Server for their customers to transfer data, once the data is copied, Nirvanix will retrieve the server, and then migrate the data to the cloud.
Amazon Web Services supports the use of mobile storage devices to copy large amounts of data from the cloud. The company uses a high-speed internal network to transmit customer data directly, bypassing the Internet.
Greaves says big companies are moving the data, depending on the situation, sometimes using the Internet, sometimes using the Sneakernet method.
Carpathia uses Parascale based technology to build a private cloud for its corporate clients. "It depends on how quickly you see the data they need and run and use that data." If the customer is a long-term archive, it usually takes a step-by-step approach to migrate the data, "he explains," if they are the video files that are directly needed, this is typically hundreds of TB in size, so we're going to start looking for alternatives. ”
After the initial backup, network bandwidth pressure will be mitigated, as only incremental backups can be made later.
Walters says there is no infinite scalability or infinite capacity for the cloud. Planning capacity, always ensuring storage capacity to meet the needs of users is the responsibility of cloud storage vendors. He said: "If someone wants to upload more than 10 TB of data, you have to be prepared beforehand, this is a carefully planned work." ”
Storage vendors use complex methods for capacity planning. Carpathia, for example, continues to drive network traffic, raising it from 450gbit/sec to 500gbit/sec, and plans to use algorithms from the telecoms industry to change capacity.
"You have a T1 line, you have to figure out how many core minutes can be squeezed from this T1 line (core minutes), which is really an excess configuration (overprovisioning) problem," Greaves explains.
The carrier uses a unit of measurement known as "Erlang: 1 of the number of calls per 3,600 seconds on the same circuit, or a traffic load that keeps the circuit busy for one hours," to help determine the circuit load. "We use the same approach on the cloud," Greaves said, "We can figure out we're around 1.2 and predict that if the 2 will be under capacity pressure, we'll order more hardware when we're approaching 1.2." ”
The Kroll company decided not to use cloud storage until 2010. "I never liked to be a risky guinea pigs, but I don't mind being at the forefront,"--kubacki.
But he added that next year cloud storage would remain an attractive option. "I think one of the benefits of migrating to the cloud is to create a complete concept that it is more of a spending deal than a capital deal," Kubacki said, "And now I have a large capital budget: I buy a disk that will depreciate year after month, so I'm glad to see that by storing some data in the cloud to make the company's profit and loss statement look better because I'm not actually buying storage, I'm just renting. ”