KeywordsBig data big data big data cloud computing big data cloud computing if big data cloud computing if public cloud
Given that cloud computing is just a conceptual dream for most of us, when someone talks about their big data strategy of "storing all the data in a cloud service," You can't clearly tell that their strategy is a visionary solution, Or simply repeating some of the experts they heard at the industry meeting.
The overlap between large data and cloud computing paradigms is actually so extensive that you can claim that your organization is leveraging existing, internally deployed Hadoop, NoSQL, or enterprise-class data Warehouse environments for large cloud-based data deployments. But it's important to remember that the concept of cloud computing is now more broadly understood as a "private cloud" deployment, supplemented by public cloud computing, SaaS, and a multi-tenant hosting environment.
However, if you limit your actual definition of "cloud computing" to the scope of the public subscription service, you find the core of the problem: you have to determine which large data applications are better suited for public cloud/SaaS deployments and which are more appropriate for internal deployments such as hardware devices or virtual server clusters that involve advanced optimizations )。
In other words: When will you be able to collect large data that is scalable, resilient, high-performance, cost-effective, reliable, and manageable, and managed by external service providers? The following are some examples of significant large data management in the public cloud.
Enterprise applications are already hosted in cloud services: If, for many enterprises (especially small and medium sized enterprises) are already using cloud-based applications from external service providers, most of your transaction data sources are already in a public cloud. Or if your enterprise has a deep historical data source with this cloud platform, it may have been accumulating large amounts of data. To a certain extent, the service provider or its partners provide value-added analysis services-such as loss analysis, marketing optimization, or offsite backup and customer data archiving-so that it may be more meaningful to host large data in the cloud, rather than the hosts that are stored within the enterprise.
High-capacity external data sources require considerable preprocessing: for example, if you are monitoring customer sentiment based on social media data, you may not need to take advantage of servers, storage devices, and bandwidth capacity resources within your enterprise. This is an obvious example of an application that takes advantage of the social media filtering services provided by the public cloud based on large data services.
Application requirements exceed the large data processing capabilities of your enterprise's internal devices: If you have a large local data platform within your enterprise, dedicated to dealing with an application (such as a dedicated Hadoop cluster that handles the ETL of high-capacity unstructured data sources). Then, when there is a new application, and the enterprise's current large data platform is not suitable to meet the needs of new applications, the adoption of a public cloud is the right solution. (for example, multi-channel marketing, social media analytics, geo-spatial analytics, accessible archiving, resilient data, scientific sandbox), and on-demand services are more cost-effective. In fact, a public cloud solution might be the only viable option if you need to handle large, petabytes, streaming, and multiple-structured data as quickly as possible.
Flexible supply very large and short project sandbox parsing: If you have a very short data science project that requires an exploratory dataset (also known as a sandbox) and its order of magnitude is larger than the average size, then the public cloud may be your only viable or economic option. You can take advantage of cloud-based storage and processing capabilities to quickly invest in the project. You can then redefine storage and processing capabilities at the end of the project. I call this the "bubble set" deployment model, which is specifically tailored for cloud cover.
If you have done any of these, then the strategic problem of large cloud based data is not at the beginning of your project. With the growing sophistication, price/performance, scalability, flexibility, and manageability of cloud-based large data services, this issue will occur when your project terminates. By the end of the decade, with more and more applications and data moving to the public cloud, the idea of building and running your own large data deployment will become as unrealistic as designing your own server today.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.