Open source Large data frame Apache Hadoop has become a fact standard for large data processing, but it is also almost synonymous with large numbers, although this is somewhat biased.
According to Gartner, the current market for Hadoop ecosystems is around $77 million trillion, which will grow rapidly to $813 million in 2016.
But swimming in the fast-expanding blue sea of Hadoop is not easy, not only is it hard to develop large data infrastructure technology products, it's hard to sell, but it's even harder to make big data infrastructure tools like Hadoop, NoSQL databases and streaming systems. Customers need a lot of training and education, paying customers need a lot of support and timely follow-up of product development work. Dealing with enterprise customers is often not the strength of a start-up team. In addition, large data infrastructure technology start-ups usually have a higher demand for the scale of venture capital.
Despite the difficulties, Hadoop startups are springing up, with the exception of Cloudera, Datameer, DataStax and MapR, the already successful Hadoop startups, CIO Magazine recently ranked the ten most noteworthy Hadoop startups of the 2014, knowing that these companies ' products and business models are of great value to corporate data-technology entrepreneurs and large data-application users:
I. Platfora
Business: The large Data analysis solution provides the ability to convert raw data from Hadoop into an interactive, memory-based business intelligence service.
Introduction: Founded in 2011, so far has raised 65 million of dollars.
Reason for inclusion: The goal of Platfora is to simplify the complex Hadoop, and to drive the use of Hadoop in the enterprise market. Platfora's approach is to simplify data acquisition and analysis processes, automatically convert the raw data in Hadoop into interactive business intelligence services without the need for ETL or data warehousing. (Reference reading: Hadoop is just poor ETL)
Ii. Alpine Data Labs
Business: Provides a data analysis platform based on Hadoop
Introduction: Founded in 2010, so far the cumulative financing of 23.5 million of dollars.
Reasons to be chosen: complex advanced analysis and machine learning applications often require a master of scripting and Code development, which further pushes up the technology threshold for data scientists. In fact, big data executives and IT managers have neither the time nor the interest to learn programming techniques, or to learn about complex Hadoop. Alpine data dramatically reduces the application threshold for predictive analysis through SaaS services.
Iii. Altiscale
Business: Providing Hadoop as a service (HaaS)
Introduction: Founded in March 2012, so far financing 12 million of dollars.
Reason: Big data is making the talent shortage, and the provision of Hadoop-related services through cloud computing is undoubtedly a shortcut to Universal Hadoo, according to TechNavio estimates, the 2016 Haas Market size will be as high as 19 billion dollars, is a big cake. But the Haas market has become increasingly competitive, including the Amazon EMR, Microsoft's Hadoop on Azure, and Rackspace's Hortonworks cloud services, which are heavyweight players, and Altiscale also need Hortonworks, Cloudera, mortar Data, Qubole, Xpleny launched direct competition.
Iv. Trifacta
Business: Provides a platform to help users transform complex raw data into clean, structured formats for analysis.
Introduction: Founded in 2012, so far financing 16.3 million of dollars.
Reason for inclusion: a huge bottleneck between large data technology platforms and analysis tools is that data analysts need to spend a lot of effort and time transforming data, and business data analysts often don't have the technical ability to do data conversion work independently. In order to solve this problem trifacta developed a "predictive interaction" technology to visualize the operation of the data, and the TRIFACTA machine learning algorithm can also observe the user and data attributes, predict the user's intention and automatically give suggestions. Trifata's rivals are Paxata, Informatica and Cirrohow.
V. Splice Machine
Business: A Hadoop based SQL-compatible database for large data applications.
Introduction: Founded in 2012, so far financing 19 million of dollars.
Reason for inclusion: New data technologies have enabled some of the popular functions of traditional relational databases such as acid compliance, transactional consistency, and standard SQL query language to be extended on cheap, scalable Hadoop. Splice machine retains all the benefits of the NoSQL database, such as auto-sharding, fault tolerance, scalability, and so on, while preserving SQL.
Vi. datatorrent
Business: Provide a real-time streaming platform based on Hadoop platform
Introduction: Founded in 2012, June 2013 received 8 million dollars a round of financing.