1. Hadoop
It would is impossible to talk about open source data analytics without mentioning Hadoop. This Apache Foundation project have become nearly synonymous with big data, and it enables large-scale distributed processi Ng of extremely large data sets. A survey conducted by TDWI and SAS found this nearly percent of enterprises expected to has Hadoop clusters in product Ion by the end of 2016.
However, it should be noted, the Hadoop on its own doesn ' t enable data analytics. It ' s usually part of a larger solution for gathering insights from big data.
2. Spark
Also an Apache project, Spark promises fast big data processing. In fact, the IT claims to "run programs up to 100x faster than Hadoop MapReduce in memory, or 10x faster on disk." As a result of this fast performance, it's often used to analyze streaming data or in applications that require INTERACTI ve analysis capabilities. Companies frequently use it alongside Hadoop or Mesos although it can also run in its own. It has recently experienced a dramatic rise in popularity, and a to survey conducted by Syncsort found that nearly PE Rcent of Enterprise Big Data staffers surveyed were interested in Spark.
3. Talend
Unlike the first and projects in this slideshow, Talend are managed by a for-profit company rather than a foundation. As a result, paid support is available. Talend offers a mix of free and paid products. Its-free, open source solution are called Talend Open Studio, and it has been downloaded more than 2 million times.
Market analyst firm Gartner recently named Talend a "Leader" in data integration. The company boasts so it can help enterprises analyze their big data five times faster and at one-fifth the cost compare D to competing solutions.
4. Jaspersoft
Like Talend, Jaspersoft comes in multiple editions both free and paid. Its Community edition are free and open source while the Reporting, AWS, Professional and Enterprise editions require a fee But the come with the support included.
Jaspersoft is a open source Business intelligence tool, this aims to allow business users to self-serve their own needs. The company claims it technology powers more than 130,000 apps with embedded BI capabilities.
5. Pentaho
Pentaho describes itself as a "comprehensive data integration and business Analytics platform." The company primarily promotes the commercial versions of its software, which is based on the open source Community versi On. Companies can use it alongside tools like Hadoop and Spark to enable reporting and visualizations for their big data. This software boasts a long list of well-known customers that includes BT, Caterpillar, Nasdaq, the U.S. Dept Security, NOAA, the New York times, EMC and many others.
6. RapidMiner
RapidMiner claims to be the ' #1 open source data science platform, ' and Gartner named it a leader in its Magic Quadrant re Port for advanced analytics. It enables self-service predictive analytics and promises lightning-fast performance. Its users include BMW, Lufthansa, Domino ' s Pizza, Sony, Ford, Salesforce, Amnesty International and GE. The complete Radiminer Platform includes three separate Pieces:rapidminer Studio, RapidMiner Server and RapidMiner Radoop . All three is available under open source or commercial licenses, and commercial prices depend on the number of users.
7. Storm
Used by companies like Yahoo!, Twitter, Spotify, Yahoo, Yelp, Flipboard and Groupon, Apache Storm is a real-time big data p Rocessing engine. Its website explains, "Storm makes it easy-reliably process unbounded streams of data, doing for real-time processing W Hat Hadoop did for batch processing. " Customers can use it with any database and any programming language. It ' s scalable, fault-tolerant and easy to deploy. Users should note however, that Storm had not yet reached the 1.0 release level.
8. H2O
Used by + than 60,000 data scientists at + than 7,000 organizations, H2O claims to being "the world ' s leading open sour CE machine learning platform. " Thanks to its In-memory technology, it offers extremely fast performance. It also integrates with many other open source data analytics tools like Hadoop and Spark, and it supports all of the Popular databases. Paid support is available.
In addition to the standard version of H2O, the company also offers sparkling water, a version of this incorporates Spark, an D Steam, and end-to-end Artificial intelligence application engine.
9. lumify
Created by a company called Altamira Technologies, Lumify describes itself as an "open source Big data analysis and visual ization platform. " It makes it easy-to-create 2D or 3D graphs that show the relationship between entities or to overlay data on maps. For those who is interested in learning more about what it works, the website offers several videos that show lumify in AC tion, and it also have a demo site that allows users to upload their own data and try out the software.
Ten. Drill
Apache Drill allows users to the use of SQL queries for non-relational data storage systems. It supports a range of NoSQL and cloud-based data storage systems, including HBase, MongoDB, Mapr-db, HDFS, Mapr-fs, Amazo n S3, Azure Blob Storage, Google Cloud Storage and Swift. It also allows users to search through multiple datasets stored with different technologies using a single query. In addition, it supports many popular BI tools.
One. MongoDB
One of the best-known NoSQL databases, MongoDB is an open-source non-relational data storage solution. Its customers include MetLife, the city of Chicago, Expedia, Google, the Weather Channel, BuzzFeed and Facebook. In addition to the free open source version, the company also offers a paid Enterprise version and MongoDB Atlas, a cloud- Hosted version. Forrester has named MongoDB a "Leader" for Big Data NoSQL.
SpagoBI.
SpagoBI is a open source business intelligence and Big data analytics platform. The software is completely free, but paid user support, maintenance, consulting and training were available for purchase. It includes tools for reporting, Multidimensional Analysis (OLAP), charts, location intelligence, data mining, ETL and Mor E. It also integrates with popular in-memory processing engines and enables real-time processing.
Top Open Source Data Analytics Apps