Hadoop large Data analysis received local R language support
Source: Internet
Author: User
KeywordsNew can big data analysis run
With the growing interest in large data analysis, software vendors Revolution http://www.aliyun.com/zixun/aggregation/16353.html "> Analytics has improved its flagship R-language statistics feature to enable it to run with the Hadoop data processing platform.
This new revolution R Enterprise 7 (RRE 7) also enables R to run in the Teradata database.
The R language provides a way to run common statistical tests (such as linear and non-linear modeling, time series analysis, classification, and clustering) for a set of data, usually resulting in graphical form.
R is increasingly used for complex data analysis, which is beyond the scope of standard business intelligence packages. Revolution Analytics Company expects more than 2 million people worldwide to use R.
RRE7 contains the R algorithm library, which can run in parallel across multiple nodes, in the same way that Hadoop manages large datasets. RRE7 can be added to the Cloudera CDH3 and CDH4 Hadoop distributions, as well as Hortonworks data platform 1.3.
This new R library protects the most commonly used statistical and predictive analysis algorithms for data processing, data sampling, descriptive statistics, statistical testing, data visualization, simulation, machine learning, and predictive modeling tasks.
Data analysis based on R can be done faster by analyzing the data in the node rather than moving the data to other locations. It also allows an entire set of data to be parsed, rather than a subset or summary of the data-the Enterprise Data Warehouse (EDW) is typically used in this way.
Revolution Analytics wants to add R to Hadoop and Teradata databases to broaden the use of the language. The company has also designed a new workflow interface that does not need to know how to deploy specific R algorithms. This eases the hassle of programming R in Java or other languages, allowing it to run on the Hadoop platform.
In addition to supporting these new platforms, RRE7 has adopted new algorithms and processes. One is to set up a model set of Decision forests, Decision forests is a machine learning technique for predicting future results. The new stepwise regression feature can also help us automate the selection of the most important variables for the predictive model. The new decision tree Visualization provides a graphical way to describe the complex relationships and dependencies within a dataset.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.