Large data processing technology----R analysis of the system PA
Source: Internet
Author: User
KeywordsLarge data provide pass process technology
With large data being adopted by more enterprises, the compilation and production language of data processing and analysis algorithms have been widely concerned. and unknowingly, open source statistics language R has become a basic technology for large data scientists and developers. In all programming languages and techniques, popularity has soared.
The following translation
By consolidating with large data processing tools, R provides depth statistical capabilities for large datasets, including statistical analysis and data-driven visualization. In industries such as finance, pharmaceuticals, media, and sales, which can directly take decisions from data, R has been applied in depth.
According to Rexer Analytics 2013 survey of data mining professionals, R has become the most popular statistical analysis tool, at least 70% of respondents said that the use of the R language. In the corporate market, R is also popular, with multiple companies and projects using R and providing it to large data scientists and business users, including Microsoft's cloud computing Azure Machine Learning, IBM's Big R, Teradata Aster R, Oracle R Enterprise, PIVOTALR's big Data R release, SAP R for Hana, and so on, do a brief analysis:
The azure Machine Learning is equipped with R. Microsoft provides the R language API and templates in Azure ml, supports more than 300 packages using r language, and the user does not have to start from scratch, and azure ml allows developers to use the existing parts to assemble models that fit their needs. This undoubtedly lowers the threshold for machine learning and allows data scientists of all backgrounds to use it.
IBM infosphere biginsights Big R. Big R is a set of functional libraries that provide terminal to terminal R and Infosphere biginsights integration. Big R can be used to synthesize data analysis on Infosphere biginsights servers, reduce the complexity of writing mapreduce jobs, and allow users to return to common R syntax and paradigms.
Teradata Aster R. Teradata Aster R expands open source R language analysis capabilities by relaxing memory and processing power constraints. For R language Analyst, Aster R developed their familiar R language and tools, and provides powerful processing capabilities and rich analytical methods, which are divided into 3 components: "Aster r Library" Preset 100 kinds of r language functions; Aster R Parallel Constructor "owns more than 5,500 R language analysis kits;" Aster SNAP Framework Integration "fully integrates the Open source R language engine into the Teradata Aster Seamless network analysis processing framework.
Oracle R Enterprise. Oracle R Enterprise primarily provides in-database analysis capabilities for the company's RDBMS and Exadata devices.
Pivotalr. PIVOTALR is a package that allows r users to interact with pivotal (Greenplum) database and pivotal HD (for large data processing analysis), providing in-database and in-hadoop calculations for data scientists in a similar interface to R. HAWQ is the core of pivotal HD Hadoop technology, providing dynamic pipelining, world-class query optimizer, vertical scaling, SQL compliance, interactive queries, depth analysis, and commonly used Hadoop formats by supporting the R language.
SAP will be a collection of R and Hana. SAP integrates R language and their memory database Hana to form a new platform for mobile, analytics, data services, and converged services that SAP has implemented through Rserve (with R server Communicator). Because of the use of column storage, Hana can exchange data with R efficiency, SAP simplifies the user's operations by encapsulating rapid deployment solutions.
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.