KeywordsCan can through can through crawl can through crawl large data analysis can through crawl large data analysis this
Recently, PRECOG announced their large data warehousing and analytics services, which are responsible for processing data capture, transform analysis and visualization processes, and the infrastructure on which services are based. However, this service also reserves various open access points through the RESTful API, enabling developers and data scientists to control the entire process.
PRECOG can crawl input data from a variety of data sources, including SQL databases, Amazon S3, Hadoop, MongoDB, client Web applications, and back-end servers. The RESTful API enables developers to crawl data from external sources such as Twitter or Facebook, CSV files, or mobile devices. The crawled data is saved to a custom database called Precogdb, and we can use demographics, attitudes, locations, and other information to enrich the data.
Data can then be analyzed in a variety of ways, such as through an API or by using a client library (javascript,php), or by using Labcoat, an IDE that supports data analysis using Quirrel, a declarative query language. Developers can create their own data capture, hardening, and analysis modules that can even be sold on the market.
PRECOG can run the entire process on top of different cloud vendors, such as Amazon EC2 and softlayer--, to increase system resiliency and uptime.
In an interview with Infoq, Precog's CEO and founder John A. De goes explained:
"(System) architecture is somewhat similar to database analysis, for example, all include column-oriented storage, but the difference is that the former supports completely heterogeneous, non-standard data, and with the support of Quirrel, this is similar to the" R "Language for large data, compared to using RDBMS for analysis, You can easily perform many more advanced computations. ”
Precogdb, the core of the platform, is a Scala, column-oriented database that runs on the JVM and optimizes data capture and analysis. According to de goes, precogdb can save "measured data such as clicks, purchases, dimensions, Twitter data, or log information collected from various other activities". He added, "PRECOG is not yet able to store large chunks of unstructured data, although this demand is true in bioinformatics and other areas of application." However, this function is already on our road map. ”
As for quirrel--, a statistical query language implemented by PRECOG, De goes said: "Quirrel many aspects are similar to the R programming language." Like R, Quirrel is also designed for advanced analysis and statistics. But unlike R, Quirrel is not a fully Turing language, it is purely declarative, and it makes it easier to efficiently distribute Quirrel queries in large clusters of machines (which make quirrel easier to learn than R). ”
PRECOGDB has "built-in routines for common analysis and statistical calculations," which also provides a "fine-grained, competency-based security model that enables applications from mobile devices or web to access their capabilities directly through rest."
Translator Shu Tao View English original: Precog:big Data Analytics as a Service
The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion;
products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the
content of the page makes you feel confusing, please write us an email, we will handle the problem
within 5 days after receiving your email.
If you find any instances of plagiarism from the community, please send an email to:
info-contact@alibabacloud.com
and provide relevant evidence. A staff member will contact you within 5 working days.