In the era of artificial intelligence, enterprises want to improve efficiency through big data analysis and mining technology, and are blocked by related technologies such as big data volume analysis and machine learning mining. It is necessary for a data analysis and mining product to cross this gap. Jarvis came into being in this context. Jarvis is a tool and platform for supporting big data analytics mining application development. It is positioned between enterprise developers and big data analytics mining technologies to provide visual interaction support, enabling big data analytics and mining technologies to be quickly transformed into enterprise application scenarios. Specific products.
It is understood that the vertical stratification and horizontal grading of the Jarvis technology implementation stack ensure that the entire process can solve the data processing, computing resources, operator algorithm support, environment deployment and other aspects of the data analysis and mining process, and at the same time, perform functional grading for users. Maximize scalability and strive to be a product that benefits all types of development users, including data scientists, business developers, data analysts, product managers, and decision analysts.
Visual management of the entire process of data science
A classic data mining analysis application process includes data acquisition, data expectation processing, feature extraction, modeling development, predictive deployment, and application. Jarvis fully researched and analyzed the development scenarios faced by development and implementation personnel, possible efficient and convenient working methods, and carried out abstract design implementation:
Data connection, support structured, unstructured multi-type data access, support private data access, support cloud Bos, distributed HDFS, relational database and other multi-type data source reading and flexible mounting.
Data preparation, providing interactive data cleaning and pre-processing tools that support text and image types for efficient data preparation.
Data analysis, support PB level SQL interactive query analysis, Spark processing; also provide a wealth of visual data exploration tools to facilitate developers to obtain high-value valid samples.
Mining modeling, built-in rich basic operator algorithm for developers to efficiently develop and develop; at the same time preset the classic vertical industry solutions, can be efficiently implemented in matching scenarios at low cost.
Model deployment, build model can be directly released, deployed, and supports dynamic hot loading. It provides the effect monitoring function of common model evaluation indicators for one-click selection monitoring and support for free expansion.
Process monitoring, the full workflow of the developer implements automatic tracking, and the new data can automatically trigger the entire process of re-running.
Cloud native service
In the process of data analysis and mining, the different scenarios, different data, different processing stages, different developers' needs for the environment, and the needs for resources are diverse. This requires the resources of the data analysis and mining platform. (including development environment resources) Management should be flexible, flexible, and easy to expand, ensuring stability and efficient resource utilization. Jarvis is implemented using a cloud native service architecture.
Automatic Machine Learning AutoML
The strategy model developers spend a lot of time on selecting different feature data, performing different algorithm selection attempts, and parameter tuning, and finally get an efficient model. In theory, AutoML can automatically try multiple data features, multiple algorithms, test completely different model architectures, and then match the target to give a solution to the final problem.
Industry Solutions
Different enterprises in the same industry often have common data analysis and mining scenarios, such as: power industry, electricity consumption forecasting; industrial physical network, equipment fault detection, fault prediction, etc. The problems to be solved in these similar scenarios are similar, and the data to be analyzed is similar, so that it is possible to abstract and generalize industrial solutions to be reused in a similar scenario and quickly put into application. For deep data mining developers, there are also a large number of common algorithms and operator libraries that can be reused to improve development efficiency. Jarvis provides layered built-in capabilities from basic algorithms, general-purpose models, and vertical-type solutions, and continues to expand and integrate, providing efficient reuse for developers of different scenarios.
During the Baidu Developer Conference, Jarvis invited the first users to use the enhanced basic development environment through the Dianshi-Big Data Intelligence Platform (dianshi.baidu.com, DataLab) (built-in rich algorithm database) And Baidu AI open interface), by the user's praise.