15 major frameworks for machine learning

Source: Internet
Author: User
Tags data mining deep learning machine learning data stream mining machine learning algorithm

Machine learning engineers are part of the team that develops products and builds algorithms and ensures that they work reliably, quickly, and on a scale. They work closely with data scientists to understand theoretical knowledge and industry applications. The main differences between data experts and machine learning engineers are:

  • Machine learning engineers build, develop, and maintain products for machine learning systems.

  • Data experts conduct research to form ideas about machine learning projects and then analyze to understand the metric impact of machine learning systems.


The following is an introduction to the framework of machine learning:

  1. Apache Singa is a universal distributed deep learning platform for training deep learning on large data sets. It is designed based on a simple development model of layered abstraction. It also supports a variety of current popular deep learning models, including feedforward models (convolutional neural networks, CNN), energy models (restricted Boltzmann machines, RBM and recurrent neural networks, RNN), and also provides users with Many inlay layers.

  2. Amazon Machine Learning (AML) is a service that is easy for developers of all levels of machine learning technology to provide visual tools and wizards that guide you without having to learn complex machine learning algorithms and techniques. Establish machine learning in case.

  3. Azure ML Studio allows Microsoft Azure users to create and train models, which are then translated into APIs that can be used by other services. Although you can link your Azure storage to a larger model of service, the storage capacity of each account model data can be up to 10GB. There are a lot of algorithms available in Azure, thanks to Microsoft and some third parties. Even if you don't need to sign up for an account, you can log in anonymously and use the Azure ML Studio service for up to 8 hours.

  4. Caffe is a deep learning framework developed by the Berkeley Visual Learning Center (BLVC) and community contributors based on the BSD-2- protocol, which is based on the concept of “representation, efficiency and modularity”. Model and combinatorial optimization is achieved through configuration rather than hard coding, and users can switch between CPU processing and GPU processing as needed. Caffe's efficiency makes it perfect for experimental research and industrial deployment, using a single NVIDIA K40 The GPU processor can process more than 60 million images per day.

  5. H2O makes it easy to apply mathematical and predictive analytics to solve today's challenging business problems. It cleverly combines the unique features that are currently not used in other machine learning platforms: the best open source technology, easy to use The WebUI and familiar interface support common databases and different file types. With H2O, you can use existing languages and tools. In addition, it can also be seamlessly extended to the Hadoop environment.

  6. Massive Online Analysis (MOA) is currently the most popular open source framework for data stream mining and has a very active community. It contains a range of machine learning algorithms (classification, regression, clustering, outlier detection, concept drift detection and recommendation systems) and evaluation tools. Like the WEKA project, MOA is written in Java, but it is more scalable.

  7. MLlib (Spark) is a machine learning library of Apache Spark. It is designed to make machine learning scalable and easy to operate. It consists of common learning algorithms and utilities, including classification, regression, clustering, collaborative filtering, Dimensionality reduction, including the underlying optimized native language and high-level pipeline API.

  8. Mlpack is a C++-based basic learning library, first introduced in 2011. According to the developers of the library, it is designed with the concept of “scalability, efficiency and ease of use”. There are two ways to execute Mlpack: by quickly handling the simple "black box" operation of the command line execution cache, or by using the C++ API to handle more complex work. Mlpack provides simple command-line programs and C++ classes that can be integrated into large machine learning solutions.

  9. Pattern is a web mining component of the Python programming language, with data mining tools (Google, Twitter, Wikipedia API, web crawler, HTML DOM parser), natural language processing ( part-of-speech tagging, n-gram search, sentiment analysis, WordNet interface ), machine learning (vector space model, clustering, support vector machine), network analysis and <canvas> visualization.

  10. Scikit-Learn expands the use of Python based on several existing Python packages (Numpy, SciPy and matplotlib) for math and science work. The resulting library can be used for interactive workbench applications or embedded in other software for reuse. The toolkit is based on the BSD protocol and is completely free and open source and reusable. Scikit-Learn contains a variety of tools for machine learning tasks such as clustering, classification, regression, and more. Scikit-Learn was developed by a large community of developers and machine learning experts, so the cutting-edge technology in Scikit-Learn is often developed in a very short time.

  11. Shogu is one of the earliest machine learning libraries. It was created in 1999 and developed in C++, but is not limited to the C++ environment. With the SWIG library, Shogun is available in a variety of locales such as Java, Python, c#, Ruby, R, Lua, Octave and Mablab. Shogun aims to provide unified, large-scale learning, such as classification, regression or exploratory data analysis, for a wide range of specific types and learning configuration environments.

  12. TensorFlow is an open source software library that uses data flow graphs for numerical operations. It implements data flow graphs in which tensors ("tensors") can be processed by a series of graphically described algorithms, and data changes in the system. Known as the "stream", it is named after it. The data stream can be encoded in C++ or Python and run on a CPU or GPU device.

  13. Theano is a definable, optimizable and numerically calculable Phython library based on the BSD protocol. The use of Theano is also comparable to the speed of implementing big data processing with C, and is an algorithm that supports efficient machine learning.

  14. Torch is a scientific computing framework that broadly supports machine learning algorithms that put GPUs first. It is easy to use and efficient because it is implemented using the simple and fast scripting language LuaJIT and the underlying C/CUDA. Torch's goal is to let you build your own scientific algorithms with extremely simple processes, maximum flexibility and speed. Torch is based on Lua and has a large ecosystem-driven library package designed for machine learning, computer vision, signal processing, parallel processing, imaging, video, audio and networking.

  15. Veles is a distributed platform for deep learning applications developed in C++, but it uses Python to automate and collaborate between nodes. Data can be analyzed and automatically standardized before the relevant data sets are clustered, and the REST API allows for the immediate addition of each trained model to the production environment, with a focus on performance and flexibility. Veles is hardly coded and can train all widely recognized network topologies, such as full convolutional neural networks, convolutional neural networks, and recurrent neural networks.

Related Article

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.