Super full! Java-based machine learning project, environment, library ... __java

Source: Internet
Author: User
Tags sub domain

Https://yq.aliyun.com/articles/278837?utm_source=tuicool&utm_medium=referral

Summary: Are you a Java programmer who wants to start or learn about machine learning? Using machine learning to write programs is the best way to learn. You can write the algorithm from scratch, but with the existing open source library, you can make greater progress. This article introduces the main platform and the Open source machine learning Library.

Are you a Java programmer who wants to start or learn about machine learning?

Using machine learning to write programs is the best way to learn. You can write the algorithm from scratch, but with the existing open source library, you can make greater progress.

This article introduces the main platform and the Open source machine learning Library. You can use these machine learning libraries. Environment

This section describes the Java environment or the work domain for machine learning. They provide a graphical user interface for performing machine learning tasks, as well as Java APIs for developing their own applications. Weka

Waikato Environmental Knowledge Analysis (Weka) (https://www.cs.waikato.ac.nz/ml/weka/) is a machine learning platform developed by New Zealand's Waikato University. Provides Java graphical user interface, command line interface and Java API interface. It is probably the most popular Java machine Learning Library and a good place to start or practice machine learning.
Knime

Constance Information Mining (knime) (https://www.knime. com/) is an analysis and reporting platform developed by the University of Constance, Germany. Its research focus is on drug research, but it has expanded to general business intelligence. It provides a graphical user interface (based on Eclipse) and a Java API.
Rapid excavation

Rapid Excavation (https://rapidminer.com/) is developed by the Technical University of Dortmund, Germany. It provides a GUI and a Java API to develop its own applications. It also provides a machine learning algorithm for data processing, visualization and modeling.



Elki

Elki is an environment for developing kdd-applications supported by the index structure (https://elki-project.github.io/), a data mining platform developed in the Java language by Ludwig Maximilli University in Munich, Germany. It focuses on processing data in relational databases, such as outlier detection and classification (based on distance function methods). It provides a mini GUI, command line interface, and Java API.


Library

In fact, each of the projects listed in this article has a Java API library. However, the projects listed in this section provide only one Java API. In a narrow sense, they are machine learning libraries. Java-ml

The Java Machine Learning Library (JAVA-ML) (http://java-ml.sourceforge.net/) provides a collection of machine learning algorithms implemented in Java. It provides a standard interface for each algorithm, no UI (user interface), and no reference to the relevant scientific literature for further reading. It includes methods for data manipulation, clustering, feature selection, and classification. It is noteworthy that, as of this article, the latest version of it is in the year 2012. Jsat

The Java Statistical Analysis tool (JSAT) (Https://github.com/EdwardRaff/JSAT/tree/master) provides a standard machine learning algorithm that is implemented in a pure Java language to solve problems of a medium scale. Jsat's author says he developed the library in part for self-study, partly to get the job done. Still, the list of algorithms is impressive. It includes classification, regression, collection, clustering and feature selection methods. Java Large Data Project

This section lists Java projects that are suitable for large data, such as machine clusters. Mahout (Hadoop)

The Apache Mahout (https://mahout.apache.org/) provides a machine learning algorithm for implementing the Apache Hadoop platform (distributed mapping simplification). The project focuses on clustering and classification algorithms, and a popular application-driven implementation is its use in collaborative filtering of recommended systems. It also includes a reference implementation that runs the algorithm on a single node. Mllib (Spark)

Apache Machine Learning Library (mllib) (Http://spark. apache.org/mllib/) provides an implementation of a machine learning algorithm for the Apache Spark platform (HDFS, rather than a mapping simplification). Although Java libraries and platforms support Java, Scala, and Python bindings. The library is new, the list of algorithms is short, but it grows fast. Moa

Large-scale online analysis (MOA) (Https://moa.cms waikato.ac.nz/) is an open source platform, designed by data stream mining at the University of New Zealand Waikato. Same as Weka (developed in the same place), providing a GUI, command-line interface, and Java APIs. It provides a long list of algorithms that focus on classifying and supporting outlier detection to solve conceptual drift. MOA uses advanced data mining and machine learning Systems (https://adams.cms.waikato.ac.nz/) to manage workflows, and development is in the same place. Samoa

Scalable Advanced Online Analytics (Samoa) (http://samoa-project.net/) is a distributed streaming media machine learning framework developed by Yahoo. It is designed to run on Apache Storm and Apache S4. The system can use the algorithm provided by MOA project to complete the classification tasks. Natural Language Processing

This section is dedicated to Java libraries and projects that address the problems of the child domain from machine learning, called Natural Language Processing (NLP).

Natural language processing is not my domain, so I just point out the key libraries. Opennlp:apache OPENNLP (HTTP://OPENNLP. apache.org/) is a toolkit for handling natural language texts, which provides methods for natural language processing tasks such as tagging, segmentation, and entity extraction.
Lingpipe:lingpipe (http://alias-i.com/lingpipe/) is a toolkit for computational linguistics, including thematic classification, entity extraction, clustering, and emotion analysis methods.
Gate: Text engineering general structure (GATE)(http://gate.ac.uk/) is an open source library for text processing. It provides an array of examples for different projects.
Mallet: The Machine Learning Language Toolkit (Mallet) (http://mallet.cs.umass.edu/) is a Java Toolkit for statistical Natural language processing, document classification, clustering, theme modeling, and information extraction.
Computer Vision

This section lists the machine learning Sub domain library, called Computer Vision (VC).

Computer vision is not the domain I am familiar with, so I just point out the key library. BOOFCV:BOOFCV (Http://boofcv.org/index.php?title=Main_Page) is an open source library for computer vision and robotics applications. It supports the functions of image processing, feature, geometric vision, calibration, recognition and image data entry.
Deep Learning

With the rapid development of depth learning methods and hardware, neural network has been popular again. This section lists the key Java libraries for dealing with neural networks and deep learning. ENCOG:ENCOG (HTTP://WWW.HEATONRESEARCH.COM/ENCOG) is a machine learning library that provides algorithms such as SVM, classical neural networks, genetic programming, Bayesian networks, Hmm, and genetic algorithms.
DEEPLEARNING4J:DEEPLEARNING4J (http://deeplearning4j.org/) is considered to be a business-level deep learning library written in Java. It is described as being compatible with Hadoop and provides algorithms that include limited Boltzmann, deep belief networks, and stacked noise-reduction automatic encoders.
Summary

In this article, when we choose a library or platform for machine learning in Java, we have access to the big project name option. These are popular items for learners, but they are not the only listed. For example: Look at this page on mloss.org (http://mloss.org/software/language/java/), which lists 71 Java-based open source machine learning projects. This is a very important job and I believe GitHub and SourceForge have more work to do.

The key for learners is to consider their projects and needs seriously. Find what you need from a library or a platform, and then choose and learn a project that works best for you.

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.