PHP Machine Learning Library PHP-ML Example Tutorial

Source: Internet
Author: User
Tags autoload explode
PHP-ML is a machine learning library written using PHP. While we know that Python or C + + provides more machine learning libraries, in fact, most of them are slightly more complex and configured to be desperate for many novices. PHP-ML This machine learning library is not particularly tall on the algorithm, but it has the most basic machine learning, classification and other algorithms, our small companies do some simple data analysis, prediction, etc. are sufficient. In our project, the pursuit should be cost-effective, rather than excessive efficiency and precision. Some algorithms and libraries look great, but if we think about getting online quickly, and our technicians don't have the experience of machine learning, complex code and configuration can actually slow down our project. And if we are doing a simple machine learning application, then studying the complexity of the library and algorithm learning cost is obviously higher, and, the project has a strange problem, we can solve it? What if the demand changes? I believe we all have this experience: doing, the program suddenly error, how they do not know why, on Google or Baidu, only a search to meet the conditions of the problem, in five years, ten years ago asked questions, and then 0 replies. Therefore, it is necessary to choose the simplest, most efficient and cost-effective approach. PHP-ML speed is not slow (quickly change PHP7 bar), and the accuracy is good, after all, the algorithm is the same, and PHP is based on C. Bloggers are most uncomfortable with Python and java,php than the performance, than the scope of application. Really want performance, please take C development. Really want to pursue the scope of application, also please use C, even compiled ...

First, we want to use this library, we need to download this library first. On GitHub You can download to this library file (HTTPS://GITHUB.COM/PHP-AI/PHP-ML). Of course, it is more recommended to use composer to download the library and configure it automatically.

When the download is ready, we can take a look at the document of this library, the document is some simple small example, we can build a file to try. are easy to understand. Next, let's take the actual data to test it. Data set one is the data set of iris stamens, and the other is missing because of the record, so I don't know what the data is about ...

Iris stamens part of the data, there are three different categories:

Unknown data set, the decimal point is a comma, so you need to deal with the calculation:

We'll deal with the unknown data set first. First, the file name of our unnamed dataset is data.txt. This data set can be plotted as an X-y line chart first. Therefore, we first draw the original data into a line chart. Because the x-axis is long, we just need to look at its approximate shape:

Draw the PHP Jpgraph library with the following code:

<?phpinclude_once './src/jpgraph.php '; include_once './src/jpgraph_line.php '; $g = new Graph (1920,1080);// Jpgraph Drawing Operation $g->setscale ("Textint"), $g->title->set (' data ');//File processing $file = fopen (' data.txt ', ' R '); $labels = Array (); while (!feof ($file)) {$data = Explode (", fgets ($file));   $data [1] = Str_replace (', ', '. ', $data [1]);//data processing, the comma is fixed to the decimal point $labels [(int) $data [0]] = (float) $data [1];// Here, the data is stored in the array as a key value, so that we can sort by the key ksort ($labels);//The size of the key $x = Array (), the//x axis representation data $y = Array (), the//y axis representation data foreach ($labels as $ Key=> $value) {Array_push ($x, $key); Array_push ($y, $value);} $linePlot = new LinePlot ($y); $g->xaxis->setticklabels ($x); $linePlot->setlegend (' data '), $g->add ($linePlot), $g->stroke ();

In the comparison with this original image, we are going to study next. We use the Leastsquars in php-ml to learn. Our test output needs to be stored in a file so that we can draw a comparison chart. The learning code is as follows:

<?php require ' vendor/autoload.php '; Use Phpml\regression\leastsquares; Use Phpml\modelmanager; $file = fopen (' data.txt ', ' R '); $samples = Array (); $labels = Array (); $i = 0; while (!feof ($file)) {  $data = explode (", fgets ($file));  $samples [$i][0] = (int) $data [0];  $data [1] = Str_replace (', ', '. ', $data [1]);  $labels [$i] = (float) $data [1];  $i + +; }  fclose ($file); $regression = new Leastsquares (); $regression->train ($samples, $labels);// This a-array is based on the X-values of the original data processing, and is used for testing purposes. $a = [ 0,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,22,23,24,25,26,27,29,30,31,37,40,41,45,48,53,55,57,60,61,108,124 ]; for ($i = 0; $i < count ($a); $i + +) {  file_put_contents ("Putput.txt", ($regression->predict ([$a [$i]]). " \ n ", file_append); Append to File   }

After that, we will read the data stored in the file, draw a graph, first paste the final:

The code is as follows:


<?phpinclude_once './src/jpgraph.php '; include_once './src/jpgraph_line.php '; $g = new Graph (1920,1080); $g Setscale ("Textint"); $g->title->set (' data '); $file = fopen (' putput.txt ', ' R '); $y = Array (); $i = 0;while (!feof ($ file) {$y [$i] = (float) (fgets ($file)); $i + +;   } $x = [ 0,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,22,23,24,25,26,27,29,30,31,37,40,41,45,48,53,55,57,60,61,108,124 ]; $linePlot = new LinePlot ($y); $g->xaxis->setticklabels ($x); $linePlot->setlegend (' data '), $g->add ($linePlot), $g->stroke ();

It can be found that the graphics in or out of the larger, especially in the shape of the more jagged segments. However, this is after all 40 sets of data, we can see that the approximate graph trend is consistent. The general library in this study, the data volume is low, the accuracy is very low. To achieve high precision, it takes a lot of data, more than the amount of data is necessary. If this data requirement is not reached, then we are in vain to use any library. Therefore, in the practice of machine learning, the real difficulty is not low precision, configuration complexity and other technical problems, but the amount of data is not enough, or the quality is too low (a set of data is too much useless data). Pre-processing of data is also necessary before machine learning is done.

Next, we will test the stamens data. There are three categories, because we downloaded CSV data, so we can use the PHP-ML official offer to operate the CSV file method. And here is a classification problem, so we choose the SVC algorithm provided by the library to classify. We set the file name of the stamen data as Iris.csv, the code is as follows:

<?phprequire ' vendor/autoload.php '; use Phpml\classification\svc;use phpml\supportvectormachine\kernel;use Phpml \dataset\csvdataset; $dataset = new Csvdataset (' Iris.csv ', 4, false); $classifier = new SVC (kernel::linear, $cost = 1000); $ Classifier->train ($dataset->getsamples (), $dataset->gettargets ()), Echo $classifier->predict ([$ARGV [1 ], $ARGV [2], $ARGV [3], $ARGV [4]]);//$ARGV is a command line parameter, debugging such a program using the command line more convenient

Isn't it simple? Just 12 lines of code will be done. Next, let's test it out. According to the diagram we posted above, when we enter 5 3.3 1.4 0.2, the output should be iris-setosa. Let's take a look at:

Look, at least we enter a data that we have, and we get the right result. However, do we enter data that is not in the original data set? Let's test two groups:

From the data of the two graphs we posted earlier, the data we entered does not exist in the dataset, but the classification is reasonable according to our initial observations.

So, this machine learning library is enough for most people. And most despise this library despise that library, talk about the performance of people, is basically not what Daniel. The real Daniel has been busy with his own work, or is doing academic research and so on. More of us should be mastering algorithms, understanding the truth and the mystery, not the rhetoric. Of course, this library is not recommended for large projects, only small projects or individual projects.

The

Jpgraph only relies on the GD library, so it can be used after downloading the reference, and a lot of code is placed on the drawing and the initial data processing. The learning code is not complicated because of the excellent encapsulation of the library. Need all the code or test data set of the small partners can leave a message or private messages, etc., I provide complete code, decompression is used

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.