PHP-ML is a machine learning library written using PHP. While we know that Python or C + + provides more machine learning libraries, in fact, most of them are slightly more complex and configured to be desperate for many novices. PHP-ML This machine learning library is not particularly tall on the algorithm, but it has the most basic machine learning, classification and other algorithms, our small companies do some simple data analysis, prediction, etc. are sufficient. In our project, the pursuit should be cost-effective, rather than excessive efficiency and precision. Some algorithms and libraries look great, but if we think about getting online quickly, and our technicians don't have the experience of machine learning, complex code and configuration can actually slow down our project. And if we are doing a simple machine learning application, then studying the complexity of the library and algorithm learning cost is obviously higher, and, the project has a strange problem, we can solve it? What if the demand changes? I believe we all have this experience: doing, the program suddenly error, how they do not know why, on Google or Baidu, only a search to meet the conditions of the problem, in five years, ten years ago asked questions, and then 0 replies. Therefore, it is necessary to choose the simplest, most efficient and cost-effective approach. PHP-ML speed is not slow (quickly change PHP7 bar), and the accuracy is good, after all, the algorithm is the same, and PHP is based on C. Bloggers are most uncomfortable with Python and java,php than the performance, than the scope of application. Really want performance, please take C development. Really want to pursue the scope of application, also please use C, even compiled ...
First, we want to use this library, we need to download this library first. On GitHub You can download to this library file (HTTPS://GITHUB.COM/PHP-AI/PHP-ML). Of course, it is more recommended to use composer to download the library and configure it automatically.
When the download is ready, we can take a look at the document of this library, the document is some simple small example, we can build a file to try. are easy to understand. Next, let's take the actual data to test it. Data set one is the data set of iris stamens, and the other is missing because of the record, so I don't know what the data is about ...
Iris stamens part of the data, there are three different categories:
Unknown data set, the decimal point is a comma, so you need to deal with the calculation:
We'll deal with the unknown data set first. First, the file name of our unnamed dataset is data.txt. This data set can be plotted as an X-y line chart first. Therefore, we first draw the original data into a line chart. Because the x-axis is long, we just need to look at its approximate shape:
Draw the PHP Jpgraph library with the following code:
1<?PHP2 include_once'./src/jpgraph.php ';3 include_once'./src/jpgraph_line.php ';4 5 $g=NewGraph (1920,1080);//drawing operations for Jpgraph6 $g->setscale ("Textint");7 $g->title->set (' Data ');8 9 //Processing of filesTen $file=fopen(' Data.txt ', ' R '); One $labels=Array(); A while(!feof($file)){ - $data=Explode(‘ ‘,fgets($file)); - $data[1] =Str_replace(‘,‘,‘.‘,$data[1]);//data processing, the comma in the database is fixed to a decimal point the $labels[(int)$data[0]] = (float)$data[1];//here, the data is stored in the array as a key value, so we can sort by the key . - } - - Ksort($labels);//sort the size of the keys + - $x=Array();//x-axis representation data + $y=Array();//Y-axis representation data A foreach($labels as $key=$value){ at Array_push($x,$key); - Array_push($y,$value); - } - - - $linePlot=NewLinePlot ($y); in $g->xaxis->setticklabels ($x); - $linePlot->setlegend (' Data '); to $g->add ($linePlot); + $g->stroke ();
In the comparison with this original image, we are going to study next. We use the Leastsquars in php-ml to learn. Our test output needs to be stored in a file so that we can draw a comparison chart. The learning code is as follows:
1<?PHP2 require' Vendor/autoload.php ';3 4 UsePhpml\regression\leastsquares;5 UsePhpml\modelmanager;6 7 $file=fopen(' Data.txt ', ' R ');8 $samples=Array();9 $labels=Array();Ten $i= 0; One while(!feof($file)){ A $data=Explode(‘ ‘,fgets($file)); - $samples[$i][0] = (int)$data[0]; - $data[1] =Str_replace(‘,‘,‘.‘,$data[1]); the $labels[$i] = (float)$data[1]; - $i++; - } - fclose($file); + - $regression=Newleastsquares (); + $regression->train ($samples,$labels); A at //this A-array is based on the X-values of the original data processing, and is used for testing purposes. - $a= [0,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,22,23,24,25,26,27,29,30,31,37,40,41,45,48,53,55,57,60,61,108,124]; - for($i= 0;$i<Count($a);$i++){ - file_put_contents("Putput.txt", ($regression->predict ([$a[$i]]))." \ n ", file_append);//depositing files in an additional way -}
After that, we will read the data stored in the file, draw a graph, first paste the final:
The code is as follows:
1<?PHP2 include_once'./src/jpgraph.php ';3 include_once'./src/jpgraph_line.php ';4 5 $g=NewGraph (1920,1080);6 $g->setscale ("Textint");7 $g->title->set (' Data ');8 9 $file=fopen(' Putput.txt ', ' R ');Ten $y=Array(); One $i= 0; A while(!feof($file)){ - $y[$i] = (float)(fgets($file)); - $i++; the } - - $x= [0,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,20,22,23,24,25,26,27,29,30,31,37,40,41,45,48,53,55,57,60,61,108,124]; - + $linePlot=NewLinePlot ($y); - $g->xaxis->setticklabels ($x); + $linePlot->setlegend (' Data '); A $g->add ($linePlot); at $g->stroke ();
It can be found that the graphics in or out of the larger, especially in the shape of the more jagged segments. However, this is after all 40 sets of data, we can see that the approximate graph trend is consistent. The general library in this study, the data volume is low, the accuracy is very low. To achieve high precision, it takes a lot of data, more than the amount of data is necessary. If this data requirement is not reached, then we are in vain to use any library. Therefore, in the practice of machine learning, the real difficulty is not low precision, configuration complexity and other technical problems, but the amount of data is not enough, or the quality is too low (a set of data is too much useless data). Pre-processing of data is also necessary before machine learning is done.
Next, we will test the stamens data. There are three categories, because we downloaded CSV data, so we can use the PHP-ML official offer to operate the CSV file method. And here is a classification problem, so we choose the SVC algorithm provided by the library to classify. We set the file name of the stamen data as Iris.csv, the code is as follows:
1<?PHP2 require' Vendor/autoload.php ';3 4 Usephpml\classification\svc;5 UsePhpml\supportvectormachine\kernel;6 UsePhpml\dataset\csvdataset;7 8 $dataset=NewCsvdataset (' Iris.csv ', 4,false);9 $classifier=NewSVC (Kernel::linear,$cost= 1000);Ten $classifier->train ($dataset->getsamples (),$dataset-gettargets ()); One A Echo $classifier->predict ([$argv[1],$argv[2],$argv[3],$argv[4]]);//$argv is a command line parameter, it is easier to debug this program using the command line
Isn't it simple? Just 12 lines of code will be done. Next, let's test it out. According to the diagram we posted above, when we enter 5 3.3 1.4 0.2, the output should be iris-setosa. Let's take a look at:
Look, at least we enter a data that we have, and we get the right result. However, do we enter data that is not in the original data set? Let's test two groups:
From the data of the two graphs we posted earlier, the data we entered does not exist in the dataset, but the classification is reasonable according to our initial observations.
So, this machine learning library is enough for most people. And most despise this library despise that library, talk about the performance of people, is basically not what Daniel. The real Daniel has been busy with his own work, or is doing academic research and so on. More of us should be mastering algorithms, understanding the truth and the mystery, not the rhetoric. Of course, this library is not recommended for large projects, only small projects or individual projects.
Jpgraph only relies on the GD library, so it can be used after downloading the reference, and a lot of code is placed on the drawing and the initial data processing. The learning code is not complicated because of the excellent encapsulation of the library. Need all the code or test data set of small partners can message or private messages, etc., I provide the complete code, decompression is used (blog space is too small, not suitable for uploading files). Bloggers are also learning, and we work together.
Simple testing and use of PHP machine learning Library PHP-ML