Preface
Graphlab create is a machine learning function library, in which sframe is also a very powerful data management tool. It allows you to read data directly from the hard disk, without loading all the data into the memory. This makes big data processing possible. This is also the biggest advantage of scikit-learn. We know that scikit-learn can only read data in memory.
: Graphlab create is a good function library in machine learning. It integrates the jupyter notebook ide. notebook is used in machine learning, data statistics, analysis, modeling, and other fields, jupyter notebook is also an open-source web application. The file format is :. end of ipynb ....
Graphlab createhttps: // turi.com/
Install
--
To use this function library, we certainly have to install it first. The specific installation process is not very troublesome. Here, I will not talk about it. No, my friends can use Baidu directly. There are a lot of tutorials.
After the installation is complete, you can use the software Icon of the single-host desktop. Then, select
Will Jump directly to jupyter (original notebook)
Create a workspace
Modify the workspace name
In this way, we can start the operation.
1. Before use, we must first introduce this package
Import graphlab
Read a dataset
TIPS: if we want to view the first or last data rows
We use
SF. head # view the first few rows SF. Tail # view the last few rows
Manipulate column data
The above are some basic operations. You only need to select a column to perform operations similar to arrays. You can try it.
Add a new column
Ii. simple use of graphlab canvas
Canvas is a graphical tool.
We have stored the Personal Information Dataset in SF.
Directly use SF. Show ()
Will automatically open on another tab page
You can click to try
We want to display it on the current notebook page instead of another page. How can we do this? Here, you just need to redirect
Next, we will solve a scenario problem.
One problem in our personnel information table is that in the country column, USA and United States represent the same country in the United States, but the data format is different. If, without data unification, we may not be so accurate when building machine learning models, because machines will regard these two forms as processing by two countries.
Solution
To enable the preceding dataset to build a machine learning model, you need to make some changes to the dataset.
We use the apply function to convert data.
Okay. Now, the common operations of graphlab create are described. Later, we will introduce how to process data in some practical scenarios.
Basic use of graphlab create