Prediction of human motion State-case Analysis _ Prediction

Source: Internet
Author: User

Background information:

• The popularity of wearable devices makes it easier for us to use sensors to get all the human data and even physiological data.

• When the sensor acquisition of a large number of data, we can analyze and model the data, through the characteristics of the value of the user status, according to the status of users to provide more accurate, convenient service.

Data introduction:

• We now collect sensor data on wearable devices from a, B, C, D, E 5 users, each user's dataset contains a feature file (A.feature) and a label file (A.label).

• All sensor values for each line in the feature file, each row in the label file and the marked user gesture in the corresponding moment in the feature file, the same number of rows in the two files, corresponding to each other.

Task Introduction

Assuming that a new user is present, but we only have the data collected by the sensor, how can we get the posture of this new user?

 or to the same user if the sensor collection of new data, how to judge the current user based on new data what kind of posture?

In the case of clarifying that this is a classification problem, we can select some sort of classification model (or
algorithm), the model is studied by using the training data, and then the corresponding classification is given for each test sample.
Results.

Machine learning classification algorithms are numerous, in the next study we will introduce the classical classification algorithm, such as K nearest neighbor, decision tree and naive Bayesian principle and implementation.

Basic Classification Model:

K Nearest Neighbor classifier (KNN)

KNN: The distance from all data points in an existing dataset by calculating the data points to be sorted. Take the first k points from the smallest distance and divide the data point into the most frequently occurring category according to the principle of "minority obeying majority".

K-Nearest-neighbor classifier in Sklearn

In the Sklearn library, you can use Sklearn.neighbors.KNeighborsClassifier to create a K nearest neighbor classifier with the following main parameters:

n_neighbors: Used to specify the size of K in the classifier (default is 5, note the difference from Kmeans)

weights: Set the selected K point to the impact of the classification of the weight (the default is the average weight "uniform", you can choose "Distance" represents the closer the point weight higher, or passed into their own written distance as the parameter of the weight of the calculation function)

algorithm: Sets the method for calculating the proximity point, because when the data volume is large
It is time-consuming to compute the distance between the current point and all the points, and this computation is time consuming, so (there are ball_tree, Kd_tree, and brute, respectively, representing different optimization algorithms for neighbors, the default value is auto, and automatically chooses according to the training data)

The experience of using KNN

In actual use, we can use all the training data to form the feature X and label y, and use the Fit () function for training. In the formal classification, the classification result of the sample is obtained by constructing the test set or one input sample at a time. About the value of K:

If larger, it is equivalent to use the training examples in the larger neighborhood to predict, can reduce the estimation error,
But distant samples also work on predictions, leading to predictive errors.

• Conversely, if K is smaller, it is equivalent to using a smaller neighborhood to predict if the neighbor happens to be noisy

• Generally, k tends to select a smaller value and use Cross-validation to select the optimal k value.

Decision Tree

Decision tree is a kind of tree structure classifier, which determines the final category of the classification point by asking the attribute of the classification point in order. A decision tree is usually constructed based on the information gain or other metrics of the feature. In the classification, we just need to judge the nodes in the decision tree in order to get the category of the sample.

For example, according to the structured classification decision tree below, a person without real estate, single, 55K of income will be classed as unable to repay the credit card category.

Decision tree in the Sklearn

In the Sklearn library, you can use Sklearn.tree.DecisionTreeClassifier to create a decision tree for categorization, with the main parameters:

criterion: Criteria for selecting attributes that can be passed in "Gini" represent Gini coefficients, or "entropy" represents information gain.

Max_features: The optimal feature is selected from the number of features when the decision tree node is split. You can set a fixed number, percentage, or other criteria. Its default value is to use the number of all features.

The decision tree is essentially looking for a partition on the feature space, designed to build a good, and less complex, decision tree for training data fitting.

In the actual use, need according to the data condition, adjusts the parameter which passes in the Decisiontreeclassifier class, for instance chooses the suitable criterion, sets up the random variable and so on.

Naive Bayes

Naive Bayesian classifier is a classifier of multiple classifications based on Bayesian theorem.

For the given data, we first study the joint probability distribution of the input and output based on the conditional independence hypothesis of the feature, and then, based on the model, use the Bayes theorem to find out the maximum output y of the posterior probability for the given input x.

The naïve Bayes in the Sklearn

In the Sklearn library, you can use Sklearn.naive_bayes. GAUSSIANNB creates a Gaussian naive Bayesian classifier with the following parameters:

Priors: Given a priori probability of each category. If empty, then statistics according to the actual situation of the training data; If a priori probability is given, it cannot be changed during the training process.

Naive Bayesian is a typical generative learning method, from training data learning joint probability distribution, and obtain
posterior probability distribution.

Naive Bayes generally perform well in small-scale data and are suitable for multiple classification tasks.

Program writing

#!/usr/bin/env Python3 #-*-coding:utf-8-*-"" "Created on the Sat may 07:52:08 2017 @author: Xiaolian" "" Import pand As as PD import NumPy as NP sklearn.preprocessing import imputer from sklearn.cross_validation import train_test_s Plit from sklearn.metrics import classification_report to sklearn.neighbors import Kneighborsclassifier from sklearn.t REE Import decisiontreeclassifier from sklearn.naive_bayes import gaussiannb def load_datasets (Feature_paths, Label_
        Paths): Feature = Np.ndarray (Shape = (0)) label = Np.ndarray (Shape = (0, 1)) for file in Feature_paths: DF = pd.read_table (file, delimiter = ', ', na_values = '? ', Header = None) #print (DF) imp = Imputer (missing_values = ' NaN ', strategy = ' mean ', Axis = 0) imp.fit (DF) #print (DF) df = imp.transform (DF ) # Print (' DF type: ', type (DF)) #print (DF) feature = Np.concatenate ((feature, DF)) #print ( ' Feature ') #print (FEAture) for file in LABEL_PATHS:DF = pd.read_table (file, header = None) #print (' DF: ', DF) lab
    El = np.concatenate ((label, DF)) #print (' label ', label) label = np.ravel (label) #print (' label ', label)  Return feature, label if __name__ = = ' __main__ ': ' data path ' featurepaths = [' a/a.feature ', ' b/b.feature ', ' C/c.feature ', ' d/d.feature ', ' e/e.feature '] labelpaths = [' A/a.label ', ' B/b.label ', ' C/c.label ', ' D/d.label ', ' E/E.la Bel '] Load_datasets (Featurepaths[:4], labelpaths[:4]) ' Load data ' x_train, Y_train = Load_datasets (feat  Urepaths[:4],labelpaths[:4]) x_test, y_test = Load_datasets (featurepaths[4:],labelpaths[4:]) X_train, X_, Y_train, Y_ = Train_test_split (X_train, y_train, test_size = 0.0) print (' Start training KNN ') KNN = Kneighborsclassifier (

    ). Fit (X_train, Y_train) print ("Training done") ANSWER_KNN = Knn.predict (x_test) print (' prediction done ') Print (' Start training DT') dt = Decisiontreeclassifier (). Fit (X_train, Y_train) print (' Training done ') Answer_dt = Dt.predict (x_test) Print (' prediction done ') print (' Start training Bayes ') GNB = GAUSSIANNB (). Fit (X_train, Y_train) print (' Tra Ining done ") ANSWER_GNB = Gnb.predict (x_test) print (' prediction done ') print (' \n\nthe classification F or KNN: ') print (Classification_report (y_test, ANSWER_KNN)) print (' \n\nthe classification for DT: ') prin T (Classification_report (Y_test, Answer_dt)) print (' \n\nthe classification for Bayes: ') print (classification
 _report (Y_test, ANSWER_GNB))

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.