A brief analysis of the use of Xgboost

Source: Internet
Author: User
Tags wrapper xgboost in python

Preface--Remember when Ali internship, we are using mllib under the GBDT to train model. However, since mllib is not open source, it is not available outside the company. Later to participate in the Kaggle competition, recognized a GDBT useful tools, xgboost, so seriously study a bit.


GitHub Address: Https://github.com/dmlc/xgboost

The specific use of the way, in fact, there are instructions, the following main talk about under the Windows Python environment how to use Xgboost.


1. Follow the official instructions to download and compile the release version of the 64-bit program, and execute "python setup.py install".

2. Add the Environment path "sys.path.append (' C:\\.........\\xgboost-master\\wrapper ')" in Python Code (note: This path is the path you downloaded the extracted folder wrapper)

3, so you can try "import xgboost as XGB", if successful there is no problem. (Note: To rely on NumPy and scipy, and install the number of bits to match your Python, 64-bit version numpy and scipy to download here, 32-bit version can be downloaded to the official website)


The following is a train code and predict code:

Train

#!/usr/bin/python
Import Sys,os
sys.path.append (' c:\\xgboost-master\\wrapper ')

import numpy as NP
Import scipy.sparse
Import xgboost as XGB #

# # Simple Example
# load file from text file, also binary buffer ge nerated by xgboost
dtrain = xgb. Dmatrix (' c:\\predictahe_trainset_libsvmformat.txt ')
dtest = XGB. Dmatrix (' C:\\predictahe_testset_libsvmformat.txt ')


# Specify parameters via map, definition is same as C + + version< C10/>param = {' max_depth ': 6, ' ETA ': 0.3, ' silent ': 1, ' objective ': ' Binary:logistic '}

# Specify validations set to Wat CH Performance
Watchlist  = [(dtest, ' eval '), (Dtrain, ' train ')]
num_round =
BST = Xgb.train ( Param, Dtrain, num_round, watchlist)

# This is prediction preds
= bst.predict (dtest)
labels = dtest.get_ Label ()
print (' error=%f '% (  sum (1 for I in range (len (preds)) if int (preds[i]>0.5)!=labels[i])/float (Len ( Preds)))
Bst.save_model (' C:\\xgb.model '))


Predict

#!/usr/bin/python
Import Sys,os
sys.path.append (' c:\\xgboost-master\\wrapper ')

import numpy as NP
Import scipy.sparse
Import xgboost as XGB #

# # Simple Example
# load file from text file, also binary buffer ge nerated by xgboost
dtest2 = xgb. Dmatrix (' C:\\predictahe_temp_libsvmformat.txt ')

# Load model and data in
Bst2 = XGB. Booster (model_file= ' C:\\xgb.model ')
preds2 = bst2.predict (dtest2)

# This is prediction
outing = open (' C:\\predictahe_temp_result.txt ', ' W ')
outing.write (str (int (preds2[0]>0.5))) #只输出了一个
outing.close ()

P.S. input file format if LIBSVM format




Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.