SPARK2 model selection and tuning models selection and tuning

Source: Internet
Author: User

Model Selection Models Selection

  An important task in ML is model selection, or using data to find the best model or parameter for a given task. This is also known as tuning. Individual estimators such as logistic regression can be adjusted, or the entire pipeline including multiple algorithms, features, and other steps may be adjusted. The user can adjust the entire pipeline at once without having to individually adjust each element in the pipeline.

Mllib supports model selection using tools such as Crossvalidator and Trainvalidationsplit. These tools require the following items:
Estimator: Algorithm or Pipeline to adjust
Set of Parammaps: Parameters that can be selected, sometimes referred to as "parameter grid"
Evaluator: How good it is to measure the FIT model's support for test data

At the high level, these model selection tools work as follows:

Divide the input data into separate training and test datasets.
For each (training, test) pair, traverse the collection of Parammap:
For each parammap, they use these parameters to fit the estimator, obtain the fitted model, and use evaluator to evaluate the model's performance.
Select the model that is generated by the best-performing parameter collection.

Evaluator can be a regressionevaluator for regression problems, binaryclassificationevaluator for binary data, Multiclassclassificationevaluator for a multi-class problem. The "Setmetricname method" in each evaluator is a default measure for selecting the best parammap.

To help construct the parameter grid, users can use the Paramgridbuilder utility.

SPARK2 model selection and tuning models selection and tuning

Contact Us

The content source of this page is from Internet, which doesn't represent Alibaba Cloud's opinion; products and services mentioned on that page don't have any relationship with Alibaba Cloud. If the content of the page makes you feel confusing, please write us an email, we will handle the problem within 5 days after receiving your email.

If you find any instances of plagiarism from the community, please send an email to: info-contact@alibabacloud.com and provide relevant evidence. A staff member will contact you within 5 working days.

A Free Trial That Lets You Build Big!

Start building with 50+ products and up to 12 months usage for Elastic Compute Service

  • Sales Support

    1 on 1 presale consultation

  • After-Sales Support

    24/7 Technical Support 6 Free Tickets per Quarter Faster Response

  • Alibaba Cloud offers highly flexible support services tailored to meet your exact needs.