Machine Learning - Ensemble Learning - Advanced Ensemble Techniques Tutorial

Stacking

Stacking (stacked generalization) is an ensemble technique that uses meta-learning for generating predictions. Meta model provide the final prediction based on the prediction from submodel or base model. This final model is said to be stacked on top of the others, hence it is named stacking.

0001-2

In this, original data is splitted into train and test.
Then train is splitted into n fold, n fold data is again splitted into (n-1) data part and (1 or n^th) validation data part.
First we need to train meta model using base model. Then again train base model.

Suppose we have case, where n is 4 and we have 3 model LR, DT and KNN

Lets take LR model first

Then (n-1) data part is fitted into base models, and then will calculate the prediction of (1 or n^th) data part.

i.e first 3 part of data is fitted into LR model, and 4^th data are predicted.

Next we will fit data on 1^st,2^nd and 4^th part of data, and will make prediction on 3^rd part.

Next again we will fit data on 1^st,3^rd and 4^th part of data, and will make prediction on 2^nd part.

Next we will fit data on 2^nd , 3^rd and 4^th part of data, and will make prediction on 1^st part.

Such that we will get prediction on all part of data.

Now step iii is only done for LR model, we need to perform same step for remaining model i.e DT and KNN.

Such that we get new column LR_pred, DT_pred and KNN_pred. We can use this column as input and actual target column as output.

Based on the data obtain on step iii. we will fit/train that data in meta model.

Let say meta model is RF .in which LR_pred, DT_pred and KNN_pred column is input column and actual target column as output.

Such that training data set get trained on meta model.

But training on base model is still remaining.which will see in step v.

Forget all the above process, now train the training data using LR, then again train same data using DT, and then KNN.

Now we have trained all the 4 model (3 base model –LR,DT, KNN and 1 meta model - RF).

We can use object of these model to make prediction on Test data

Original Data - The original split is split into n-folds

Base Models - Level 1 individual Models

Level 1 Predictions - Predictions generated by base models on original data

Level 2 Model - Meta-Learner, the model which combines the Level 1 predictions to generate best final prediction

Blending –Stacking using Hold Out method is blending.

In this, original data is splitted into train and test.(suppose 80:20 ratio)
Then train is splitted into train and validation part, let say D_train and D_validation. .(suppose 80:20 ratio)
Suppose we have 3 base model i.e LR, DT and KNN, and meta model as RF

Firstly we will train D_train using LR, and make prediction on D_validation, say the output is LR_pred.

secondly we will train D_train using DT, and make prediction on D_validation, say the output is DT_pred.

Lastly we will train D_train using KNN, and make prediction on D_validation, say the output is KNN_pred.

Now we will use LR_pred, DT_pred and KNN_pred as input column and actual target column as output to train meta model i.e RF.
Such that we also train meta model in step iv and we already train all 3 base model on step 3.
Now we can use all these 4 model for prediction of test dataset, that is splitted in step i.

Stacking – Stacking using K-Fold method stacking only.

Bagging

Bagging, also known as bootstrap aggregating ensemble technique, is the aggregation of multiple versions of a predicted model that is achieved using bootstrapping. Bootstrapping is the method of randomly creating samples from original data with replacement (meaning that the individual data points can be chosen more than once) to achieve ramdomness. Each model is trained individually using homogenous machine learning model, and combined using an averaging process to yield a more accurate estimate. The primary focus of bagging is to deal with bias-variance trade-off, reduce variance and to avoid overfitting issue.

If dependent variable is continuous, then we will use BaggingRegression, otherwise we will use BaggingClassification for classification problem. Both are same, only the aggregation method is changed.

In Regression problem, it will calculate mean/average of prediction. While in classification, it will calculate mode in aggregation process.

Advantages of Bagging in Machine Learning

Bagging minimizes the overfitting of data
It improves the model’s accuracy
It deals with higher dimensional data efficiently

Tell the Difference in bagging with decision tree and random forest?

Difference in bagging and random forest is, in bagging- we use homogenous algorithm other than decision tree to create ensemble.(e.g for 3 different model, we can use SVM,SVM,SVM OR LR,LR,LR)

In random forest –we use only decision tree as a homogenous algorithm to create ensemble.

Other difference is in bagging with decision tree- there is tree level sampling (column sampling and row sampling will decided before the tree creation) and in random forest there is node level sampling (column sampling and row sampling will decided at a time of node creation). Hence there will be more randomness in random forest than bagging with decision tree –(see video 93)

Three Types Of Bagging-

i] Pasting- same as bagging , only difference is it is row sampling without replacement.

ii] Ramdom Subspace –It is bagging technique with column sampling with or without replacement, no row sampling only column.

iii] Random Patches - It is bagging technique with both row and column sampling.

OOB score- Out of bag score. It is statistically proven that whenever we do sampling on original data, then only 60-70 % is chosen in sampling because of replacement, other 20-30% data never come out of the bag i.e it remain unseen to the model. So, to use that unseen data for testing and predicting accuracy on it known as OOB score. And this process is called OOB evaluation.

The OOB_score is computed as the number of correctly predicted rows from the out-of-bag sample

Machine Learning - Ensemble Learning - Advanced Ensemble Techniques Tutorial

Advantages of Bagging in Machine Learning

About Fresherbell

Important Links

Social Media