Breaking News: Grepper is joining You.com. Read the official announcement!

dimension reduction

Innocent Iguana answered on February 10, 2023 Popularity 10/10 Helpfulness 2/10

answer dimension reduction

dimension reduction

Comment

Tip Innocent Iguana 1 GREPCC

- Remove features with 0 variance : they add no information
- Remove features with very high or very low cardinality : they add no information
- Remove irrelevant column : they add noise into the model
- Inspect features with seaborn pairplot to remove duplicate columns
- Drop highly correlated features if you are confident that they may add bias into the model
    - look at both pairplot and heatmap and prearson's correlation value
- Drop scaled variance up to a threshold : Very low variance may add noise to the dataset
- Drop columns that have missing values beyond a threshold (generally 30%)
- Extract features for seemmingly same correlated features : 
    - Use PCA
- Visualize the contribution of features with t-sne
    - Use t-sne on numeric features and visualize them in 2D
    - use categorical features as `hue` of scatterplot for transformed t-sne to identify driver features
- Discard less important features of a model by filtering out with a threshold co-efficient value:
    - Recursive feature elimination
    - train the model, drop the feature with lowest co-efficient
    - train the model again, drop the next feature with lowest co-efficient
    - continue until a desired number of features remain
- Voting from many models:
    - Perform RFE on many models.
    - Do votes on existing features for all models
    - The features that survive most of the time are the desired features
    - Note : make sure the dataset is standardized, regularized, cross-validated
- Use trees to find out important features
- Generate new features from existing features:
    - eg: average arm length column from left arm column and right arm column
    - eg: generate bmi column from weight and height

xxxxxxxxxx

- Remove features with 0 variance : they add no information

- Remove features with very high or very low cardinality : they add no information

- Remove irrelevant column : they add noise into the model

- Inspect features with seaborn pairplot to remove duplicate columns

- Drop highly correlated features if you are confident that they may add bias into the model

    - look at both pairplot and heatmap and prearson's correlation value

- Drop scaled variance up to a threshold : Very low variance may add noise to the dataset

- Drop columns that have missing values beyond a threshold (generally 30%)

- Extract features for seemmingly same correlated features :

    - Use PCA

- Visualize the contribution of features with t-sne

    - Use t-sne on numeric features and visualize them in 2D

    - use categorical features as `hue` of scatterplot for transformed t-sne to identify driver features

- Discard less important features of a model by filtering out with a threshold co-efficient value:

    - Recursive feature elimination

    - train the model, drop the feature with lowest co-efficient

    - train the model again, drop the next feature with lowest co-efficient

    - continue until a desired number of features remain

- Voting from many models:

    - Perform RFE on many models.

    - Do votes on existing features for all models

    - The features that survive most of the time are the desired features

    - Note : make sure the dataset is standardized, regularized, cross-validated

- Use trees to find out important features

- Generate new features from existing features:

    - eg: average arm length column from left arm column and right arm column

    - eg: generate bmi column from weight and height

Popularity 10/10 Helpfulness 2/10 Language whatever

Source: Grepper

Tags: dimension-reduction whatever

Link to this answer
Share Copy Link

Contributed on Feb 10 2023

Innocent Iguana

0 Answers Avg Quality 2/10

dimension reduction

Contents

More Related Answers

dimension reduction

Grepper

Documentation

Social

Legal

Contact

Oops, You will need to install Grepper and log-in to perform this action.