Does the number of variables used in PCA have an impact of the amount of variance explained? -


if perform pca 100 variables, first component explain 30% of variance. while when used 40 of these explain 48% of variance.

can more relevant work these 40 variables because explain 48% of variance when using pca or due "variable-size" effect ? (more variable - more noise...)

thanks !

almost definition, more pca variables use, more explain train variance. usually, point else, e.g., explaining test variance. in many realistic settings, explaining more of train variance explains more of test variance point: initially, adding more variables help, cause damage. hence, fact adding more 39 variables decreased train variance, means little in test variance.

to find number of variables optimizing test varance, use number of techniques, e.g., estimating through cross validation.


Comments

Popular posts from this blog

How has firefox/gecko HTML+CSS rendering changed in version 38? -

javascript - Complex json ng-repeat -

jquery - Cloning of rows and columns from the old table into the new with colSpan and rowSpan -