anova - Centering Variables in R -
do centered variables have stay in matrix form when using them in regression equation?
i have centered few variables using scale
function center=t
, scale=f
. converted variables numeric variable, can manipulate data frame other purposes. however, when run anova, different f values, variable, else same.
edit:
what's difference between these two:
scale(df$a, center=true, scale=false)
which embed matrix within data.frame
and
scale(df$a, center=true, scale=false) df$a = as.numeric(df$a)
which makes variable numeric, , removes matrix notation within variable?
example of trying do, example doesn't cause problem having:
library(car) library(mass) mtcars$wt_c <- scale(mtcars$wt, center=true, scale=false) mtcars$gear <- as.factor(mtcars$gear) mtcars1 <- as.data.frame(mtcars) # part 1 rlm.mpg <- rlm(mpg~wt_c+gear+wt_c*gear, data=mtcars1) anova.mpg <- anova(rlm.mpg, type="iii") # part 2 # make wt_c numeric mtcars1$wt_c <- as.numeric(mtcars1$wt_c) rlm.mpg2 <- rlm(mpg~wt_c+gear+wt_c*gear, mtcars1) anova.mpg2 <- anova(rlm.mpg2, type="iii")
i'll attempt answer both of questions
do centered variables have stay in matrix form when using them in regression equation?
i'm not sure mean this, can strip center , scale attributes scale()
if referring to. can see in example below same answer whether in 'matrix form' or not.
what's difference between these two:
scale(a, center=true, scale=false)
which embed matrix within data.frame
and
scale(df$a, center=true, scale=false) df$a = as.numeric(df$a)
from file scale()
see returns,
"for scale.default, centered, scaled matrix."
you getting matrix attributes scaled , center. as.numeric(aa)
strips off attributes difference between first , second method. c(aa)
same thing. guess as.numeric()
either calls c()
(through as.double()
) or uses same method does.
set.seed(1234) test <- data.frame(matrix(runif(10*5),10,5)) head(test) x1 x2 x3 x4 x5 1 0.1137034 0.6935913 0.31661245 0.4560915 0.5533336 2 0.6222994 0.5449748 0.30269337 0.2651867 0.6464061 3 0.6092747 0.2827336 0.15904600 0.3046722 0.3118243 4 0.6233794 0.9234335 0.03999592 0.5073069 0.6218192 5 0.8609154 0.2923158 0.21879954 0.1810962 0.3297702 6 0.6403106 0.8372956 0.81059855 0.7596706 0.5019975 # center , scale testvar <- scale(test[,1]) testvar [,1] [1,] -1.36612292 [2,] 0.48410899 [3,] 0.43672627 [4,] 0.48803808 [5,] 1.35217501 [6,] 0.54963231 [7,] -1.74522210 [8,] -0.93376661 [9,] 0.64339300 [10,] 0.09103797 attr(,"scaled:center") [1] 0.4892264 attr(,"scaled:scale") [1] 0.2748823 # put testvar friends bindvar <- cbind(testvar,test[,2:5]) # run regression 'matrix form' y var testlm1 <- lm(testvar~.,data=bindvar) # strip non-name attributes testvar <- as.numeric(testvar) # rebind , regress bindvar <- cbind(testvar,test[,2:5]) testlm2 <- lm(testvar~.,data=bindvar) # check equality all.equal(testlm1, testlm2) [1] true
lm()
seems return same thing appears both same.
Comments
Post a Comment