One of the disappointing problems in SAS (as I need PROC MIXED for some analysis) is to recode categorical variables to have a particular reference category. In R, my usual tool, this is rather easy both to set and to modify using the `relevel`

command available in base R (in the stats package). My understanding is that this is actually easy in SAS for GLM, PHREG and some others, but not in PROC MIXED. (Once again I face my pet peeve about the inconsistencies within a leading commercial product and market “leader” like SAS). The easiest way to deal with this, I believe, is to actually create the dummy variables by hand using ifelse statements and use them in the model rather than the categorical variables themselves. If most of the covariates are not categorical, this isn’t too burdensome.

I’m sure some SAS guru will comment on the elegant or “right” solution to this problem.

### Like this:

Like Loading...

*Related*

We cover this here: http://sas-and-r.blogspot.com/2010/09/example-86-changing-reference-category.html

My understanding is that different procedures are the responsibilities of different groups at SAS– not unlike the way that some very important methods in R are developed by different groups. In any sufficiently large enterprise, it’s probably impossible to ensure a uniform approach.

Ken,

FIrst of all, kudos for your and Nick’s wonderful blog.

That is quite a valid point, specially in a huge enterprise like SAS. Even R has it’s idiosyncrasies, as you well know. However, manipulating categorical variables is a pretty fundamental data management task. PROC MIXED itself is good and popular, even among the non-SAS-philes, and so why such a fundamental data manipulation would be ignored in a very popular PROC befundles me.

Thanks for those kind words. I completely agree with you. There are a few procs which have a sensible (and fairly broad) set of options for parameterizing categorical variables, and this ough tot be adopted by all procs, IMO. OTOH, it wasn’t too long ago that there was no class statement for logistic regression– all categorical variables had to be recoded by hand. So– progress may be slow, especially when the code is not written by volunteers, but it does come eventually.