Forest plots are most commonly used in reporting meta-analyses, but can be profitably used to summarise the results of a fitted model. They essentially display the estimates for model parameters and their corresponding confidence intervals.

Matt Shotwell just posted a message to the R-help mailing list with his *lattice*-based solution to the problem of creating forest plots in R. I just figured out how to create a forest plot for a consulting report using ggplot2. The availability of the `geom_pointrange`

layer makes this process very easy!!

credplot.gg <- function(d){ # d is a data frame with 4 columns # d$x gives variable names # d$y gives center point # d$ylo gives lower limits # d$yhi gives upper limits require(ggplot2) p <- ggplot(d, aes(x=x, y=y, ymin=ylo, ymax=yhi))+geom_pointrange()+ coord_flip() + geom_hline(aes(x=0), lty=2)+ xlab('Variable') return(p) }

If we start with some dummy data, like

d <- data.frame(x = toupper(letters[1:10]), y = rnorm(10, 0, 0.1)) d <- transform(d, ylo = y-1/10, yhi=y+1/10) credplot.gg(d)

Abhijit,

Thanks for the mention and nice post. I think the ggplot2 solution to this is really more simple than anything we can do with lattice. Unfortunately, I’m still procrastinating about learning ggplot2 well enough for every-day use. Examples like this make it easier, I think.

-Matt

Matt,

I learned a lot from our useR meetup talk this month by Harlan Harris (available at the DC useR meetup site, and also on Harlan’s blog). I learned lattice first and was partial to it for a long time, mainly due to its quickness compared to ggplot. However, once I realized the “logic” behind creating plots in ggplot, it seemed very easy and flexible. The speed thing still irks me, but for smallish data sets its my preferred graphics platform in R now.

Abhijit

Abhijit,

Thanks for the excellent post. This can be especially useful in genetics. Would you care if I slightly modified the code and reposted to my own blog, citing this post of course?

Stephen.

Stephen,

Feel free to modify and post this code. It’s pretty rough, and not production quality. It’s really proof-of-principle code, and very easy code at that.

Abhijit

Looks great, but which is easier for someone who’s not versed in either package?

Well, if you use either Matt’s function or mine (mine would need some more polish, and the Getting Genetics Done blog might have a more polished version soon), it is equally easy right now, since both are presented as generic R functions, and the inner workings would be hidden to the user :)

Both graphics engines have a learning curve.

`lattice`

is closer in syntactical spirit to base graphics and a lot of R’s formula and conditioning syntax in general, so it might have a more familiar feel for a newcomer.`ggplot2`

has syntax that is not quite so straightforward (and abuses some conventions), but it has made some sounder graphical choices and makes it easier to build customizable graphs once you understand the logic of building layers to make up a graph (in other words, the grammar). I started with`lattice`

for many years and resisted moving to`ggplot2`

, but now, I tend to use`ggplot2`

more for my quick graphing needs. If something doesn’t work in`ggplot2`

, I will go back to lattice or base graphics.Deepayan Sarkar’s book on

`lattice`

is a pretty good starting point. Hadley Wickham’s book and website on`ggplot2`

are good references, but I’d look for online tutorials (like the DC, Bay Area and New York R meetup sites, or the Learn R blog) to get an initial start.Hi Abiji,

Wondering if I can add a table with the Beta’s at the side of this graph? If there is any link you can provide which can be useful to add more details to Forest plot for publication?

thanks,

SD

HI Everyone,

The forest plot is very nice. However, I have a question: it display at most 24 variables names on the forest plot after I tried to do that. Would you please telling me how to display more than 24 variables names on forest plot?

Thanks a lot!

Liming