This post shows how to produce a plot involving three categorical variables
and one continuous variable using ggplot2
in R.
The following code is also available as a gist on github.
1. Create Data
First, let's load ggplot2
and create some data to work with:
library(ggplot2)
set.seed(4444)
Data <- expand.grid(group=c("Apples", "Bananas", "Carrots", "Durians",
"Eggplants"),
year=c("2000", "2001", "2002"),
quality=c("Grade A", "Grade B", "Grade C", "Grade D",
"Grade E"))
Group.Weight <- data.frame(
group=c("Apples", "Bananas", "Carrots", "Durians", "Eggplants"),
group.weight=c(1,1,-1,0.5, 0))
Quality.Weight <- data.frame(
quality=c("Grade A", "Grade B", "Grade C", "Grade D", "Grade E"),
quality.weight = c(1,0.5,0,-0.5,-1))
Data <- merge(Data, Group.Weight)
Data <- merge(Data, Quality.Weight)
Data$score <- Data$group.weight + Data$quality.weight +
rnorm(nrow(Data), 0, 0.2)
Data$proportion.tasty <- exp(Data$score)/(1 + exp(Data$score))
2. Produce Plot
And here's the code to produce the plot.
ggplot(data=Data,
aes(x=factor(year), y=proportion.tasty,
group=group,
shape=group,
color=group)) +
geom_line() +
geom_point() +
opts(title =
"Proportion Tasty by Year, Quality, and Group") +
scale_x_discrete("Year") +
scale_y_continuous("Proportion Tasty") +
facet_grid(.~quality )
And here's what it looks like: