Menu

Stacked Area Makeover

cover-image

The first in a series to improve upon the data visualization techniques at Quantcast.

Various expert articles detail the reasons why a stacked area chart is bad, bad, bad form. And just this week, the masters of our field are discussing similar problems with stream graphs, a type of stacked area chart.

Card

So why does Quantcast use the stacked area for Site Traffic data ?

Let's Make it Over

Here below is an example of the area chart to be improved.

First to Generate Some Data

  
### gen_data function
gen_data<-function(m_apps,m_web,online){  
  date<-seq(as.Date("2014-04-01"), as.Date("2014-05-30"), by="days")  
  clicks_m_apps<-as.integer(abs(rnorm(1:60,mean=m_apps,sd=m_apps/4)))
  clicks_m_web<-as.integer(abs(rnorm(1:60,mean=m_web,sd=m_web/4)))
  clicks_online<-as.integer(abs(rnorm(1:60,mean=online,sd=online/4)))
  df<-data.frame(date,clicks_m_apps,clicks_m_web,clicks_online)
  return(df)
}

### generate random variables
df<-gen_data(200,250,450)

### write to a CSV
write.csv(df,file="data.csv",row.names=FALSE)  

From our generated data, this is what the stacked area chart looks like. This was sexy in Excel (cerca 1995):

Transparency is Nice

Dan Murray recently pointed out, the use of transparency is nice and all comparative plots should base from zero.

Twitter / DGM885: I like the use of transparency ...

With only two measures, this transparency effect is actually quite easy to achieve in Tableau. It is done with a dual-axis, and plots for both measures do begin from zero.

Beyond 2 Measures: Forget Area Fill

However, starting from the third measure in Tableau they begin to stack (undesirable). And in Tableau after two measures it is no longer possible to synchronize the axes. In other tools, beyond two layers, most charts will simply become too busy.

The Case for Something New

Back in 2012, Andy Kriebel suggested augmenting line graphs with a calculated total, and he was on to something.

VizWiz

Plotting a calculated total together with a line for each category does present both the parts and the whole. Yet, the chart can be even further improved upon. One important lesson from Jason Leek is: When you have data measured over space, distance, or time, you should smooth.

10 things statistics taught us about big data analysis

A Solution

So, time series lines need a smoothing agent. For example: a moving average or a trend-line. And upon Andy Kriebel's approach, I would also adjust by plotting the totals in a muted grey background.

This improved chart presents everything its predecessors did, and more:

  1. Component lines each base from zero
  2. Sum of the parts is preserved
  3. Trend-lines smooth, and highlight subtle divergences

New Insights

This new chart gives access to previously hidden insights:

  1. Mobile web traffic briefly overtook online on four occasions
    • imperceptible from a stacked area graph
  2. Mobile web shows a slight trending gain over mobile apps
    • imperceptible without the smoothing lines
  3. Online traffic has a noticeable upward trend
    • imperceptible without the smoothing lines

Let me know your thoughts

Any and all thoughts or further improvements are welcome!

Word Count: 478

References

  1. "I hate stacked area charts", Dr. Drang, November 22, 2011:
    http://www.leancrew.com/all-this/2011/11/i-hate-stacked-area-charts/
  2. "Quantitative Displays for Combining Time-Series and Part-to-Whole Relationships", Stephen Few, Perceptual Edge Visual Business Intelligence Newsletter January, February, and March 2011:
    http://www.perceptualedge.com/articles/visualbusinessintelligence/displaysforcombiningtime-seriesand_part-to-whole.pdf
  3. "Stacked area chart vs. Line chart – The great debate", Andy Kriebel, October 12, 2012:
    http://vizwiz.blogspot.com/2012/10/stacked-area-chart-vs-line-chart-great.html
  4. Tweet, Moritz Stefaner, May 21, 2014:
    https://twitter.com/moritz_stefaner/statuses/469141421623377920
  5. "Reading Our Reports", Quantcast.com:
    https://www.quantcast.com/help/how-to-read-our-reports/
  6. Tweet, Daniel Murray, May 13, 2014:
    https://twitter.com/DGM885/status/466184897607266304"
  7. "Stacked area chart vs. Line chart – The great debate", Andy Kriebel, October 12, 2012:
    http://vizwiz.blogspot.com/2012/10/stacked-area-chart-vs-line-chart-great.html
  8. 10 things statistics taught us about big data analysis, "Jeff Leek", May 22, 2014:
    http://simplystatistics.org/2014/05/22/10-things-statistics-taught-us-about-big-data-analysis/