Cohort analysis is a very powerful tool to understand seasonality, customer lifecycle and the long term health of your business. This tutorial will explain what cohorts and cohort analyses are and what you can do with them.
What are Cohorts and Cohort Analyses?
A cohort is a group of users sharing a particular characteristic. Strictly speaking it can be any characteristic, but typically the term cohort refers to a time-dependent grouping. For example, a typical cohort groups users by the week or month when they were first acquired. When speaking of groupings that are not time-dependent, the term segment is typically used instead of cohort.
A cohort analysis refers to tracking and investigating the performance of cohorts over time.
For example, if you wanted to see if users you’re acquiring now are more or less valuable than users you’ve acquired in the past, you can define cohorts by the month when they were first acquired. You can then run a cohort analysis to compare year-over-year revenue performance.
For example, in this type of cohort analysis, you can see how much revenue you made from users acquired in January, 2018 in their acquisition month (January, 2018) relative to how the ones in the January, 2017 cohort performed in their acquisition month (January, 2017). This is different that just looking at total revenue in each month. To illustrate, consider this hypothetical revenue chart:
It shows the total revenue generated by month as the yellow line. It’s showing a great trend - where revenue is growing month over month. But the blue bars represent the revenue of the cohort of users that were acquired in each month (the new users of each month). That line doesn’t tell a positive story. In January, 2017 we made $1,000 total where $800 came from new users. But in January 2018, even though we made more money total ($1,650), only $300 came from the cohort of users acquired that month. The revenue growth from older cohorts is masking that the value of newer cohorts has been decreasing over time since the summer (either from fewer new users or from making less per new user).
What Can You Do with a Cohort Analysis?
The previous example highlighted one use case for a cohort analysis, but there are many others. Cohort analyses are among the most insightful analyses you can run. They can answer questions like:
Are the new cohorts you’re acquiring more (or less) valuable than previous users?
Have changes you’ve made to your site impacted users who are new to your site?
Are there seasonal differences between users you acquire? Perhaps users acquired during big retail moments like Black Friday behave differently than those acquired at other times.
What is your users’ retention rate?
What is the long term value of your users?
When do users start to churn?
Making a habit of checking cohort performance across a variety of metrics helps you:
Identify issues sooner: you may see disturbing trends in newer cohorts that would otherwise be masked by the rest of the user base until they affect enough cohorts for you to notice in top-line metrics
Build more accurate forecasts: knowing seasonal behaviors, you can incorporate those into your expectations of future performance from that set of users
How to Read a Cohort Chart
Cohort charts can be intimidating to look at for the first time, but they are very useful visualizations packed with a lot of information.
Here’s an example of a cohort chart plotting weekly revenue per user. We’re defining cohorts as users acquired in a given week. Assume we’re looking at the data the last week of the year, around Dec 27th.
The cohorts run along the vertical axis, with the oldest cohorts on the top and the newest ones at the bottom. In this example we have weekly cohorts with the oldest being the week of Nov 19.
Across the horizontal axis are the time periods since the start of the cohort. In this example, they range from week 0 (the week of acquisition) through week 4 - four weeks from the week of acquisition.
The cells in the middle have the corresponding values for the metric you’re plotting - in this case the weekly revenue per user. From the chart we can see that on average, users acquired the week of Nov 19 spent $3.70 in their week of acquisition (week 0). The following week, the same set of users (those acquired the week of Nov 19) on average spent $1.09, the week after that (week 2) they spent $0.73, and so on.
Obviously, the oldest cohorts have the most time in our company, so they have more data. Users in the Nov 19 cohort have had 4 weeks since they were acquired, but users from last week’s cohort hasn’t had any full weeks since they were acquired. This results in the typical triangle shape of cohort charts, which is why they are sometimes referred to as triangle charts.
To better visualize the trends in the data, many cohort charts use color shading. For example, below is the same data, but with shading. In this chart, the darker the color, the higher the revenue per user.
The shading makes it easier to see that our cohorts’ value decays over time (they spend more in their first weeks than in later weeks). It also makes it easier to see anomalies - like the relatively low week 0 value of our Dec 10 cohort ($2.12).
While not universally true, the steep drop from time period 0 to 1 and the slow decay thereafter is typical of most cohort metrics. In general, users are most active right when they’re acquired and then quickly fade over time.
One note of caution - when looking at cohort charts, be mindful of the size of your cohorts. If you only have a handful of users per week, the metrics will likely be highly variable and what could look concerning may be just noise. In that case, consider looking at biweekly or monthly cohorts instead.
While cohort analyses are complex and they require a lot of time to look through multiple metrics, the result of this exercise will be incredibly useful for understanding your customers, your seasonality, and your business changes.
Ready to start running a cohort analysis? Check out this tutorial on how to run cohort analyses in Google Analytics.