This is part of an ongoing series, focusing on how to use numbers to analyze and run your company more efficiently and effectively. You can find more at the contents page.

When you are running a site/app/service that is tracking individual users either through sign-ups, downloads or some other mechanism, you are most certainly aware of all the regular metrics: active users, retention, viral coefficient, etc. A really good way to measure and gain insight into your business is cohort analysis.

The traditional cohort definition is pretty self-explanatory:

A type of multiple cross-sectional design where the population of interest is a cohort whose members have all experienced the same event in the same time period (eg birth).


When replacing birth with sign-ups, for example, you get to the point where cohort makes sense for you.

Here’s what you want to get out of such an analysis:

  • Analyze and present user behavior by showing similar behavior over different time periods.
  • Explaining how your current user base (for the given criteria) is made up of user groups from previous time periods.
  • Explain what percentage of your revenue you expect to generate from the sign-ups you were just generating a couple of periods out.
  • Become more predictable in both user growth and revenue, easing the growth process and thus opening the door to planning.


What you want to prove is that you can predict future behavior by showing that users act the same over different periods. Actually, what you want to show is a certain pattern.

Presentation (Example)
This sort of analysis is best presented in a graph as shown below:

What you can see immediately is that the area on the right (Period 5) stacks up our current status with users from Period 1 to Period 4. The really interesting piece of the puzzle comes into play when you are considering what exactly your users represent: active, subscribers, etc. So here is what we can infer from the chart:

  • The height of the chart at Period 5 (at 280) is the number of users currently using (or paying for) our system/app.
  • The individual stacks have a drop-off. As we can see, the drop-off is high in the beginning and then starts to level out but does not go down to zero. Since this is homogeneous across all periods, we can infer that there is something we are doing right: user behavior becomes predictable.
  • For each period 1 to 4, new users were signing up and the number of users from Period 1 makes up 17.8% (50 out of 280) of the users in Period 5.
  • The fall off of users from one Period to the next is higher in subsequent Periods, leveling out at about 25%  of the original sign-ups after 3 periods.

Cohort analysis offers great insights into what’s happening at your user base. It surely consumes a lot of time to collect and grasp the data, especially when your site is getting a bit more popular. But it is well worth the time as you are getting the insights into your users behavior and the development of your site/app. After all, cohort analysis is not too complex but still one which is easy to grasp graphically.

UPDATE: LSVP posted an interesting article on this as well.