Wrapping Your Head Around “Relative Sizing” in Agile
“Relative sizing with story points.” You’ve probably heard the phrase, but it can be hard to wrap your head around the idea. If you haven’t heard of it, here are some resources:
It can’t be as simple as it sounds, can it? It seems vague. Imprecise. Arbitrary. Well, yes, in a way it is. That’s what makes relative sizing useful! Here’s an analogy that may help you understand how and why we size user stories with relative story points rather than absolute time estimates.
Basically, humans are a lot better at comparing one thing to another than estimating the value of a single thing on its own. Relative sizing takes advantage of this ability, which is why we like to use it in Agile.
Ideally, when doing relative sizing, we size a new story by comparing its overall “heft” or weight with user stories we have previously completed. If the new story “feels” the same as the previous story, we give it the same point value. If it’s heavier, we give it a higher value; lighter, we give it a lower value. We try not to think about the sizes in terms of days or hours, because we’re more likely to get those wrong. It’s more useful to be imprecise and correct than precise and wrong.
Since it can be hard to think of numbers as imprecise, it may help to temporarily replace the story point numbers with words instead.
1 = Extra Small
2 = Small
3 = Medium
5 = Large
8 = Extra Large
13 = Huge
When you compare user stories, think of each number as a multiple of the smallest one. A small (or 2) is twice the size of an extra small (or 1). A medium is 3 times the size of an extra small.
As a rule, when in doubt, always go a size higher, even if the work only feels slightly above a given size. If you think a user story is a 4, then mark it as large (5). Set 6 and 7 to extra large, and 9 to 12 as huge. We jump in value like this to account for the greater uncertainty of larger estimates and the greater chance of surprises over the course of a longer task.*
People frequently call these the “t-shirt sizes” since they share names with the size labels on t-shirts (XS, S, M, L, XL). But I don’t want you to think in terms of t-shirts. A small t-shirt is not twice the size of an extra small t-shirt; real t-shirts increase by about only 1/16 of the total area as they go up each size. Imagine much bigger differences when you compare user stories.
Here’s the analogy I prefer over t-shirt sizes. I like to think about this scale of relative sizes more like this:
1 = Tiny / Extra Small = a rodent (mouse, rat, chipmunk, etc.)**
2 = Small = a house cat
3 = Medium = a dog
5 = Large = a lion
8 = Extra Large = a horse
13 = Huge = an elephant
20+ = Gargantuan = a whale (too big! Break it down)
This is a lot closer to the differences in size we intend for user stories. Each of these categories is predictable in size, but still a range: cats are usually larger than rats, but there are a variety of sizes and shapes among both. If I saw a random cat, I couldn’t tell you how much it weighed, thought I’d know it’s likely between 5 and 20 pounds and could guess even closer after holding it.
And, like work items, these ranges also overlap. The smallest dog is smaller than the largest cat, and someone’s “horse” could be a pony, which is much smaller than a lion. In cases like this (estimating a kind of work we’re not familiar with or doing estimates with a new group), we have to ask questions to know for sure where to fit things.
Let’s say your team’s gotten good at sizing your “cats” and “dogs” and “lions.” You understand that work. Now someone brings in some type of new work, something that they call a “bear.” You’re not a “bear” expert. You’ve heard about them, and what comes to mind is a brown bear, about the size of a lion. So your reflex is to throw it into our “large” category. Is that sufficient? What if we’re wrong, and the bear is a Kodiak? Those are the size of a horse (extra large). What if it’s a koala? That’s closer to a cat (small). So, in reality, our bear could be anywhere from small to medium, large, or extra-large. We need more information.
At this point we start asking our Product Owner questions to help us estimate this new work we don’t know much about. If they don’t know, we might create a spike user story to investigate it, and gather the information this sprint to size and implement our "bear" story for the next sprint.
Every team has outliers (ponies, Kodiaks, and mastiffs). One of the advantages of using Relative Sizing is that over the course of several sprints, the high outliers tend to balance out the low ones. Relative Sizing works better with the way human minds work. Overall, it’s a faster technique than time estimates. And it reminds us that story points are just estimates in a way that calling something “an hour” or “a day” does not do.
*Wherever possible, you should break up user stories that are 8 or 13 points (i.e. extra large or larger) into smaller stories. For most teams, setting a story to 13 or more points says that it will take longer than a sprint to complete. Huge/13-point stories are risky and usually require multiple people to complete them. You lower your risk by breaking them up into smaller chunks.
**Personally, I use monsters from Dungeons & Dragons for my sizes: stirge, goblin, human, ogre, hill giant, dragon, and leviathan. But I’m a gamer. As long as the scale fits, use whatever works for you.