Should the team reduce future estimates after becoming competent at a new skill, because estimates were increased while learning?

Question

I have been pushing unit testing lately. This is a new skill for my team. I have had 10+ years experience writing unit tests, but I am basically the only person on the team with any experience with this at all. I have been struggling lately with how to budget for learning these skills. Forcing people (me included) to learn all new skills outside work hours doesn't work. We have families. Work at work. Home at home. We are all allotted training hours each quarter, which is great. However blog posts, YouTube videos and PluralSight tutorials only get you so far.

I got this hair brained idea to increase story points for stories where unit tests are required. This effectively reduces the amount of functionality we can deliver per story point. At the time it felt fine, since we are increasing the total effort. In my mind this increase was justified by the "unknowns" of writing unit tests. I also expect story point estimates to come back down after our team members have become competent at unit testing.

I originally got this hair brained idea from another hair brained idea to increase story point estimates for stories that required writing automated end-to-end tests with Selenium. This resulted in features that used to be 1 story exploding into 6+ stories. Story #1 included development and writing a single automated test. This usually turned out to be a 13 point story. As a general rule the team feels comfortable delivering an 8 point story in a 3 week sprint. Anything higher and our confidence goes down exponentially. A 13 point story is worrisome. A 20 point story in one sprint? Yeah, and while we're at it I'd like a pony too.

So that first story would be 13 points, then we would have 4-5 stories estimated at 3 to 5 points each. The smaller stories were literally the effort required to write the automated test, including the addition of any test infrastructure code, like Selenium page models. These tests all verified distinct, testable end user behavior.

Team velocity initially suffered, but eventually went up. Story point estimates never came back down. We continued our story breakdown of a single 13 point story and then a bunch of 3 to 5 point stories to write automated tests.

Now we fast forward to my current situation of learning unit testing. The team estimated a story at 13+ story points again, and there is no way to break this story down into anything smaller. For our team, a "story" is basically something an end user can interact with. Pretty general, but if an end user cannot see or interact with it, then it is not a user story.

I requested we do unit tests that require mocking a single method on an interface used to send an e-mail. We create and sent the e-mail using the Postal NuGet package, which makes sending an e-mail no more complicated than rendering a web page with a view model and razor template (our team has extensive experience with ASP.NET MVC).

The unit tests would cover a "service" class invoked when removing people from a business customer account. Anyone who is removed should get an e-mail notification. The new unit tests should cover the fact that e-mails get sent to each person who is removed. They do not need to assert the contents of the e-mail, just that the e-mail gets sent. This involves mocking the IEmailService.Send(Email) method.

This 13 point story makes me nervous. We are half way through our 3 week sprint and I am still getting basic questions about unit testing fundamentals. I'm afraid we are going to miss our goal this sprint, which is why the story got a 13 point estimate. Each time I tried introducing unit tests, even in smaller, simpler stories, the team always gave me a 13+ point estimate. I'm afraid no story is small enough for a single sprint anymore once you factor in development, automated tests and unit tests. This is simply too much for the speed and skill level of this team — a trend I have noticed the entire 4 years I've lead this project. I'm just simply hitting a brick wall.

We do not adjust story points based on who gets assigned the story. To be honest, no single person works on a story anyhow. I've read Where does learning new skills fit into Agile?, but at some point you must utilize the new skill, and this is my conundrum. Since I am the team lead, scrum master, business analyst, graphic designer, BDD practitioner and architect of this project I frequently do not have time to pair program with every person on the team. This large number of responsibilities is not changing any time soon, either.

It seems we must deal with a reduced velocity, or increase the estimates. I've chosen the latter of the two.

After increasing story point estimates in order to learn unit testing, should the team reduce future story point estimates for similar work based on the assumption that the "unknowns" of learning to write unit tests are no longer unknown?

Logically, if you are inflating the points to allow for learning time, then you will stop inflating the points once the learning is thought to be complete and no further learning needs to be accounted for. — Steve, May 27 '20 at 22:27
@Steve: that was my logic as well, but the story points never came down when I did this to introduce automated tests. — Greg Burghardt, May 28 '20 at 00:17
How long a period have you allowed for the points to come down? Also, it is possible that some other factor is increasing - for example, as the learning curve eases, are they spending the released time on developing more quality which at first was skipped? Have they had enough learning and digestion time to start systemising and performing as quickly as you would expect (on top of and in the context of all other responsibilities), or just enough learning to engage them in a constant inefficient muddle? Are they reaching the point of exhaustion in some way? Some things to think about anyway. — Steve, May 28 '20 at 01:20
"This resulted in features that used to be 1 story exploding into 6+ stories" did you really mean "6+ stories" or should that have been "6+ points"? — Bryan Oakley, May 28 '20 at 19:09
"The team estimated a story at 13+ story points again, and there is no way to break this story down into anything smaller." - that is almost never really true. Things can always be broken down into smaller chunks. According to your post that represents about 5 weeks of work. I find it hard to believe that 5 weeks of work for an entire team can't be broken down into smaller chunks. — Bryan Oakley, May 28 '20 at 19:16

Robert Harvey · Accepted Answer · 2020-05-27T22:23:31.323

I see several potential problems here.

The whole point of using story points and velocity is to hand-wave away hourly estimates, but ultimately story points must eventually correlate in some way to how long it takes your team to get things done. If your team can complete 30 story points in each three week sprint (without working any overtime), that means each story point takes roughly 4 hours to complete.

In my opinion, the story points and velocity should inform your estimation process, not the other way around. Simply increasing the estimates is not going to work; your team has to figure out how to get things done in a more timely fashion so that the story points and velocity eventually normalize.

If the team estimates 30 story points for a task, but completes it in Week 1 of the sprint and has time to complete 10 more story points in other priorities before the end of the sprint, that's a good problem to have. That isn't, however, the problem you appear to have.

So here are my thoughts, in no particular order.

Unit testing with mocks is difficult and expensive. In my experience, it's better to engineer your API so that it doesn't require mocks to test, and you get a better design in the bargain. Consider writing your tests first, so that they inform your API's design and serve as a partial "definition of done."
Find a way to increase the granularity of your tasks. Smaller tasks that are easier to complete are also easier to estimate. Twenty story points for a task is too large for a team that's only capable of 30 points per sprint, unless your team is especially disciplined.
Let the velocity and story points of the team speak for themselves. If the team consistently estimates more story points per task than it takes, gradually dial the estimates back and make sure there's plenty of work in the backlog to fill the void. If it's taking longer than it should to complete things, stretch the estimates and work on the root causes of the work slowdowns.
Pragmatism rules. If the team was consistently producing reliable software before unit testing was introduced, it might be time to re-evaluate your approach. Check your staffing levels; you may need more developers to accommodate the increased workload.

Your velocity and story points are telling you there's a problem. Don't try to re-engineer those metrics; work on the root causes.

_{True story: A former boss once told me that the story point system and software development process had become so institutionalized and corrupt at one of his jobs that simple changes like adding a dropdown to a form took three months to complete, because the development team had hijacked the estimation process. Don't let this happen to you.}

"Don't try to re-engineer those metrics; work on the root causes." The sad irony is we were trying to address the root causes by budgeting time in our sprints to develop new skills, and in the meantime it appears we've reengineered the metrics. This is what sprint retrospectives are for... — Greg Burghardt, May 28 '20 at 09:55
"If your team can complete 30 story points in each three week sprint (without working any overtime), that means each story point takes roughly 4 hours to complete." Small correct, each story point requires 4 hours times the amount of team members. For a 4 man dev team, that's actually 16 hours per point. — Flater, May 28 '20 at 13:54
@Flater: I was referring to calendar hours, not "man-hours," based on a 40 hour work week. — Robert Harvey, May 28 '20 at 19:30

Ewan · Answer 2 · 2020-05-27T22:13:51.137

1

Its 2020 you would have thought all developers were onboard the unit test train by now.

In terms of points and estimates, I would say you are getting hung up on details. You know that unit tests will speed up development in the long term and have accepted it as a requirement.

Have a "definition of done" that includes units tests and let the devs estimate the tasks. Don't challenge or worry about their estimates, just keep track of velocity and use that to predict end dates. I'd wager that your stress over estimates pushes them up and eats up time in meetings

I'd also say the points seem a bit big if 8pts = 3weeks. I would recommend 1 week sprints and estimate in days. Let the team set their own targets.

The definition of story might also be part of the problem. "Make the button green on mouse over" can be a story

edited May 27 '20 at 22:13

answered May 27 '20 at 22:07

Ewan

75,506

3

"you would have thought all developers were onboard the unit test train by now" - it still comes down to the resources that an organisation has, both to reproduce the skills (or buy them pre-made from the market) and to pay for the necessary development time. – Steve May 27 '20 at 22:37
We have dramatically reduced the size of stories. It is now to the point where in order to get a smaller story we would be bending the rules to absolute silliness. A task to write a method that is not yet being called by the application is not a user story, but our team is seriously at that point. I'm legitimately afraid that requiring the devs to write unit tests for a method will take longer than 3 weeks, but might take a work day without unit tests. – Greg Burghardt May 28 '20 at 10:03
@Steve what i mean is I would expect all software engineers on the market now to be supportive of, and have experience writing, unit tests of various kinds. Just like I would expect all c# devs to know about async – Ewan May 28 '20 at 10:24
1

@GregBurghardt I would give up on the idea of stories and just have tasks sized by the engineers to whatever they think they can get done in 2-3 days (with unit tests). Before I gave up on unit tests I would give up on scrum – Ewan May 28 '20 at 10:29
This leads to bigger institutional problems. Project Management is its own behemoth where I work. Scrum teams are expected to keep a consistent velocity. When points go down, red flags get raised and I get called to the carpet. This was another reason I opted to increase story points. Sometimes it feels like as long as we deliver X number of points each sprint, management could care less whether we deliver a green button on mouseover or an entire search page. Maybe Planning Poker should be like "Who's Line Is It Anyway?" - a game where anything goes and the points don't count. – Greg Burghardt May 28 '20 at 11:19
just normalise the estimates to a total of X points per sprint and be sure to add 10% each month – Ewan May 28 '20 at 11:24
@Ewan, in an ideal world yes, but in my impression most experienced software developers are still produced in small firms where speed and variety of delivery is valued, and long-term robustness is not valued. Most C# devs may know about async, but whether they are all proficient in designing systems with asynchronous code is likely another question! – Steve May 28 '20 at 13:08
1

I guess thats the kind of arguement that makes me roll my eyes. there is no downside to unit tests – Ewan May 28 '20 at 13:10
@Ewan, no downside except the paid working time it takes to learn the approach and to implement them to a high standard! It's the same with documentation. I agree the payoffs of unit tests arrive at a very modest level of complexity, but the swingeing costs of software development also arrive at a very modest level of complexity. Faced with recognition that they cannot afford more complex development done at a high standard, many businesses or developers do not desist, instead they plough on with a lower standard and more erratic software performance, often until breaking point. (1/2) – Steve May 28 '20 at 17:36
There may be long-term costs to this approach if the software is stable to begin with and lasts a long time, but many software systems are as badly conceived and written as they are badly tested, so they are regularly overhauled, and some entire businesses fold or are absorbed in the long-term and the software is then simply jettisoned. Also, firms that try to make software long-lived, often then face a problem in recruiting and retaining maintenance programmers for old platforms and frameworks, so it's actually cheaper (if also more labour intensive) to constantly rewrite. (2/2) – Steve May 28 '20 at 17:38
1

I would argue that the definition of done also includes end-to-end selenium tests. – Bryan Oakley May 28 '20 at 19:13
1

@GregBurghardt: "I'm legitimately afraid that requiring the devs to write unit tests for a method will take longer than 3 weeks, but might take a work day without unit tests." - if you don't write tests, how would you know if that feature is working? And how will you know that the feature continues to work as the project progresses? – Bryan Oakley May 28 '20 at 19:34
@BryanOakley: I completely agree. In fact, they are part of our definition of done. Unit tests are new. – Greg Burghardt May 28 '20 at 19:35
1

@Steve: I seen (and worked for) companies that constantly had to rewrite poorly written software. Each one of them floundered and failed as a direct result. Competitors who went slower, but delivered better quality software did better over time. And frequently bought out the companies that didn't. – Greg Burghardt May 28 '20 at 19:41
@GregBurghardt, I've certainly seen firms suffer from their poor quality software. But engaging in such hopeless competition with those who are doing it better, can still have functions. It functions when it creates management and ownership positions for those in a low-road firm, who as you say may even be bought out in due course. It functions when high-road competitors are sometimes forced out of business in the short run by a more agile or low-cost competitor, and the low-road firm seizes market share. (1/2) – Steve May 28 '20 at 20:20
And it functions when such low-road competition also acts as a fulcrum for the high-road business to attack the pay and conditions of all their workforce - they will point to the competitive threat, falling market share, and lower wages of the low-road competitor to discipline the workforce - so the industry bankers can also have an interest in bankrolling such hopeless competition. The effect is that the high-road firms continue to operate, they retain their high-road workforce, but at substantially higher profitability and lower wages than if there was no competition. (2/2) – Steve May 28 '20 at 20:23
yeah steve, your whole premise that unit tests cost money is wrong. but we are veering off the point. – Ewan May 28 '20 at 20:40
(3/3!) And to clarify my point, good software does cost a lot of money. It costs money not just because of the time it takes to do it properly, but because of the amount of skill and experience it takes for design, and the amount of time and money it takes to reproduce that skill and experience (which is then vested in the employee not the business). Most firms simply cannot afford to do it at that standard - but they can afford to hire self-taught computer nerds who can do it at a low standard given little or no training, and these firms are the main producers of experienced software devs. – Steve May 28 '20 at 20:50
@Ewan, clearly unit tests cost time up front, and therefore they cost a business money up front. I'm not disputing that they become valuable at a relatively low level of software complexity. But quality documentation is also useful. And good design is worth its weight in gold. But these things all have reproduction costs in terms of producing the workers who are able to do these tasks to a good standard, and most businesses cannot afford that level of quality. What they can afford is badly designed, untested, and short-lived software, and that really is where most developers come from. – Steve May 28 '20 at 20:57
I think what I should clarify as well, is that if you have something so badly designed that it will soon need to be modified in a global fashion, and you write unit tests for this pig's ear too, then soon you will have to not just rewrite the core code but also rejig a lot of unit tests too because the modifications touch so many things. So you've paid for unit tests that got thrown away - they had zero value. So although unit tests support good design, they impose additional taxes on bad design. If you can't afford good design, then the best thing is to pay the least tax on bad design. – Steve May 28 '20 at 21:13
no, its quicker to write code + tests than just code. costs less – Ewan May 28 '20 at 23:19

score 1 · Answer 3 · answered May 28 '20 at 19:32

I'm afraid no story is small enough for a single sprint anymore once you factor in development, automated tests and unit tests.

I think that's the root of the problem - your stories are simply too large. I find it hard to believe that you can't break a 13 point story -- which represents roughly 5 weeks from your entire team -- into three or four smaller stories.

My recommendation would be to challenge the team to write better, smaller stories. With smaller stories will come more accurate estimates. Based on the numbers you gave in your post, I would suggest requiring that no story be larger than 4 points, including time for all testing for that story. If it's bigger than that, break it into two stories.

After increasing story point estimates in order to learn unit testing, should the team reduce future story point estimates for similar work based on the assumption that the "unknowns" of learning to write unit tests are no longer unknown?

The team shouldn't artificially reduce story points. If you were artificially adding extra story points, however, you should stop doing that.

The story points should naturally fall as the team develops skills. Story points should reflect the honest opinion of the team for fully finishing a story, including all testing, documentation, etc. As they become more proficient in testing, the time will naturally decrease.

We have been utilizing Example Mapping sessions with the 3 Amigos. This has been a huge help with knowing what to build. This has also helped us cut things into much smaller stories, but the problem I am seeing now is that we have been inflating story points to account for training and learning. These "13 point stories" in my opinion should be 3 or 5 pointers. Eight points tops. But the team consistently estimates them at 13+ due to fears of learning new skills along with writing automated tests. — Greg Burghardt, May 28 '20 at 19:37
@GregBurghardt: if it really takes your team a solid week to write a couple of unit tests, something is very wrong. I can understand a day, maybe even two, but beyond that there's something fundamentally wrong. Perhaps you can bring that up in a retrospective and try to find the root cause. Maybe the solution is to invest more time in building out your testing infrastructure, such as creating a library of mocks so they don't have to reinvent the wheel for every test. — Bryan Oakley, May 28 '20 at 19:41
@GregBurghardt: To add to what Bryan said in the comment - automated tests will have a higher story point if they haven't generated a framework to build them out of (On top of the imported framework, a series of classes that they import and/or extend to do the setup work of a test would help, but itself takes time to develop and standardize), because they aren't copying and pasting for individual unit test writing sessions - if each test is unique, they could have a higher story point still. Though there could be other root causes, this stands out as a possible root issue. — Alexander The 1st, Jan 21 '23 at 06:39

score 1 · Answer 4 · answered Jun 03 '20 at 16:37

You write in a comment: I'm legitimately afraid that requiring the devs to write unit tests for a method will take longer than 3 weeks, but might take a work day without unit tests.

Sounds like the team does not want to do unit tests, and thus arbitrarily increases the estimated effort. Use the training budget to do unit test workshops. Convince them to want unit tests before forcing them.

Creating unit tests will increase the complexity of a story. So, the number of points will increase, and less features will be done per sprint. In the beginning the effect will be bigger (although not as big as you experience it).

Because of the unit tests future refactorings will become easier. Release efforts may drop or disappear. Less bugs may show in the release, reducing bug fixing efforts. These safed times can be spend on realizing features, thus increasing velocity.

Should the team reduce future estimates after becoming competent at a new skill, because estimates were increased while learning?

4 Answers4