Hackr News App

Why Understanding Software Cycle Time Is Messy, Not Magic

(arxiv.org)

85 points

by: SiempreViernes

10 days ago

☆

32 comments

☆
tmnvdb

10 days ago

next
[ - ]
I've never encountered cycle time recommended as a metric for evaluating individual developer productivity, making the central premise of this article rather misguided.
The primary value of measuring cycle time is precisely that it captures end-to-end process inefficiencies, variability, and bottlenecks, rather than individual effort. This systemic perspective is fundamental in Kanban methodology, where cycle time and its variance are commonly used to forecast delivery timelines.
reply
☆
octo888

10 days ago

parent
next
[ - ]
[ x ]
<@tmnvdb> > The primary value of measuring cycle time is precisely that it captures end-to-end process inefficiencies, variability, and bottlenecks, rather than individual effort
Yes! Waiting for responses from colleagues, slow CI pipelines, inefficient local dev processes, other teams constantly breaking things and affecting you, someone changing JIRA yet again, someone's calendar being full, stakeholders not available to clear up questions around requirements, poor internal documentation, spiraling testing complexity due to microservices etc. The list is endless
It's borderline cruel to take cycle time and measure and judge the developer alone.
reply
☆
dagmx

9 days ago

root
parent
next
[ - ]
[ x ]
<@octo888> Imho cycle time perhaps can only be taken as a reflection across people who are doing similar things (likely team mates) or against recurring estimates if they’re incorrect.
But generally when I’m evaluating cycle efficiency, it’s much better to look at everything around the teams instead. It’s a good way to improve things for everyone across the space as well, because it helps other people too.
reply
☆
to11mtm

9 days ago

root
parent
prev
next
[ - ]
[ x ]
<@octo888> YES ALL OF THIS.
- Dev gets a bug report.
- Dev finds problem and identifies fix.
- Dev has to get people to review PR. Oh BTW the CI takes 5-10 minutes just to tell them whether their change passes everything on CI, despite the fact only new code is having tests written for and overall coverage is only 20-30%.
- Dev has to fill out a document to deploy to even Test Environment, get it approved, wait for a deployment window.
- Dev has to fill out another document to deploy to QA Environment, get it approved, wait for a deployment window.
- Dev has to fill out another document for Prod, get it approved....
- Dev may have to go to a meeting to get approval for PROD.
That's the -happy- path, mind you...
... And then the Devs are told they are slow rather than the org acknowledging their processes are inefficient.
reply
☆
theplatman

9 days ago

root
parent
next
[ - ]
[ x ]
<@to11mtm> This is crazy. I have worked at some large orgs and never experienced that level of bureaucracy to deploy to test/qa environments.
I’ve seen in sensitive apps needing an approval to go to prod but it’s async and didn’t require a meeting!
reply
☆
tmnvdb

8 days ago

root
parent
prev
next
[ - ]
[ x ]
<@octo888> It's not just cruel, it's stupid. Not only does cycle time form a very poor measure of individual productivity, using cycle time to measure individuals will create very bad incentives that will make your team perform significantly worse!
reply
☆
vasco

10 days ago

parent
prev
next
[ - ]
[ x ]
<@tmnvdb> If all things considered within cycle time - as you correctly say - indicate a developer's forecast for delivery timelines, and one developer over a large enough period of time working on the same codebase has half the cycle time as another, does that really tell you nothing?
Assume you're in a team where work is distributed uniformly and not some of this faster person only picking up small items.
reply
☆
Etheryte

10 days ago

root
parent
next
[ - ]
[ x ]
<@vasco> No, it doesn't tell you anything. Someone is consistently delivering half the tickets compared to another person. Are they slow, lazy or etc? Or are they working on difficult tickets that the other person wouldn't even be able to tackle? Cycle time doesn't tell you anything about what's behind the number.
reply
☆
vasco

9 days ago

root
parent
next
[ - ]
[ x ]
<@Etheryte> > Someone is consistently delivering half the tickets compared to another person
So it does tell you something. You also nicely avoided the condition I gave you which is, the team picks up similar tickets and one person doesn't just pickup easy tickets. Assume there's a team lead that isn't blind.
reply
☆
hobs

9 days ago

root
parent
prev
next
[ - ]
[ x ]
<@vasco> Work is never distributed uniformly, that's a silly assumption.
reply
☆
tmnvdb

8 days ago

root
parent
prev
next
[ - ]
[ x ]
<@vasco> You've misunderstood: cycle time is neither a forecast nor a measure of individual productivity.
Cycle time measures how long it takes for a unit of work (usually a ticket) to move from initiation to completion within a team's workflow. It is a property of the team / process, not individuals. It can be used to generate statistical forecasts for when a number of tasks are likely to be completed by the team process.
For most teams, actual programming or development tasks usually represent only a small portion—often less than 20%—of the total cycle time. The bulk of cycle time typically results from process inefficiencies like waiting periods, bottlenecks, handoffs between team members, external dependencies (such as waiting for stakeholder approval or code review), and other friction points within the workflow. Because of this, many Kanban-based forecasting methods don't even attempt to estimate technical complexity. They focus instead on historical cycle time data.
For example, consider a development task estimated to take a developer only two days of actual programming. If the developer has to wait on code reviews, deal with shifting priorities, or coordinate with external teams, the total cycle time from task initiation to completion might end up taking two weeks. Here, focusing on the individual’s performance misses the bigger issue: the structural inefficiencies embedded within the workflow itself.
Even if tasks were perfectly and uniformly distributed across all developers—a scenario both unlikely and probably undesirable—this fact would remain. The purpose of measuring cycle time is to identify and address overall process problems, not to evaluate individual contributions.
If you're using cycle time as an individual performance metric, you're missing the fundamental point of what cycle time actually measures.
reply
☆
CSMastermind

9 days ago

parent
prev
next
[ - ]
[ x ]
<@tmnvdb> Making an efficent software team is literally all about reducing communication overhead.
reply
☆
dgfitz

10 days ago

prev
next
[ - ]
My current org can have a cycle time on the order of a year. Embedded dev work on limited release cadence where the Jira (et. al.) workflow is sub-optimal and tickets don’t get reassigned, only tested, destroys metrics of this nature.
If this research is aimed at web-dev, sure I get it. I only read the intro. Software happens outside of webdev a lot, like a whole lot.
reply
☆
resource_waste

10 days ago

prev
next
[ - ]
A thank you to HN who told me to multiply my estimates by Pi.
To be serious with the recipient, I actually multiply by 3.
What I can't understand is why my intuitive guess is always wrong. Even when I break down the parts, GUI is 3 hours, Algorithem is 20 hours, getting some important value is 5 hours... why does it end up taking 75 hours?
Sometimes I finish within ~1.5x my original intuitive time, but that is rare.
I even had a large project which I threw around the 3x number, not entirely being serious that it would take that long... and it did.
reply
☆
jyounker

10 days ago

parent
next
[ - ]
[ x ]
<@resource_waste> Because "GUI" and "Algorithm" are too big. You have to further decompose into small tasks which you can actually estimate. An estimable composition for a GUI task might be something like:
* Research scrollbar implementation options. (note, time box to x hours).
* Determine number of lines in document.
* Add scrollbar to primary pane. * Determine number of lines presentable based on current window size.
* Determine number of lines in document currently visible.
* Hide scrollbar when number of displayed lines < document size.
* Verify behavior when we reach reach the end of the document.
* Verify behavior when we scroll to the top.
When you decompose a task it's also important to figure out which steps you don't understand well enough to estimate. The unpredictability of these steps are what blows your estimation, and the more of these are in your estimate, the less reliable your estimate will be.
If it's really important to produce an accurate estimate, then you have to figure out the details of these unknowns before you begin the project.
reply
☆
tmnvdb

8 days ago

root
parent
next
[ - ]
[ x ]
<@jyounker> If it's really important to have an accurate estimate for a large work package you are in trouble, there is no such thing.
reply
☆
pieterr

10 days ago

parent
prev
next
[ - ]
[ x ]
<@resource_waste> Hofstadter's law. :-)
https://en.wikipedia.org/wiki/Hofstadter%27s_law
reply
☆
SiempreViernes

10 days ago

prev
next
[ - ]
> We analyze cycle time, a widely-used metric measuring time from ticket creation to completion, using a dataset of over 55,000 observations across 216 organizations. [...] We find precise but modest associations between cycle time and factors including coding days per week, number of merged pull requests, and degree of collaboration. However, these effects are set against considerable unexplained variation both between and within individuals.
reply
☆
rk06

10 days ago

parent
next
[ - ]
[ x ]
<@SiempreViernes> We also need to more info on what tasks are being performed and if they are of significance.
reply
☆
sethammons

10 days ago

parent
prev
next
[ - ]
[ x ]
<@SiempreViernes> Eh, starting the clock at ticket creation is likely less useful than starting when the ticket is moved to an in-progress state. Lots of reasons a ticket can sit in a backlog.
reply
☆
wry_durian

10 days ago

prev
next
[ - ]
Cycle time is imprtant, but three problems with it. First, it (like many other factors) is just a proxy variable in the total cost equation. Second, cycle time is a lagging indicator so it gives you limited foresight into the systemic control levers at your disposal. And third, queue size plays a larger causal role in downstream economic problems with products. This is why you should always consider your queue size before your cycle time.
I didn't see these talked about much in the paper at a glance. Highly recommend Reinertsen's The Principles of Product Development Flow here instead.
reply
☆
tmnvdb

8 days ago

parent
next
[ - ]
[ x ]
<@wry_durian> It is precisely to reduce cycle time that we control queue size. It's also not entirely true that cycle time is purely lagging. Every day an item ages in your queue, you know the cycle time had increased by one day. Hence the advice to track item age to control cycle time.
reply
☆
discreteevent

10 days ago

parent
prev
next
[ - ]
[ x ]
<@wry_durian> How can something like " The Principles of Product Development Flow" be applied to software development when every item has a different size and quality than every other item?
reply
☆
wry_durian

10 days ago

root
parent
next
[ - ]
[ x ]
<@discreteevent> The book has a chapter about how to optimize variability in the product development process. The key idea is that variability is not inherently good nor bad, we just care about the economic cost of variability. There are lots of asymmetries in the payoff functions in the software context, so the area is ripe for optimization, and that means sometimes you'll want to increase variability to increase profit. But if we're mostly concerned that software development is too variable, there are lots of ways to decrease it, like pooling demand-variable roles, cross-training employees on sequentially adjacent parts of the product lifecycle, implementing single high-capacity queues, etc.
reply
☆
reval

9 days ago

root
parent
next
[ - ]
[ x ]
<@wry_durian> You seem knowledgeable in this area. I am slightly obsessed with this stuff ever since reading The Goal. Where can I learn more? For example, how can we use variability to create more profit? What are high-capacity queues?
reply
☆
wry_durian

8 days ago

root
parent
next
[ - ]
[ x ]
<@reval> One intuition for the variability argument comes from binary search, where you learn the most when you eliminate half the possibilities. You can apply the same logic to your product development or testing strategy by intending to fail more frequently. (In the testing context, this could look like testing at a higher level of integration.) This adds variability to the process but you will learn more and faster, which typically results in economic upside.
In terms of resources, Will Larson's An Elegant Puzzle hits on some of these themes and is very readable. However, he doesn't show much of his work, as it were. It's more like a series of blog posts, whereas Reinertsen's book is more like a textbook. You could also just read a queuing theory textbook and try to generalize from it (and that's where you'll read plenty about high-capacity queues, for example).
reply
☆
reval

8 days ago

root
parent
next
[ - ]
[ x ]
<@wry_durian> Thank you for this response really helpful. Generalizing from queue theory makes a lot of sense.
reply
☆
tmnvdb

8 days ago

root
parent
prev
next
[ - ]
[ x ]
<@discreteevent> PPDF is a great book but hard to apply. I recommend looking at some Kanban literature. Classic in this space is Actionable Agile Metrics for Predictability.
reply
☆
duncanfwalker

9 days ago

prev
next
[ - ]
> Comments per PR [...] served as a measure to gauge the depth of collaboration exhibited during the development and review process.
That sounds like a particularly poor measure - it might even be negatively correlated. I'm worked on teams that are highly aligned on principles, style and understanding of the problem domain - they got there by deep collaboration - and have few comments on PRs. I've also seen junior devs go without support and be faced with a deluge of feedback come review time.
reply
☆
tangotaylor

9 days ago

parent
next
[ - ]
[ x ]
<@duncanfwalker> > I'm worked on teams that are highly aligned on principles, style and understanding of the problem domain - they got there by deep collaboration - and have few comments on PRs. I've also seen junior devs go without support and be faced with a deluge of feedback come review time.
The results actually make sense then. If you look at Fig 7: the cycle time explodes as the number of PR comments goes up. This seems like a symptom that the developers weren't aligned before the PR.
It gels with my personal experience where controversial changes in a PR gum up the works and trigger long comment threads and meetings.
reply
☆
duncanfwalker

9 days ago

root
parent
next
[ - ]
[ x ]
<@tangotaylor> 100% - for me the fact that hasn't been picked up throughout the research undermines the credibility of the rest of the paper.
reply
☆
tangotaylor

9 days ago

prev
next
[ - ]
My favorite findings:
* Fig 2b: the cycle time drops slightly around June and July. I have no idea why this is but it's amusing.
* Fig 3: more coding days has very diminishing returns on cycle time. E.g. from eyeballing the graph, a 3x increase in the number of days per week spent coding (from 2 days to 6 days) only has a ~25% boost to cycle time.
* Fig 7: more comments on a PR means vastly slower cycle time. I can personally attest to this as some controversial PRs that I've participated in triggered a chain reaction of meetings and soul searching.
reply