Gaming Testing Metrics

I read Larry Osterman’s post on Measuring Testers by Test Metrics Doesn’t via The Best Software Writing Vol I. It reminded me of a failed experience from about a year ago on a death march project.

The scenario was simple enough. Our large project had been underway for 2.5 years and had been in defect resolution for at least 1.5 years (Death March). Thus we had well north of 1000+ defects and were constantly falling behind on getting them resolved.

Given this one of our senior developers suggested maybe utilizing an oft forgotten field in our Mercury TestDirector bug tracker might really motivate faster bug fixing. The field was

Estimated Fix Time

and wasn’t required so it was generally left blank. The suggestion was to bring it up at our daily stand ups and explain that going forward all the developers should fill this out on their new defects with real estimates.

Not a bad idea in principle. Developers have to estimate the time for their assigned bugs and then they’ll naturally want to fix the defects within that timeframe. Guilt and professional pride will help re-motivate the developers on this great death march. So here’s what happened:

  • Developers started reluctantly putting in estimated fix times.
  • Most of those times were in multiples of 8 hours since it was easier to estimate in days and historically many of the defects required a lot of negotiation since they were really requirement changes or clarifications.
  • This was a death march project so no one felt all that guilty if they didn’t manage to hit some estimated fix time estimate.
  • Everyone stopped filling out the field again within a few weeks because it wasn’t required.

So again it reinforced the lesson that trying to drive behavior with metrics is likely to be a failed effort. I still like metrics, but mostly as a source of feedback. Negative reinforcement with a single statistic tends to fail or lead to gaming the system.