Wednesday, 14 November 2007

Silver bullet for code metrics

Yes, I know there is no such thing as a "one true metric" for assessing quality on a (java) project. There is no way that we can generate a score for a project or assign it a quantitative grade like "B". This isn't school.

But what if we wanted it to be?

It's always dangerous to create a metric that you judge code on because the interested parties will just work to maximize that metric, regardless of anything else. I was thinking of this on the bus home today (yes, I took the bus. :-( ) with the phrase "this isn't school" in my head. Well, let's freaking make it like school!

* Code coverage is worth 10 points. So, 75% coverage would give you 7.5 points.
* % of passed builds is worth 10 points.
* 10 points minus 1 point for every major FindBugs bug
* etc.

It would be a great idea to work in checkstyle and complexity analysis tools as well... just don't know how you'd do that right now.

Add up the points, figure out the percent, and give it a grade. Just like in school where it doesn't really mean anything, but it's a good indicator of quality. The only thing is that management would have to care about those metrics.

It's all well and good to be pro-active and provide metrics that you think are valuable, but if no one else does, it's useless. It might be better to just keep this in mind in case someone asks for it. :-/

5 comments:

  1. How many - points for such a bad post? :P
    What code coverage metrics are you using? Branch, statement, function...? Generally speaking, once you hit 100% branch coverage you know you're in a good shape. Hit 100% statement, who knows? Function, even worse.
    If build fails that does not mean that product is bad, someone might have added a third party lib to the proj, but forgot to add it to the build script.
    In general, as long as you know where you're going and how to get there, use anything you want. I agree that more tools you use getting there becomes easier. Save the silver bullet for when things go really bad so you can shoot yourself in style.

    ReplyDelete
  2. hahaha... I don't bother scoring posts because I assume that they are all bad. :-P
    For coverage, I'm taking about line coverage. Tools like EMMA can go to a finer level of detail, but I'm using what I've got (cobertura).
    Yes, failing builds don't really *mean* anything, but like all the metrics, they give an indication. If you've got 84% of your builds passing, that shows you care a bit more than someone who has 20%.
    That's why you can't use just one or 2 metrics, you must use as many so that people don't spend all their effort maximizing one.

    ReplyDelete
  3. 1 - Half my broken builds are because of broken SNAP-SHOT maven plug-ins... :P
    2 - Line/Branch coverage is a good start. Those test with integrity and have a high % in this category should get free cookies.
    3 - I realize FindBugs is a report that runs on our project... but I've never looked at it. We keep Checkstyle warnings to a minimum (the ones we have are unavoidable), and the other warnings Eclipse throws in (making sure Lists, Maps, etc... are type-safe), we try to keep at zero.
    You KNOW I like testing, our project has DECENT (not stellar) coverage at around 84%, and our build usually breaks because of broken plug-ins... but I think you might offend some people if you tell them they have to repeat the last iteration because they didn't make the grade :)
    I think you're looking at this from the wrong angle. Do you make a good hockey player by shaming them into being a good hockey player? No, they have to want to play hockey.
    You need developers who are willing to test and willing to buy into whatever metrics the organization decides is important. So... I guess it really comes down to good hiring (and firing) :)
    Like most solutions to software problems (and I think I've heard you say this before), it comes down to the people. People can either create and maintain a great test suite, or they can write a bunch of tests with no "assert" statements that gives them great line/branch/statement coverage but on actual "testing" is done. People can care about a build or not care about a build... people can write good code or crappy code... You get my drift :)

    ReplyDelete
  4. Ah, I don't think that I explained myself very well. This grade isn't for the developers, it's for management to understand the *aprox* quality of a project. This isn't about going back and fixing stuff, just knowing where it is. Right now the only metrics that most PM's have are 1) number of user stories left 2) number of bugs opened / closed.
    To use your hockey analogy, if you were a watching teams you'd use some metrics to get an aprox idea of score. Goals against, goals for, etc. You might make use these to calculate what team is the "best". It might be the leafs. Does that mean that the leafs are are going to beat the Sens in the next game? No, but it gives you an idea if they are going to make the playoffs or not.
    This is just a thought experiment to try and grade the upgradeable. ;-)

    ReplyDelete
  5. To comment again, this isn't something that I'm going to push for at work... it would probably make a better Ph.D. thesis to figure out a "good" grading scheme.
    btw, if your build keeps on breaking 'cause of snapshots, you should have words with the guy running the maven repo. :-P

    ReplyDelete