Data from the Coverity Scan Open Source Report found a 16 percent improvement in the quality of open source projects actively participating in the scan. This is exactly the type of data that open source vendors and proponents want in their back pockets. But is it accurate? Hold your tomatoes and let me explain.
Coverity used lines of code (LOC) before a defect was found to evaluate code quality and code quality improvements in a normalized fashion. There are however two approaches for doing this calculation.
The first approach is to calculate the total LOC scanned across all projects and divide by the total number of defects found across all projects. The result is the average number of LOC before a defect is identified. This one step approach is best for talking about the code quality of the overall code base, across projects. It does not however let us determine which projects are producing higher quality code versus other projects, or track results on a project by project basis over time.
The second approach is a two step process. First, calculate the LOC per defect on a project by project basis. Then, average those results to arrive at the LOC per defect across all the projects scanned. The first step of this approach is good for understanding which projects are producing higher quality code and measuring a project’s progress in the quality arena over time. However, averaging the LOC per defect across projects in step two disregards the size of a project’s code base.
Before we move on, keep in mind that a higher LOC per defect number is preferable regardless of which approach we use. For example, a 999 LOC per defect result is better than a 100 LOC per defect.
As I’m sure you’d expect, or I wouldn’t be blogging about this, the two approaches provide different results due to how the code base size of a given project is weighted. In approach one, projects with larger code bases are weighted higher than projects with smaller code bases. The reverse is true for approach two.
Here’s an example:
Project A: 2,000,000 LOC & 2,400 defects Project B: 20,000 LOC & 20 defects Project C: 21,000 LOC & 15 defects
Using the first approach, the LOC per defect would be: 838
[2,000,000 + 20,000 + 21,000] / [2,400 + 20 + 15] = [2,025,000] / [2,435] = 838
Using the second approach, the LOC per defect would be: 1,078
([2,000,000 / 2,400] + [20,000 / 20] + [21,000 / 15]) / 3 = ([833.33] + [1,000.00] + [1,400.00]) / 3 = 1,078
Notice the impact of the much smaller projects B and C on the overall results.
The Coverity report used approach two. When I first saw the data I instinctively used approach one. Both approaches are statistically valid. Which approach you use comes down to what you’re testing for. Coverity is interested in determining code quality at the individual project level. The projects whose leads have submitted code into Coverity care much more about the individual project’s results. I was more interested to determine code quality improvements across the set of projects scanned.
Coverity reported that open source code quality of the projects scanned had improved from LOC per defect of 3,333 in 2006 to approximately 4,000 in 2009. This led Coverity to claim a 16 percent overall improvement in the quality of open source projects actively participating in the scan.
Using approach one, I found that LOC per defect has worsened from 1,982 in 2008 to 1,560 in 2009. This represents a 21 percent decline in the quality of the open source code base included in the scan.
2008: 55,000,000 LOC / 27,752 total defects found = 1,982 LOC per defect 2009: 60,000,000 LOC / 38,453 total defects found = 1,560 LOC per defect
Note that 2006 data was not included in the 2009 report, or else I would have calculated 2006 versus 2009.
Readers can decide which of these two figures, a 16 percent improvement or 21 percent decline, to use for their purposes. Both are valid interpretations of the data.
Follow me on twitter at: SavioRodrigues
PS: I should state: “The postings on this site are my own and don’t necessarily represent IBM’s positions, strategies or opinions.”