Are there any studies that aggregate data over a wide population of contributed code, that establish a correlation between amount of code written in a commit and the # of bugs discovered in that code ? It'd be hard to do in github without knowing whether a change was due to new functionality or a bug, but you could determine a relation between lines of code per commit and how much thrashing eventually goes on in that code.
-
4Far, far more likely, it's proportional to the skill of the person who coded it, adjusted for complexity of what that code is intending to do. – Izkata Jan 24 '13 at 19:02
-
Not all lines of code are created equal, so I don't know if such a proportion would even make sense. – FrustratedWithFormsDesigner Jan 24 '13 at 20:16
3 Answers
It just depends.
If all your program does is Console.WriteLine over and over.. chances are it won't have any bugs no matter how big it gets. If you're writing the next great document database, chances are you'll have a lot of bugs.
You couldn't scrape this information from github because you don't know how hard the problems people are trying to solve.. If most projects on gitHub are the complexity of a tic tac toe game, again, you probably won't see a ton of bugs. Your analysis could fool you and say "Wow codebases can expand with relatively few bugs or none at all!".
Bugs are more related to complexity, is what I'm getting at.

- 3,569
The only metric that I'm familiar with that tries to relate possible defects to program size is one of Halstead's complexity measures. The figure used is B = (E^(2/3))/3000
or B = V/3000
where B is the number of delivered bugs, E is the amount of effort, and V is the program volume. If you simplify down to the counted values, these equate to either B = ((n1/2)(N2/n2))/3000
or B = (N1 + N2) * log2(n1 + n2)
where n1 is the number of distinct operators, n2 is the number of distinct operands, N1 is the total number of operators, and N2 is the total number of operands.
Your number of bugs per commit may be related to the delta in bugs before the commit and bugs after the commit.
However, the validity of Halstead's metrics have been questioned - if you search for academic studies, you'll find papers that indicate their validity as well as papers that seem to indicate little to no validity of the metric. To the best of my knowledge, they are not widely accepted nor is there overwhelming evidence that they are empirically valid.

- 82,739
-
4Saying that "the validity of Halstead's metrics have been questioned" does not begin to tell the tale. Halstead's metrics have all been shown to be strongly correlated with raw SLOC (source lines of code). The implications are obvious. – John R. Strohm Jan 24 '13 at 19:06
Its proportional to the number of functions/methods not covered by unit tests.
Bugs = K + M * <functions that are not tested> - N * <Integration Test Coverage>

- 11,190
- 2
- 43
- 70
-
4I'm quite familiar with various metrics, but I have never seen anything like this before. Can you cite a source? – Thomas Owens Jan 24 '13 at 18:37
-
-
@ThomasOwens: Sorry yes. Loki said it on Stackoverflow on 24/Januray/2013 remember the date it will be famous one day. – Martin York Jan 24 '13 at 18:48
-
4I have to down vote this because it's useless. An equation with absolutely no empirical or even anecdotal backing is not a metric and doesn't answer the question. – Thomas Owens Jan 24 '13 at 18:50
-
1@ThomasOwens: It definitely has anecdotal evidence. Code with test in my experience has significantly less bugs than code without. The more integration tests the less bugs. As such it does answer the question. So the equation is essentially correct. The real problem is defining K/M/N (where M/N may not constant but potentially functions). PS Its a stupid down vote. – Martin York Jan 24 '13 at 18:52
-
@LokiAstari I, on the other hand, have seen the opposite enough times to know that code with tests isn't necessarily less buggy. – Izkata Jan 24 '13 at 19:00
-
@Izkata: So you have seen code that has become more buggy because of the addition of unit tests? Tests don't guarantee fewer bugs than general. It indicates less bugs than the same code without tests and will not grow the bug count, so you will have the same or fewer. If the code is already bug free then all it can do is prevent bugs being introduced. – Martin York Jan 24 '13 at 19:02
-
@LokiAstari Semantically identical code. For example, we had a Search module that was tested up the wazoo, but still very buggy, and causing problems every week. I rewrote it in about a month, cut down the number of tests by about half (after the rewrite was done; I don't like TDD. Admittedly most weren't necessary, duplicating the same type of edge case), and it's been a good 4 months or so with zero issues. – Izkata Jan 24 '13 at 19:04
-
@Izkata: Tests don't guarantee bug free code. They guarantee what you test. But your old buggy search was probably a lot better because of the tests than without it. Reducing the complexity will reduce the bugs. You halved the tests because you probably quartered the complexity. But you can't say that the half of the tests you left in did not help in validating that what you did test. – Martin York Jan 24 '13 at 19:07
-
@Izkata: But back to your first statement. You have seen code that has more bugs because it has tests? – Martin York Jan 24 '13 at 19:10
-
@LokiAstari They didn't, because the internals changed (they were badly written tests that checked what they shouldn't have). I left the concept behind the tests in, but completely rewrote them. And no, not worse because it has tests (except perhaps in a false sense of security) - but this made-up metric does not take into account the quality of anything. Just quantity. – Izkata Jan 24 '13 at 19:11
-
@Izkata: Yes agree does not take into account quality. But neither does any other metric you can invent. – Martin York Jan 24 '13 at 19:20
-
6@LokiAstari - please consider deleting this answer or significantly revising it. I respect your opinion that bug counts are related to having test cases. However, the way this has been phrased is not helpful. I think you may have a valid answer building upon your responses within these comments, but currently the answer doesn't stand-up on it's own. – Jan 24 '13 at 19:21
-
-
3-1. Your equation implies that a sufficiently high integration test coverage will result in a negative bug count, so it can't possibly be correct. It also completely ignores test quality, code complexity, and whether the author of the code knew what they were doing, all of which strike me as major variables. – Michael Shaw Jan 25 '13 at 20:47
-
-