Is it Possible to Write Bug-Free Code? A Case for Thorough Specifications
When I started as a server developer at Vena in 2013, we didn’t have test-driven development. Sure, we designed our codebase to be testable by using dependency injection from the start, with the expectation of writing unit tests with mocks. But we needed to build out quickly, and unit tests were quickly forgotten. We had one automation engineer to write black-box tests for 3-4 developers. As one would expect, that didn’t scale. After the team doubled in size, we mandated that all developers also write automated tests. Since then, we’ve been steadily increasing our test coverage: writing black-box (API) tests, unit tests, stress tests, and now UI tests. We’ve replaced our perpetually delayed biweekly QA+release cycles with daily releases by implementing pre-merge testing (automated and manual QA) and code reviews using Github Enterprise. We put in place a staggered release process from development to canary to production environments. All this helps us release faster, with fewer bugs.
By all accounts, it’s been a success. But it’s easy to get lost in the race for higher test coverage. Testing itself is not the end goal — code correctness is! All the tests and processes above are in place to try to catch bugs in our code. But can one write code in a way that doesn’t contain bugs in the first place? Can you prove that code is correct without testing it?
When Code Coverage Isn’t Enough
Let’s say you want to write a log parser, where each line of the log is in this format "[time] message"
. You start by writing a Java function that takes a line of input and produces a Record
object.
static Record parseRecord(String input) {
int i = input.indexOf('[');
int j = input.indexOf(']');
String time = input.substring(i+1, j);
String message = input.substring(j+1);
return new Record(time, message.trim());
}
You also write a simple unit test for it. It passes, the code coverage is 100%, and you’re all good right?
@Test
void testParseRecord() {
Record r = parseRecord("[15:30] foo");
assertEquals(r.getTime(), "15:30");
assertEquals(r.getMessage(), "foo");
}
Well, the first thing a sharp QA mind would try is: What about null
input? Or input that doesn’t contain any square brackets? It throws an exception! At this point, the developer might protest, “Well this function is only meant to read lines of a specific format. They should never run on such bad input!”
There are two problems here:
- We’ve not defined what kind of inputs this function is designed to accept. It is under-specified.
- Code coverage was 100% because the one unit test written,
testParseRecord
, exercises all 5 lines of the functionparseRecord
. If we accept that the invalid input handling is a bug despite the coverage, then how do we know when we’ve sufficiently tested all cases?
In my opinion, the solution to both is better code specifications. OK, I hear your groans already. But I’m not talking about the high-level feature requirements document that you have to wait for a product manager to write up. I’m talking about the short, quick documentation that any developer can write for every low-level unit of code (i.e. function, method, class).
Problem #1 can be solved directly by specifying the kinds of inputs parseRecord()
should accept (we’ll get back to this later). What about problem #2? Well, we can use the well-specified behaviour of the methods that our code calls – like String.indexOf()
and String.substring()
– to do a code analysis.
Using Assertions to Verify Code
Disclaimer: I am not an expert in formal methods for verifying software. I will however show you one way to do it. The method is to write assertions in between each line of code. The assertions are logical statements (not code that runs), and they must always be true at that point of the code’s execution.
The first assertion we write will be at the entrance of the function, before any lines of code are executed. Thanks to Java being a strictly-typed language, we know there is only one local variable input
, and it is of type String
. We’ll now write the most obviously true statement ever:
static Record parseRecord(String input) {
// ASSERT: input is null OR input is not null
The goal of code analysis is to narrow down the state space as much as possible. Already we can see that input
being null
is a problematic state. This guides us to writing some extra bit of real code, after which we can write some more useful assertions:
if (input == null) {
throw new IllegalArgumentException("input may not be null");
}
// ASSERT: input is not null
// ASSERT: 0 <= input.length() <= Integer.MAX_VALUE
I took the liberty of asserting the value of input.length()
based on what we know about Java Strings. They can never be shorter than 0 or longer than the max value of a signed integer (231 – 1). A lot of assertions will state obvious things. Get used to that.
For the next line of code, let’s read the JavaDoc for String.indexOf(int ch):
Returns: the index of the first occurrence of the character in the character sequence represented by this object, or -1 if the character does not occur.
We know something about String indexes: they can range from 0 to the length of the string minus 1, inclusive. Combining this information with the assertion about input.length()
above, we can write the next two assertions:
int i = input.indexOf('[');
// ASSERT: -1 <= i < input.length() <= Integer.MAX_VALUE
int j = input.indexOf(']');
// ASSERT: -1 <= j < input.length() <= Integer.MAX_VALUE
So far, so good.
Working Backwards
For the next two lines, we’ll work backwards. Why? Reading the documentation, we find that the two substring
methods we use can throw IndexOutOfBoundsException
!
substring(int beginIndex, int endIndex):
Throws:
IndexOutOfBoundsException
– if thebeginIndex
is negative, orendIndex
is larger than the length of this String object, orbeginIndex
is larger thanendIndex
Throws:
IndexOutOfBoundsException
– ifbeginIndex
is negative or larger than the length of this String object.
Let’s say that we don’t want to throw IndexOutOfBoundsException
from our function. (That’s not to say exceptions can’t be thrown, but we only want to throw when necessary, and even then, only throw specific exceptions that tell the caller exactly what went wrong.)
So how do we ensure that input.substring()
does not throw that exception? We can write a requirement by substituting in our values for beginIndex
and endIndex
parameters:
// REQUIRE: 0 <= i+1 <= j <= input.length()
String time = input.substring(i+1, j);
// REQUIRE: 0 <= j+1 <= input.length()
String message = input.substring(j+1);
Let’s break that first requirement down into 3 parts:
- REQUIRE:
0 <= i+1
is SATISFIED by ASSERT-1 <= i
- REQUIRE:
i+1 <= j
is NOT SATISFIED - REQUIRE:
j <= input.length()
is SATISFIED by ASSERTj < input.length()
Similarly, breaking down the second requirement:
- REQUIRE:
0 <= j+1
is SATISFIED by ASSERT-1 <= j
- REQUIRE:
j+1 <= input.length()
is SATISFIED by ASSERTj < input.length()
The requirement i+1 <= j
is not satisfied, which means some inputs could throw IndexOutOfBoundsException
from the code input.substring(i+1, j)
. Example: when input
is "x][foo"
, then i
is 2, i+1
is 3, and j
is 1.
How to solve?
Once again, the failed requirement guides us to the solution. We need to make i+1 <= j
, which means we should rewrite the code that assigns j
.
Let’s use int indexOf(int ch, int fromIndex):
Returns:
the index of the first occurrence of the character in the character sequence represented by this object that is greater than or equal tofromIndex
, or -1 if the character does not occur.
So if we pass i+1
as fromIndex
, then we get:
int j = input.indexOf(']', i+1);
// ASSERT: i+1 <= j OR j == -1
Well, that almost works. If j
is -1, then it’s always going to be smaller than i+1
because we know -1 <= i
from our assertion before, and j = -1 <= i < i+1
, i.e. j < i+1
, which fails our requirement.
Let’s add another check.
int j = input.indexOf(']', i+1);
// ASSERT: i+1 <= j OR j == -1
if (j == -1) {
throw new InvalidFormatException("] not found");
}
// ASSERT: i+1 <= j
Now we’ve got what we need!
Putting it all together, the final annotated code (up to before the return
statement) might look like this:
static Record parseRecord(String input) {
// ASSERT: input is null OR input is not null
if (input == null) {
throw new IllegalArgumentException("input may not be null");
}
// ASSERT: input is not null
// ASSERT: 0 <= input.length() <= Integer.MAX_VALUE
int i = input.indexOf('[');
// ASSERT: -1 <= i < input.length()
int j = input.indexOf(']', i+1);
// ASSERT: j < input.length()
// ASSERT: i+1 <= j OR j == -1
if (j == -1) {
throw new InvalidFormatException("] not found");
}
// ASSERT: i+1 <= j < input.length()
String time = input.substring(i+1, j);
// ASSERT: No exception thrown by above line
// ASSERT: time is not null AND time.length() == j - (i+1)
// ASSERT: j+1 <= input.length()
String message = input.substring(j+1);
// ASSERT: No exception thrown by above line
// ASSERT: message is not null AND message.length() == input.length() - (j+1)
return new Record(time, message);
}
We’ve now modified the code and proved that it will never throw any IndexOutOfBoundsException
from the calls to substring()
. But is it correct? Astute readers will notice that we never added a check for i == -1
. This means input such as "7:00] hello"
would be accepted, even though it doesn’t match the expected format.
Why did our analysis miss this? The reason is that we only tried to prove a certain exception wouldn’t get thrown. We didn’t try to prove anything else. Actually, we didn’t know what to prove yet, because we didn’t specify what this function is supposed to do. That leads us back to problem #1 at the top of this page…
Specifying the Behaviour
During our exercise, we already started to make decisions on what exceptions the function should throw. Specifically, it should never throw IndexOutOfBoundsException
but it could throw InvalidFormatException
or IllegalArgumentException
. This should be written into the function specification.
As we start to define what the function should do in non-error cases, we find that the original function parseRecord()
did two separate things: it did some string parsing to find two substrings, and then it created a Java object out of them. It’s easier to write a spec when the function only does one thing. So let’s extract the first string parsing part into its own function findInBrackets()
.
That specification might look something like this (in Javadoc style):
/**
* Looks for the first occurrence of [.*] in the string, and returns the
* substring between the [ and ] characters.
* @param input - a string to parse
* @throws IllegalArgumentException if input is null
* @throws InvalidFormatException if the '[' character is not found or
* the ']' character is not found anywhere to the right of the
* first '['.
* @returns a substring of input, specifically the string that is
* in between the first '[' symbol and the first ']' symbol
* that follows that '[' in the input string.
*/
String findInBrackets(String input);
This is now a very specific specification! It tells us exactly which inputs it accepts, and it tells us what happens if you feed it something it doesn’t accept. It also tells us exactly what it returns. With this specification, it’s easy to write the correct code.
It’s now clear that we should add code to follow the specification “throws InvalidFormatException if the ‘[‘ character is not found,” which was missing. It looks like this:
if (i == -1) {
throw new InvalidFormatException("[ not found");
}
The spec “throws InvalidFormatException if … the ‘]’ character is not found anywhere to the right of ‘[‘” is already handled by the if (j == -1)
check.
Now we have a provably correct function, and we did it all without writing or running another test. (Fun fact: while writing this post, I never once compiled or ran the code!) Of course, you should still write tests. A nice bonus is that the spec tells us exactly what the expected result of each test should be. Here are some cases that I came up with in 30 seconds:
"abc [ foo] bar"
->" foo"
"[a[b[c[d]e]f]g]"
->"a[b[c[d"
"]x["
-> InvalidFormatException"][]"
->""
"abc (xyz)"
-> InvalidFormatException- null -> IllegalArgumentException
Addressing Questions
Writing assertions looks like a lot of work. Can’t a tool automate this for me?
Probably! Static analysis tools do exactly this. They can analyze your code to narrow down all the possible return values and alert you if you end up trying to dereference a null pointer for example. But keep in mind it has the same limitations shown above in Using Assertions to Verify Code. A tool doesn’t know what your intentions are, and it wouldn’t catch the i == -1
case either. You still need a human to design the specifications (which I think is a good thing!)
Personally, I mainly use assertions in a particularly long, complex, or critical piece of code.
As an experienced coder, I already know how to avoid those traps with indexOf and substring.
Great! It’s probably easy because you know those methods well. It’s made easier because those String library methods are well-documented and well-defined! How many times have you run into methods in your own development work that aren’t so well-defined? Here’s an example inspired by true events:
buildQuery(Map<Key, Collection<Value>> filters)
This function returns a query for data that matches the provided filters. What kind of query do you get when filters
is an empty map? Some callers might assume the query returns no data (nothing matches), while others might think it returns all data (nothing is filtered out). Without clear specifications, it’s anybody’s guess as to how the buildQuery()
method should handle an empty input. Or a null input. Or an input where a key maps to an empty set vs a null set. Pass the result of this to a method like deleteByQuery(buildQuery(filters))
and there’s a nasty bug waiting to happen.
What ends up happening is that a developer who wants to use this method has to go to great lengths to read the actual code to find out what it does, or spend time testing code that they didn’t write, or both. (So do the code reviewers.) All this wasted time could be avoided with a few short sentences in the method’s documentation…
Takeaways
- Don’t let code coverage fool you into thinking your tests are complete; you want tests that cover all possible states, not just all codepaths.
- Writing the specification first can help you write better code.
- A good specification makes writing test cases trivial.
- A good specification makes the lives of developers who use your code easier.
- Assertions can help you reason about what cases can or cannot occur, (hopefully) leading to bug-free code!
Happy coding!