Test-Driven Open Source Contributions

In my last post, I discussed implementing and testing the Bulk Insert/Save Pattern in MongoDB. At the end of that post, I alluded to an issue we encountered when I first wrote the test for the bulk insert/save. In this post, I'll describe the issue we encountered and how we solved it.

To recap:

Fongo is an in-memory java implementation of MongoDB. It intercepts calls to the standard mongo-java-driver for finds, updates, inserts, removes and other methods. The primary use is for lightweight unit testing where you don't want to spin up a mongod process.

Now, confession time: in my last post, I linked to some sample code with some unit tests. That test...doesn't actually work. If you actually run the unit test, you'll see a bunch of debug logging and then the test fails with this exception:

com.mongodb.InsertManyWriteConcernException  
    at com.mongodb.FongoBulkWriteCombiner.throwOnError(FongoBulkWriteCombiner.java:76)
    at com.mongodb.FongoDBCollection.executeBulkWriteOperation(FongoDBCollection.java:1218)
    at com.mongodb.DBCollection.executeBulkWriteOperation(DBCollection.java:2188)
    at com.mongodb.BulkWriteOperation.execute(BulkWriteOperation.java:121)
    at com.vena.PersonDAO.bulkInsertPerson(PersonDAO.java:51)
    at com.vena.PersonDAOFongoTest.saveDuplicates(PersonDAOFongoTest.java:39)

Reading that stack trace, it looks like some things are working correctly; remember that we are expecting an error from MongoDB at this point. However, we weren't expecting an InsertManyWriteConcernException! What's going on?

Unfortunately the MongoDB docs don't mention InsertManyWriteConcernException at all-- what gives? It turns out that this is actually an internal Fongo exception. This seems to suggest that Fongo is throwing an internal exception instead of the expected BulkWriteException. So, how do we verify this hypothesis?

Simple: we modify our test code to actually talk to MongoDB and confirm that our test code actually does what we think it does.

Actually Testing Against Mongo

Our unit test contains a single JUnit setup method (and corresponding clean up):

@BeforeClass
public static void setupFongo() {  
    fongo = new Fongo("fongo mock server");
    mongoClient = fongo.getMongo();
    morphia = new Morphia();
    dataStore = new DatastoreImpl(morphia, mongoClient, "mydb");
}

Very clean, and this means we can simply change the mongoClient line to read:

mongoClient = new MongoClient();  

and now our unit tests run against the mongodb instance running at the default url/port: localhost:27017

Perfect. I ran the exact same unit test, and lo and behold: It passed!

So now we know Fongo has a bug in it: Fongo isn't matching the behaviour of MongoDB, which is, y'know, what it's supposed to do. I opened an issue on the Fongo GitHub repo, but, like most issues on most GitHub projects, this one was mostly ignored.

So the natural next step was to write a patch myself and submit a pull request.

Writing a Patch for Fongo

I had never contributed to MongoDB or Fongo before this. I've only ever interacted with their external APIs, and even then, only lightly.

However, that meant that I was familiar enough with the external Fongo API that I could immediately write a unit test reproducing my issue. It looked a little like this:

@Test
public void test_bulk_update_with_dupes() {  
    // Given
    DBCollection collection = newCollection();
    collection.insert(new BasicDBObject("_id", 100).append("hi", 1));

    // When
    BulkWriteOperation bulkWriteOperation = collection.initializeUnorderedBulkOperation();
    // these are all safe
    bulkWriteOperation.insert(new BasicDBObject("_id", 101).append("hi", 1));
    bulkWriteOperation.insert(new BasicDBObject("_id", 102).append("hi", 1));
    bulkWriteOperation.insert(new BasicDBObject("_id", 103).append("hi", 1));
    bulkWriteOperation.insert(new BasicDBObject("_id", 104).append("hi", 1));
    // this one should break it
    bulkWriteOperation.insert(new BasicDBObject("_id", 100).append("hi", 1));

    // Then
    exception.expect(BulkWriteException.class);
    bulkWriteOperation.execute().isAcknowledged();
}

This is essentially the same test case I had in my application code, but more generic.

Once I had that test, I could breakpoint-debug the Fongo code, which was entirely unfamiliar to me, and use that test as an entry point into the code base to start figuring my way around.

This is pure Test Driven Development: I had a failing test case and I needed to write code to make that test case pass.

Eventually, I did that, and then an entirely different test in Fongo's test suite failed! It's like they say:

99 bugs in the code, take one down, patch it around, 127 bugs in the code

This failing test lead me to a strange old corner of the MongDB API where, well, let's just say things could be cleaner.

Your Daily WTF

Ok, so as I pointed out earlier, Mongo was throwing an exception called BulkWriteException to signify that something had gone wrong in the bulk write. Seems reasonable enough.

When I wrote my original patch, I noticed that Fongo was throwing something called InsertManyWriteConcernException, which as I mentioned above, is an internal Fongo class. It turns out that the calling methods would catch this and convert it to the appropriate exception.

My original patch simply swapped out that InsertManyWriteConcernException for a BulkWriteException, and that's when I learned that Mongo's public API defines both a BulkWriteException and a MongoBulkWriteException.

What, what?

Yeah, you read that right. BulkWriteException and MongoBulkWriteException are two different exceptions that Mongo can throw. In different scenarios. I went and looked both exceptions up in the documentation and realized that they are basically the same thing!

Both have methods defined with the following signature: List<BulkWriteError> getWriteErrors(), which are really all I care about in this case.

But wait! The plot thickens. The BulkWriteError in the list returned by MongoBulkWriteException is different from the one returned by BulkWriteException. The former is in the com.mongodb.bulk package, whereas the latter is simply contained in the com.mongodb package. In fact, the com.mongodb.bulk package actually includes duplicates of a whole host of classes in the com.mongodb package!

Waht.

Mongo Why?

Looking more closely at the documentation reveals that the com.mongodb.bulk package, along with the MongoBulkWriteException, was added in Java driver version 3.0, but that GitHub page doesn't contain any useful information about this new class hierarchy. The linked "What's New" page also doesn't have anything useful for this case, nor does the listed incompatibilities page.

I was able to find a public issue referring to the new hierarchy, where it seems like that change was made to bring the exception hierarchy more in line with the .NET driver. This sounds reasonable on paper, but leads me to wonder why the old Exception hierarchy was left alone. At the very least, I would expect them to be marked as deprecated. Additionally, since this change was on a brand new major release, that is the perfect time to introduce breaking changes such as "this method now throws a different Exception class than it used to."

The comments on that issue also point to a chain of git commits...but from what I can tell on Github, they were committed directly to master, not through a Pull Request. All of this means that I have no idea why this change was made. Another public issue refers to the old classes as a legacy API, which implies that they were left there for backwards compatibility, but the entire reason you'd roll out a new major version is for breaking changes, right?

If anyone knows why this design decision was made, feel free to leave a comment on HackerNews or Twitter and enlighten the rest of us. 😀

Dealing With the Weird API

So back to the Fongo patch: it turns out that Fongo has defined a whole host of methods called translate*ToNew which accept an object of the old type and translate it to the new type. Then, my patch simply had to include the following method:

(Please excuse the mess; I'm using the full package names to avoid any ambiguity)

public static List<com.mongodb.bulk.BulkWriteError> translateWriteErrorsToNew(final List<BulkWriteError> errors) {  
  List<com.mongodb.bulk.BulkWriteError> retVal = new ArrayList<com.mongodb.bulk.BulkWriteError>(errors.size());
  for (BulkWriteError cur : errors) {
    com.mongodb.bulk.BulkWriteError e = new com.mongodb.bulk.BulkWriteError(cur.getCode(), cur.getMessage(), new BsonDocument(), cur.getIndex());
    retVal.add(e);
  }
  return retVal;
}

and then we use that when we catch the BulkWriteError to turn it into a MongoBulkWriteError.

Wrapping Up

When I started on this bug, I knew nothing about the MongoDB and Fongo internals. Sure, I could use their public APIs to get my work done, but I knew nothing about how they worked. When I realized I'd be needing to dive into the Fongo code base, setting up a unit test enabled me to quickly determine the code path I needed to investigate. In addition, that unit test validated my work and gave me a target to aim for: when that test passed, I knew my work was done.

Well, almost! It turns out unit tests are really good at being automated and catching regressions, so when an unrelated test started failing, it meant that my "fix", while introducing my desired behavior, had broken something else. So more correctly: when all of the unit tests passed, I knew my work was done.

Comment on HackerNews or Twitter.