Using Jackson for your REST service? Don't initialize your entities

Using a Java framework like Dropwizard and an ORM like Hibernate, it's quite easy to write a REST server in a completely object-oriented way. By defining entity classes, you can use the same objects from the incoming HTTP request, through the business logic, all the way to the persistence layer. We fully embraced this approach when we started our server rewrite in 2013, and it's been in use ever since. However there are definitely some anti-patterns that we learned along the way. This article talks about one of them.

The Problem

Say we have an entity class A defined like so:

@Entity
class A {  
  @Id
  Id id;

  @Column
  String name;  

  @OneToMany
  Set<B> bees = new HashSet<>();

  // getters and setters
}

This definition is convenient because you know bees is never null, so you can happily do bees.size() and bees.add(b) while avoiding null checks in your code. But it's problematic as an entity. Why?

Say that there is some record A100 with { id = 100, bees = [ B1, B2, B3 ] } persisted in the database.

Now we receive a PATCH request to update that record's name (and no other fields*):

PATCH /a/100  
{
   "name": "new name"
}

*We use PATCH for partial object updates, and the PUT method for full object updates (replacement).

In our framework, Jackson will take the JSON body and create a Java object by calling the default constructor, followed by setters. It actually uses reflection to do this, but that's equivalent to:

A a = new A();  
a.setName("new name");  

Note that it only calls setters for properties that are present -- it does not call a.setId() nor a.setBees() because those properties are absent.

Additionally, our resource method will look something like this to handle that request and update the object in the database:

@PATCH
@Path("/a/{id}")
void updateA(@PathParam("id") Id id, A jsonA) {

  A dbA = aDAO.findById(id); // fetches object from database

  mapper.map(jsonA, dbA); // copies all non-null fields from jsonA to dbA

  aDAO.update(dbA);
}

But the field jsonA.bees is initialized as new HashSet<>() in the class definition, therefore it's non-null. Because it's not null, the mapper overwrites whatever was in dbA.bees with the empty jsonA.bees. When persisting it into the database, the ORM will now execute a delete statement to clear the bees collection**!

**This delete behaviour depends on the ORM. Hibernate will not delete anything if the collection is mapped as @OneToMany(mappedBy="a") but it absolutely does if it's declared with plain @OneToMany or @ManyToMany i.e. if it's a join table, not a join column.

You could argue that this is the intended behaviour for an update method that always overwrites all fields (e.g. PUT), but we specifically used PATCH to avoid having to specify all fields! More generally, we should be able to differentiate between when the JSON request actually contained the empty set { "bees": [] } vs. not containing "bees" at all, and the null value is the way we do that.

To do this, we should leave all entity property declarations blank (or at least any property that needs to support patch).

@Entity
class A {

  @Id
  Id id;

  @Column
  String name;  

  @OneToMany
  Set<B> bees;
}

Now, when bees isn't specified in the JSON, it will remain null, so a PATCH method would not modify it.

If this isn't convincing, consider why we don't initialize everything like so?

@Entity
class A {

  @Id
  Id id = new Id(0);

  @Column
  String name = "";

  @OneToMany
  Set<B> bees = new HashSet<>();
}

Answer: If we did this, a PATCH request would never work. A request that omits id or name fields would overwrite those properties with empty values as well!

But leaving these fields uninitialized is inconvenient because now we have to do null checks before using them. We might have to litter the code with a lot of

  if (a.getBees() == null) a.setBees(new HashSet<B>());

which isn't ideal.

The Solution

What we can do is create non-default constructors that do initialize the collections. These constructors won't be used by Jackson (or Hibernate, or Morphia, etc.). Additionally, we deprecate the default constructors so developers don't use them by accident.

The class then becomes:

@Entity
class A {

  @Id
  Id id;

  @Column
  String name;  

  @OneToMany
  Set<B> bees;

  /**
   * Default constructor for reflection libraries only
   * Don't use this to create new objects!
   */
  @Deprecated
  A() {
  }

  /**
   * Use this to create new objects.
   */
  A(String name) {
    this.id = new Id();
    this.name = name;
    this.bees = new HashSet<>();
  }

  // getters and setters
}

So if the developer's intention is to create a truly new object A that doesn't exist in the database yet, then they call new A("some name") and we're guaranteed that the collections (and other fields) have been initialized and are non-null!

Now for any entity, we have 3 kinds of objects:

  1. Objects created by application code - properties are initialized in non-default constructor.
  2. Objects created by Hibernate (from database) - collections are never null - they are always Hibernate persistent collections. Other types may be null depending on the column definition.
  3. Objects created by Jackson (deserialized from JSON request) - any property can be null.

This is manageable because it limits the null checks to just objects coming from a JSON request, which by definition is an object from the outside world that you shouldn't trust. We always validate those objects anyway.

Caveats

  • All fields that support PATCH need to be nullable. So primitives boolean, int, and long need to be replaced by Boolean, Integer, and Long. Like with collections, you can avoid null checks by initializing them in your non-default constructor(s). You can supplement this with setters that accept non-null values only.

  • The PATCH method I've described works by using the null value as a sentinel for "no value". If null is your intended value, then the PATCH method won't be able to set it. Jackson simply doesn't differentiate between a JSON body that omits the field and one that has it set to null, i.e. both {} and {'field': null} get deserialized into the same Java object. In practice, we've rarely needed this functionality, but if needed, you can always use PUT and/or define a new API at /a/{id}/field just for that purpose.

  • Default constructors for entities should not be used in your application code. That means you have to define another constructor with at least one parameter. Sometimes there is no sensible parameter, and you might have to just create a dummy one. Either that, or use factories.

Following this pattern, we can support PATCH methods in our REST server with minimal fuss and without needing to add null checks everywhere. Tony Hoare's billion dollar mistake is overstated. ;)

Discuss on Hacker News

Vena is hiring in Toronto!
Learn about our culture, if you think you're a good fit, apply!