A short while ago, we were building a distributed backend service where multiple instances of the service would be sending requests to each other to coordinate work splitting and availability. We were concerned about a number of potential roadblocks, including overwhelming the receivers and congesting our network traffic. We also realized that, in this case, immediate consistency wasn't a top priority-- we could tolerate a few seconds of delay, and as long as we were eventually consistent, the system would behave as we expected.
We took inspiration for this problem from debounce mechanisms in web development. On the web, this technique is used for features like autocomplete or suggestions in search bars. Generally you don't actually want these components to make requests on every keystroke, especially if the user knows what they're typing and types a lot, so the solution is to debounce the requests.
What is Debouncing?
The name "debounce" actually comes from hardware designers who needed an accurate way to tell if someone pressed a button, because the physical button will have a little bounce to it, so it looks like they pushed the button, released it, and pushed it again really quickly...but I digress.
Fundamentally, we're rate-limiting the number of requests that are being made. But there's a lot of variations to what "rate-limiting" can mean: does the first request get made right away? Do the subsequent requests get queued, or simply thrown away? If there is a request currently queued, and another one comes in, does it get dropped? Also queued? Does the timeout get extended?
If you search for the meaning "debouncing" you'll typically find any combination of these.
For our purposes, we settled on the following behavior:
When we make a request:
- If it hasn't been made within the past
delayMillismilliseconds (and there isn't a call queued), then the request gets made immediately
- If there IS a recent call, then we check to see if one is queued
- If there is a call queued, we drop the current call
- If there is not, we queue up the current call to be called after
If you imagine requests to be coming in on a timeline, where our delay is 10 ms, it would look a little like this:
This has the nice effect that the first request always gets made immediately, and we still remain within our rate limits.
Requests in Java
In Java, everything is an object, even function calls. We knew that the method we were calling to send our API request wasn't going to accept any parameters, which made things a little easier; it meant we could use the
Runnable interface was originally designed for multithreaded computations: you'd create an object with a
run() method on it, and you could give that object to a separate thread and have the
run() method get called in that second thread. For our purposes, though, it's an effective interface to represent a function call that accepts no arguments, does some stuff (ie. has a side effect) and returns no values.
So we're going to build a
DebouncedRunnable. It accepts a
Runnable and also implements the
Runnable interface itself, meaning it can be a drop-in replacement for the
Runnable we give it.
Enough theory, let's show some code. The full class can be found at this gist.
It might look like a lot, but there's that big comment in the middle taking up a ton of room, haha. Let's go chunk by chunk through it.
At the top, we've got some instance and class fields:
The top line defines a logger, so we can control log levels and redirect logging on a class-by-class basis.
The next few lines just define some constants. We've got a
ScheduledExecutorService, which is comes from the
java.util.concurrent package, and lets us schedule a specific
Runnable to be run in a certain amount of time. Then we've got our
operation to run, a
name of the operation so we can keep track of it for logging, and the
delayMillis, which controls the amount of time we need to wait between calls. These are
final because, once we get them in the constructor, they'll never change.
Then we've got a couple other fields that are mutable. We'll use them for tracking our scheduling.
We've also got the constructor, which sets those fields, and has a nice juicy JavaDoc describing the behavior of this code. Always comment your code kids!
The actual meat of this class is in the
This method is
synchronized, meaning that only one thread can call it at a time. This helps us avoid race conditions if multiple threads call
run() for the very first time at the same time.
The behaviour is what we described above: if we've got a call queued up, we ignore the current invocation and do nothing. Otherwise, we check if we should run it now or schedule it to be run later (on a background thread), and then we do exactly that, keeping track of
Another thing to note is the funky syntax on the call to the scheduler. If you haven't seen Java 8 before, the
this::scheduledRun syntax will look odd to you. It basically means "the current object's
scheduledRun method". Let's take a look at that method:
This is very similar to the case above, where we just called
run() directly, but log message differs slightly and we're clearing the
isQueued flag as well.
The remaining methods in the class are all small:
The first is our check to see if we're allowed to run at the current time, and the other 2 are simple wrappers around our scheduler and the time interface so that we can properly unit test this class.
Before we talk about testing, though, let's talk about how we actually used this
Our calling code had an operation called
volunteer(). The details here aren't important-- it boiled down to making an API request, and like I mentioned at the beginning, we were ok with it not firing off immediately 100% of the time and instead being rate limited.
Previously, this class would have looked like:
We had to inflate this service class a little, and change it look a little like this:
So, instead of just sticking a method on a class, we need to take that method, wrap it in
DebouncedRunnable, and then keep a reference to it so we can call it later. We also hide the actual API call in a deprecated method with an intentionally ugly name
volunteer_yesIKnowWhatImDoing so that it jumps out in code reviews and is immediately obvious that it's not just any other method. Then our
volunteer() method calls
run() on the runnable, which does its debounced thing: calls
volunteer_yesIKnowWhatImDoing directly, schedules a call to it for a short time in the future, or ignores us entirely.
Testing Time and Space
So, now that we've seen what our calling code looks like, we need to answer one more question: how do we test this? Code that interfaces with time is notoriously difficult to test, since unit tests aren't guaranteed to take the exact same amount of time from run to run. On top of that, code that interfaces with separate threads is typically harder to test as well.
And here we've got both.
To start, let's take a look at the instance variables at the top.
We're using Java's
AtomicInteger as a simple container object. It represents an integer and defines an
incrementAndGet() method on it that will increment the value of the integer, so the value of the integer will correspond to exactly the number of times this method was called.
We also have our
DebouncedRunnable, around the
incrementAndGet() method. It's the normal
DebouncedRunnable, but we're also going to use
Mockito to spy on it, meaning we can intercept method calls as we wish.
And lastly, a
List<Runnable> to store all of our queued operations.
Our class setup (annotated with
@Before in JUnit) is also interesting. Keen-eyed readers would have noticed that I defined two single-line wrapper functions above that I said would help us unit test, and this is how:
doAnswer() is a method from
Mockito that takes a lambda to run and a
Spy. Basically, since we're spying on our
DebouncedRunnable, this is how we specify that we want to intercept the
schedule() method on. Instead of calling the real
schedule() method (which is a one-liner that sticks things on a separate thread), we're tracking the calls and storing them in our array.
There's one last helper method that makes our tests a lot more readable:
We're using the same
doReturn() as above, but it's operating on the
getCurrentTimeMillis() method. Ordinarily, this goes straight to the system, but we're intercepting it and manually specifying our return value, effectively lettings us freeze time and manipulate time as we wish.
Now that we've got all of that, we can write tests like this:
or like this, where we verify that even if there is a scheduling delay and enough time has passed between our last run and now, the fact that there is a call queued is enough for us to drop a call:
The full JUnit test, with a few more tests cases I didn't go over, is available in a gist here.
Writing descriptive unit tests like this is really satisfying. Once you've found the right level of abstraction, and can describe the required setup, the actions you want to take, and the outcomes you're expecting, it becomes much easier to write tests and build confidence that your code is doing what you want it to be doing.