A Day in the Life of a Vena Engineer

Hi, my name is Mustafa Haddara, and I'm a Software Developer at Vena. We build a multidimensional database (aka OLAP Cube; the wikipedia article is pretty good at elaborating on the subject). I work mainly on the compiler and runtime for Vena Calcs, our custom programming language for defining relationships between members in a dimensional hierarchy in the Cube.

9AM: Empty Office

Most of our developers (and other employees, really) trickle in around 10, which means that when I normally walk in around 9, the office is mostly empty. I boot up my laptop and grab a cup of coffee from the espresso machine in the kitchen.

At any given time I might have some open issues to fix or a new feature to add. Today I'm working on adding more validation to our Calcs.

As I'm working, two new developers who have just started are shown in. They start getting their development environment set up. Our stack includes Java 8, MySQL, MongoDB, and a variety of other smaller services. In the past, we'd have to install all of those individually and tweak configuration options and compare version numbers, but recently our Infrastructure team has been working on packaging most of it into a Vagrant box. That Vagrant setup is mostly done, so the new hires get to beta test it.

10AM: Undocumented Characters

As I'm working on the extra validation, a question comes up in our #Vena-Calcs channel on Slack about the allowed characters in variable names and identifiers; they're asking if $ characters are allowed in member names. Usually, my standard response is to link to our docs, but in this case that section is a little unclear.

I'm pretty sure that our validation will fail on a $ character, but I quickly jump into my already-running local instance of our server code to validate that. I rename one of the existing members in a dimension hierarchy to contain a $ in that name, and then try to save the Calc. As I expected, the save fails due to the $. I respond to the question on Slack, and quickly update our documentation to be more clear.

As I do so, I notice that the documentation claims that we support both < and > characters, but in reality, the > character fails our validation. That's odd.

11AM: Coffee and the Angled Bracket Tax

I pull up our grammar file to confirm that we disallow > characters, and sure enough, it seems like we missed that character. It seems like an odd character to disallow; especially since < characters are allowed. In this case, the fix is simple: add the >, regenerate the parser, and we're home free. Now it's time to test this out.

I point my browser to localhost and log in to my local instance with the fix. To test this, I'm going to need a member in dimension with a > in its name. As it happens, to begin, I rename a member to have a <, to validate that everything works as I expect; the < character is a "known good" character.

I find the Jan member in my Period dimension and rename it to Jan<abc.

attempted-rename

To my surprise, all I see is:

failed-rename

And the output from the server shows that it did actually receive a rename request. I open up the MySQL database, and sure enough, the member name did get changed.

That's...odd. Must be a front end bug. Let's look at the front end code.

12PM: Down the JavaScript rabbit hole

Looking at the front end code, it's not immediately obvious where the bug is, but eventually, I realize that the table widget we're using respects HTML. Sure enough:

html result

That clears it up. So we escape the characters member name and, as expected:

it works

It works. So I push that change to GitHub and open a pull request.

All of this started with the > testing. Testing it by hand on our front end shows that it actually does work correctly, fortunately. I write an automated test to prevent regressions and commit it alongside my server side change, push that to our GitHub, and open a Pull Request. We have an automated build server (Jenkins) that watches our server repository and runs any code submitted in Pull Requests through a suite of automated tests.

1PM: Lunch

Once my tests are running, I go out for lunch with the two new hires. We grab some food from the food court in the business park across the street and we talk about our past experiences and what brought us to Vena.

2PM: Product Demo

Every couple of weeks we have demo where anyone who has been working on some excited new feature can show it off to the rest of the developers and to the rest of the team. I don't have anything to show off this week, but a few of the others do.

All of the developers and a few of the Customer Success people and some of the Consultants gather around a projector. First, one of the developers on the Infrastructure team shows off an internal tool we can use to check on the status of all of our AWS instances: what service they're running, what kind of instance they are, etc.

Then we get a demo of an upcoming feature getting added to our product. The feature looks great and everyone is excited about it. This is the time when the people working on it are able to get feedback from the rest of the development team, and as always, there are some discussions about different aspects of the feature.

3PM: Customer Issues

After the demo, a Consultant approaches me with a Calcs problem. He made a change to reduce the scope of his calculations (and thereby reduce the amount of data he was processing), but then his calcs crashed after going out of memory instead of completing in 8 minutes.

He shows me the Calc and everything looks alright. I can't immediately tell why one version ran out of memory but the version that does more processing completed in 8 minutes, so my next step is to get a backup of the databases and restore it to my local instance so I can analyze the execution more closely.

4PM: Out of Memory

After restoring the customer database, I take a look at the difference between the consultant's two calculation runs, and right away I notice a problem. The first run only detected 35 thousand intersections (source data points) and generated 1.3 million outputs, but the later run found more than 75 thousand source data points and generated more than 3 million outputs.

That's seems odd, because the consultant said that the only change he had made was to reduce the amount of data being processed. The fact that, all of a sudden, the reduction prompted an increase in the amount of data being pulled in seems suspect.

Next, I compile the consultant's calculations, and I pop open the debugger and check that the compiler generated the correct compiled form (which it does). Interestingly, the compiled form had the correct bindings to the data that it should be pulling out of the database.

So there were supposed to be 75K values from the beginning. That's good to know.

Now I'm thinking that perhaps the initial run was actually the one that hit some bug, and it ran much quicker than the correct version. That's possible, I suppose, but it makes for particularly bad news to give back to the user.

Either way, to be sure, I have to run their calcs and figure out why they went out of memory the second time. So I hit run and sit back, watching the memory graph on my profiler go up and up and up.

memory-usage

Pop quiz: what's the difference between a memory leak and big data processing? Answer: in big data processing, the huge memory usage is planned.

In our case, because any calculated intersection could potentially trigger another calculation, we keep the results cached instead of going back and forth to Mongo to read out the results that we just calculated. And the longer we go processing a set of Calcs, well, the more results we expect to have and therefore the more memory we expect to consume.

However, that doesn't mean we require the cache— we could always drop the cache and fall back to reading from the database, at the expense of slower execution. That's been a feature we've wanted to implement for a while but never really got around to it, and it looks like it just became a higher priority issue, so I guess I'll implement that now.

5PM: Sets, Maps, and Space

We store our intersection values in MongoDB, and we were passing around these large caches of results to avoid going back to Mongo. The type of these caches?

Map<DenseIntersection, Intersection> resultsCache;  

So for every Entry in this cache, we need to have a DenseIntersection and an Intersection object. Let's take a look at these objects, starting with the Intersection object.

I'm not going to show you the full class, or even all of the properties that an Intersection object would have. It contains a great many objects, included a Map<Integer, Member> to hold that intersection's "address", multiple String and Id properties, a Set of other objects, and so on. All told, an Intersection object contains almost 40 other object references— meaning we're looking at 480 bytes of pure object headers! The sad part in all of this is that the Calc engine uses almost none of the attributes: we use the address and the value, and we don't care at all about the other properties.

Guess what the DenseIntersection object looks like?

public class DenseIntersection {  
    Long[] members;
    String value;

    ...
}

Two total objects. That's it. That's all we care about. We don't care about anything contained in the Intersection object that is not contained within the DenseIntersection object.

So why keep the Intersection objects around?

Good question. Let's dump them. So, all over our Calc Engine, where we would pass a Map<DenseIntersection, Intersection>, we just use a Set<DenseIntersection> instead.

Let's do the math here: 480 bytes for the headers of the properties on an Intersection and 24 bytes for the headers of the properties on a DenseIntersection object means we're looking at 500 bytes, purely for headers, for a single entry in that Map. We were allocating more than half a KB per data point!

Switching to a Set<DenseIntersection> saves us a ton of memory (the graphs auto-scale; compare the values on the y-axis to the graph above):

less memory

We went from a steady increase in memory to a far more constant memory usage. Of course, our memory usage still rises, but it rises more slowly. This is a big win, and I didn't even get around to implementing that fallback where we drop the cache yet. It can wait until tomorrow.

6PM: Separation of Concerns

We do a weekly hacknight in the office where we work on personal projects -- no work allowed. It's totally optional, we generally don't get a huge turnout, but I still enjoy going.

I plan on going tonight, but I like setting a boundary between work and personal projects, and the hacknight doesn't start until 7pm. I've made some good progress on the memory optimizations, so I commit my code for the night, push it to GitHub for our automated tests to run, and then I pack up my computer and head down to the coffee shop across the street for some Netflix.

7PM: Hacknight

I'm back in the office now-- most people are gone now, home for the night, and only a few developers remain. Those of us staying for the hacknight get together and decide what we want on our pizza, then we order online. A few weeks ago, for his hacknight project, one of our developers wrote a small extension for Google Chrome that scrapes Dominos Pizza's online order status tracker and pumps updates to our Slack channel:

slack integration

The pizza arrives, we put up the week's Last Week Tonight on a projector, and then when it's over we dive into our projects. This week, one of the developers didn't really have a project, so he ended up looking at some questions from Google's Code Jam and passing the question around. We all thought it was a decent challenge, and in the spur of the moment decided to all race to complete it (Sadly, I didn't win the race).

On other nights, we do video games nights: we book out a conference room and get Super Smash Bros or Mario Kart or Halo up on a projector.

Conclusion

This isn't representative of every engineer's day at Vena; I jumped around from back-end validation to front-end UI fixing to writing documentation to worrying about memory usage (in Java! Oh the irony!). We do all of these things, and more— I didn't do any infrastructure coding, I didn't do much in the way of QA, nor have I ever touched the C# codebase for our Excel Addin. But if any of these things excite you, if you're a talented Java or C# or JavaScript developer, well, we're hiring, and we'd love to hear from you!

Discuss on Hacker News