The Stardog kernel API is morphing into a system based asynchronous, reactive streams; in this post we discuss motivations and design goals.
Background
If you’ve programmed against Stardog, you’ve had to use at least a bit of the Stardog Native API for the RDF Language (SNARL API). Probably the most oft-used bit is the Connection
class. Patterned after JDBC connections, Connection
is the means of interacting with a Stardog database within an application. It’s very stable: only 28 non-javadoc changes to the interface since it was introduced before Stardog 1.0; none in the last 18 months.
But SNARL is showing its age. The pleasant world of lambdas, try-with-resources, or just using String
in a switch statement is now the actual world. We’ve been thinking about where Stardog is going in 2017; you might have read about it recently as we recapped an exciting 2016. We unify all the data: Structured, unstructured, and everything in between. As we reconsider our design, and plan for future growth, I thought it was time to take a new look at SNARL.
Why?
Let’s set aside the issues of changing a public API first. It’s not something we decided easily; we take compatibility seriously. But Stardog is undergoing fundamental changes, and some of those changes will be directly reflected in API like SNARL.
At the very least, there’s a lot to be gained from leveraging Stream
and lambda. We weakly embraced this the last time we changed the API. But a lot has changed in the Stardog world too. When the Connection
class and the rest of the SNARL API was originally created, we were months from our first public release.
SNARL was created very early on, before stuff like disk-based databases, the reasoner, CLI, security, even transactions, existed. So there it was, the public face of Stardog for years to come, built when the system was not much more than a query engine and memory-based indexes that were simple, sorted arrays. And we’ve hardly changed it since.
Going Reactive
I first heard about reactive programming not from the manifesto but from some binge watching/listening of Erik Meijer videos. If you haven’t watched any of Erik Meijer’s talks, you should. You’ll learn something, and you’ll definitely laugh. This talk about reactive a few years ago was my introduction.
I’ve seen it referred to as the “Observer pattern done right”, which is fair. It is basically Observer Pattern. The primary difference is that it adds error and completion semantics. You can also just think about it as Iterable
. The only difference is it’s push vs pull, and instead of the barebones Iterable
API, you get Stream
plus a whole lot more.
But for me, it was easier to just think about it as a way to work with streams of data. Consider a click stream from a mouse. These streams are a simple sequence of events: Value
, Error
, and Complete
. Error
and Complete
are actually mutually exclusive. Things worked, or they didn’t, but not both. It’s not rocket science. You can get zero or more Value
s before the termination event (i.e. Error
or Complete
). This is a great explanation of reactive principles in terms of a click stream.
Basic usage of a reactive API would look very similar to the Java Stream API:
getResults()
.skip(10)
.take(5)
.map(Object::toString)
.subscribe(System.err::println)
Apply an offset and limit, turn the Value
s into String
s and print them out; subscribe
in this case is akin to forEach
. Add scheduling, and backpressure, and it’s easy to see how this approach can provide the building blocks for a powerful and flexible API.
Here are a few other links that I found useful when trying to grok this stuff:
- Functional Reactive Programming with RxJava
- Reactive Streams 1.0.0 interview
- Reactive Programming at scale
- Comparison of Reactive-Streams implementations
Why Reactive?
This is really two questions:
- Why RxJava 2 instead of
Future
?, - Why is reactive a good fit for Stardog?
Why not Future
?
One downside to going with the reactive pattern is that it’s not as widely known as, say, Future
. That won’t help the learning curve. And we have always tried to maximize developer happiness and productivity because after all, we’re developers too, and we look after our peeps!
Java 8 improves Future
. Before we switched from Java 6 to Java 8, I would have told you that Future
is not easily composable and that doing error handling is cumbersome. The changes introduced in Java 8 go a long way to improving this. Future
also gives us some of the lazy/asynchronous joy we can get out of reactive, especially if we’re just waiting on a single result. For example, did dropping the database work or not? Future<Void>
is perfect.
However, it’s easy to kill some of that joy by calling Future#get
too early. If you do that, you’re stuck waiting. You can use callbacks to help; Guava has some nice support for this, but callbacks can get out of control quickly.
There’s also the problem that Future
is single-valued. So do you use Future<Collection<T>>
or Collection<Future<V>>
? Probably the latter, but if the first one in the collection we call Future#get
on isn’t done, we’re blocked from getting results from anything else in the collection.
Scheduling is pretty straight-forward, but there’s no notion of backpressure: you’ve got to roll your own. And what about enhancing the Future
, like attaching a progress listener to the asynchronous task? You’ll need to bake that into anything you want to keep tabs on.
My first attempts to sketch out some ideas based on Future
just felt cumbersome. For what I had in mind, a reactive design just felt more intuitive and natural to use.
Why use it for SNARL?
Distributed systems are hard. One of the hardest things to internalize, and then design for is failure. Instead of running one node, you run three so there’s no single point of failure, but now there are a million more things that can go wrong. Fault tolerance, including finding ways to be partially available, must be a fundamental part of the system. And you need to keep a close eye on things, so when failures do occur, and they will, you can manually intervene if need be; replace failing nodes, add more capacity, etc.
This is a big change from a single node system. We’ve been building in, around, and on top of our existing API as the cluster has matured and as we’ve begun the shift from a vertically scaling system to a horizontally scalable one. One thing I felt was clear was that a lot of these concerns needed to be built into the system, rather than on top of it.
We’ve been using Apache Curator for the cluster, which is something that grew out of the Netflix Open Source treasure trove. So I knew about Hystrix, and more generally the Circuit Breaker pattern.
According to the Hystrix wiki, Hystrix is designed to do the following:
- “Give protection from and control over latency and failure from dependencies accessed (typically over the network) via third-party client libraries.”
- “Stop cascading failures in a complex distributed system.”
- “Fail fast and rapidly recover.”
- “Fallback and gracefully degrade when possible.”
- “Enable near real-time monitoring, alerting, and operational control.”
Do we want any of that in Stardog? Check, check, and check. That’s a good place to start. It’s even got steps toward monitoring. This can be part of the glue between the different parts of Stardog, but it’s almost an implementation detail rather than something that’s part of the API. And it plays very nicely with RxJava. So now a couple different ideas all seem to be coming together to provide the building blocks for the next generation of the SNARL API.
The switch in design isn’t without drawbacks. I’ve already touched on the familiarity of Future
versus the relative unfamiliarity of reactive APIs. Other areas of concern are that extending Flowable
is cumbersome, but doable, while Observable
is not. The distinction between hot and cold observables is rarely obvious, you really have to pick one style and stick with it, because let’s face it developers aren’t going to read the @implNote
in the docs to find out which is which until after there’s a bug. And debugging can be an exercise in interpreting more convoluted stack traces. Finally, understanding what errors an exception “throws” is less clear. There’s no throws
declaration to rely on as exceptions are reported to subscribers directly. You have to read the javadocs to know what kinds of exceptions can be thrown.
But overall, it’s a very positive step forward for Stardog and something I’m personally very excited about.
Coming Up Next
This is the first in a two part series discussing the upcoming changes to the SNARL API. In the final installment we’ll be taking a closer look at the API itself.
Download Stardog today to start your free 30-day evaluation.
Mike Grove
17 January 2017