Simple Binary Encoding

jamesaguilar · on May 5, 2014

Curious to see how this stacks up with other generic binary formats. Off the top of my head, protocol buffer can be made to have this kind of wire format (root fields in front, variable fields in the back), but I don't know if partial decoding or streaming decoding is possible, and it definitely requires some user effort to make it this way.

rch · on May 5, 2014

> ~25X greater throughput than Google Protocol Buffers (GPB) with very low and predictable latency

Seems like it might take more than structure tweaking to overcome the quoted delta. I'd still like to see a general comparison too though, especially to Cap'n Proto on the one side and HDF5 on the other.

kentonv · on May 6, 2014

Looking at their benchmarks -- specifically, in C++ -- I immediately see a coding error that probably makes the protobufs version run half as fast as it should. Namely, they make multiple redundant calls to message.ByteSize(). ByteSize() is not free -- it has to walk the whole message to compute the size, doing many of the same expensive branches that would be needed when actually serializing. Had they profiled their benchmark, they would have noticed this.

This is just from one minute of looking. There are probably more problems. Unfortunately, I find this is typical of benchmarks that claim something is massively faster than Protobufs. Admittedly, this may be indicative of the Protobuf interface being unintuitive.

Not that I doubt that SBE is faster; the design should obviously make it so. But it's probably not 25x faster.

Update: Bug filed: https://github.com/real-logic/simple-binary-encoding/issues/...

jdf · on May 5, 2014

Good point, although it's not like the stock protobuf implementation (rather than protocol) is the best that can be had. For example, this implementation of protobuf deserialization is quite a bit faster:

https://github.com/haberman/upb

http://blog.reverberate.org/2011/04/upb-status-and-prelimina...

haberman · on May 6, 2014

Hi there, upb author here, thanks for the shout-out!

The stock protobuf implementation is pretty close to optimal given what it does. Where upb is faster, it is faster by doing less.

For example, I can beat protobuf if you don't write the parsed data into a tree structure (or only write a few fields). I can beat it if your input has unknown fields and you don't care about preserving them. But if what you need is a tree structure that contains 100% of the input data, protobuf is hard to beat speed-wise (though there are still a few tricks, like arena allocation, that can beat it given some usage patterns).

jamesaguilar · on May 5, 2014

Interesting. Well I hope that if they are indeed faster in the general case, protocol buffers can learn something from them.

kasey_junk · on May 5, 2014

The design principles between SBE & Cap'N Proto are very similar, so in that some part the lessons are already learned.

I haven't bench marked either of those 2 against each other but would not be surprised if they were very close to each other in the C++ implementations. That said, SBE does provide a non-JNI java implementation which as far as I know is not available for Capnproto yet. SBE also provides a C# implementation which Cap'n proto doesn't. That in and of itself may be enough for many people (Cap'N seems to have a lot of languages under development that SBE doesn't fwiw).

haberman · on May 6, 2014

This sounds very similar in concept to Cap'n Proto, and given how much hard-earned experience Kenton has and how much time he's put into it, I'd put my bets on Cap'n Proto if you're looking for something in this design space.

I haven't looked at SBE in detail, but given the high-level description I expect it would have a similar set of trade-offs vs. Protocol Buffers as Cap'n Proto does: it will have unparalleled serialize/deserialize performance (since they are basically no-ops), but sparse messages (where only a few of many defined fields are set) will be arbitrarily larger than protobufs, so a separate compression step will be necessary to be size-competitive, at which point some of the speed advantage is lost.

kasey_junk · on May 6, 2014

Kenton Varda deserves the respect you are giving him when it comes to serialization libraries. Cap'n Proto is a great project that I'd love to work with anytime. That said, Martin Thompson whose blog this is a link to is also very well respected in his particular domain (low latency messaging systems especially on the JVM and JVM/C++ hybrid space).

All that's to say, the Cap'n Proto and SBE libraries have exceedingly similar design philosophies and pedigrees. I expect them to have near performance parity in the shared language of C++. But if you are looking for a JVM/C# solution I think you can pretty reasonably choose to start working with SBE today as it provides those language bindings out of the gate and there is unlikely to be such a performance mismatch between SBE and Cap'n Proto in the future for it to matter long term.

haberman · on May 6, 2014

I can totally believe that Martin Thompson is highly capable given what I read in the article. I guess my bias is against a proliferation of similar-but-incompatible messaging formats. But I suppose that if Cap'n Proto doesn't support JVM/C#, then that doesn't leave someone on those platforms a lot of options.

kentonv · on May 6, 2014

There's been a lot of false starts on getting Java support implemented (people saying they'd do it, and then not), and one or two for C# as well. I wish I had time to just sit down and crank out a Java implementation myself -- I think it would take a week or two -- but right now I really need to be putting everything into Sandstorm.io. :/

It will happen eventually, though.

kasey_junk · on May 7, 2014

I think one problem with this is that a JNI solution would be "trivial" but is wholly unsatisfying (JNI being somewhat incompatible with the performance goals). The Unsafe and/or ByteBuffer solution on the other hand would be a lot of work to create and maintain (so much so that I wouldn't sign up for the job).

SBE on the other hand is targeting a space dominated by C++/Java/C#. They can basically ignore every other language binding without losing any of their target audience. It also doesn't hurt that they have a patron with a financial motivation to back them...

ubercode5 · on May 6, 2014

I would be interested to see a binary size comparison along with the speed comparison. Is there a trade-off of speed for byte size?

kentonv · on May 6, 2014

Compared to Protobuf, there is definitely going to be a trade-off. Protobuf uses variable-width integer encoding whereas SBE uses only fixed-width integers which means there will be a lot of zero-valued high-order bytes and padding taking up space. But variable-width encoding is costly because it involves a lot of unpredictable branching -- this is exactly where SBE's speed advantage comes from. The exact amount of overhead will vary depending on usage; it could be about the same size, or it could be many times larger. (This is a general problem with benchmarks of serialization systems: the difference is highly dependent on use case, so you should never assume your own application will see the same kind of performance as any particular benchmark.)

Cap'n Proto also uses fixed-width encoding, but offers an optional "packing" step that is basically a really fast compression pass that aims only to remove zeros. I've found that this usually achieves similar sizes to Protobufs while still being faster, because the packing algorithm can be implemented it a tight loop whereas a Protobuf encoder usually involves sprawling generated code that does all kinds of things. SBE could pretty easily adopt Cap'n Proto's packing algorithm.

SBE is probably slightly leaner than Cap'n Proto because they don't use pointers. Instead, SBE outputs the message tree in preorder. Only fixed-width elements are found at fixed offsets; the rest have to be found by traversing the tree. The down side of this is that you want to find one element in a list, you must first traverse all of the elements before it (and all of their transitive children). Cap'n Proto's approach using pointers allows you to process input in the order that is most convenient for you (rather than in the exact order in which it was written) and makes it easy to seek directly to one particular element (which is awesome when you're mmap()ing a huge database), but does have a slight cost in terms of message size.

kasey_junk · on May 7, 2014

I'd add that SBE is targeting the FIX community, where there are many, but bounded message types and the incidence of lists (or lists of lists) is rare and already segmented to the sections of systems that people associate with lower performance requirements.

Cap'n Proto does not seem to have this focus (forgive me for assumptions about intentions) and therefore is more designed around supporting the general data structure case.

I'd say that in the cases where the performance differences between these 2 systems really matter you would be unable to make a fair judgement without implementing your real world problem in both and testing the differences. That said, I had an argument today with a colleague about stop bit encoding vs fixed width encoding and without testing real world solutions I was hesitant to make any claims, even though my experience and bias would be toward fixed width encoding.

As for Protobuf, it has been a while since I've worked with them, but when I did I had 2 specific performance bottle necks other than the variability (on the JVM) 1) was encoding non-primitive fields and 2) was not being able to do allocation free programming with them.