So now that I’ve finally got around to releasing a protocol built on top of the RpcLibrary I thought it would be fun to re-run some benchmarks. The protocol has been around unreleased for almost a year waiting on some required changes in protobuf-csharp-port which have finally been published in a release. The protocol I’m speaking of, Google.ProtocolBuffers.Rpc, uses protobuffer services and the RpcLibrary to provide a full-featured rpc client/server implementation.

Benchmark Process
For the comparison benchmarks I wanted to demonstrate both the power of the RpcLibrary as well as the efficiency of this protobuffer library. So I built this rpc test-harness that spawns a server process and multiple client processes (5 of them) each using multiple threads (3 each). These 15 threads wait on a global signal until everyone is ready and then run 3 times in succession for a fixed duration of 5 seconds. The total numbers of calls made is then divided by the thread’s running duration to produce a calls-per-second. Since this happens 3 times on 15 threads, there are 45 results per test that are then used to produce a worst, average, and best time.

Benchmark Test Data
Each call passes a single object in and returns a single object as a result. The object is a collection of a ‘sample data’ class that has the following data: 32 bytes of random data, those 32 bytes as a base-64 encoded string, a sequentially incremented integer, a double, and a date-time value. The graph title below indicates the number of these records passed in and out for each call. Zero is used for an empty collection/message to test the transport and dispatch speed while only serializing an empty collection. The test was completed for all transport/protocols for message sizes of 0, 10, and 100 records. At 1000 records the test duration was increased to 50 seconds and only ProtoBuf_LRPC and Wcf_TCP were executed.

Transports & Protocols
Most of the transports and protocols used should be obvious enough by the abbreviation name in the charts below; however, there are a few worth calling attention to. All the Wcf_xxx are in-process hosted WCF listeners including the Wcf_Http test. All the ProtoBuf_xxx tests used the RpcLibrary except the ProtoBuf_Wcf test. The ProtoBuf_Wcf used protobuffer protocol serialization on a WCF/TCP transport. The WCF service was basically just a Stream argument, Stream return prototype. I see a lot of people packing protobuffers over WCF without realizing that serialization is not WCF’s only problem. All the tests ending with “_Auth” are fully secured and authenticated connections.

Results with Empty Messages
The top of the blue bar is the worst run’s average calls-per-second, the top of the red bar is the average of all 45 runs, and the top of the green bar is the best calls-per-second of all 45 runs.

Clearly there is almost no comparison here, WCF is running at 1/20th of the speed of the RpcLibrary! That remember is with a completely empty message. I was actually pleased to see that WCF even completed these tests, the last time I tried in .NET 2.0 (WCF 3.0) it dropped client connections and caused them to timeout. This time around running WCF with the full 3.5 sp1 framework it ran flawlessly even if slowly.

It Only Gets Worse for WTF, err WCF
The next two benchmarks only widen the gap between the RpcLibrary and WCF. With just 10 ‘records’ RpcLibrary + Protobuffers outpaces WCF by 25 times, and at just 100 records that jumps to more like 40 times faster! After that it seems to start to level off, 1000 records averaged 40.8 calls-per-second on protobuffers+Rpc, and WCF averaged only 0.91 calls-per-second. That is still a 45 fold increase in performance and, in IMHO, it’s far more acceptable to execute a round-trip in 24 milliseconds instead of 1.1 seconds.

So the next time you’re looking to stand up an internal service to talk to on the local machine or within the local intranet you should take a close look at alternatives to WCF. It’s wicked fast and I’ve never seen RPC simply ‘hang’ with a broken connection.