Shahzad Bhatti Welcome to my ramblings and rants!

September 5, 2007

Performance Testing with C++, Java, Ruby and Erlang

Filed under: Computing — admin @ 4:22 pm

I came across an interesting blog on classical distributed Byzantine Generals Problem from Mark Nelsons’ blog. He showed C++ implementation of Leslie Lamport algorithm. It seemed like a natural fit Erlang, but instead of writing directly in Erlang, I first wrote the algorithm into Java and Ruby (with slight redesign). Unfortunately, the original C++ source, and my Java and Ruby versions are not really truly distributed or concurrent. I just wanted to translate the original algorithm without changing a whole a lot, but Erlang gave you both distributing features and concurrency for free. You can learn about the Byzantine General problem from Mark Nelson’s post or from this brief summary, or the Wikipedia. The algorithm is heavily based on message communication where a general sends a message value (0 or 1), to his lieutenants and each lieutenant sends the value to all other lieutenants including himself. After some specific number of rounds a consensus is reached. The original C++ code defines a Node structure to store input/output values of each process, a Traits class to store configuration and a method to come up with new values. Since, some processes can be faulty (or traiters), the Traits class determines the value that a process will return. The most of the fun happens in the Process class that performs all message communication. You can download the original C++ source code.

I slightly changed the structure and got rid of static attributes and methods. In my design, the application consists of following modules:

  • Configuration – for storing configuration for simulation such as number of processes, rounds, source (general) process.
  • Strategy – that determines the value that should be returned by the process.
  • Repository – that stores hierarchical paths for messages that were communicated by the processes.
  • Process – GeneralProcesses classification that send or receive messages.
  • Engine – drives the algorithm.
  • Test – that actually invokes engine with different parameters and stores timings.

Here is the Java implementation (I am showing multiple java classes here, but you can download the complete source code from here. I kept most of the algorithm as it was, which turned out to be really memory hog when you increase number of processes. In fact, I could not run the Java application for more than 10 processes with 512MB memory. Here is the source code for the Java version:

Java
Java
Java
Java
Java
Java
Java
Java
Java
Java
Java

 Here is the ruby version, which is pretty much same as the Java version (except a bit smaller). The operator overloading and power of collections give Ruby the terseness over Java version. Again, the application is consisted of classes for process, strategy, repository, engine, node, and config. The benchmarks are kicked off from ByzGeneralTest, which calls Engine, which in turn creates Repository, Strategy, Generals and LProcesses (lieutenants).

Ruby
Ruby
Ruby
Ruby
Ruby
Ruby

Finally here is the Erlang source code (Again I am showing multiple modules here). I tried to break the structure same as my Java and Ruby version and the application consisted of process, strategy, repository, engine, node, and config modules. The benchmarks were kicked off from the byz_general_test, which invoked engine and engine created processes for repository, generals and lieutenants. Another distinction is difference between active objects and regular functions. In this design, repository and processes are active objects that can receive messages via Erlang’s message communication primitives. Also, I added message counter just to keep track of number of messages for the simulation (though it is not requirement for the algorithm). I used ets APIs in repository to store paths of messages.

Erlang
Erlang
Erlang
Erlang
Erlang
Erlang
Erlang

I merged the results with following script:

Ruby

And here are the results of benchmark that I ran on my Dell notebook (Note I used time function in C++ which only returns timing in seconds, so C++ timings are actually not precise.):

Conclusion:

This was interesting problem that tested limits of these languages. For example, I found when using more than 10 processes things got really nasty. The Java program gave up with OutOfMemory. The Erlang program dumped crashed, though I would have used OTP’s supervisors if I was writing more fault tolerant application. Ruby program became too slow, so I had to kill it. Java turned out to be the performance leader in this case and I was a bit surprised when Erlang’s response time became really high with 9 processes. As, I mentioned earlier, the C++, Java and Ruby versions are not really concurrent and their message passing is really method invocation. As far as mapping algorithm to the language, I found that Erlang fit very nicely for distributed algorithms that involve a lot of communications. I left the C++, Java and Ruby version very simple and didn’t try to implement truly independent processes and communication because that would have required a lot more effort.

Resources:

Source Code

No Comments

No comments yet.

RSS feed for comments on this post. TrackBack URL

Sorry, the comment form is closed at this time.

Powered by WordPress