Concurrency Constructs « Shahzad Bhatti

October 18, 2008

Concurrency Constructs

Filed under: Computing — admin @ 11:43 am

With the proliferation of multi-cores, concurrency has been a hot topic for past few years. I described a few ways to tackle concurrency in my blog earlier this year. As I mentioned in my blog, this problem is nothing for scientific and research community. Also, many languages have had devised numerous constructs to tackle these abstractions and other languages are adding support for concurency. I will go over some of these abstractions here:

Processes

All modern operating systems support preemptive multitasking and processes have been simple way to add concurrency. For example, this technique is used by Apache web server that launches multiple processes and serve thousands of clients at a time. There are also a number of system primitives for inter-process communication such as shared memory, sockets, pipes and other forms of middlewares and peer to peer technology. This technique is specially suitable for many scripting languages that don’t have concurrency support such as Perl, Ruby, CPython, etc. Despite their scripting nature, these languages are highly productive. For example, much of Amazon’s retail website is built using Perl/Mason and uses multiple processes served by FastCGI for handling concurrent requests. Though, Ruby and Python support green threads, but are not designed for concurrency. For example, Global Interpreter Lock (GIL) in Python only allows one thread to execute at a time. Similarly, Ruby has Global VM lock that also allows one thread to execute at a time. These limitations have also effected kind of applications you can built. For example, one of criticism of Ruby on Rails framework has been its lack of concurrency. Though, there have been a number of solutions that tackle this problem by multi-processing such as clusters of mongrel, FastCGI, etc.

This model is also used by Fork/Join, Map/Reduce, Master-Slave and Leader/Follower style of parallel programming, where the master process creates work and distributes work to workers, who work on it. There are a plenty of libraries and frameworks available for this such as Globus, Hadoop, MPI/MPL, PVM and Fork/Join in Java 7.

Kernel Threads

Many languages support kernel threads that are mapped to lightweight processes (LWP) and scheduled by the operating systems. The operating system context switches between threads using time-slicing that give perception of concurrency. Posix compliant operating systems offer a number of solutions for inter-thread communication and synchronization such as mutex, semaphores, condition variables. Many languages offer support for high level synchronization such as concurrent library in Java.

Green Threads/Fibers

As opposed to native threads, green threads run in user space and scheduled by using yield to allow other thread/fiber to execute. As user threads or fibers rely on collaboration, they are easily subject to starvation. Some languages encapsulate blocking I/O with nonblocking I/O and automatically yield fiber when they make blocking call. Effective use of fibers can help applications scale more than multi-processing or multi-threading, which are generally limited by the system resources. For example, in a performance comparison between Apache and Yaws, where latter was built in Erlang that uses user processes was able to scale upto 80,000 concurrent connections whereas Apache died at around 4000 connections. Ruby 1.9 has just added support for fibers, which can be used to build streams, e.g.

 fib = Fiber.new do  
    x, y = 0, 1 
    loop do  
     Fiber.yield y 
     x,y = y,x+y 
    end 
   end 
   20.times { puts fib.resume }

Streams

Streams are sort of iterators over infinite lists and often used in functional languages, though many imperative programming languages now support them to create infinite lists. You can find a number of examples of Strams in languages like Groovy, Scala, Haskell, etc. Python offers similar capability via generators that contain yield keyword, which causes compiler to suspend generator until next() method of generator is called. For example,

 def generate_ints(N):
     for i in range(N):
         yield i
 
 >>> gen = generate_ints(2)
 >>> gen.next()
 >>> gen.next()

Coroutines

Coroutines are similar user threads and unlike subroutines, they have multiple entry and exit points. They are often built using contiuations or generators. In C, setjmp and longjmp can be used to implement coroutine. Ruby 1.9 added support for fibers that are more accurately semi-coroutines [pp 167].

Continuation

Continuation work like saving game at a checkpoint and resuming it. Traditionaly goto were used to implement continuations, though a number of languages have native support for continuations such as Scheme, Smalltalk, Ruby (callcc), Scala, C (setjmp/longjmp), etc. Though, they are not without problems but Seaside web framework based on Smalltalk has shown innovative way to write web applications using continuation passing style.

Actor model

I was introduced to actor model in mid mid 90s when working on graduate and post-graduate work. An actor has its own thread of execution with a mailbox. It communicates to other actors by sending messages that are delivered to its mailbox. Erlang, a language designed for concurrency, is based on actor model and more recently Scala has adopted actor model as well. When I worked for high performance computing (HPC) area in 90s, I used SGI systems that were built using NUMA based shared memory and messaging passing systems/libraries such as MPI, MPL, etc. I found message passing systems were much more scalable than shared memory, so I believe this style of programming will have best chance of succeeding. Though, I admit I found converting algorithms into message passing style is not easy.

Reactive or Event driven style

In reactive or event driven style, systems communicate by sending or listening on events. It’s sort of flip-style of threading, instead of creating new thread for each new concurrent task, it uses a fixed number of threads that cooperate. It is more scalable than multi-threading (kernel) as native threads are limited by available resources and context switching is expensive. For example, event-driven Mongrel server is much more scalable than Mogrel server. A number of J2EE application servers such as Glassfish, Jetty and Tomcat 6 have adopted Grizzly framework that uses asynchronous I/O to implement reactive style servers.

Software Transactional Memory

It’s difficult to write correct program using shared memory and as it requires proper locking of shared data and special attention to thread starvation and deadlock issues. A number of languages like Clojure, Haskell and Fortress have added support for software transactional memory that provide implicit locking through transactions. I have not used this style of programming but it seems to make writing concurrent applications easy. Though, I believe scalability with shared memory may only help with systems upto a few cores or processors.

TupleSpace based model

In tuple space, processes communicate to each other by storing messages in tuple spaces. The tuple space provides simple APIs like get, put, read and eval. Where get/read are blocking operations to implement dataflow based application and eval creates a new process to execute a task and store result in the tuple space. I built a similar system called JavaNOW, though there are a number of open source and commercial frameworks availeble such as JavaSpaces, GigaSpaces, Blitz.

Declarative concurrency

I heard recently Anders Hejlsberg and Guy Steele talked about concurrency and Anders suggested declarative concurrency where the programmer indicates parts of the applications that can be run concurrently. In a way, it is similar to compiler based parallelism used in High performance Fortran or Parallel Fortran that used data flow analysis to create parallel applications.

In nutshell, I find that more and more languages are adding constructs and frameworks to address concurrency and we will have to get used to these constructs to write programs that are future proof.

Shahzad Bhatti Welcome to my ramblings and rants!

October 18, 2008