Jun

Overview of concurrency in .NET Framework 3.5

There is a lot of information on the concurrent primitives and concepts exposed by the .NET Framework 3.5 available on MSDN, blogs, and other websites. The goal of this post is to distill the information into an easy-to-digest high-level summary: what are the different pieces, where they differ and how they relate. If you want to know the difference between a Thread and a BackgroundWorker, or what is the point of interlocked operations, you are reading the right article.

For each construct, I will give a motivating usage example and explain how it relates to other concurrent constructs.I will not attempt to cover everything there is to say about concurrent programming on .NET Framework, but whenever possible, I will link to more in-depth resources on each topic. Entire books have been written on concurrent programming for different platforms, so I certainly cannot fit everything into a single blog posting. But, I will walk you through the concurrent constructs and primitives so that you understand what is out there, and know where to look for more information if you need to.

There are three key concepts that .NET concurrency primitives relate to: concurrent execution, synchronization and memory sharing. Let’s go over them one by one.

Concurrent execution

In order to move from sequential to concurrent programming, we need some way to do multiple things at the same time. In the .NET Framework, threads are the main mechanism for concurrent execution.

For example, this program reads integers from the console and factors them into primes, executing each factorization on a separate thread:

static void Factor(long x) {
    var factors = new List<long>();
    for (long i = 2; i * i <= Math.Abs(x); i++) {
        while (x % i == 0) {
            x /= i;
            factors.Add(i);
        }
    }

    if (x != 1 || factors.Count == 1) factors.Add(x);

    Console.WriteLine(
        "{0} = {1}",
        x,
        string.Join(" x ", factors.Select(i => i.ToString()).ToArray())); }
}

The nice thing about this program is that even if some integer takes a long time to factor, the program will remain responsive. To see this, run the above program, and enter 100000000000000019. While the program is busy attempting to factor the large prime, you can continue to type in integers, and each answer pops up as soon as it is computed.

A disadvantage of the above approach is that if you use this program to factor many small integers, you will end up creating a lot of threads. Creation and destruction of threads is fairly expensive, so you will pay a significant performance cost. A thread pool is a construct to mitigate that cost. Instead of creating a new thread each time we want to perform some work asynchronously, we just enqueue the work on the thread pool. The thread pool keeps recycling the same threads to execute more and more work items, and thus avoids the cost of creating and destroying threads frequently. A thread pool is exposed in .NET via the ThreadPool class. We can rewrite the Main method from the previous example to use ThreadPool instead of directly creating threads:

static void Main()
{
    while (true)
    {
        string line = Console.ReadLine();
        long x = long.Parse(line);
        ThreadPool.QueueUserWorkItem((o) => Factor(x));
    }
}

Applications with a graphical user interface are one class of programs that benefit from concurrency. Frozen interface is a terrible user experience, and concurrent programming is one way to solve that problem. BackgroundWorker is a class designed to make it easier to integrate asynchronous execution with application that use Windows Forms. BackgroundWorker provides the RunWorkerCompleted event that fire once all work is done, as well as the ProgressChanged to report periodically how much of the work has already been completed.

The main feature of the BackgroundWorker, and arguably the reason for its existence, is that events RunWorkerCompleted and ProgressChanged fire on the UI thread rather than on the thread on which the worker is executing. This is very important for integration with Windows Forms, because UI elements may only be modified from the UI thread. BackgroundWorker accomplishes this by posting messages from the worker thread, which get picked up by the event loop on the UI thread.

Synchronization

In most real-world usages, threads are not fully independent. Instead, they coordinate their progress and wait on each other at predetermined places. The mechanism for thread coordination is called synchronization.

The most important example of synchronization is mutual exclusion. Often, it is important to be able to ensure that two threads will not execute in some section of code at the same time. If one thread is executing in the critical section, and another thread attempts to enter that section, the second thread will have to wait until the first thread exits the critical section.

Monitor is the simplest primitive for mutual exclusion provided by the .NET framework. It allows any .NET object to be treated as a lock. Only one thread can be inside a region protected by a monitor which is associated with a particular object. However, note that two different threads can be inside two monitors which are associated with different objects. Here is an example that uses Monitor to implement a thread-safe Account class:

class Account {
    private int balance;
    private object myLock = new object();
    
    public void Deposit(int amount) {
        Monitor.Enter(this.myLock);
        try {
            this.balance += amount;
        }
        finally {
            Monitor.Exit(this.myLock);
        }
    }
    
    public void Withdraw(int amount) {
        Monitor.Enter(this.myLock);
        try { 
            if (this.balance < amount) {
                throw new InvalidOperationException("Balance too low.");
            } 
            this.balance -= amount; 
        } 
        finally {
            Monitor.Exit(this.myLock);
        }
    }
}

Since mutual exclusion using Monitor is such a common pattern, some .NET languages have a special language construct for it. C# is one of those languages, so we can rewrite the Withdraw() method from the Account class as follows:

public void Withdraw(int amount) {
    lock (myLock) {
        if (this.balance < amount) {
            throw new InvalidOperationException("Balance too low.");
        }
        this.balance -= amount;
    }
}

Mutual exclusion imposes a constraint that at most one thread executes in a particular critical section at a time. However, a looser synchronization pattern is often useful, where we distinguish between a read and write access to a critical section. Many readers can access the same critical section at the same time, but if a writer is in the critical section, all other readers and writers must stay out. As a result, throughput is improved if there are many more readers than writers, particularly on a multi-core machine.

This pattern of usage is supported by the ReaderWriterLock class in the .NET framework. However, it turns out that the implementation of ReaderWriterLock has performance issues, and as of version 3.0, .NET includes a new class ReaderWriterLockSlim. Long story short, you should always use ReaderWriterLockSlim rather than ReaderWriterLock.

As an example, let’s implement another class to represent a bank account, this time with reader-writer locks:

class Account2 {
    private int balance;
    private int overdraftLimit;
    ReaderWriterLockSlim rwLock = new ReaderWriterLockSlim();
    
    public Account2(int balance, int overdraftLimit) { ... }
    
    public void SetOverdraftLimit(int overdraftLimit) { ... }
    
    public void Deposit(int overdraftLimit) { ... }
    
    public void Withdraw(int amount) {
        this.rwLock.EnterWriteLock();
        try {
            if (this.balance + this.overdraftLimit < amount) {
                throw new InvalidOperationException("Balance too low.");
            }
            this.balance -= amount;
        }
        finally {
            this.rwLock.ExitWriteLock();
        }
    }
    
    public int GetBalance() {
        this.rwLock.EnterReadLock();
        try {
            return this.balance;
        }
        finally {
            this.rwLock.ExitReadLock();
        }
    }
}

If calls to GetBalance() are much more frequent than calls to Deposit(), Withdraw() and SetOverdraftLimit(), we should observe a significant performance improvement from using ReaderWriterLockSlim over Monitor.

Reader-writer lock is one possible generalization of the mutual exclusion concept. Another possible generalization is to limit the number of threads executing in a critical section to a number larger than one. This problem certainly does not arise as frequently as the critical section problem, but it is frequent enough to be addressed by a classic computer science construct known as a semaphore. In .NET, semaphores are exposed via the Semaphore class. This code sample implements method ThrottledQueryDB which limits the number of threads that can concurrently call QueryDB() to 5:

class SemaphoreSample {
    private Semaphore semaphore = new Semaphore(0, 5);
    
    public void ThrottledQueryDB() {
        semaphore.WaitOne();
        try {
            QueryDB();
        }
        finally {
            semaphore.Release();
        }
    }
}

However, there are many synchronization scenarios that are more complicated than mutual exclusion. For example, we may want one thread to wait until another thread computes an result, and only proceed once the result is ready. Monitor’s methods Wait, Pulse and PulseAll are often the easiest way to implement this scenario. The nice thing about using Monitor’s methods is that Pulse and Wait are called while holding a lock. If we use events (described below) rather than Monitor signaling, it becomes much harder to avoid race conditions. Here is a simple implementation of a producer-consumer exchange data structure that has a capacity for one element, and blocks both consumers and producers as appropriate:

class ProducerConsumer<T> {
    private T value;
    private bool isEmpty = true;

    public void Produce(T t) {
        lock (this) {
            while (!isEmpty) {
                Monitor.Wait(this);
            }
            this.value = t;
            isEmpty = false;
            Monitor.Pulse(this);
        }
    }
    
    public T Consume() {
        lock (this) {
            while (isEmpty) {
                Monitor.Wait(this);
            }

            isEmpty = true;
            Monitor.Pulse(this);
            return this.value;
        }
    }
}

An even more powerful construct than Monitor signaling is an event. In .NET, events are represented via the ManualResetEvent and AutoResetEvent classes. The main two operations on an event are WaitOne and Set. Various threads can "wait" on the event by calling WaitOne. Each call to WaitOne will block, until some other thread calls Set, at which point all threads that have been waiting on the event wake up. The difference between the two events is that AutoResetEvent automatically resets into the "unset" state itself after it has been set, while the ManualResetEvent can be reset manually by calling the Reset method.

For a usage example of events, see for example the MSDN article, How to: Synchronize a Producer and a Consumer Thread.

Memory sharing

I talked about how to run code on different threads, and also how to synchronize them. One thing that I did not talk about is how to access shared memory from different threads.

First, the nice part: if all reads and writes to a particular memory location are protected by the same lock, each read will see the most up-to-date value written to that memory location. Reads and writes will behave exactly as you would expect. That is pretty nice and simple, and in 99% of cases, all that you need.

But, once we leave the safety of locks and attempt to access the same memory from multiple threads, things get much, much more complicated. I won’t get into details here, but in summary: many properties that we take for granted in single-threaded programming (and get for free in multi-threaded programming with proper locking) break down when we try to share memory among threads without locking. Reads and writes may get reordered in strange ways. For example, if one thread updates value X and then updates value Y, and another thread reads value Y and then value X, the reader thread may see the new value of X, but not the new value of Y. Or, one thread may set a flag, and another thread may loop forever checking the flag, never finding out that the flag has in fact already been set.

If you want to use low-lock techniques, you need to precisely understand the guarantees that the CLR makes about memory operations in situations where multiple threads access the same memory without locking. This is not for the faint at heart, but if you are interested, you can read up on the CLR memory model in Vance Morrison’s article.

As I said, low-lock techniques are hard. Despite that, it is useful to know what is out there. So, let’s move on to describe the constructs offered by the framework.

One technique to avoid locking is to take advantage of atomic operations exposed as static methods on the Interlocked class. These atomic operations are available:

Add: given two integers, replaces the first one with their sum, and returns the sum.
CompareExchange: if a value is equal to a comparand, replaces it with another value.
Decrement: decrements an integer and returns the new value.
Exchange: sets a variable to some value.
Increment: increments an integer and returns the new value.
Read: reads a 64-bit value.

Now, the operations on the above list may seem pretty uninteresting. The reason why they are useful is that they are atomic. The entire operation either takes place fully or not at all, and no other thread may observe the intermediate invalid state. That is a very nice property, typically difficult to achieve in low-lock situations.

For example, consider this Counter class:

class Counter {
    int count = 0;

    public void Increment() {
        Interlocked.Increment(ref count);
    }
    
    public int Count {
        get { return count; }
    }
}

You could implement the same method by replacing the Interlocked.Increment(count) with count++. But, that would only work if all calls to the Increment method are externally synchronized. The ++ increment operator is a notorious example of a simple operation that is not atomic, and if multiple threads attempt to increment the same field at the same time, some increments may get lost, and an incorrect count will get computed. Interlocked.Increment provides an atomic increment operation that does not require locks, which can be very useful.

Another low-lock technique is the use of volatile fields. Some of the strange cases I mentioned earlier (such as a thread seeing a value written earlier, but not a value written later) go away when you mark the shared memory as volatile. Volatile fields prevent the some kinds of reordering from happening: no reads or writes can move before a volatile read, and no reads or writes can move after a volatile write. Understanding precisely what these restrictions allow you to assume in practice is unfortunately quite hard, but Vance Morisson’s article helps.

If a field is volatile, then all reads and writes to that field will be volatile. To make a particular read of a non-volatile field volatile, use the VolatileRead static method on the Thread class. VolatileWrite method is available as well, but due to some bizarre consequences of the .NET memory model, there is no difference between a volatile write and a regular write. At least, that is the case to the best of my knowledge, in .NET 3.5.

As I said earlier, volatile reads and writes prevent other reads and writes from moving across them in one direction. MemoryBarrier method on the Thread class is creates a fence that cannot be crossed in either direction by reads or writes.

Finally, one construct that is quite often useful is a thread-local static field. A thread-local static field differs from regular static fields in that it can hold multiple values at the same time. Specifically, each thread has its own copy of the field that is completely independent from the copies that other threads see. Each thread can read and write into its own copy. To make a static field thread-local, mark it with the ThreadStatic attribute.

This static class uses a thread-local field to track how many times each thread called the Register() method:

static class ThreadAccessCounter {
    [ThreadStatic]
    private static int count;

    public static void Register() {
        count++;
        Console.WriteLine("This thread has called Register() {0} times", count);
    }
}

Comments and Conclusion

These are the most important concurrency constructs and primitives in .NET 3.5… quite a list! And, that list does not even include Parallel Extensions, the community technology preview of which was released two weeks ago. (For those unaware, I am a developer at the Microsoft team developing Parallel Extensions.)

As is usual, all code samples in this posting are provided as-is, with no guarantees.

Related

Managed Threading Best Practices [msdn.microsoft.com] covers concurrency gotchas and how to avoid them, which is something I did not get to in this article.
C# 3.0 in a Nutshell [book] contains an in-depth chapter on concurrency. The chapter is also available online.
Concurrent Programming on Windows Vista [book] is available for pre-order, and will be an excellent resource for concurrency on Windows and .NET.

Tags: Concurrency

Posted by Igor Ostrovsky Concurrency Subscribe to RSS feed

20 Comments to “Overview of concurrency in .NET Framework 3.5”

Reflective Perspective - Chris Alcock » The Morning Brew #116 says:

June 17, 2008 at 1:49 am

[…] Overview of concurrency in .NET Framework 3.5 – Igor Ostrovsky takes a overview of the key concepts in concurrent programming on .NET, looking at Concurrent executions, synchronization and memory sharing. […]
Matt says:

June 22, 2008 at 11:26 am

Excellent post! Very informative and detailed.

Another option that a lot of people don’t know about is the Concurrency and Coordination Runtime (CCR) that ships with the Microsoft Robotics Studio. It’s not directly related to robotics, but robotics are a good application for it.

I wrote a little bit about it over here:
http://thevalerios.net/matt/20.....cy-in-net/
It’s definitely worth checking out.
Lee says:

July 9, 2008 at 8:43 pm

This is some very useful information.

I have a question that goes beyond concurrent thread processing. Is there any way to share mapped memory between two or more Windows processes (not threads) using .NET 3.5?
Igor Ostrovsky says:

July 21, 2008 at 1:46 am

Lee: the closest thing to memory sharing between processes in .NET is remoting, as far as I know. If you search the web for the phrase .NET remoting, you will find lots of information on how it works and how to use it.
Phil Hackett says:

July 21, 2008 at 8:12 am

Lee: You might also be able to write yourself some unmanaged code to handle memory sharing, and call into that from .net.
Article on Concurrency Issues Development in .NET | .NETics says:

July 21, 2008 at 7:31 pm

[…] Here is the complete article. […]
Overview of concurrency in .NET Framework 3.5 « Rams On It - .NET says:

July 28, 2008 at 7:30 pm

[…] Overview of concurrency in .NET Framework 3.5 28 Jul 2008 Posted by ramsonit in .NET Core, Advanced Basics, Introduction. trackback There is a lot of information on the concurrent primitives and concepts exposed by the .NET Framework 3.5 available on MSDN, blogs, and other websites. The goal of this post is to distill the information into an easy-to-digest high-level summary: what are the different pieces, where they differ and how they relate. If you want to know the difference between a Thread and a BackgroundWorker, or what is the point of interlocked operations, you are reading the right article. Read more. […]
James says:

August 2, 2008 at 3:40 am

Hey Igor, nice article. Your very first code snippet seems to have the main method missing though.
Anonymous says:

September 16, 2008 at 9:32 am

pp
Windows High Performance Computing (HPC) and Programming Resources « Angel “Java” Lopez on Blog says:

November 4, 2008 at 4:51 am

[…] at Google TalksConcurrency: What Every Dev Must Know About Multithreaded AppsOverview of concurrency in .NET Framework 3.5 | Igor Ostrovsky BloggingParallel Programming with .NETParallel Computing Developer Center from MicrosoftParallel Virtual […]
Recursos de Windows High Performance Computing (HPC) y Programación - Angel "Java" Lopez says:

November 8, 2008 at 8:45 am

[…] en Google TalksConcurrency: What Every Dev Must Know About Multithreaded AppsOverview of concurrency in .NET Framework 3.5 | Igor Ostrovsky BloggingParallel Programming with .NETParallel Computing Developer Center de MicrosoftParallel Virtual […]
User links about "concurrency" on iLinkShare says:

November 29, 2008 at 11:48 pm

[…] public links | iLinkShare 5 votesThinking in Concurrency>> saved by Tryggi 1 days ago5 votesOverview of concurrency in .NET Framework 3.5>> saved by pixelnaut 1 days ago4 votesThreadless Concurrency with Actors for Elite Java […]
Lixin says:

March 6, 2009 at 8:29 pm

Good article, Simple and Clear. Thanks!
Eduard S. says:

March 9, 2009 at 2:24 pm

Good subject overview!

But [ThreadStatic] makes sence only on static fields, so ThreadAccessCounter example should have:

[ThreadStatic]
private static int count;

otherwise all threads will be seeing the same instance “of count”..

Cheers,
Eduard
Igor Ostrovsky says:

March 10, 2009 at 1:37 am

Thanks, Eduard S. Of course, you are entirely right. I corrected the article.
Bekir says:

August 20, 2009 at 7:01 am

Great article !

As James mentioned earlier, your very first source code lacks a main method, and the threading related part, by the way.

Thanks again for this introductory article !
R.P.Kumar says:

May 7, 2010 at 9:43 pm

thanks its really helpful
developer
Richonet Technologies Pvt. Ltd
http://www.richonet.com
Andrew Borodin says:

June 9, 2010 at 12:39 am

Hello.
I can’t find a name of the structure.

I’m building tree-like index. It’s pages are interlocked for read during downward walk and for write with upward, so it’s safe to insert records in multiple threads.
So, data is devided into small buckets of 64k records, which are enqueued in tasks queue. My thread pool creates (3.14/2)*(processor count) worker threads.
If there are many concurrent loads it’s to devide workers among them, so that lots of workers will not stuck around root page of one index. Each task should get equal number of workers.

This is a mix of thread pool and semaphore, or balancing semaphore.

I didn’t find standart implementation for such structure, because I did’nt know how is it called. Currently I’m using my implementation, but it’d be very good to switch to common component.

Cheers, Andrey.
raviteja says:

March 10, 2013 at 11:41 pm

i want to know the exact usage of synchronise & synchronise()?
Anonymous says:

January 2, 2017 at 3:12 pm

The Vance Morrison article link is broken, but is now available at https://blogs.msdn.microsoft.com/vancem/2016/02/27/encore-presentation-what-every-dev-must-know-about-multithreaded-apps/