Performance of parallel programs is an interesting – but also tricky – issue. I put together an article for our team blog that talks about the most common reasons why a parallel program may not scale as desired:
Developers ask why one program shows a parallel speedup but another one does not, or how to modify a program so that it scales better on multi-core machines.
The answers tend to vary. In some cases, the problem observed is a clear issue in our code base that is relatively easy to fix. In other cases, the problem is that Parallel Extension does not adapt well to a particular workload. This may be harder to fix on our side, but understanding real-world workloads is a first step to do that, so please continue to send us your feedback. In yet another class of issues, the problem is in the way the program uses Parallel Extensions, and there is little we can do on our side to resolve the issue.
Interestingly, it turns out that there are common patterns that underlie most performance issues, regardless of whether the source of the problem is in our code or the user code. In this blog posting, we will walk though the common performance issues to take into consideration while developing applications using Parallel Extensions.