Uncategorized

How to write a self-printing program

As promised at the end of my recent post, I am going to explain how to implement a program that prints itself, in addition to doing other things (like playing Game of Life).

A self-printing program - also called a quine - is a program that prints out its own source code. I will describe one simple way to implement a quine that can be adapted to just about any programming language. The technique does not depend on any unusual language features, but also does not necessarily yield the shortest possible quine in a particular language.

The main idea behind this quine implementation is simple. The quine will consist of two parts: a definition of a string and the program core. The string will contain the source code of the program core. And, what will the program core do? It will print the string twice: once to print the string definition, and again to print the program core.

Continue reading »

Self-printing Game of Life in C#

Conway’s Game of Life has fascinated computer scientists for decades. Even though its rules are ridiculously simple, Conway’s universe gives rise to a variety of gliders, spaceships, oscillators, glider guns, and other forms of “life”. Self-printing programs are similarly curious, and - rather surprisingly - have an important place in the theory of computation.

What happens when you combine the two? You are about to find out, but one thing is for sure: the geekiness factor should be pretty high.

I wrote a little C# program that contains a Game-of-Life grid. The program advances the game grid to the next generation and prints out a copy of itself, with the grid updated. You can take the output, compile it with a C# compiler, run it, and you’ll get the next generation of the game. You can iterate the process, or change the initial grid state manually.

Continue reading »

Numbers that cannot be computed

Did you know that there are numbers that cannot be computed by any computer program? It is weird, but true.

And by number, I mean just an ordinary real number. As a perhaps unnecessarily simple example, the result of the division 1/7 looks like this:

        0.1428571428571428571428571428571414285714285714285714285714285714…

We can easily implement a program that prints this number. The decimal expansion of 1/7 is infinite, so the program will have to run in an infinite loop to print the “whole” number. Here is a C# implementation:

    static void Main()
    {
        Console.Write(“0.”);
        while (true) Console.Write(“142857″);
    }

Continue reading »

One LINQ operator to rule them all

SelectMany is a fascinating operator in LINQ to Objects. For one thing, it is not as intuitive as most other LINQ operators. MSDN says that SelectMany “projects each element of a sequence to an IEnumerable(T) and flattens the resulting sequences into one sequence.” I still remember reading this description of SelectMany for the first time, and wondering why that would that be useful.

Of course, SelectMany is not only incredibly useful, but also surprisingly powerful. In fact, a variety of LINQ operators are really just constrained versions SelectMany. Select, Concat, Where, Take, Skip, TakeWhile, SkipWhile and Distinct can all be easily rewritten using a single SelectMany.

Continue reading »

Another LINQ puzzle

I was discussing the little LINQ puzzle with Stephen Toub, and he brought up an idea which lead to another puzzle. I like this one even more than the previous one.

Why does the last line throw StackOverflowException?

IEnumerable<int> q = new int[] { 1, 2 };
q = from x in new int[] { 1, 2 }
    from y in q
    select x + y;
q.ToArray();

And, how come the code sample runs just fine if you switch the order of the from clauses?

Little LINQ puzzle

Why does the last line hang?

IEnumerable<int> empty = Enumerable.Empty<int>();
for (int i = 0; i < 40; i++)
{
    empty = empty.Concat(empty);
}
int[] emptyArray = empty.ToArray();

Answer in the comments section. ;-)

For a slightly harder challenge, check out the next puzzle.

Data structure zoo: ordered set

This article is the first one in a series titled Data structure zoo. Each article will give you a “working knowledge” of data structures that solve a particular problem. You won’t necessarily know how to implement each one, but you will have a good idea of the main characteristics of each solution and how to pick among them to solve your particular problem.

Today, I am writing about data structures to represent an ordered set. An ordered set is a common data structure that supports O(log N) lookups, insertions and removals. Ordered set is also sometimes used as an alternative to a hash map, for example in STL’s map.

Continue reading »

Most Common Performance Issues in Parallel Programs

Performance of parallel programs is an interesting - but also tricky - issue. I put together an article for our team blog that talks about the most common reasons why a parallel program may not scale as desired:

Developers ask why one program shows a parallel speedup but another one does not, or how to modify a program so that it scales better on multi-core machines.

The answers tend to vary. In some cases, the problem observed is a clear issue in our code base that is relatively easy to fix. In other cases, the problem is that Parallel Extension does not adapt well to a particular workload. This may be harder to fix on our side, but understanding real-world workloads is a first step to do that, so please continue to send us your feedback. In yet another class of issues, the problem is in the way the program uses Parallel Extensions, and there is little we can do on our side to resolve the issue.

Interestingly, it turns out that there are common patterns that underlie most performance issues, regardless of whether the source of the problem is in our code or the user code. In this blog posting, we will walk though the common performance issues to take into consideration while developing applications using Parallel Extensions.

Read the rest of the article here.

Big Oh in the parallel world

Big-Oh notation is a simple and powerful way to express how running time of a particular algorithm depends on the size of the input. When you say that a particular algorithm runs in O(N2) time, you mean that the number of steps the algorithm takes is proportional to the input size squared. Or, in mathematical terms, there is some fixed constant C, such that to process input of size N, the algorithm needs at most C x N2 steps. One interesting question is how to define a “step”. The beauty of the Big-Oh notation is that any somewhat reasonable definition of a step will do. The step could be a clock cycle, a hardware instruction, or an expression in source code. If an algorithm takes O(N) steps according to one definition of a “step”, it takes O(N) according to all reasonable definitions.

While this is great, you already probably heard it a thousand times. But, what about the complexity of parallel programs? A major departure from the sequential case is that the number of steps an algorithm takes and the actual real-world time may not be proportional. In the parallel case, we may have potentially many cores executing the computation steps!

Continue reading »

Skip lists are fascinating!

Skip lists are a fascinating data structure: very simple, and yet have the same asymptotic efficiency as much more complicated AVL trees and red-black trees. While many standard libraries for various programming languages provide a sorted set data structure, there are numerous problems that require more control over the internal data structure than a sorted set exposes. In this article, I will discuss the asymptotic efficiency of operations on skip lists, the ideas that make them work, and their interesting use cases. And, of course, I will give you the source code for a skip list in C#.

Continue reading »

Next »