In responses to my last week’s post, several readers mentioned LINQ-like operators they implemented themselves. I also had ideas for operators that would lead to neat solutions for some problems, so I decided to give it some thought and collect up the most useful operators into a reusable library.

My goal was to include operators that are simple to use, but applicable to a broad range of problems. I  left out operators that I thought were either too complicated to use, or too specific to a particular problem domain.

You can download the full source code of the library here (rename the file to ExtendedEnumerable.cs). Read on to find out what it contains.

ReadLinesFrom, WriteLinesTo - I/O in LINQ queries

LINQ is a great programming model for simple file-processing tasks. Treating a file as an enumerable of lines, we can filter, transform and analyze it using various LINQ operators. To support this use case, my library includes several operators to convert between streams and line enumerables. Two most general overloads are ReadLinesFrom and WriteLinesTo, which have the following signatures:

public static IEnumerable<string> ReadLinesFrom(TextReader reader)
public static void WriteLinesTo(    this IEnumerable<string> lines, TextWriter writer)

However, in most cases you will want to use one of the more specific overloads, ReadLinesFromConsole, ReadLinesFromFile, WriteLinesToConsole and WriteLinesToFile. For example, the Grep method below reads a file, keeps only lines that contain a particular substring, and writes out the results into another file:

static void Grep(string inputFile, string outputFile, string substring)
{
    ExtendedEnumerable.ReadLinesFromFile(inputFile)
        .Where(line => line.Contains(substring))
        .WriteLinesToFile(outputFile);
}

Isn’t that neat?

Generate - generate a sequence from a user delegate

In C# 2, generating arbitrary sequences became much more convenient than it used to be in C# 1. Instead of implementing two classes, the IEnumerable<T> and the IEnumerator<T>, you can implement a single method that yields items using the iterator block syntax (i.e. the yield statements).

However, I still try to avoid creating a method just to generate a simple sequence, particularly if I use that sequence only in one place in my program. The Generate operator below accepts a delegate which generates the sequence element by element. To signal the end of the sequence, the generator returns null.

Since value types cannot be null, we need one overload for reference types, and another overload that uses a nullable wrapper to handle value types:

public static IEnumerable<T> Generate<T>(Func<T> generator)    where T : class
public static IEnumerable<T> Generate<T>(Func<Nullable<T>> generator)    where T : struct

To give a usage example, the ReadLinesFromConsole operator I mentioned above could be implemented as follows:

public static IEnumerable<string> ReadLinesFromConsole()
{
    return ExtendedEnumerable.Generate(() => Console.ReadLine());
}

As another example, this code sample generates an infinite sequence of random integers:

Random rand = new Random();var randomSeq = ExtendedEnumerable.Generate(() => (int?)rand.Next());

This Generate operator has two disadvantages. First, it cannot be used to generate sequences that contain null values, because null is the terminator of the sequence. Second, it is a bit annoying to have to use the cast in the value-type overload (see the cast to int? in the random-sequence example). These are minor disadvantages, though, and I much prefer using the Generate operator over implementing a new method each time I need to generate a simple sequence.

As a side note, apparently Jon Skeet also looked at the problem of generating a sequence from a user’s delegate, and came up with a similar but slightly different solution, which you can find here.

ForEach - execute an action for each element in the sequence

As has been suggested by Magnus Martensson in a comment to my previous posting, as well as by others elsewhere, it is often neat to be able to specify an action at the end of the query using a ForEach operator, rather than having to iterate over the query in a foreach statement.

So, instead of this:

foreach (int x in Enumerable.Range(0,10).Where(i => (i % 2 == 0)).Take(5))
{
    Console.WriteLine(x);
}

You can write this:

Enumerable.Range(0,10).Where(i => (i % 2 == 0)).Take(5)
.ForEach(i => Console.WriteLine(i));

Do - execute side effects in the middle of the query

Sometimes it is useful to add side-effects in the middle of query, rather than to the end. For example, we can log which elements have been processed at a particular stage of the query. The Do operator provides this functionality:

Enumerable.Range(0,10)
    .Do((e) => Console.WriteLine("Processing {0}", e))
    .Select(x => x*2).ToArray();

Combine - combine two sequences

The Combine operator exists in various functional languages including F#, sometimes under the name Zip or ZipWith. It accepts two sequences as inputs, and combines their elements into a single sequence. So, the first element in sequence 1 and the first element in sequence 2 will be combined to produce the first element in the output sequence, and so forth. The function which combines an element from one sequence with an element from the other sequence is provided by the user. If one of the sequences is longer, the remaining elements in the longer sequence will be ignored.

To compute the pairwise sum between elements in seq1 and seq2, use the Combine operator like this:

IEnumerable<int> sumSeq = seq1.Combine(seq2, (a, b) => a + b);

As another example, to check whether a sequence of integers seq is increasing, use this query:

bool isIncreasing = seq.Combine(seq.Skip(1), (a, b) => a < b).All(x => x);

ToStringPretty - convert a sequence to a delimited string

Converting a sequence to a nicely-formatted string is a bit of a pain. The String.Join method definitely helps, but unfortunately it accepts an array of strings, so it does not compose with LINQ very nicely.

My library includes several overloads of the ToStringPretty operator that hides the uninteresting code. Here is an example of use:

Console.WriteLine(Enumerable.Range(0, 10).ToStringPretty("From 0 to 9: [", ",", "]"));

The output of this program is:

From 0 to 9: [0,1,2,3,4,5,6,7,8,9]

FromEnumerator - convert an enumerator to an enumerable

Several times I got into a situation where I have an enumerator, but really need an enumerable instead. There does not seem to be a simple way to do the conversion in .Net. Hence, my library of operators includes FromEnumerator which accepts an enumerator and returns an enumerable.

This sample converts enumerator e1 into an enumerable and then iterates over it in a foreach statement:

foreach (int x inExtendedEnumerable.FromEnumerator(e1)) { … }

And this sample converts enumerator e2 into an enumerable to use it as a data source in a LINQ query:

var query = from x in ExtendedEnumerable.FromEnumerator(e2)
            where x % 2 == 0
            select x;

Single - convert an item to an enumerable

As I mentioned in my previous posting, I have found converting a single item to an enumerable to be a fairly frequent operation. So, my library includes an operator for the conversion:

IEnumerable<int> e = ExtendedEnumerable.Single(5);

Shuffle - randomly shuffle a sequence

I find myself regularly re-implementing the Shuffle operator when I am testing my code. Shuffle operator accepts a sequence and returns the same sequence, randomly rearranged.

This example prints digits 0..9 in a random order:

Enumerable.Range(0, 10).Shuffle().WriteLinesToConsole();

Comments and Conclusion

Again, the source code is available for download here. If there operators that I haven’t included, but you think they are useful, let me know in the comments!

kick it on DotNetKicks.com

Related:

Share/Save/Bookmark