Tuesday, June 8, 2010

The .Net HashSet

I’m relatively new to .Net, coming from the world of Java. Now, the documentation of the HashSet class says, “The HashSet(Of T) class provides high performance set operations. A set is a collection that contains no duplicate elements, and whose elements are in no particular order.” Fair enough, except that the HashSet class preserves insertion order. Furthermore, set1.Equals(set2) will return false even when the two sets contain the same elements but in a different insertion order.

This doesn’t sound like set semantics to me. Fortunately, you can remedy this with a simple extension method and LINQ:

[sourcecode language="csharp"]
public static bool SetEquals<T>(this HashSet<T> me, HashSet<T> other)
return me.All(other.Contains) && other.All(me.Contains);

As you can see, the LINQ expression on line 3 is equivalent to the definition of set equality. Of course, the parameters to this method could be of type ICollection<T> or something for more genericity.