0
votes
HashSet<ReadOnlyCollection<int>> test1 = new HashSet<ReadOnlyCollection<int>> ();
for (int i = 0; i < 10; i++) {
    List<int> temp = new List<int> ();
    for (int j = 1; j < 2; j++) {
        temp.Add (i);
        temp.Add (j);
    }
    test1.Add (temp.AsReadOnly ());
}

Here test1 is {[0,1], [1,1], [2,1], [3,1], [4,1], [5,1], [6,1], [7,1], [8,1], [9,1]}

HashSet<ReadOnlyCollection<int>> test2 = new HashSet<ReadOnlyCollection<int>> ();
for (int i = 5; i < 10; i++) {
    List<int> temp = new List<int> ();
    for (int j = 1; j < 2; j++) {
        temp.Add (i);
        temp.Add (j);
    }
    test2.Add (temp.AsReadOnly ());
}

Here test2 is {[5,1], [6,1], [7,1], [8,1], [9,1]}

test1.ExceptWith(test2);

After doing this, I want test1 to be {[0,1], [1,1], [2,1], [3,1], [4,1]}, but it gives me the original test1.
How fix this problem? Or is there any other way to do the same thing? Thank you!

2
Short answer : you could define a custom EqualityComparer for your collections. Currently, they are compared by their instance (as objects), every new HashSet or List is different, even if they contain the same elements.Pac0
Do you understand WHY it is acting like that?mjwills
@Pac0 Do you mean I should define a class to contain the values and override its Equals and GetHashCode method? Thank you!lynkewsw
@mjwills My guess is that when C# hashes the same ReadOnlyCollection<int>, say [1,2], it returns different hash values for the value-same collections, but I don't know how to fix that, thank you for your reply!lynkewsw
If this is the data you actually want to work with, make it a Tuple<int,int> instead.gnud

2 Answers

2
votes

Objects in c# are usually compared by reference, not by value. This means that new object() != new object(). In the same way, new List<int>() { 1 } != new List<int>() { 1 }. Structs and primitives, on the other hand, are compared by value, not by reference.

Some objects override their equality method to compare values instead. For example strings: new string(new[] { 'a', 'b', 'c'}) == "abc", even if object.ReferenceEquals(new string(new[] { 'a', 'b', 'c'}), "abc") == false.

But collections, lists, arrays etc. do not. For good reason - when comparing two lists of ints, what do you want to compare? The exact elements, regardless of order? The exact elements in order? The sum of elements? There's not one answer that fits everything. And often you might actually want to check if you have the same object.

When working with collections or LINQ, you can often specify a custom 'comparer' that will handle comparisons the way you want to. The collection methods then use this 'comparer' whenever it needs to compare two elements.

A very simple comparer that works on a ReadOnlyCollection<T> might look like this:

class ROCollectionComparer<T> : IEqualityComparer<IReadOnlyCollection<T>>
{
    private readonly IEqualityComparer<T> elementComparer;

    public ROCollectionComparer() : this(EqualityComparer<T>.Default) {}
    public ROCollectionComparer(IEqualityComparer<T> elementComparer) {
        this.elementComparer = elementComparer;
    }

    public bool Equals(IReadOnlyCollection<T> x, IReadOnlyCollection<T> y)
    {
        if(x== null && y == null) return true;
        if(x == null || y == null) return false;
        if(object.ReferenceEquals(x, y)) return true;

        return x.Count == y.Count && 
            x.SequenceEqual(y, elementComparer);
    }

    public int GetHashCode(IReadOnlyCollection<T> obj)
    {       
        // simplistic implementation - but should OK-ish when just looking for equality
        return (obj.Count, obj.Count == 0 ? 0 : elementComparer.GetHashCode(obj.First())).GetHashCode();
    }
}

And then you can compare the behavior of the default equality check, and your custom one:

var std = new HashSet<int[]>(new[] { new[] { 1, 2 }, new[] { 2, 2}});
std.ExceptWith(new[] { new[] { 2, 2}});
std.Dump();

var custom = new HashSet<int[]>(new[] { new[] { 1, 2 }, new[] { 2, 2 } }, new ROCollectionComparer<int>());
custom.ExceptWith(new[] { new[] { 2, 2 }});
custom.ExceptWith(new[] { new int[] { }});
custom.Dump();

You can test the whole thing in this fiddle.

0
votes

Here you have the implementation of ExceptWith:

https://github.com/microsoft/referencesource/blob/3b1eaf5203992df69de44c783a3eda37d3d4cd10/System.Core/System/Collections/Generic/HashSet.cs#L532

What it actually does is:

 // remove every element in other from this
 foreach (T element in other) {
    Remove(element);
 }

And Remove implementation:

https://github.com/microsoft/referencesource/blob/3b1eaf5203992df69de44c783a3eda37d3d4cd10/System.Core/System/Collections/Generic/HashSet.cs#L287

 if (m_slots[i].hashCode == hashCode && m_comparer.Equals(m_slots[i].value, item)) {

So if the hashcode is not the same, Remove will do nothing.

A small test to prove that hashcode is not the same:

    List<int> temp = new List<int> ();
     temp.Add(1);
     temp.Add(2);

    HashSet<ReadOnlyCollection<int>> test1 = new HashSet<ReadOnlyCollection<int>> ();
    HashSet<ReadOnlyCollection<int>> test2 = new HashSet<ReadOnlyCollection<int>> ();
    test1.Add (temp.AsReadOnly ());
    test2.Add (temp.AsReadOnly ());

    Console.WriteLine(test1.First().GetHashCode() == test2.First().GetHashCode());