3
votes

After some amount of stress, I created the following generic function:

func removeDupes<T : Hashable > (inout inputCollection : [T] ) -> [T] {
  var hashMap = [T : Bool]()
  var output = 0
  for thing in inputCollection {
    if !hashMap[thing] {
      hashMap[thing] = true
      inputCollection[output++] = thing
    }
  }
  while (inputCollection.count > output) {
    inputCollection.removeLast()
  }
  return inputCollection
}

So when you do:

var names = ["Bob", "Carol", "Bob", "Bob", "Carol", "Ted", "Ted", "Alice", "Ted", "Alice"]
removeDupes(&names)

names will contain: ["Bob", "Carol","Ted", "Alice"]

Now I'd like to generically add "removeDupes" as an extension method on Array and I'm floundering with the syntax, since I have to constrain it to be an array of Hashable items.

I was hoping I could declare it like this:

extension Array {
  func removeDupes< T : Hashable> () -> [T] {
    return removeDupes(&self)
  }
}

But I get the error:

Array is not convertible to '@lvalue inout $T5'

I suspect the answer is either "you idiot, do this..." or "you can't do it"

Which will it be? :-D

4

4 Answers

3
votes

Array class is declared like this:

public struct Array<Element>

Since Swift 2.0 you can create extension methods only for instances where the generic Element parameter conforms to a protocol:

extension Array where Element: Hashable {
    @warn_unused_result
    func removeDupes() -> [Element] {
        var hashMap = [Element : Bool]()
        var result = [Element]()

        for thing in self {
            if hashMap[thing] == nil {
                hashMap[thing] = true
                result.append(thing)
            }
        }
        return result
    }
}

Notice the where statement in the extension declaration. This way, the removeDupes method is only declared for arrays of Hashable elements:

let names = ["Bob", "Carol", "Bob", "Bob", "Carol", "Ted", "Ted", "Alice", "Ted", "Alice"]
let uniqueNames = names.removeDupes()   // returns ["Bob", "Carol", "Ted", "Alice"]
2
votes

You can't define a templated method for a generic that is more restrictive than the generic itself (only works with some forms of the generic).

Also, when you are defining T in your extension method, you are defining a new T unrelated to the T defined in the Array. There is no need to redefine T and ultimately that is the problem you are running into. It is fine do define new templates within a generic, but there is no way to relate those templates to both T and a protocol.

Your original function is the best way to implement that, although it would potentially be better to return a copy of the array without the duplicates instead of modifying the original. This makes the code more clear and is better for multithreading (which may be important for a method that does this much processing).

2
votes

Array is defined as Array<T>, which means that an Array can hold any type T, and T does not necessarily conform to the Hashable protocol. Therefore, you cannot apply methods that require this constraint. Unfortunately, you can also not check against or downcast to this protocol, because Hashable was not declared with the @objc attribute.

It is possible to remove duplicates by using a Dictionary, but you cannot use T for keys, for the same reason mentioned above. However, if T is uniquely representable as a String, this should work:

extension Array {
    func unique()->Array<T> {
        var buffer = Dictionary<String,T>()
        for it in self {
            buffer["\(it)"] = it
        }
        return Array(buffer.values)
    } 
}
0
votes

Here's another way to do it, although as drewag suggests, this one returns a copy. The global function takes any SequenceType, but always returns an Array because SequenceType is a very general type that does not allow appending values.

public func unique<H: Hashable, S: SequenceType where S.Generator.Element == H>(sequence: S) -> [H] {
    var hashMap = [H: Bool]()
    var result = [H]()
    for elem in sequence {
        let key = hashMap[elem]
        if key == nil {
            hashMap[elem] = true
            result.append(elem)
        }
    }
    return result
}

Then, in your Array extension:

public func unique<H: Hashable>() -> [H] {
    return Yeah.unique(map({ $0 as H }))
}

I'm not crazy about this solution because the array is traversed twice: once to map it to Hashable, and then again to iterate it to remove the duplicates. The advantage of it is that by delegating to a global function, you can very quickly add a unique method to any class that adopts SequenceType. I also don't like the fact that you are forced to return an Array. The way to solve that is probably to have an overload of the global unique with a closure that knows how to append a value. Food for thought.