24
votes

Is there any efficient way to get intersection of two slices in Go?

I want to avoid nested for loop like solution
slice1 := []string{"foo", "bar","hello"}
slice2 := []string{"foo", "bar"}

intersection(slice1, slice2)
=> ["foo", "bar"]
order of string does not matter
5
It looks like you are looking for this set library: github.com/deckarep/golang-setAkavall

5 Answers

36
votes

How do I get the intersection between two arrays as a new array?

  • Simple Intersection: Compare each element in A to each in B (O(n^2))
  • Hash Intersection: Put them into a hash table (O(n))
  • Sorted Intersection: Sort A and do an optimized intersection (O(n*log(n)))

All of which are implemented here

https://github.com/juliangruber/go-intersect

2
votes

It's a best method for intersection two slice. Time complexity is too low.

Time Complexity : O(m+n)

m = length of first slice.

n = length of second slice.

func intersection(s1, s2 []string) (inter []string) {
    hash := make(map[string]bool)
    for _, e := range s1 {
        hash[e] = true
    }
    for _, e := range s2 {
        // If elements present in the hashmap then append intersection list.
        if hash[e] {
            inter = append(inter, e)
        }
    }
    //Remove dups from slice.
    inter = removeDups(inter)
    return
}

//Remove dups from slice.
func removeDups(elements []string)(nodups []string) {
    encountered := make(map[string]bool)
    for _, element := range elements {
        if !encountered[element] {
            nodups = append(nodups, element)
            encountered[element] = true
        }
    }
    return
}
1
votes

if there exists no blank in your []string, maybe you need this simple code:

func filter(src []string) (res []string) {
    for _, s := range src {
        newStr := strings.Join(res, " ")
        if !strings.Contains(newStr, s) {
            res = append(res, s)
        }
    }
    return
}

func intersections(section1, section2 []string) (intersection []string) {
    str1 := strings.Join(filter(section1), " ")
    for _, s := range filter(section2) {
        if strings.Contains(str1, s) {
            intersection = append(intersection, s)
        }
    }
    return
}
1
votes

https://github.com/viant/toolbox/blob/master/collections.go

Another O(m+n) Time Complexity solution that uses a hashmap. It has two differences compared to the other solutions discussed here.

  • Passing the target slice as a parameter instead of new slice returned
  • Faster to use for commonly used types like string/int instead of reflection for all
-5
votes

Yes there are a few different ways to go about it.. Here's an example that can be optimized.

package main

import "fmt"

func intersection(a []string, b []string) (inter []string) {
    // interacting on the smallest list first can potentailly be faster...but not by much, worse case is the same
    low, high := a, b
    if len(a) > len(b) {
        low = b
        high = a
    }

    done := false
    for i, l := range low {
        for j, h := range high {
            // get future index values
            f1 := i + 1
            f2 := j + 1
            if l == h {
                inter = append(inter, h)
                if f1 < len(low) && f2 < len(high) {
                    // if the future values aren't the same then that's the end of the intersection
                    if low[f1] != high[f2] {
                        done = true
                    }
                }
                // we don't want to interate on the entire list everytime, so remove the parts we already looped on will make it faster each pass
                high = high[:j+copy(high[j:], high[j+1:])]
                break
            }
        }
        // nothing in the future so we are done
        if done {
            break
        }
    }
    return
}

func main() {
    slice1 := []string{"foo", "bar", "hello", "bar"}
    slice2 := []string{"foo", "bar"}
    fmt.Printf("%+v\n", intersection(slice1, slice2))
}

Now the intersection method defined above will only operate on slices of strings, like your example.. You can in theory create a definition that looks like this func intersection(a []interface, b []interface) (inter []interface), however you would be relying on reflection and type casting so that you can compare, which will add latency and make your code harder to read. It's probably easier to maintain and read to write a separate function for each type you care about.

func intersectionString(a []string, b []string) (inter []string),

func intersectionInt(a []int, b []int) (inter []int),

func intersectionFloat64(a []Float64, b []Float64) (inter []Float64), ..ect

You can then create your own package and reuse once you settle how you want to implement it.

package intersection

func String(a []string, b []string) (inter []string)

func Int(a []int, b []int) (inter []int)

func Float64(a []Float64, b []Float64) (inter []Float64)