0
votes

The following example is taken from the Donovan/Kernighan book:

func makeThumbnails6(filenames <-chan string) int64 {
    sizes := make(chan int64)
    var wg sync.WaitGroup // number of working goroutines
    for f := range filenames {
        wg.Add(1)
        // worker
        go func(f string) {
            defer wg.Done()
            thumb, err := thumbnail.ImageFile(f)
            if err != nil {
                log.Println(err)
                return
            }
            info, _ := os.Stat(thumb) // OK to ignore error
            sizes <- info.Size()
        }(f)
    }
    // closer
    go func() {
        wg.Wait()
        close(sizes)
    }()
    var total int64
    for size := range sizes {
        total += size
    }
    return total
}

And the book states:

"These two operations, wait and close, must be concurrent with the loop over sizes. Consider the alternatives: if the wait operation were placed in the main goroutine before the loop, it would never end"

This is what I do not understand - if they are not put in separate goroutine, then wg.Wait will block the main goroutine, so close(sizes) will happen when all other goroutines finished. Closing the sizes channel will still allow the loop to read all the already sent messages/ from channel, right?

1

1 Answers

1
votes

Closing the sizes channel will still allow the loop to read all the already sent messages/ from channel, right?

Yes, but that's not the problem. All goroutines are waiting to read from channels (and therefore, nobody will ever write to them). So the process will deadlock if sizes is unbuffered. For the workers to complete, something needs to read from it. For wg.Wait() to complete, the workers need to complete.

But also, the range sizes can't complete (eg find an empty, closed channel) until close(sizes) happens, which can't complete until the workers are complete (because they're the ones writing to sizes).

So it's wg.Wait() and close(sizes) both have to complete before range sizes that happen concurrently.