I have a very large file and i need to process each line (each line of file is independent). How can I use goroutines (or should I not use them?) to read the file in the fastest way?
3 Answers
As long as your hard disk is orders of magnitude slower than your CPU, which is still a quite common situation, then you cannot magically make the file reading (domain: from a single HD) any faster by throwing more CPU cycles onto it. (Assuming cold file caches and/or file size much bigger then all available file cache memory).
As in pretty much all cases the disk I/O is the limiting factor, and not the CPU cycles, you will not get an advantage in pure reading throughput by using goroutines.
Instead, you should check if you can use concurrency one step later, after reading a line. If your processing of a line takes a bit of processing or waiting (maybe you analyse it, or send it somewhere else?) concurrency may be useful: passing it to another or several other go routine(s) so reading can go on in this goroutine.
Also you should try to read memory page sized blobs of data to maximize the throughput (reading two half pages is slower than reading one full page). The page size depends on your OS/Kernel configuration.