6
votes

Consider the following simple Haskell program, which reads a file as a bytestring and writes the file tmp.tmp from this bytestring:

module Main
  where
import System.Environment
import qualified Data.ByteString.Lazy as B

main :: IO ()
main = do
  [file] <- getArgs
  bs <- B.readFile file
  action <- B.writeFile "tmp.tmp" bs
  putStrLn "done"

It is compiled to an executable named tmptmp.

I have two hard drives on my computer: the C drive and the U drive, and this one is a network drive, and this network drive is offline.

Now, let's try tmptmp.

When I run it from C, there's no problem; I run it two times below, the first time with a file on C and the second time with a file on U:

C:\HaskellProjects\imagelength> tmptmp LICENSE
done

C:\HaskellProjects\imagelength> tmptmp U:\Data\ztemp\test.xlsx
done

Now I run it from U, with a file on the C drive, no problem:

U:\Data\ztemp> tmptmp C:\HaskellProjects\imagelength\LICENSE
done

The problem occurs when I run it from U with a file on the U drive:

U:\Data\ztemp> tmptmp test.xlsx
tmptmp: tmp.tmp: openBinaryFile: resource busy (file is locked)

If in my program I use strict bytestrings instead of lazy bytestrings (by replacing Data.ByteString.Lazy with Data.ByteString), this problem does not occur anymore.

I'd like to understand that. Any explanation? (I would particularly like to know how to solve this issue but still using lazy bytestrings)

EDIT

To be perhaps more precise, the problem still occurs with this program:

import qualified Data.ByteString as SB
import qualified Data.ByteString.Lazy as LB

main :: IO ()
main = do
  [file] <- getArgs
  bs <- LB.readFile file
  action <- SB.writeFile "tmp.tmp" (LB.toStrict bs)
  putStrLn "done"

while the problem disappears with:

  bs <- SB.readFile file
  action <- LB.writeFile "tmp.tmp" (LB.fromStrict bs)

It looks like the point causing the problem is the laziness of readFile.

1
1. Does it work if you give it an absolute path (i.e. cd U:/ ; tmptmp U:/<..>/test.xlsx? (who knows, this could be it. Windows is weird sometimes) 2. What do you mean by "this network drive is offline"? I'd like to try to reproduce but I'm not sure how one accesses a network drive which is offline (clearly I misunderstand the meaning of 'offline' here!). 3. Why do you need to use lazy BS? It seems you've discovered that Strict is the right tool for the job. 4. Does it work if you force the input (i.e. evaluate (length bs) before the write)? - user2407038
Hi @user2407038. 1) No. 2) This is the laptop of my job and I'm not connected to the domain. In Windows Explorer you have a button "Work offline / Work online". Click on "Work offline" if you want to reproduce. 3) This is just a minimal reproducible example. In the real life, I'm using the xlsx library which deals with lazy bytestrings. 4) I didn"t know the evaluate function, I'll try. - Stéphane Laurent
2) Or simply disconnect your computer from Internet. - Stéphane Laurent
I've just solved my real-life issue by using the strategy of the last point of my edit, with LB.readFile then fromStrict. But obviously that does not provide an explanation. - Stéphane Laurent
Unfortunately, I can't reproduce (on W7). I think it is because I don't have an actual remote location which I can access this way, but Windows allowed me to "Map network drive" with a local (shared) folder. With this setup, there is no "Work offline" button, and it worked just fine with the lazy ByteString. - user2407038

1 Answers

0
votes

As per the most recent Data.ByteString.Lazy docs:

Using lazy I/O functions like readFile or hGetContents means that the order of operations such as closing the file handle is left at the discretion of the RTS.

The example given with the offline network drive presumably leads to the RTS continuing from readFile without closing the file. The docs, which have an almost identical example, say that

When writeFile is executed next, [tmp.tmp] is still open for reading and the RTS takes care to avoid simultaneously opening it for writing, instead returning the error.

As far as I am aware, there is no solution to this in Data.ByteString.Lazy — both your solution (using the strict read) and other packages are suggested on the docs. Sometimes reading and writing the same file can work, but you have no guarantee.