2
votes

I have multiple threads which are serializing my 'Data' objects to files. The filename is based on 2 fields from the Object

  class Data {
    org.joda.DateTime time;
    String title;

    public String getFilename() {
      return time.toString() + '_' + title + ".xml";
    }

It is possible that 2 Data objects will have the same 'time' and 'title', and so the same filename.

This is acceptable, and I'm happy for either to be saved. (They're probably the same Data object anyway if those are the same)

My problem is that two (or more) threads are writing to a file AT THE SAME TIME, causing malformed XML.

I had a look at java.nio.channels.FileLock, but it's for VM-Wide locking, and specifically NOT suitable for intra-Thread locking.

I could synchronize on DataIO.class (but that will cause a HUGE overhead, since I really only want to synchronize on the individual File).

Synchronizing on the File object will be useless, as multiple File objects can represent the same System-File.

Code Follows:

class DataIO {
  public void writeArticleToFile(Article article, String filename, boolean overwrite) throws IOException {
    File file = new File(filename);
    writeArticleToFile(article, file, overwrite);
  }

  public void writeDataToFile(Data data, File file, boolean overwrite) throws IOException {
    if (file.exists()) {
      if (overwrite) {
        if (!file.delete()) {
          throw new IOException("Failed to delete the file, for overwriting: " + file);
        }
      } else {
        throw new IOException("File " + file + " already exists, and overwrite flag is set to false.");
      }
    }

    File parentFile = file.getParentFile();
    if (parentFile != null) {
      file.getParentFile().mkdirs();
    }

    file.createNewFile();

    if (!file.canWrite()) {
      throw new IOException("You do not have permission to write to the file: " + file);
    }

    FileOutputStream fos = new FileOutputStream(file, false);
    try {
      writeDataToStream(data, fos);
      logger.debug("Successfully wrote Article to file: " + file.getAbsolutePath());
    } finally {
      fos.close();
    }
  }
}
3

3 Answers

1
votes

You could intern() the string that is the filename. Then synchronise on the interned string.

class DataIO {
  public void writeArticleToFile(Article article, String filename, boolean overwrite) throws IOException {
    synchronized(filename.intern()) {
       File file = new File(filename);
       writeArticleToFile(article, file, overwrite);
    }
  }
2
votes

If I am reading this correctly you have a Data object that represents a single file.

You can consider creating a striped set based on the Data object. Possibly having a ConcurrentHashMap of

ConcurrentMap<Data,Lock> lockMap = new ConcurrentHashMap<Data,Lock>();

No when you want to write to this object you can do:

Lock lock = lockMap.get(someMyDataObject);
lock.lock();
try{
   //write object here
}finally{
   lock.unlock();
}

Keep in mind you would have to write the hashCode and equals method based on the title and DateTime

0
votes

I agree that using synchronization is the technique you should use. What you need is a distinct object for each file permutation, and more importantly the same object each time. One option might be to create a class called FileLock:

public class FileLock {
    DateTime time;
    String title;

    public FileLock(DateTime time, String title) {
        this.time = time;
        this.title = title;
    }

    override equals/hashCode based on those two properties

    static Hashtable<FileLock, FileLock> unqiueLocks = new Hashtable<FileLock, FileLock>();
    static lockObject = new Object();

    public static FileLock getLock(DateTime time, String title) {
        synchronized (lockObject) {
            FileLock lock = new FileLock(time, title);
            if (unqiueLocks.ContainsKey(lock)) {
                return unqiueLocks.get(lock);
            }
            else {
                unqiueLocks.put(lock, lock);
                return lock;
            }
        }
    }
}

Then callers would use it like:

synchronized (FileLock.getLock(time, title)) {
    ...
}

Bear in mind this has a memory leak since the Hashtable keeps growing with new file/time permutations. If you need to, you could modify this technique so that callers of getLock also invoke a releaseLock method that you use to keep the Hashtable clean.