6
votes

I have a huge file with a list of objects written by ObjectOutputStream, one after another.

for (Object obj : currentList){
    oos.writeUnshared(obj);
}

Now I want to read this file using ObjectInputStream. However, I need to read multiple files at the same time, so I can't read the entire file into memory. However, using ObjectInputStream causes a Heap Out Of Memory Error. From what I read, this is caused because ObjectInputStream has a memory leak and maintains references to the read objects even after returning them.

How can I ask ObjectInputStream to not maintain a reference of whatever its reading?

2
I closed the stream while writing it. While reading it, I have around 100 files that I'm reading simultaneously and all those streams are open together. I only close them once the respective files have been read. However, the memory leak is definitely happening in ObjectInputStream since there should be only 100 objects stored in memory at a given time. - copperhead
Why don't you try to read the files in batches. E.g for 100 files, you can make 4 batches, such that initially batch1 will be read which contain 25 file, then batch2 will be read.. and so on.. - Gaurav Gupta

2 Answers

7
votes

A possible solution is to call the method reset() on your ObjectOutputStream: “This will disregard the state of any objects already written to the stream. The state is reset to be the same as a new ObjectOutputStream. The current point in the stream is marked as reset so the corresponding ObjectInputStream will be reset at the same point.” (extracted from the java documentation) Doing a reset on your ObjectOutputStream also resets the ObjectInputStream state.

I assume you can control also your ObjectOutputStreams?

2
votes

When you are using writeUnshared on the writing side you have already done one half of the job. If you now also use readUnshared on the input side rather than readObject, the ObjectInputStream will not maintain references to the objects.

You can use the following program to verify the behavior:

package lib.io;

import java.awt.Button;
import java.io.*;
import java.lang.ref.Reference;
import java.lang.ref.ReferenceQueue;
import java.lang.ref.WeakReference;
import java.util.concurrent.ConcurrentHashMap;

public class ObjectInputStreamReferences {
  public static void main(String[] args)
  throws IOException, ClassNotFoundException {
    final int numObjects=1000;
    Serializable s=new Button();
    ByteArrayOutputStream os=new ByteArrayOutputStream();
    try( ObjectOutputStream oos=new ObjectOutputStream(os) ) {
      for(int i=0; i<numObjects; i++) oos.writeUnshared(s);
    }
    final ConcurrentHashMap<WeakReference<?>, Object> map
                                                  =new ConcurrentHashMap<>();
    final ReferenceQueue<Object> q=new ReferenceQueue<>();
    new Thread(new Runnable() {
      public void run() {
        reportCollections(map, q);
      }
    }).start();
    try(ObjectInputStream ois=
        new ObjectInputStream(new ByteArrayInputStream(os.toByteArray()))) {
      for(int i=0; i<numObjects; i++) {
        Object o=ois.readUnshared();
        map.put(new WeakReference<>(o,q), "");
        o=null;
        System.gc();Thread.yield();
      }
    }
    System.exit(0);
  }

  static void reportCollections(
      ConcurrentHashMap<WeakReference<?>, Object> map, ReferenceQueue<?> q) {
    for(;;) try {
      Reference<?> removed = q.remove();
      System.out.println("one object collected");
      map.remove(removed);
    } catch(InterruptedException ex){}
  }
}