25
votes

I am trying to get a handle on proper memory usage and garbage collection in Java. I'm not a novice programmer by any means, but it always seems to me that once Java touches some memory, it will never be released for other applications to use. In that case, you have to make sure your peak memory is never too high, or your application will continually use whatever the peak memory usage was.

I wrote a small sample program trying to demonstrate this. It basically has 4 buttons...

  1. Fill class scope variable BigList = new ArrayList<string>() with about 25,000,000 long string items.
  2. Call BigList.clear()
  3. Reallocate the list - BigList = new ArrayList<string>() again (to shrink the list size)
  4. A call to System.gc() - Yes, I know this doesn't mean that GC will really run, but it's what we have.

So next I did some testing on Windows, Linux, and Mac OS while using the default task monitors to check on the processes reported memory usage. Here is what I found...

  • Windows - Pumping the list, calling clear, and then calling GC several times will not reduce memory usage at all. However, reallocating the list using new and then calling GC several times will reduce the memory usage back to starting levels. IMO, this is acceptable.
  • Linux (I used Mint 11 distro with Sun JVM) - Same results as Windows.
  • Mac OS - I followed the sames steps as above, but even when reinitializing the list calls to GC seemingly have no effect. The program will sit using hundreds of MB of RAM even though I have nothing in memory.

Can anyone explain this to me? Some people have told me some stuff about "heap" memory, but I still don't fully understand it and I'm not sure it applies here. From what I have heard about it, I shouldn't be seeing the behavior I am on Windows and Linux anyways.

Is this just a difference in the way Mac OS's Activity Monitor measures memory usage or is there something else going on? I would prefer to not have my program idling with tons of RAM usage. Thanks for your insight.

8
It is unclear waht you mean by "memory" and "released". RAM, virtual memory?Raedwald
Let's define it as "The amount of physical memory immediately available to the operating system for use in other applications". Does that work?jocull

8 Answers

20
votes

The Sun/Oracle JVM does not return unneeded memory to the system. If you give it a large, maximum heap size, and you actually use that heap space at some point, the JVM won't give it back to the OS for other uses. Other JVMs will do that (JRockit used to, but I don't think it does any more).

So, for Oracles JVM you need to tune your app and your system for peak usage, that's just how it works. If the memory that you're using can be managed with byte arrays (such as working with images or something), then you can use mapped byte buffers instead of Java byte arrays. Mapped byte buffers are taken straight from the system, and are not part of the heap. When you free up these objects (AND they are GC'd, I believe, but not sure), the memory will be returned to the system. You'll likely have to play with that one assuming it's even applicable at all.

6
votes

... but it always seems to me that once Java touches some memory, it's gone forever. You will never get it back.

It depends on what you mean by "gone forever".

I've also heard it said that some JVMs do give memory back to the OS when they are ready and able to. Unfortunately, given the way that the low-level memory APIs typically work, the JVM has to give back entire segments, and it tends to be complicated to "evacuate" a segment so that it can be given back.

But I wouldn't rely on that ... because there are various things that could prevent the memory being given back. The chances are that the JVM won't give the memory back to the OS. But it is not "gone forever" in the sense that the JVM will continue to make use of it. Even if the JVM never approaches the peak usage again, all of that memory will help to make the garbage collector run more efficiently.

In that case, you have to make sure your peak memory is never too high, or your application will continually eat up hundreds of MB of RAM.

That is not true. Assuming that you are adopting the strategy of starting with a small heap and letting it grow, the JVM won't ask for significantly more memory than the peak memory. The JVM won't continually eat up more memory ... unless your application has a memory leak and (as a result) its peak memory requirement has no bound.

(The OP's comments below indicate that this is not what he was trying to say. Even so, it is what he did say.)


On the topic of garbage collection efficiency, we can model the cost of a run of an efficient garbage collector as:

cost ~= (amount_of_live_data * W1) + (amount_of_garbage * W2)

where W1 and W2 are (we assume) constants that depend on the collector. (Actually, this is an over-simplification. The first part is not a linear function of the number of live objects. However, I claim that it doesn't matter for the following.)

The efficiency of the collector can then be stated as:

efficiency = cost / amount_of_garbage_collected

which (if we assume that the GC collects all data) expands to

efficiency ~= (amount_of_live_data * W1) / amount_of_garbage + W2.

When the GC runs,

heap_size ~= amount_of_live_data + amount_of_garbage

so

efficiency ~= W1 * (amount_of_live_data / (heap_size - amount_of_live_data) )
              + W2.

In other words:

  • as you increase the heap size, the efficiency tends to a constant (W2), but
  • you need a large ratio of heap_size to amount_of_live_data for this to happen.

The other point is that for an efficient copying collector, W2 covers just the cost of zeroing the space occupied by the garbage objects in 'from space'. The rest (tracing, copying of live objects to 'to space", and zeroing the 'from space' that they occupied) is part of the first term of the initial equation; i.e. covered by W1. What this means is that W2 is likely to be considerably smaller than W1 ... and that the first term of the final equation is significant for longer.

Now obviously this is a theoretical analysis, and the cost model is a simplification of how real garbage collectors really work. (And it doesn't take account of the "real" work that the application is doing, or the system-level effects of tying down too much memory.) However, the maths tells me that from the standpoint of GC efficiency, a big heap really does help a lot.

3
votes

Some JVMs do not or are not able to release previously acquired memory back to the host OS if it isn't needed atm. This is because it's a costly and complex task. The garbage collector only applies to the heap memory within the Java virtual machine. Therefore it does not give back (free() in C terms) memory to the OS. E.g. if a big object isn't used any more, the memory will be marked as free within the heap of the JVM by the GC and not released to OS.

However, the situation is changing, for example ZGC will return memory to the operating system.

2
votes

Once the program terminates, is the memory usage getting down in taskmanager in windows ? I think the memory is getting released but not shown as released by the default task monitors in the OS you are monitoring. Go through this question on C++ Problem with deallocating vector of pointers

2
votes

A common misconception is that Java uses up memory as it runs and there for it should be able to return memory to the OS. Actually the Oracle/Sun JVM reserves the virtual memory as a continuous block of memory as soon as it starts. If the isn't enough continuous virtual memory available it fails on start up even if the program isn't going to use that much.

What then happens is the OS is smart enough not to allocate physical memory to the program until it is used. It cannot easily reclaim the memory but it can be swapped to disk if it needs to and it hasn't been used for a while. Java doesn't handle having parts of the heap swapped to disk very well so this should be avoided.

1
votes

Java allocate memory only to objects. There is no explicit allocation of memory. In-fact Java even treats array types as objects. Each time an object created it comes in heap.

The Java runtime employs a garbage collector that reclaims the memory occupied by an object once it determines that object is no longer accessible. This is automatic process.

Calling System.gc() may not collect garbage at the time you call it; thats why your memory is not reduced. In general, it is better to let the system decide when it needs to collect the heap, and whether or not to do a full collection.

System.gc() doesn't even force a garbage collection; it's simply a hint to the JVM that "now may be a good time to clean up a bit"

Java memory explained here link2

1
votes

There are some great documents produced by Sun/Oracle describing Java's Garbage Collection. A quick search on "Java Garbage Collection Tuning" yeilds results such as; http://www.oracle.com/technetwork/java/gc-tuning-5-138395.html and http://java.sun.com/docs/hotspot/gc1.4.2/

The introduction of the Oracle doc states;

The Java TM 2 Platform Standard Edition (J2SE TM platform) is used for a wide variety of applications from small applets on desktops to web services on large servers. In the J2SE platform version 1.4.2 there were four garbage collectors from which to choose but without an explicit choice by the user the serial garbage collector was always chosen. In version 5.0 the choice of the collector is based on the class of the machine on which the application is started.

This “smarter choice” of the garbage collector is generally better but is not always the best. For the user who wants to make their own choice of garbage collectors, this document will provide information on which to base that choice. This will first include the general features of the garbage collections and tuning options to take the best advantage of those features. The examples are given in the context of the serial, stop-the-world collector. Then specific features of the other collectors will be discussed along with factors that should considered when choosing one of the other collectors.

They describe the various types of collectors available and the situations in which they should be used. I remember using this alongside JConsole to montior how the application performed when started with various different options.

These docs will give you a bit more insight into how collection occurs depending on the parameters you are using.

1
votes

I ran into this problem on Windows and have found a solution, so I'm posting it as an answer in case it can help others.

A lot of answers on here suggest that Java's behavior is 1. good and/or 2. an unavoidable consequence of garbage collecting. These are both false.

The Problem:

If you are like me and you want to write Java to write small applications for a workstation or even run multiple smaller processes on a server, then Oracle's JVM memory allocation behavior makes it almost completely useless. Even when running with -client, every JVM process hoards memory once allocated and never gives it back. This behavior cannot be disabled. As the OP notices: each jvm process holds on to its unused memory indefinitely even if it will never use it again and even while other jvm processes are starving. This inexplicable behavior makes Oracle's a useless implementation for all but monolithic, single-application scenarios.

Also: this is NOT a consequence of garbage collection. Witness .Net applications which run on Windows, use garbage collection, and do not suffer from this problem at all.

The Solution:

The solution I found to this was to use the IKVM.NET JVM which you use as a drop-in replacement for java.exe on windows. It compiles Java bytecode to .Net IL code and runs as a .Net process. It also contains utilities to convert .jar files into .Net .dll and .exe assemblies. The performance is often better than Oracle's JVM and after a GC, memory is instantly returned to the OS. (Note: this also works in Linux with Mono)

To be clear, I still rely on Oracle's JVM for all but my small applications and also to debug my small applications, but once stable, I use ikvm to run them as if they were native windows applications and this works so well, I've been amazed. It has numerous beneficial side effects. Once compiled, DLLs shared between processes are loaded only once, and applications show up in the task manager as .exe instead of all showing as javaw.exe.

Unfortunately, not everyone can use ikvm to solve this problem, but I hope this helps those in my situation.