3
votes

I am writing a micro-benchmark to compare String concatenation using + operator vs StringBuilder. To this aim, I created a JMH benchmark class based on OpenJDK example that uses the batchSize parameter:

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@Measurement(batchSize = 10000, iterations = 10)
@Warmup(batchSize = 10000, iterations = 10)
@Fork(1)
public class StringConcatenationBenchmark {

    private String string;

    private StringBuilder stringBuilder;

    @Setup(Level.Iteration)
    public void setup() {
        string = "";
        stringBuilder = new StringBuilder();
    }

    @Benchmark
    public void stringConcatenation() {
        string += "some more data";
    }

    @Benchmark
    public void stringBuilderConcatenation() {
        stringBuilder.append("some more data");
    }

}

When I run the benchmark I get the following error for stringBuilderConcatenation method:

java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Arrays.java:3332)
    at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:137)
    at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:121)
    at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:421)
    at java.lang.StringBuilder.append(StringBuilder.java:136)
    at link.pellegrino.string_concatenation.StringConcatenationBenchmark.stringBuilderConcatenation(StringConcatenationBenchmark.java:29)
    at link.pellegrino.string_concatenation.generated.StringConcatenationBenchmark_stringBuilderConcatenation.stringBuilderConcatenation_avgt_jmhStub(StringConcatenationBenchmark_stringBuilderConcatenation.java:165)
    at link.pellegrino.string_concatenation.generated.StringConcatenationBenchmark_stringBuilderConcatenation.stringBuilderConcatenation_AverageTime(StringConcatenationBenchmark_stringBuilderConcatenation.java:130)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:430)
    at org.openjdk.jmh.runner.BenchmarkHandler$BenchmarkTask.call(BenchmarkHandler.java:412)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)

I was thinking that the default JVM heap size has to be increased, so I tried to allow up to 10GB using -Xmx10G value with -jvmArgs option provided by JMH. Unfortunately, I still get the error.

Consequently, I tried to reduce the value for batchSize parameter to 1 but I still get an OutOfMemoryError.

The only workaround I have found is to set the benchmark mode to Mode.SingleShotTime. Since this mode seems to consider a batch as a single shot (even if s/op is displayed in the Units column), it seems that I get the metric I want: the average time to perform the set of batch operations. However, I still don't understand why it is not working with Mode.AverageTime.

Please also note that the benchmarks for method stringConcatenation work as expected whatever the benchmark mode is used. The issue only occurs with stringBuilderConcatenation method that makes use of StringBuilder.

Any help to understand why the previous example is not working with Benchmark mode set to Mode.AverageTime is welcome.

JMH version I used is 1.10.4.

1

1 Answers

3
votes

You're right that Mode.SingleShotTime is what you need: it measures the time for single batch. When using the Mode.AverageTime your iteration still works until the iteration time finishes (which is 1 second by default). It measures the time per executing the single batch (only batches which were fully finished during the execution time are counted), so the final results differ, but execution time is the same.

Another problem is that @Setup(Level.Iteration) forces setup to be executed before every iteration, but not before every batch. Thus your strings are not actually limited by the batch size. The string version does not cause the OutOfMemoryError just because it's much slower than StringBuilder, so during the 1 second it's capable to build much shorter string.

Not very beautiful way to fix your benchmark (while still using average time mode and batchSize parameter) is to reset the string/stringBuilder manually:

@State(Scope.Thread)
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.MICROSECONDS)
@Measurement(batchSize = 10000, iterations = 10)
@Warmup(batchSize = 10000, iterations = 10)
@Fork(1)
public class StringConcatenationBenchmark {
    private static final String S = "some more data";
    private static final int maxLen = S.length()*10000;

    private String string;

    private StringBuilder stringBuilder;

    @Setup(Level.Iteration)
    public void setup() {
        string = "";
        stringBuilder = new StringBuilder();
    }

    @Benchmark
    public void stringConcatenation() {
        if(string.length() >= maxLen) string = "";
        string += S;
    }

    @Benchmark
    public void stringBuilderConcatenation() {
        if(stringBuilder.length() >= maxLen) stringBuilder = new StringBuilder();
        stringBuilder.append(S);
    }
}

Here's results on my box (i5 3340, 4Gb RAM, 64bit Win7, JDK 1.8.0_45):

Benchmark                   Mode  Cnt       Score       Error  Units
stringBuilderConcatenation  avgt   10     145.997 ±     2.301  us/op
stringConcatenation         avgt   10  324878.341 ± 39824.738  us/op

So you can see that only about 3 batches fit the second for stringConcatenation (1e6/324878) while for stringBuilderConcatenation thousands of batches can be executed resulting in enormous string leading to OutOfMemoryError.

I don't know why adding more memory doesn't work for you, for me -Xmx4G is enough to run the stringBuilder test of your original benchmark. Probably your box is faster, so the resulting string is even longer. Note that for the very big string you can hit the array size limit (2 billion of elements) even if you have enough memory. Check the exception stacktrace after adding the memory: is it the same? If you hit the array size limit, it will still be OutOfMemoryError, but stacktrace will be different a little bit. Anyways even with enough memory the results for your benchmark will be incorrect (both for String and StringBuilder).