10
votes

As I am studying java, I have learned that the proper way to compare 2 Strings is to use equals and not "==". This line

static String s1 = "a";
static String s2 = "a";
System.out.println(s1 == s2);  

will output true because the jvm seems to have optimized this code so that they are actually pointing to the same address. I tried to prove this using a great post I found here

http://javapapers.com/core-java/address-of-a-java-object/

but the addresses don't seem to be the same. What am I missing?

import sun.misc.Unsafe;
import java.lang.reflect.Field;
public class SomeClass {
    static String s1 = "a";
    static String s2 = "a";
    public static void main (String args[]) throws Exception {
        System.out.println(s1 == s2); //true

        Unsafe unsafe = getUnsafeInstance();
        Field s1Field = SomeClass.class.getDeclaredField("s1");
        System.out.println(unsafe.staticFieldOffset(s1Field)); //600

        Field s2Field = SomeClass.class.getDeclaredField("s2");
        System.out.println(unsafe.staticFieldOffset(s2Field)); //604

    }

    private static Unsafe getUnsafeInstance() throws SecurityException, 
        NoSuchFieldException, IllegalArgumentException, IllegalAccessException {
        Field theUnsafeInstance = Unsafe.class.getDeclaredField("theUnsafe");
        theUnsafeInstance.setAccessible(true);
        return (Unsafe) theUnsafeInstance.get(Unsafe.class);
    }
}

4
READ THE QUESTION. He says that he knows how to compare strings. This is a question of memory management and the JVM, not string comparison.feathj
Yes, please read the questionuser584583

4 Answers

3
votes

You aren't missing anything. The Unsafe library is reporting what is actually happening.

Bytecode:

static {};
  Code:
   0:   ldc #11; //String a
   2:   putstatic   #13; //Field s1:Ljava/lang/String;
   5:   ldc #11; //String a
   7:   putstatic   #15; //Field s2:Ljava/lang/String;
   10:  return

Notice both Strings are put in different locations in memory, 13 and 15.

There is a difference between where the the variables are stored in memory, which needs a separate address, and whether a new Object is put on the heap. In this case, it assigns two separate addresses for two variables, but it does not need to create a new String Object as it recognizes the same String literal. So both variables reference the same String at this point.

If you want to get the Adress, you can use the answer found in this question,How can I get the memory location of a object in java?. Make sure you read the caveats before using, but I did a quick test and it seems to work.

11
votes

I think you're confused on what staticFieldOffset is returning. It's returning the offset of the pointer to the String instance, not the address of the String itself. Because there are two fields, they have different offsets: ie, two pointers, which happen to have the same value.

A close reading of the Unsafe javadoc shows this:

Report the location of a given field in the storage allocation of its class. Do not expect to perform any sort of arithmetic on this offset; it is just a cookie which is passed to the unsafe heap memory accessors.

In other words, if you know where the actual Class instance is in memory, then you could add the offset returned by this method to that base address, and the result would be the location in memory where you could find the value of the pointer to the String.

3
votes

In the Code above you are not comparing the addresses of the strings, but their "location of a given field in the storage allocation", i.e. the location of the variables holding a reference to (the same) string.

-3
votes

String declared in Java code are automatically interned.

So the result is the same as you would call String.intern() manually.

    String a = "aa";
    String b = new String(a);
    System.out.println("aa" == "aa");
    System.out.println(a == b);
    System.out.println(a.equals(b));
    System.out.println(a.intern() == b.intern());

output:

true

false

true

true