String的intern方法测试实例分析

String.intern()测试实例

第一节 测试代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
String s1 = new String("a") + new String("b");
s1.intern();
String s2 = "ab";
printAddresses("s1",s1);
printAddresses("s1.intern()",s1.intern());
printAddresses("s2",s2);
System.out.println(s1 == s2);

String s3 = new String("c") + new String("d");
String s4 = "cd";
s3.intern();
printAddresses("s3",s3);
printAddresses("s4",s4);
printAddresses("s3.intern()",s3.intern());
System.out.println(s3 == s4);
String s5 = "e" + "f";
String s6 = "ef";
s5.intern();
printAddresses("s5",s5);
printAddresses("s6",s6);
printAddresses("s5.intern()",s5.intern());
System.out.println(s5 == s6);

  运行结果如下。

1
2
3
4
5
6
7
8
9
10
11
12
s1: 0x46353a600
s1.intern(): 0x46353a600
s2: 0x46353a600
true
s3: 0x463742d80
s4: 0x463742e40
s3.intern(): 0x463742e40
false
s5: 0x463745140
s6: 0x463745140
s5.intern(): 0x463745140
true

  思考:为何字符串拼接时,结果和如下例子不同呢?

1
2
3
4
String s1 = new String("ab");
s1.intern();
String s2 = "ab";
System.out.println(s1 == s2);//结果为false

第二节 相关知识点

  查看《Java语言规范》和《Java虚拟机规范》可找到以下相关内容:

2.1 《Java语言规范》

3.10.5 字符串字面常量

  字符串字面常量是由双引号括起来的0到多个字符构成,字符可以用转义序列来表示。字符串字面常量的类型总是String。字符串字面常量是对String类的实例的引用。一个字符串字面常量总是引用String类的同一个实例。这时因为字符串字面常量或者更一般的情况,表达常量表达式的值的字符串,被通过使用String.intern方法”限定”了,目的是让它们共享唯一的实例。

4.3.1 对象

  如果字符串连接操作符+用于非常量表达式中,那么就会隐式的创建一个新的类实例,从而产生一个String类型的新对象。

4.3.3 String类

  String类的实例表示Unicode码位序列。每个String对象都有常量值(不可修改)。字符串字面常量是对String类的实例的引用。如果字符串连接符运算的结果不是编译时的常量表达式,那么该操作符会隐式的创建新的String对象。

15.18.1 字符串连接操作符+

  字符串连接操作的结果是一个String对象的引用,除非此对象是编译时的常量表达式,否则都会隐式的新建此String对象。

2.2 《Java虚拟机规范》

3.4 访问运行时常量池

  JVM通过ldc或ldc_w等指令访问运行时常量池,包括基本数据类型和String的实例。

4.4.3 常量池 CONSTANT_String_info 结构

  此结构用于表示String类型的常量对象,格式如下:

1
2
3
4
CONSTANT_String_info {
u1 tag;
u2 string_index;
}

  tag值为CONSTANT_String(8),string_index值是对常量池表的有效索引,而常量池表在此索引处的成员必须是CONSTANT_Utf8_info结构,此结构表示Unicode码点序列,此序列最终会被初始化为一个String对象。

1
2
3
4
5
CONSTANT_Utf8_info {
u1 tag; //CONSTANT_Utf8(1)
u2 length; //指明bytes[]的长度,不等同String对象长度
u1 bytes[length]; //表示字符串值的byte数组
}

5.1 运行时常量池

  当类或接口创建时,其二进制表内的常量池表用来构造运行时常量池,运行时常量池中的所有引用最初都是符号引用。

  而字符串常量是指向String类实例的引用,Java语言规定,相同的字符串常量(包含同一份码点序列的常量)必须指向同一个String类实例。在任意字符串上调用String.intern方法,那么其返回结果所指的哪个类实例,必须和直接以常量形式出现的字符串实例完全相同。

  即 (“a”+”b”+”c”).intern() == “abc” 结果必定为true

  为了得到此字符串常量,虚拟机会检查CONSTANT_String_info结构判断:若某String实例包含的码点序列和CONSTANT_String_info中一致,且又曾在此实例上调用String.intern方法,则获取的字符常量即指向此String实例的引用。否则的话,虚拟机会创建一个新的实例,其包含CONSTANT_String_info给出的码点序列,获取的字符常量即指向此新String实例的引用,新的实例的intern方法会自动被虚拟机调用

  类在虚拟机的运作流程包括:创建加载,链接和初始化

  链接包括:

  • 经过从class文件解析二进制代码后,链接类或接口准备其父类等,
  • 验证确保其二进制表示结构正确,
  • 准备阶段创建类或接口的静态字段,
  • 解析阶段通过指令将符号引用指向运行时常量池,
  • 访问权限控制,方法覆盖。

  在类和接口初始化前,其必须经历以上阶段。

  更多内容请参考Java虚拟机类加载过程,请和虚拟机相关内容结合起来学习。

2.3 StringBuilder

  StringBuilder通过内存拷贝的形式来进行toString,最终会调用new String进行字符串的初始化。

  JDK中源码大致如下,@HotSpotIntrinsicCandidate 表示在HotSpot虚拟机中会有一套基于CPU指令的高效实现,会代替源码执行,从而提高效率。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
@Override
@HotSpotIntrinsicCandidate
public String toString() {
// Create a copy, don't share the array
return isLatin1() ? StringLatin1.newString(value, 0, count)
: StringUTF16.newString(value, 0, count);
}

public static String newString(byte[] val, int index, int len) {
return new String(Arrays.copyOfRange(val, index, index + len),
LATIN1);
}

String(byte[] value, byte coder) {
this.value = value;
this.coder = coder;
}

  所以 (“a”+”b”+”c”).intern() == “abc” 解释了实例3,在解析Class文件时字符串常量会创建对应String实例,并驻留在字符串常量池内,所以初始时字符串常量池应该已经有 ”ab””cd””ef” 引用地址,而实例1和实例2的区别在代码中intern方法和字符串常量的位置,而非字符串常量的String对象应该是在运行时才会创建一个新的字符串实例,所以实例1为何 s1==s2 呢?

  new String("a") + new String("b"); 等价于 new StringBuilder("a").append("b").toString(); 但StringBuilder的toString最终应该还是会调用 new String() 创建一个新的String对象,但前两个实例的结果很明显不等价于 new String(“ab”);


第三节 分析

  了解过以上内容后,我们把上述代码编译为class文件:

1
2
3
4
cd ..
javac -encoding utf-8 -d . Test.java
cd string
javap -v Test.class

  得到测试类字节码。

  截取运行时常量池内容,如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
Constant pool:
#1 = Methodref #19.#32 // java/lang/Object."<init>":()V
#2 = Class #33 // java/lang/StringBuilder
#3 = Methodref #2.#32 // java/lang/StringBuilder."<init>":()V
#4 = Class #34 // java/lang/String
#5 = String #35 // a
#6 = Methodref #4.#36 // java/lang/String."<init>":(Ljava/lang/String;)V
#7 = Methodref #2.#37 // java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#8 = String #38 // b
#9 = Methodref #2.#39 // java/lang/StringBuilder.toString:()Ljava/lang/String;
#10 = Methodref #4.#40 // java/lang/String.intern:()Ljava/lang/String;
#11 = String #41 // ab
#12 = Fieldref #42.#43 // java/lang/System.out:Ljava/io/PrintStream;
#13 = Methodref #44.#45 // java/io/PrintStream.println:(Z)V
#14 = String #46 // c
#15 = String #47 // d
#16 = String #48 // cd
#17 = String #49 // ef
#18 = Class #50 // string/Test
#19 = Class #51 // java/lang/Object
#20 = Utf8 <init>
#21 = Utf8 ()V
#22 = Utf8 Code
#23 = Utf8 LineNumberTable
#24 = Utf8 main
#25 = Utf8 ([Ljava/lang/String;)V
#26 = Utf8 StackMapTable
#27 = Class #52 // "[Ljava/lang/String;"
#28 = Class #34 // java/lang/String
#29 = Class #53 // java/io/PrintStream
#30 = Utf8 SourceFile
#31 = Utf8 Test.java
#32 = NameAndType #20:#21 // "<init>":()V
#33 = Utf8 java/lang/StringBuilder
#34 = Utf8 java/lang/String
#35 = Utf8 a
#36 = NameAndType #20:#54 // "<init>":(Ljava/lang/String;)V
#37 = NameAndType #55:#56 // append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
#38 = Utf8 b
#39 = NameAndType #57:#58 // toString:()Ljava/lang/String;
#40 = NameAndType #59:#58 // intern:()Ljava/lang/String;
#41 = Utf8 ab
#42 = Class #60 // java/lang/System
#43 = NameAndType #61:#62 // out:Ljava/io/PrintStream;
#44 = Class #53 // java/io/PrintStream
#45 = NameAndType #63:#64 // println:(Z)V
#46 = Utf8 c
#47 = Utf8 d
#48 = Utf8 cd
#49 = Utf8 ef
#50 = Utf8 string/Test
#51 = Utf8 java/lang/Object
#52 = Utf8 [Ljava/lang/String;
#53 = Utf8 java/io/PrintStream
#54 = Utf8 (Ljava/lang/String;)V
#55 = Utf8 append
#56 = Utf8 (Ljava/lang/String;)Ljava/lang/StringBuilder;
#57 = Utf8 toString
#58 = Utf8 ()Ljava/lang/String;
#59 = Utf8 intern
#60 = Utf8 java/lang/System
#61 = Utf8 out
#62 = Utf8 Ljava/io/PrintStream;
#63 = Utf8 println
#64 = Utf8 (Z)V

  截取并翻译运行时字节码,可以参阅常用字节码指令

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
------------------------------------String s1 = new String("a") + new String("b");

0: new #2 // class java/lang/StringBuilder
{新建StringBuilder 实例}
3: dup
{复制栈顶字}
4: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
{调用StringBuilder构造方法}

7: new #4 // class java/lang/String
{新建String实例}
10: dup
{复制栈顶字}
11: ldc #5 // String a
{从运行时常量池index取出已有String对象a的引用,压栈}
13: invokespecial #6 // Method java/lang/String."<init>":(Ljava/lang/String;)V
{调用String构造方法}
16: invokevirtual #7
// Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
{调用StringBuilder.append方法}

19: new #4 // class java/lang/String
{新建String实例}
22: dup
{复制栈顶字}
23: ldc #8 // String b
{从运行时常量池index取出已有String对象b的引用,压栈}
25: invokespecial #6
// Method java/lang/String."<init>":(Ljava/lang/String;)V
{调用String构造方法}
28: invokevirtual #7
// Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
{调用StringBuilder.append方法}

31: invokevirtual #9
// Method java/lang/StringBuilder.toString:()Ljava/lang/String;
{调用StringBuilder. toString方法}
34: astore_1
{弹出栈顶对象引用,存入局部变量表1位置}

------------------------------------s1.intern();
35: aload_1
{从局部变量表1位置取出对象引用,压栈}
36: invokevirtual #10
// Method java/lang/String.intern:()Ljava/lang/String;
{调用String.intern方法}
39: pop
{弹出操作数栈顶字}

------------------------------------String s2 = "ab";
40: ldc #11 // String ab
{从运行时常量池index取出已有String对象ab的引用,压栈}
42: astore_2
{弹出栈顶对象引用,存入局部变量表2位置}

------------------------------------System.out.println(s1 == s2);
43: getstatic #12 // Field java/lang/System.out:Ljava/io/PrintStream;
{读取静态字段}
46: aload_1
{从局部变量表1位置取出对象引用,压栈}
47: aload_2
{从局部变量表2位置取出对象引用,压栈}
48: if_acmpne 55
{比较对象是否不相等,结果为真则跳转55} s1 == s2 ->51
51: iconst_1
{将int1压栈}
52: goto 56
{跳转56}
55: iconst_0
{将int0压栈}
56: invokevirtual #13 // Method java/io/PrintStream.println:(Z)V
{调用PrintStream.println方法}

------------------------------------String s3 = new String("c") + new String("d");
59: new #2 // class java/lang/StringBuilder
{新建StringBuilder 实例}
62: dup
{复制栈顶字}
63: invokespecial #3 // Method java/lang/StringBuilder."<init>":()V
{调用StringBuilder构造方法}

66: new #4 // class java/lang/String
{新建String实例}
69: dup
{复制栈顶字}
70: ldc #14 // String c
{从运行时常量池index取出已有String对象c的引用,压栈}
72: invokespecial #6 // Method java/lang/String."<init>":(Ljava/lang/String;)V
{调用String构造方法}
75: invokevirtual #7
// Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
{调用StringBuilder.append方法}

78: new #4 // class java/lang/String
{新建String实例}
81: dup
{复制栈顶字}
82: ldc #15 // String d
{从运行时常量池index取出已有String对象d的引用,压栈}
84: invokespecial #6
// Method java/lang/String."<init>":(Ljava/lang/String;)V
{调用String构造方法}
87: invokevirtual #7
// Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
{调用StringBuilder.append方法}

90: invokevirtual #9
// Method java/lang/StringBuilder.toString:()Ljava/lang/String;
{调用StringBuilder. toString方法}
93: astore_3
{弹出栈顶对象引用,存入局部变量表3位置}

------------------------------------String s4 = "cd";
94: ldc #16 // String cd
{从运行时常量池index取出已有String对象cd的引用,压栈}
96: astore 4
{弹出栈顶对象引用,存入局部变量表3位置}

------------------------------------s3.intern();
98: aload_3
{从局部变量表3位置取出对象引用,压栈}
99: invokevirtual #10
// Method java/lang/String.intern:()Ljava/lang/String;
{调用String.intern方法}
102: pop
{弹出操作数栈顶字}

------------------------------------System.out.println(s3 == s4);
103: getstatic #12 // Field java/lang/System.out:Ljava/io/PrintStream;
{读取静态字段}
106: aload_3
{从局部变量表3位置取出对象引用,压栈}
107: aload 4
{从局部变量表4位置取出对象引用,压栈}
109: if_acmpne 116
{比较对象是否不相等,结果为真则跳转116} s3 != s4 -> 116
112: iconst_1
{将int1压栈}
113: goto 117
{跳转117}
116: iconst_0
{将int0压栈}
117: invokevirtual #13 // Method java/io/PrintStream.println:(Z)V
{调用PrintStream.println方法}


------------------------------------String s5 = "e" + "f";
120: ldc #17 // String ef
{从运行时常量池index取出已有String对象ef的引用,压栈}
122: astore 5
{弹出栈顶对象引用,存入局部变量表5位置}

------------------------------------String s6 = "ef";
124: ldc #17 // String ef
{从运行时常量池index取出已有String对象ef的引用,压栈}
126: astore 6
{弹出栈顶对象引用,存入局部变量表6位置}

------------------------------------s5.intern();
128: aload 5
{从局部变量表5位置取出对象引用,压栈}
130: invokevirtual #10
// Method java/lang/String.intern:()Ljava/lang/String;
{调用String.intern方法}
133: pop
{弹出操作数栈顶字}

------------------------------------System.out.println(s5 == s6);
134: getstatic #12 // Field java/lang/System.out:Ljava/io/PrintStream;
{读取静态字段}
137: aload 5
{从局部变量表5位置取出对象引用,压栈}
139: aload 6
{从局部变量表6位置取出对象引用,压栈}
141: if_acmpne 148
{比较对象是否不相等,结果为真则跳转148}
144: iconst_1
{将int1压栈}
145: goto 149
{跳转149}
148: iconst_0
{将int0压栈}
149: invokevirtual #13 // Method java/io/PrintStream.println:(Z)V
{调用PrintStream.println方法}
152: return

  将JDK从1.8更新为10,再重新编译。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
------------------------------------
public class a {
public static void main(String[] args) {
String s1 = new String("a") + new String("b");
s1.intern();
String s2 = "ab";
}
}
------------------------------------
0: new #2 // class java/lang/String
3: dup
4: ldc #3 // String a
6: invokespecial #4 // Method java/lang/String."<init>":(Ljava/lang/String;)V
9: new #2 // class java/lang/String
12: dup
13: ldc #5 // String b
15: invokespecial #4 // Method java/lang/String."<init>":(Ljava/lang/String;)V
18: invokedynamic #6, 0 // InvokeDynamic #0:makeConcatWithConstants:(Ljava/lang/String;Ljava/lang/String;)Ljava/lang/String;
23: astore_1

24: aload_1
25: invokevirtual #7 // Method java/lang/String.intern:()Ljava/lang/String;
28: pop

29: ldc #8 // String ab
31: astore_2
------------------------------------

  Java 9以后操作符+字符串拼接的优化不再是隐式的调用StringBuilder来拼接,而是InvokeDynamic动态调用了StringConcatFactory.makeConcatWithConstants方法。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
/**
* Facilitates the creation of optimized String concatenation methods, that
* can be used to efficiently concatenate a known number of arguments of
* known types, possibly after type adaptation and partial evaluation of
* arguments. Typically used as a <em>bootstrap method</em> for {@code
* invokedynamic} call sites, to support the <em>string concatenation</em>
* feature of the Java Programming Language.
*
* <p>When the target of the {@code CallSite} returned from this method is
* invoked, it returns the result of String concatenation, taking all
* function arguments and constants passed to the linkage method as inputs for
* concatenation. The target signature is given by {@code concatType}, and
* does not include constants.
* For a target accepting:
* <ul>
* <li>zero inputs, concatenation results in an empty string;</li>
* <li>one input, concatenation results in the single
* input converted as per JLS 5.1.11 "String Conversion"; otherwise</li>
* <li>two or more inputs, the inputs are concatenated as per
* requirements stated in JLS 15.18.1 "String Concatenation Operator +".
* The inputs are converted as per JLS 5.1.11 "String Conversion",
* and combined from left to right.</li>
* </ul>
*
* <p>The concatenation <em>recipe</em> is a String description for the way to
* construct a concatenated String from the arguments and constants. The
* recipe is processed from left to right, and each character represents an
* input to concatenation. Recipe characters mean:
*
* <ul>
*
* <li><em>\1 (Unicode point 0001)</em>: an ordinary argument. This
* input is passed through dynamic argument, and is provided during the
* concatenation method invocation. This input can be null.</li>
*
* <li><em>\2 (Unicode point 0002):</em> a constant. This input passed
* through static bootstrap argument. This constant can be any value
* representable in constant pool. If necessary, the factory would call
* {@code toString} to perform a one-time String conversion.</li>
*
* <li><em>Any other char value:</em> a single character constant.</li>
* </ul>
*
* <p>Assume the linkage arguments are as follows:
*
* <ul>
* <li>{@code concatType}, describing the {@code CallSite} signature</li>
* <li>{@code recipe}, describing the String recipe</li>
* <li>{@code constants}, the vararg array of constants</li>
* </ul>
*
* <p>Then the following linkage invariants must hold:
*
* <ul>
* <li>The number of parameter slots in {@code concatType} is less than
* or equal to 200</li>
*
* <li>The parameter count in {@code concatType} equals to number of \1 tags
* in {@code recipe}</li>
*
* <li>The return type in {@code concatType} is assignable
* from {@link java.lang.String}, and matches the return type of the
* returned {@link MethodHandle}</li>
*
* <li>The number of elements in {@code constants} equals to number of \2
* tags in {@code recipe}</li>
* </ul>
*
* @param lookup Represents a lookup context with the accessibility
* privileges of the caller. Specifically, the lookup
* context must have
* <a href="MethodHandles.Lookup.html#privacc">private access</a>
* privileges.
* When used with {@code invokedynamic}, this is stacked
* automatically by the VM.
* @param name The name of the method to implement. This name is
* arbitrary, and has no meaning for this linkage method.
* When used with {@code invokedynamic}, this is provided
* by the {@code NameAndType} of the {@code InvokeDynamic}
* structure and is stacked automatically by the VM.
* @param concatType The expected signature of the {@code CallSite}. The
* parameter types represent the types of dynamic concatenation
* arguments; the return type is always assignable from {@link
* java.lang.String}. When used with {@code
* invokedynamic}, this is provided by the {@code
* NameAndType} of the {@code InvokeDynamic} structure and
* is stacked automatically by the VM.
* @param recipe Concatenation recipe, described above.
* @param constants A vararg parameter representing the constants passed to
* the linkage method.
* @return a CallSite whose target can be used to perform String
* concatenation, with dynamic concatenation arguments described by the given
* {@code concatType}.
* @throws StringConcatException If any of the linkage invariants described
* here are violated, or the lookup context
* does not have private access privileges.
* @throws NullPointerException If any of the incoming arguments is null, or
* any constant in {@code recipe} is null.
* This will never happen when a bootstrap method
* is called with invokedynamic.
* @apiNote Code generators have three distinct ways to process a constant
* string operand S in a string concatenation expression. First, S can be
* materialized as a reference (using ldc) and passed as an ordinary argument
* (recipe '\1'). Or, S can be stored in the constant pool and passed as a
* constant (recipe '\2') . Finally, if S contains neither of the recipe
* tag characters ('\1', '\2') then S can be interpolated into the recipe
* itself, causing its characters to be inserted into the result.
*
* @jls 5.1.11 String Conversion
* @jls 15.18.1 String Concatenation Operator +
*/
public static CallSite makeConcatWithConstants(MethodHandles.Lookup lookup,
String name,
MethodType concatType,
String recipe,
Objectetc constants) throws StringConcatException {
if (DEBUG) {
System.out.println("StringConcatFactory " + STRATEGY + " is here for " + concatType + ", {" + recipe + "}, " + Arrays.toString(constants));
}

return doStringConcat(lookup, name, concatType, false, recipe, constants);
}

private static CallSite doStringConcat(MethodHandles.Lookup lookup,
String name,
MethodType concatType,
boolean generateRecipe,
String recipe,
Objectetc constants) throws StringConcatException {
Objects.requireNonNull(lookup, "Lookup is null");
Objects.requireNonNull(name, "Name is null");
Objects.requireNonNull(concatType, "Concat type is null");
Objects.requireNonNull(constants, "Constants are null");

for (Object o : constants) {
Objects.requireNonNull(o, "Cannot accept null constants");
}

if ((lookup.lookupModes() & MethodHandles.Lookup.PRIVATE) == 0) {
throw new StringConcatException("Invalid caller: " +
lookup.lookupClass().getName());
}

int cCount = 0;
int oCount = 0;
if (generateRecipe) {
// Mock the recipe to reuse the concat generator code
char[] value = new char[concatType.parameterCount()];
Arrays.fill(value, TAG_ARG);
recipe = new String(value);
oCount = concatType.parameterCount();
} else {
Objects.requireNonNull(recipe, "Recipe is null");

for (int i = 0; i < recipe.length(); i++) {
char c = recipe.charAt(i);
if (c == TAG_CONST) cCount++;
if (c == TAG_ARG) oCount++;
}
}

if (oCount != concatType.parameterCount()) {
throw new StringConcatException(
"Mismatched number of concat arguments: recipe wants " +
oCount +
" arguments, but signature provides " +
concatType.parameterCount());
}

if (cCount != constants.length) {
throw new StringConcatException(
"Mismatched number of concat constants: recipe wants " +
cCount +
" constants, but only " +
constants.length +
" are passed");
}

if (!concatType.returnType().isAssignableFrom(String.class)) {
throw new StringConcatException(
"The return type should be compatible with String, but it is " +
concatType.returnType());
}

if (concatType.parameterSlotCount() > MAX_INDY_CONCAT_ARG_SLOTS) {
throw new StringConcatException("Too many concat argument slots: " +
concatType.parameterSlotCount() +
", can only accept " +
MAX_INDY_CONCAT_ARG_SLOTS);
}

String className = getClassName(lookup.lookupClass());
MethodType mt = adaptType(concatType);
Recipe rec = new Recipe(recipe, constants);

MethodHandle mh;
if (CACHE_ENABLE) {
Key key = new Key(className, mt, rec);
mh = CACHE.get(key);
if (mh == null) {
mh = generate(lookup, className, mt, rec);
CACHE.put(key, mh);
}
} else {
mh = generate(lookup, className, mt, rec);
}
return new ConstantCallSite(mh.asType(concatType));
}

第四节 总结

  首先规划为四个案例:

案例A

1
2
3
4
String s1 = new String("ab");
s1.intern();
String s2 = "ab";
System.out.println(s1 == s2);//结果为false

案例B

1
2
3
4
5
6
7
String s1 = new String("a") + new String("b");
s1.intern();
String s2 = "ab";
printAddresses("s1",s1);
printAddresses("s1.intern()",s1.intern());
printAddresses("s2",s2);
System.out.println(s1 == s2);//结果为true

案例C

1
2
3
4
5
6
7
String s3 = new String("c") + new String("d");
String s4 = "cd";
s3.intern();
printAddresses("s3",s3);
printAddresses("s4",s4);
printAddresses("s3.intern()",s3.intern());
System.out.println(s3 == s4);//结果为false

案例D

1
2
3
4
5
6
7
String s5 = "e" + "f";
String s6 = "ef";
s5.intern();
printAddresses("s5",s5);
printAddresses("s6",s6);
printAddresses("s5.intern()",s5.intern());
System.out.println(s5 == s6);//结果为true

推导执行过程

  .java 文件编译为 .class 文件时,字符串字面量和其他常量一样会被存入常量池表。

案例A

  第一步,类加载阶段,当一个 .class 文件被加载时(注意加载过程在初始化之前),JVM在 .class 文件中寻找字符串字面量,找到后JVM会检查字符串常量池中是否有相等的字符串已被放入(堆内引用),如果找不到,JVM会在堆中新建一个对象,然后将其引用存入常量表,而之后相等字符串字面量都会共用这个实例。

StringTable Heap 局部变量表
‘ab’ : 0x001 [ 0x001 - String - ‘ab’ - etc ]

  第二步,执行 String s1 = new String("ab"); new创建新String对象,ldc从字符串常量池取出驻留Stirng对象的引用,初始化并返回新String对象的引用。

StringTable Heap 局部变量表
‘ab’ : 0x001 [ 0x001 - String - ‘ab’ - etc ] s1 : 0x002
[ 0x002 - String - ‘ab’ - etc ]

  第三步,执行 s1.intern(); 字符串常量池已有相等字符串字面量 "ab" 驻留,直接返回其引用。

s1.intern() -> 0x001

  第四步,执行 String s2 = "ab"; 直接到字符串常量池查找发现已有相等字符串字面量 "ab" 驻留,直接返回其引用。

StringTable Heap 局部变量表
‘ab’ : 0x001 [ 0x001 - String - ‘ab’ - etc ] s1 : 0x002
[ 0x002 - String - ‘ab’ - etc ] s2 : 0x001

  最终结果,s1 != s2。

案例B

  第一步,类加载阶段。

StringTable Heap 局部变量表
‘a’ : 0x001 [ 0x001 - String - ‘a’ - etc ]
‘b’ : 0x002 [ 0x002 - String - ‘b’ - etc ]
‘ab’ : 0x003 [ 0x003 - String - ‘ab’ - etc ]

  第二步,执行 String s1 = new String("a") + new String("b");

  new指令分别新建两个String对象,运算符+自动转化为StringBuilder.append拼接,隐式创建StringBuilder对象,最后通过 StringBuilder.toString() 获取String实例(“ab”),但根据结果:变量s1指向了StringTable驻留的实例对象,所以猜测JVM根据调用intern优化后去字符串常量池找到了驻留的相等字符串,然后直接拷贝了引用。

StringTable Heap 局部变量表
‘a’ : 0x001 [ 0x001 - String - ‘a’ - etc ] s1 : 0x003
‘b’ : 0x002 [ 0x002 - String - ‘b’ - etc ]
‘ab’ : 0x003 [ 0x003 - String - ‘ab’ - etc ]
[ 0x004 - String - ‘a’ - etc ]
[ 0x005 - String - ‘b’ - etc ]
[ 0x006 - StringBuilder - ‘ab’ - etc ]

  第三步,执行 s1.intern(); 字符串常量池已有相等字符串字面量 "ab" 驻留,直接返回其引用。

s1.intern() -> 0x003

  第四步,执行 String s2 = "ab"; 直接到字符串常量池查找发现已有相等字符串字面量 "ab" 驻留,直接返回其引用。

StringTable Heap 局部变量表
‘a’ : 0x001 [ 0x001 - String - ‘a’ - etc ] s1 : 0x003
‘b’ : 0x002 [ 0x002 - String - ‘b’ - etc ] s2 : 0x003
‘ab’ : 0x003 [ 0x003 - String - ‘ab’ - etc ]
[ 0x004 - String - ‘a’ - etc ]
[ 0x005 - String - ‘b’ - etc ]
[ 0x006 - StringBuilder - ‘ab’ - etc ]

  最终结果,s1 == s2。

案例C

  第一步,类加载阶段。

StringTable Heap 局部变量表
‘c’ : 0x001 [ 0x001 - String - ‘c’ - etc ]
‘d’ : 0x002 [ 0x002 - String - ‘d’ - etc ]
‘cd’ : 0x003 [ 0x003 - String - ‘cd’ - etc ]

  第二步,执行 String s3 = new String("c") + new String("d");

  new指令分别新建两个String对象,运算符+自动转化为StringBuilder.append拼接,隐式创建StringBuilder对象,最后通过 StringBuilder.toString() 创建新String对象实例(“cd”),并返回其引用。

StringTable Heap 局部变量表
‘c’ : 0x001 [ 0x001 - String - ‘c’ - etc ] s3 : 0x007
‘d’ : 0x002 [ 0x002 - String - ‘d’ - etc ]
‘cd’ : 0x003 [ 0x003 - String - ‘cd’ - etc ]
[ 0x004 - String - ‘c’ - etc ]
[ 0x005 - String - ‘d’ - etc ]
[ 0x006 - StringBuilder - ‘cd’ - etc ]
[ 0x007 - String - ‘cd’ - etc ]

  第三步,执行 String s4 = "cd"; 直接到字符串常量池查找发现已有相等字符串字面量 "cd" 驻留,直接返回其引用。

StringTable Heap 局部变量表
‘c’ : 0x001 [ 0x001 - String - ‘c’ - etc ] s3 : 0x007
‘d’ : 0x002 [ 0x002 - String - ‘d’ - etc ] s4 : 0x003
‘cd’ : 0x003 [ 0x003 - String - ‘cd’ - etc ]
[ 0x004 - String - ‘c’ - etc ]
[ 0x005 - String - ‘d’ - etc ]
[ 0x006 - StringBuilder - ‘cd’ - etc ]
[ 0x007 - String - ‘cd’ - etc ]

  第四步,执行 s3.intern(); 字符串常量池已有相等字符串字面量 "cd" 驻留,直接返回其引用。

s3.intern() -> 0x003

  最终结果,s3 != s4。

案例D

  第一步,类加载阶段。

StringTable Heap 局部变量表
‘ef’ : 0x001 [ 0x001 - String - ‘ef’ - etc ]

  第二步,执行 String s5 = "e" + "f";

  字符串常量 ”ef” ,新建String实例,并写入String Table引用地址。

StringTable Heap 局部变量表
‘ef’ : 0x001 [ 0x001 - String - ‘ef’ - etc ] s5 : 0x001

  第三步,执行 String s6 = "ef"; 直接到字符串常量池查找发现已有相等字符串字面量 "ef" 驻留,直接返回其引用。

StringTable Heap 局部变量表
‘ef’ : 0x001 [ 0x001 - String - ‘ef’ - etc ] s5 : 0x001
s6 : 0x001

  第四步,执行 s5.intern(); 字符串常量池已有相等字符串字面量 "cd" 驻留,直接返回其引用。

s5.intern() -> 0x003

  最终结果,s5 == s6。


  综合以上过程得出结论,字符串字面常量创建对象的方式会隐式的调用 String.intern() ,而new则直接返回对象引用。

  我们可以得到一个公式:

  1. 字符串字面量 = 其他字符串字面量 = String.intern()
  2. new String() != 其他new String() != 字符串字面量 != String.intern()