深入理解 Java HashCode

1. hashcode

1.Object的hashcode

默认调用object的hashcode的话,会调用native方法,与本地相关。
1
public native int hashCode();

来自 : https://zhuanlan.zhihu.com/p/33915892
下载完整的jdk呗openJDK1.8。找到Object.c文件,查看上面的方法映射表发现,hashCode被映射到了一个叫JVM_IHashCode上去了。

1
2
3
4
5
6
7
static JNINativeMethod methods[] = {
{"hashCode", "()I", (void *)&JVM_IHashCode},
{"wait", "(J)V", (void *)&JVM_MonitorWait},
{"notify", "()V", (void *)&JVM_MonitorNotify},
{"notifyAll", "()V", (void *)&JVM_MonitorNotifyAll},
{"clone", "()Ljava/lang/Object;", (void *)&JVM_Clone},
};

顺藤摸瓜去看看JVM_IHashCode到底干了什么?熟悉的味道,在jvm.h里面有方法声明,那实现一定在jvm.cpp里面。不过jvm.cpp对于JVM_IHashCode的实现调用的是ObjectSynchronizer::FastHashCode的方法。

1
2
3
4
5
JVM_ENTRY(jint, JVM_IHashCode(JNIEnv* env, jobject handle))
JVMWrapper("JVM_IHashCode");
// as implemented in the classic virtual machine; return 0 if object is NULL
return handle == NULL ? 0 : ObjectSynchronizer::FastHashCode (THREAD, JNIHandles::resolve_non_null(handle)) ;
JVM_END

发现声明在synchronizer.hpp 实现在这里synchronizer.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
// hashCode() generation :
//
// Possibilities:
// * MD5Digest of {obj,stwRandom}
// * CRC32 of {obj,stwRandom} or any linear-feedback shift register function.
// * A DES- or AES-style SBox[] mechanism
// * One of the Phi-based schemes, such as:
// 2654435761 = 2^32 * Phi (golden ratio)
// HashCodeValue = ((uintptr_t(obj) >> 3) * 2654435761) ^ GVars.stwRandom ;
// * A variation of Marsaglia's shift-xor RNG scheme.
// * (obj ^ stwRandom) is appealing, but can result
// in undesirable regularity in the hashCode values of adjacent objects
// (objects allocated back-to-back, in particular). This could potentially
// result in hashtable collisions and reduced hashtable efficiency.
// There are simple ways to "diffuse" the middle address bits over the
// generated hashCode values:

static inline intptr_t get_next_hash(Thread * Self, oop obj) {
intptr_t value = 0;
if (hashCode == 0) {
// This form uses global Park-Miller RNG.
// On MP system we'll have lots of RW access to a global, so the
// mechanism induces lots of coherency traffic.
value = os::random();
} else if (hashCode == 1) {
// This variation has the property of being stable (idempotent)
// between STW operations. This can be useful in some of the 1-0
// synchronization schemes.
intptr_t addrBits = cast_from_oop<intptr_t>(obj) >> 3;
value = addrBits ^ (addrBits >> 5) ^ GVars.stwRandom;
} else if (hashCode == 2) {
value = 1; // for sensitivity testing
} else if (hashCode == 3) {
value = ++GVars.hcSequence;
} else if (hashCode == 4) {
value = cast_from_oop<intptr_t>(obj);
} else {
// Marsaglia's xor-shift scheme with thread-specific state
// This is probably the best overall implementation -- we'll
// likely make this the default in future releases.
unsigned t = Self->_hashStateX;
t ^= (t << 11);
Self->_hashStateX = Self->_hashStateY;
Self->_hashStateY = Self->_hashStateZ;
Self->_hashStateZ = Self->_hashStateW;
unsigned v = Self->_hashStateW;
v = (v ^ (v >> 19)) ^ (t ^ (t >> 8));
Self->_hashStateW = v;
value = v;
}

value &= markOopDesc::hash_mask;
if (value == 0) value = 0xBAD;
assert(value != markOopDesc::no_hash, "invariant");
TEVENT(hashCode: GENERATE);
return value;
}


intptr_t ObjectSynchronizer::FastHashCode(Thread * Self, oop obj) {
if (UseBiasedLocking) {
// NOTE: many places throughout the JVM do not expect a safepoint
// to be taken here, in particular most operations on perm gen
// objects. However, we only ever bias Java instances and all of
// the call sites of identity_hash that might revoke biases have
// been checked to make sure they can handle a safepoint. The
// added check of the bias pattern is to avoid useless calls to
// thread-local storage.
if (obj->mark()->has_bias_pattern()) {
// Handle for oop obj in case of STW safepoint
Handle hobj(Self, obj);
// Relaxing assertion for bug 6320749.
assert(Universe::verify_in_progress() ||
!SafepointSynchronize::is_at_safepoint(),
"biases should not be seen by VM thread here");
BiasedLocking::revoke_and_rebias(hobj, false, JavaThread::current());
obj = hobj();
assert(!obj->mark()->has_bias_pattern(), "biases should be revoked by now");
}
}

// hashCode() is a heap mutator ...
// Relaxing assertion for bug 6320749.
assert(Universe::verify_in_progress() || DumpSharedSpaces ||
!SafepointSynchronize::is_at_safepoint(), "invariant");
assert(Universe::verify_in_progress() || DumpSharedSpaces ||
Self->is_Java_thread() , "invariant");
assert(Universe::verify_in_progress() || DumpSharedSpaces ||
((JavaThread *)Self)->thread_state() != _thread_blocked, "invariant");

ObjectMonitor* monitor = NULL;
markOop temp, test;
intptr_t hash;
markOop mark = ReadStableMark(obj);

// object should remain ineligible for biased locking
assert(!mark->has_bias_pattern(), "invariant");

if (mark->is_neutral()) {
hash = mark->hash(); // this is a normal header
if (hash) { // if it has hash, just return it
return hash;
}
hash = get_next_hash(Self, obj); // allocate a new hash code
temp = mark->copy_set_hash(hash); // merge the hash code into header
// use (machine word version) atomic operation to install the hash
test = obj->cas_set_mark(temp, mark);
if (test == mark) {
return hash;
}
// If atomic operation failed, we must inflate the header
// into heavy weight monitor. We could add more code here
// for fast path, but it does not worth the complexity.
} else if (mark->has_monitor()) {
monitor = mark->monitor();
temp = monitor->header();
assert(temp->is_neutral(), "invariant");
hash = temp->hash();
if (hash) {
return hash;
}
// Skip to the following code to reduce code size
} else if (Self->is_lock_owned((address)mark->locker())) {
temp = mark->displaced_mark_helper(); // this is a lightweight monitor owned
assert(temp->is_neutral(), "invariant");
hash = temp->hash(); // by current thread, check if the displaced
if (hash) { // header contains hash code
return hash;
}
// WARNING:
// The displaced header is strictly immutable.
// It can NOT be changed in ANY cases. So we have
// to inflate the header into heavyweight monitor
// even the current thread owns the lock. The reason
// is the BasicLock (stack slot) will be asynchronously
// read by other threads during the inflate() function.
// Any change to stack may not propagate to other threads
// correctly.
}

// Inflate the monitor to set hash code
monitor = ObjectSynchronizer::inflate(Self, obj, inflate_cause_hash_code);
// Load displaced header and check it has hash code
mark = monitor->header();
assert(mark->is_neutral(), "invariant");
hash = mark->hash();
if (hash == 0) {
hash = get_next_hash(Self, obj);
temp = mark->copy_set_hash(hash); // merge hash code into header
assert(temp->is_neutral(), "invariant");
test = Atomic::cmpxchg(temp, monitor->header_addr(), mark);
if (test != mark) {
// The only update to the header in the monitor (outside GC)
// is install the hash code. If someone add new usage of
// displaced header, please update this code
hash = test->hash();
assert(test->is_neutral(), "invariant");
assert(hash != 0, "Trivial unexpected object/monitor header usage.");
}
}
// We finally get the hash
return hash;
}

2.Java Object.hashCode()返回的是对象内存地址?不是!

OpenJDK8 默认hashCode的计算方法是通过和当前线程有关的一个随机数+三个确定值,运用Marsaglia’s xorshift scheme随机数算法得到的一个随机数。和对象内存地址无关
当然也可以自己实现hashcode方法,关于hashcode() 与 equals()。主要是利用hashcode 可以判断2个对象的不等,hashcode相等,对象不一定相等,但hashcode不等,可以肯定2个对象不相等。在很多地方判断对象等不等。如果equals 定义了2个对象是相等的,需要注意hashcode还是不是相等的。

3.String类的hasCode()

1
2
3
4
5
6
7
8
9
10
11
12
public int hashCode() {
int h = hash;
if (h == 0 && value.length > 0) {
char val[] = value;

for (int i = 0; i < value.length; i++) {
h = 31 * h + val[i];
}
hash = h;
}
return h;
}

2.使用JOL查看对象信息

maven 依赖

1
2
3
4
5
<dependency>
<groupId>org.openjdk.jol</groupId>
<artifactId>jol-core</artifactId>
<version>0.9</version>
</dependency>

为了内存比较“整齐”,关闭压缩指针,启动参数加上-XX:-UseCompressedOops

1.测试对象

1
2
3
class AAA{
private int number;
}

2.测试方式1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
public static void main(String[] args) {
AAA aaa = new AAA();
System.out.println("---------before invoke hascode()-----------------");
System.out.println(ClassLayout.parseInstance(aaa).toPrintable());

/*invoke hashcode() 转换成16进制*/
System.out.println(Integer.toHexString(aaa.hashCode()));
System.out.println("------------after invoke hascode()-----------------");
System.out.println(ClassLayout.parseInstance(aaa).toPrintable());

synchronized (aaa){
System.out.println("---------in synchronized() func--------------");
System.out.println(Integer.toHexString(aaa.hashCode()));
System.out.println(ClassLayout.parseInstance(aaa).toPrintable());
System.out.println(Integer.toHexString(aaa.hashCode()));
}

System.out.println("---------after invoke synchronized()--------------");
System.out.println(Integer.toHexString(aaa.hashCode()));
System.out.println(ClassLayout.parseInstance(aaa).toPrintable());

}

输出

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
---------before invoke hascode()-----------------
AAA object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 00 00 00 (00000001 00000000 00000000 00000000) (1)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) a8 35 85 1c (10101000 00110101 10000101 00011100) (478492072)
12 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
16 4 int AAA.number 0
20 4 (loss due to the next object alignment)
Instance size: 24 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

7e0b37bc
------------after invoke hascode()-----------------
AAA object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 bc 37 0b (00000001 10111100 00110111 00001011) (188201985)
4 4 (object header) 7e 00 00 00 (01111110 00000000 00000000 00000000) (126)
8 4 (object header) a8 35 85 1c (10101000 00110101 10000101 00011100) (478492072)
12 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
16 4 int AAA.number 0
20 4 (loss due to the next object alignment)
Instance size: 24 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

---------in synchronized() func--------------
7e0b37bc
AAA object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) b0 f4 34 03 (10110000 11110100 00110100 00000011) (53802160)
4 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
8 4 (object header) a8 35 85 1c (10101000 00110101 10000101 00011100) (478492072)
12 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
16 4 int AAA.number 0
20 4 (loss due to the next object alignment)
Instance size: 24 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

7e0b37bc
---------after invoke synchronized()--------------
7e0b37bc
AAA object internals:
OFFSET SIZE TYPE DESCRIPTION VALUE
0 4 (object header) 01 bc 37 0b (00000001 10111100 00110111 00001011) (188201985)
4 4 (object header) 7e 00 00 00 (01111110 00000000 00000000 00000000) (126)
8 4 (object header) a8 35 85 1c (10101000 00110101 10000101 00011100) (478492072)
12 4 (object header) 00 00 00 00 (00000000 00000000 00000000 00000000) (0)
16 4 int AAA.number 0
20 4 (loss due to the next object alignment)
Instance size: 24 bytes
Space losses: 0 bytes internal + 4 bytes external = 4 bytes total

可以看到,在调用一次 hashcode之后,就会在object header 中生成hashcode。注意区分大小端模式。
搜狗截图20191011144038.png

object header 在有锁的情况下会发生变化,但是不会改变hashcode的值。

为什么有锁的状态下,头部的hashcode会变。无锁后 又变回来了?

这是因为header会随着锁的状态发生变化。只有在无锁的情况下,才是这些字段。
unused:25 | identity_hashcode:31 | unused:1 | age:4 | biased_lock:1 | lock:2

另外,调用了一次hashcode 就会在栈帧中存在hashcode,但是与 偏向锁的字段发生冲突,因此jvm采用调用过hashcode的,不会存在偏向锁。为什么其它锁 可以呢?因为保存了指向栈中锁记录的指针,可以记录在里面。
搜狗截图20191011162436.png