volatile本意是“易变的,可变的”,它的作用是来保证 线程的可见性,和防止指令重排
[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-SOBgRKtz-1603325999246)(/Users/luca/MarkText-img-Support/2020-08-08-09-39-08-image.png)]
我们都知道每一个线程都有自己的线程私有区,还有与多个线程共享的线程共有区也就是堆(heap),如果线程要使用到堆中的某一个对象,则这个对象会被复制到线程私有区,对这个对象的任何改变首先是更新自己私有区的,之后在立刻写回到共享内存中(Heap 堆)。
这就产生了线程可见性的问题,假设一种情况:我需要线程一修改该对象之后,在用线程二处理这个对象,但是这个对象在线程一还未将处理后的对象更新到共享内存中。线程二就已经到共享内存中取得了这个对象。这就会出现问题。
所以我们就使用volatile关键字来保证这个对象的线程可见性。当一个线程在对主内存的某一份数据进行更改时,改完之后会立刻刷新到主内存。并且会强制让缓存了该变量的线程中的数据清空,它们必须从主内存重新读取最新数据。这样一来就保证了可见性
它的本质是使用的CPU的缓存一致性协议来保证线程可见性。 CPU缓存一致性协议是多核的计算机来保证各个CPU缓存一致的。
注意⚠️:volatile没有把握就不要用,volatile修饰的值越简单越好,尽量不要用它来修饰引用,例如:
volatile ArrayList<Object> mylist = new ArrayList();这样volatile就是修饰的一个引用,volatile保证这个引用的可见性,也就是说,当mylist指向别的对象时其他的线程才看的见,当这个列表的内容发生改变时,另外的线程时看不见的。
package com.mashibing.testvolatile; public class T01_ThreadVisibility { private static volatile boolean flag = true; public static void main(String[] args) throws InterruptedException { new Thread(()-> { while (flag) { //do sth } System.out.println("end"); }, "server").start(); Thread.sleep(1000); flag = false; } }我们知道,CPU为了执行的效率,都是采用了流水线模式来处理指令,如果想充分的利用这一点,就要求我们的编译器会对编译完的代码进行重新排序。
但是有一些指令我们需要让他按顺序执行,不能让它重排序,所以我们要使用volatile来修饰。
我们不能禁止CPU的指令重排序,因为这是CPU提高效率的策略,是CPU级别的,我们禁止不了的,但是我们可以在虚拟机级别来禁止指令重排序。
如果还要深究,其实防止指令重排是使用读屏障(LoadFence),写屏障(StroeFence). 这个是CPU的原语,CPU是直接支持的。LoadFance规定 必须执行完屏障前的读操作才能执行屏障后的操作,写也是一样
这样的单例模式肯定是有问题的,因为他是线程不安全的,我们可以在getInstance方法上加一个Synchronized,这样肯定是解决了这个问题。
但是现在又有一个问题,我们直接粗暴将getInstance() 方法变成了同步方法,我们希望将这个锁的粒度细化,最终就是我们的DCL(Double Check Lock), 这个看起来是十全十美了,在工程上也十分难出错,但是这个还是可能会出错。
这个错就在INSTANCE = new MyObject(); 的指令重排上。在JVM中new一个对象分成三步
给对象申请内存
初始化对象的成员变量
引用指向这块内存
当这三个步骤发生了指令重排序的话,比如顺序是132。当我们一个线程开始new这个对象的时候,执行的是132,当这个线程执行到3的时候,也就是说引用已经指向了这一块内存,但这一块内存还是赋值的默认值,还没有进行初始化,这时第二个线程进来,判断发现,这个引用已经指向一个内存了(也就是不等于null了),这时第二个线程就直接拿起还没有初始化的对象就走了。
虽然这个情况在高并发的环境中也可能不会出现,但是在超高超高的并发环境下就可能会出现这种的情况
这时我们就要对这个对象加上volatile,防止这个对象进行指令重排序
volatile不可能替代synchronized,volatile只保证线程的可见性,但不保证原子性,比如一个递增语句:count++,它最少分为三步执行,在这三步中难免会被其他的线程插一脚进来访问,所以volatile并不能保证多个线程访问共享数据带来的不一致问题
缓存行对齐 缓存行64个字节是CPU同步的基本单位,缓存行隔离会比伪共享效率要高 Disruptor
需要注意,JDK8引入了@sun.misc.Contended注解,来保证缓存行隔离效果 要使用此注解,必须去掉限制参数:-XX:-RestrictContended
另外,java编译器或者JIT编译器有可能会去除没用的字段,所以填充字段必须加上volatile
package com.mashibing.juc.c_028_FalseSharing; public class T02_CacheLinePadding { private static class Padding { public volatile long p1, p2, p3, p4, p5, p6, p7; // } private static class T extends Padding { public volatile long x = 0L; } public static T[] arr = new T[2]; static { arr[0] = new T(); arr[1] = new T(); } public static void main(String[] args) throws Exception { Thread t1 = new Thread(()->{ for (long i = 0; i < 1000_0000L; i++) { arr[0].x = i; } }); Thread t2 = new Thread(()->{ for (long i = 0; i < 1000_0000L; i++) { arr[1].x = i; } }); final long start = System.nanoTime(); t1.start(); t2.start(); t1.join(); t2.join(); System.out.println((System.nanoTime() - start)/100_0000); } }MESI
伪共享
合并写 CPU内部的4个字节的Buffer
package com.mashibing.juc.c_029_WriteCombining; public final class WriteCombining { private static final int ITERATIONS = Integer.MAX_VALUE; private static final int ITEMS = 1 << 24; private static final int MASK = ITEMS - 1; private static final byte[] arrayA = new byte[ITEMS]; private static final byte[] arrayB = new byte[ITEMS]; private static final byte[] arrayC = new byte[ITEMS]; private static final byte[] arrayD = new byte[ITEMS]; private static final byte[] arrayE = new byte[ITEMS]; private static final byte[] arrayF = new byte[ITEMS]; public static void main(final String[] args) { for (int i = 1; i <= 3; i++) { System.out.println(i + " SingleLoop duration (ns) = " + runCaseOne()); System.out.println(i + " SplitLoop duration (ns) = " + runCaseTwo()); } } public static long runCaseOne() { long start = System.nanoTime(); int i = ITERATIONS; while (--i != 0) { int slot = i & MASK; byte b = (byte) i; arrayA[slot] = b; arrayB[slot] = b; arrayC[slot] = b; arrayD[slot] = b; arrayE[slot] = b; arrayF[slot] = b; } return System.nanoTime() - start; } public static long runCaseTwo() { long start = System.nanoTime(); int i = ITERATIONS; while (--i != 0) { int slot = i & MASK; byte b = (byte) i; arrayA[slot] = b; arrayB[slot] = b; arrayC[slot] = b; } i = ITERATIONS; while (--i != 0) { int slot = i & MASK; byte b = (byte) i; arrayD[slot] = b; arrayE[slot] = b; arrayF[slot] = b; } return System.nanoTime() - start; } }指令重排序
package com.mashibing.jvm.c3_jmm; public class T04_Disorder { private static int x = 0, y = 0; private static int a = 0, b =0; public static void main(String[] args) throws InterruptedException { int i = 0; for(;;) { i++; x = 0; y = 0; a = 0; b = 0; Thread one = new Thread(new Runnable() { public void run() { //由于线程one先启动,下面这句话让它等一等线程two. 读着可根据自己电脑的实际性能适当调整等待时间. //shortWait(100000); a = 1; x = b; } }); Thread other = new Thread(new Runnable() { public void run() { b = 1; y = a; } }); one.start();other.start(); one.join();other.join(); String result = "第" + i + "次 (" + x + "," + y + ")"; if(x == 0 && y == 0) { System.err.println(result); break; } else { //System.out.println(result); } } } public static void shortWait(long interval){ long start = System.nanoTime(); long end; do{ end = System.nanoTime(); }while(start + interval >= end); }}
### 系统底层如何实现数据一致性 1. MESI如果能解决,就使用MESI 2. 如果不能,就锁总线 ### 系统底层如何保证有序性 1. 内存屏障sfence mfence lfence等系统原语 2. 锁总线 ### volatile如何解决指令重排序 1: volatile i 2: ACC_VOLATILE 3: JVM的内存屏障 屏障两边的指令不可以重排!保障有序! happends-before as - if - serial 4:hotspot实现 bytecodeinterpreter.cpp ```c++ int field_offset = cache->f2_as_index(); if (cache->is_volatile()) { if (support_IRIW_for_not_multiple_copy_atomic_cpu) { OrderAccess::fence(); }orderaccess_linux_x86.inline.hpp
inline void OrderAccess::fence() { if (os::is_MP()) { // always use locked addl since mfence is sometimes expensive #ifdef AMD64 __asm__ volatile ("lock; addl $0,0(%%rsp)" : : : "cc", "memory"); #else __asm__ volatile ("lock; addl $0,0(%%esp)" : : : "cc", "memory"); #endif } }LOCK 用于在多处理器中执行指令时对共享内存的独占使用。 它的作用是能够将当前处理器对应缓存的内容刷新到内存,并使其他处理器对应的缓存失效。 另外还提供了有序的指令无法越过这个内存屏障的作用。
安装hsdis
代码
public class T { public static volatile int i = 0; public static void main(String[] args) { for(int i=0; i<1000000; i++) { m(); n(); } } public static synchronized void m() { } public static void n() { i = 1; } } java -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly T > 1.txt由于JIT会为所有代码生成汇编,请搜索T::m T::n,来找到m() 和 n()方法的汇编码
============================= C1-compiled nmethod ============================== ----------------------------------- Assembly ----------------------------------- Compiled method (c1) 67 1 3 java.lang.Object::<init> (1 bytes) total in heap [0x00007f81d4d33010,0x00007f81d4d33360] = 848 relocation [0x00007f81d4d33170,0x00007f81d4d33198] = 40 main code [0x00007f81d4d331a0,0x00007f81d4d33260] = 192 stub code [0x00007f81d4d33260,0x00007f81d4d332f0] = 144 metadata [0x00007f81d4d332f0,0x00007f81d4d33300] = 16 scopes data [0x00007f81d4d33300,0x00007f81d4d33318] = 24 scopes pcs [0x00007f81d4d33318,0x00007f81d4d33358] = 64 dependencies [0x00007f81d4d33358,0x00007f81d4d33360] = 8 -------------------------------------------------------------------------------- [Constant Pool (empty)] -------------------------------------------------------------------------------- [Entry Point] # {method} {0x00007f81d3cfe650} '<init>' '()V' in 'java/lang/Object' # [sp+0x40] (sp of caller) 0x00007f81d4d331a0: mov 0x8(%rsi),%r10d 0x00007f81d4d331a4: shl $0x3,%r10 0x00007f81d4d331a8: cmp %rax,%r10 0x00007f81d4d331ab: jne 0x00007f81d47eed00 ; {runtime_call ic_miss_stub} 0x00007f81d4d331b1: data16 data16 nopw 0x0(%rax,%rax,1) 0x00007f81d4d331bc: data16 data16 xchg %ax,%ax [Verified Entry Point] 0x00007f81d4d331c0: mov %eax,-0x14000(%rsp) 0x00007f81d4d331c7: push %rbp 0x00007f81d4d331c8: sub $0x30,%rsp 0x00007f81d4d331cc: movabs $0x7f81d3f33388,%rdi ; {metadata(method data for {method} {0x00007f81d3cfe650} '<init>' '()V' in 'java/lang/Object')} 0x00007f81d4d331d6: mov 0x13c(%rdi),%ebx 0x00007f81d4d331dc: add $0x8,%ebx 0x00007f81d4d331df: mov %ebx,0x13c(%rdi) 0x00007f81d4d331e5: and $0x1ff8,%ebx 0x00007f81d4d331eb: cmp $0x0,%ebx 0x00007f81d4d331ee: je 0x00007f81d4d33204 ;*return {reexecute=0 rethrow=0 return_oop=0} ; - java.lang.Object::<init>@0 (line 50) 0x00007f81d4d331f4: add $0x30,%rsp 0x00007f81d4d331f8: pop %rbp 0x00007f81d4d331f9: mov 0x108(%r15),%r10 0x00007f81d4d33200: test %eax,(%r10) ; {poll_return} 0x00007f81d4d33203: retq 0x00007f81d4d33204: movabs $0x7f81d3cfe650,%r10 ; {metadata({method} {0x00007f81d3cfe650} '<init>' '()V' in 'java/lang/Object')} 0x00007f81d4d3320e: mov %r10,0x8(%rsp) 0x00007f81d4d33213: movq $0xffffffffffffffff,(%rsp) 0x00007f81d4d3321b: callq 0x00007f81d489e000 ; ImmutableOopMap {rsi=Oop } ;*synchronization entry ; - java.lang.Object::<init>@-1 (line 50) ; {runtime_call counter_overflow Runtime1 stub} 0x00007f81d4d33220: jmp 0x00007f81d4d331f4 0x00007f81d4d33222: nop 0x00007f81d4d33223: nop 0x00007f81d4d33224: mov 0x3f0(%r15),%rax 0x00007f81d4d3322b: movabs $0x0,%r10 0x00007f81d4d33235: mov %r10,0x3f0(%r15) 0x00007f81d4d3323c: movabs $0x0,%r10 0x00007f81d4d33246: mov %r10,0x3f8(%r15) 0x00007f81d4d3324d: add $0x30,%rsp 0x00007f81d4d33251: pop %rbp 0x00007f81d4d33252: jmpq 0x00007f81d480be80 ; {runtime_call unwind_exception Runtime1 stub} 0x00007f81d4d33257: hlt 0x00007f81d4d33258: hlt 0x00007f81d4d33259: hlt 0x00007f81d4d3325a: hlt 0x00007f81d4d3325b: hlt 0x00007f81d4d3325c: hlt 0x00007f81d4d3325d: hlt 0x00007f81d4d3325e: hlt 0x00007f81d4d3325f: hlt [Exception Handler] 0x00007f81d4d33260: callq 0x00007f81d489ad00 ; {no_reloc} 0x00007f81d4d33265: mov %rsp,-0x28(%rsp) 0x00007f81d4d3326a: sub $0x80,%rsp 0x00007f81d4d33271: mov %rax,0x78(%rsp) 0x00007f81d4d33276: mov %rcx,0x70(%rsp) 0x00007f81d4d3327b: mov %rdx,0x68(%rsp) 0x00007f81d4d33280: mov %rbx,0x60(%rsp) 0x00007f81d4d33285: mov %rbp,0x50(%rsp) 0x00007f81d4d3328a: mov %rsi,0x48(%rsp) 0x00007f81d4d3328f: mov %rdi,0x40(%rsp) 0x00007f81d4d33294: mov %r8,0x38(%rsp) 0x00007f81d4d33299: mov %r9,0x30(%rsp) 0x00007f81d4d3329e: mov %r10,0x28(%rsp) 0x00007f81d4d332a3: mov %r11,0x20(%rsp) 0x00007f81d4d332a8: mov %r12,0x18(%rsp) 0x00007f81d4d332ad: mov %r13,0x10(%rsp) 0x00007f81d4d332b2: mov %r14,0x8(%rsp) 0x00007f81d4d332b7: mov %r15,(%rsp) 0x00007f81d4d332bb: movabs $0x7f81f15ff3e2,%rdi ; {external_word} 0x00007f81d4d332c5: movabs $0x7f81d4d33265,%rsi ; {internal_word} 0x00007f81d4d332cf: mov %rsp,%rdx 0x00007f81d4d332d2: and $0xfffffffffffffff0,%rsp 0x00007f81d4d332d6: callq 0x00007f81f1108240 ; {runtime_call} 0x00007f81d4d332db: hlt [Deopt Handler Code] 0x00007f81d4d332dc: movabs $0x7f81d4d332dc,%r10 ; {section_word} 0x00007f81d4d332e6: push %r10 0x00007f81d4d332e8: jmpq 0x00007f81d47ed0a0 ; {runtime_call DeoptimizationBlob} 0x00007f81d4d332ed: hlt 0x00007f81d4d332ee: hlt 0x00007f81d4d332ef: hlt -------------------------------------------------------------------------------- ============================= C1-compiled nmethod ============================== ----------------------------------- Assembly ----------------------------------- Compiled method (c1) 74 2 3 java.lang.StringLatin1::hashCode (42 bytes) total in heap [0x00007f81d4d33390,0x00007f81d4d338a8] = 1304 relocation [0x00007f81d4d334f0,0x00007f81d4d33528] = 56 main code [0x00007f81d4d33540,0x00007f81d4d336c0] = 384 stub code [0x00007f81d4d336c0,0x00007f81d4d33750] = 144 metadata [0x00007f81d4d33750,0x00007f81d4d33758] = 8 scopes data [0x00007f81d4d33758,0x00007f81d4d337c0] = 104 scopes pcs [0x00007f81d4d337c0,0x00007f81d4d33890] = 208 dependencies [0x00007f81d4d33890,0x00007f81d4d33898] = 8 nul chk table [0x00007f81d4d33898,0x00007f81d4d338a8] = 16 -------------------------------------------------------------------------------- [Constant Pool (empty)] -------------------------------------------------------------------------------- [Verified Entry Point] # {method} {0x00007f81d3e6ddd0} 'hashCode' '([B)I' in 'java/lang/StringLatin1' # parm0: rsi:rsi = '[B' # [sp+0x40] (sp of caller) 0x00007f81d4d33540: mov %eax,-0x14000(%rsp) [测试结果太多,大约有14w+字,如果感兴趣,私信我获取完整的测试结果]