[PATCH][RFC] HLE support proposal

Michael Liao michael.liao at intel.com
Tue Apr 16 18:34:44 PDT 2013


Hi,

One issue in last proposal adding HLE support is that we need to add
new features in SelectionDAG to enable propagating of HLE hint from LLVM
IR into backend. As we're planning moving away from SelectionDAG, in
this propposal, an alternative approach is proposed to refactor atomic
instruction code generation, enable HLE hint to be passed into backend
and hence enable HLE code generation.

To add minimal dependence on SelectionDAG (as we have no way to bypass
it completely), this proposal adds series intrinsics mapping to native
atomic intrinsics, e.g. llvm.x86.cas.* to x86's CMPXCHG and adds a pass
just before instruction selection to transform all atomic instructions
into target-specific native intrinsics. For atomic instructions not
supported directly by hardware, that pass will transform it into CAS
(compare-and-swap) loop or LLSC (load-link & store conditional) loop
after enquiring target. E.g. atomicrmw max %Ptr, %Val is not directly
supported, it will be translated into CAS loop or LLSC loop as follows:

---- CAS loop ----
Orig := load(Ptr);
do {
  Old := PHI(Orig, Curr);
  New := max(Old, New);
  {Curr, Flag} = CAS(Ptr, Old, New);
} while (!Flag);
v := Curr;

where {Curr, Flag} is the return value of CAS, Curr is the current value
in that memory and Flag indicates whether the New value is stored in
that location.

---- LLSC loop ----
do {
  Curr := LL(Ptr);
  New := max(Old, New);
  Flag := SC(Ptr, New);
} while (!Flag);
v := Curr;

where Flag indicates whether the SC succeeds.

(LLSC is available for most RISC targets. x86 has CAS only.)

With this atomic IR lower pass, all atomic instructions are translated
into target native intrinsic or code sequence based them. To get HLE
supported, target native intrinsics will be extended with one extra
parameter, HLE hint. Targets supporting HLE will lower them
correspondingly following target ISA spec.

Before completing the full patch, I attached the early patchs for review
and demonstration purpose.

- 0001-Add-CAS-intrinsic-to-help-refactoring-Atomic-support.patch
  This patch adds llvm.x86.cas.* and llvm.x86.dcas.*, which will be
mapped to CMPXCHG, CMPXCHG8B, CMPXCHG16B separately.

- 0002-Add-X86-atomic-IR-lower-pass.patch
  This patch adds a pre-isel pass to lower all atomic instructions into
target native intrinsics or CAS/LLSC loops.

- 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
  This patch adds X86 HLE instruction encoding and assembler support.

- 0004-Add-HLE-code-generation.patch
  This patch extends X86 native intrinsics to propagate HLE hint and
enable HLE code generation.

A early test case (hle-atomic-max.ll) is also added for your reference.

Thanks for review
- Michael



-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0001-Add-CAS-intrinsic-to-help-refactoring-Atomic-support.patch
Type: text/x-patch
Size: 15357 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130416/565c2f90/attachment.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0002-Add-X86-atomic-IR-lower-pass.patch
Type: text/x-patch
Size: 23442 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130416/565c2f90/attachment-0001.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0003-Add-XACQ-XREL-prefix-and-encoding-asm-printer-suppor.patch
Type: text/x-patch
Size: 8751 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130416/565c2f90/attachment-0002.bin>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 0004-Add-HLE-code-generation.patch
Type: text/x-patch
Size: 27488 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20130416/565c2f90/attachment-0003.bin>
-------------- next part --------------
; RUN: llc < %s -x86-disable-atomic-IR-lower=0 -mtriple=i686-pc-linux -mcpu=corei7-avx | FileCheck -check-prefix X32 %s
; RUN: llc < %s -x86-disable-atomic-IR-lower=0 -mtriple=x86_64-pc-linux -mcpu=corei7-avx | FileCheck -check-prefix X64 %s
; RUN: llc < %s -x86-disable-atomic-IR-lower=0 -mtriple=i686-pc-linux -mcpu=corei7-avx -mattr=+hle | FileCheck -check-prefix HLE32 %s
; RUN: llc < %s -x86-disable-atomic-IR-lower=0 -mtriple=x86_64-pc-linux -mcpu=corei7-avx -mattr=+hle | FileCheck -check-prefix HLE64 %s

@sc = external global i64

define i64 @test0() {
  %r = atomicrmw max i64* @sc, i64 3 acquire, !hle.lock !0
  ret i64 %r
; X32: test0
; X32: cmp
; X32: test
; X32: test
; X32: cmov
; X32: cmov
; X32: lock
; X32-NEXT: cmpxchg8b
; X32-NEXT: jne
; X32: ret

; X64: test0
; X64: cmp
; X64: cmov
; X64: lock
; X64-NEXT: cmpxchg
; X64-NEXT: jne
; X64: ret

; HLE32: test0
; HLE32: cmp
; HLE32: test
; HLE32: test
; HLE32: cmov
; HLE32: cmov
; HLE32: lock
; HLE32-NEXT: xacquire
; HLE32-NEXT: cmpxchg8b
; HLE32-NEXT: jne
; HLE32: ret

; HLE64: test0
; HLE64: cmp
; HLE64: cmov
; HLE64: lock
; HLE64-NEXT: xacquire
; HLE64-NEXT: cmpxchg
; HLE64-NEXT: jne
; HLE64: ret
}

define i64 @test1(i64 %x) {
  %r = atomicrmw max i64* @sc, i64 %x acquire, !hle.lock !1
  ret i64 %r
; X32: test1
; X32: cmp
; X32: cmp
; X32: test
; X32: cmov
; X32: cmov
; X32: lock
; X32-NEXT: cmpxchg8b
; X32-NEXT: jne
; X32: ret

; X64: test1
; X64: cmp
; X64: cmov
; X64: lock
; X64-NEXT: cmpxchg
; X64-NEXT: jne
; X64: ret

; HLE32: test1
; HLE32: cmp
; HLE32: cmp
; HLE32: test
; HLE32: cmov
; HLE32: cmov
; HLE32: lock
; HLE32-NEXT: xrelease
; HLE32-NEXT: cmpxchg8b
; HLE32-NEXT: jne
; HLE32: ret

; HLE64: test1
; HLE64: cmp
; HLE64: cmov
; HLE64: lock
; HLE64-NEXT: xrelease
; HLE64-NEXT: cmpxchg
; HLE64-NEXT: jne
; HLE64: ret
}

define i64 @test2() {
  %r = atomicrmw min i64* @sc, i64 3 acquire, !hle.lock !0
  ret i64 %r
; X32: test2
; X32: cmp
; X32: test
; X32: test
; X32: cmov
; X32: cmov
; X32: lock
; X32-NEXT: cmpxchg8b
; X32-NEXT: jne
; X32: ret

; X64: test2
; X64: cmp
; X64: cmov
; X64: lock
; X64-NEXT: cmpxchg
; X64-NEXT: jne
; X64: ret

; HLE32: test2
; HLE32: cmp
; HLE32: test
; HLE32: test
; HLE32: cmov
; HLE32: cmov
; HLE32: lock
; HLE32-NEXT: xacquire
; HLE32-NEXT: cmpxchg8b
; HLE32-NEXT: jne
; HLE32: ret

; HLE64: test2
; HLE64: cmp
; HLE64: cmov
; HLE64: lock
; HLE64-NEXT: xacquire
; HLE64-NEXT: cmpxchg
; HLE64-NEXT: jne
; HLE64: ret
}

define i64 @test3(i64 %x) {
  %r = atomicrmw min i64* @sc, i64 %x acquire, !hle.lock !1
  ret i64 %r
; X32: test3
; X32: cmp
; X32: cmp
; X32: test
; X32: cmov
; X32: cmov
; X32: lock
; X32-NEXT: cmpxchg8b
; X32-NEXT: jne
; X32: ret

; X64: test3
; X64: cmp
; X64: cmov
; X64: lock
; X64-NEXT: cmpxchg
; X64-NEXT: jne
; X64: ret

; HLE32: test3
; HLE32: cmp
; HLE32: cmp
; HLE32: test
; HLE32: cmov
; HLE32: cmov
; HLE32: lock
; HLE32-NEXT: xrelease
; HLE32-NEXT: cmpxchg8b
; HLE32-NEXT: jne
; HLE32: ret

; HLE64: test3
; HLE64: cmp
; HLE64: cmov
; HLE64: lock
; HLE64-NEXT: xrelease
; HLE64-NEXT: cmpxchg
; HLE64-NEXT: jne
; HLE64: ret
}

!0 = metadata !{metadata !"acquire"}
!1 = metadata !{metadata !"release"}


More information about the llvm-commits mailing list