[Openmp-commits] [llvm] [libcxxabi] [lldb] [compiler-rt] [clang-tools-extra] [clang] [openmp] [flang] [mlir] [libcxx] [MachineCopyPropagation] When the source of PreviousCopy is undef, we cannot replace sub register (PR #74682)

via Openmp-commits openmp-commits at lists.llvm.org
Thu Dec 7 17:02:17 PST 2023


DianQK wrote:

> Hello. I think that if you removed undef from the first instruction the result would still be incorrect. With:
> 
> ```
> $x8 = ORRXrs $xzr, $x0, 0, implicit $w0
> $w8 = ORRWrs $wzr, $w0, 0, implicit-def $x8
> ```

I'm also curious about it, but this transformation has been around for a long time, so I'm assuming it's correct. I think there exists some other pass solution to this problem here.

> The top bits are not zero into the function?

Yes. Please see the link https://llvm.godbolt.org/z/zv3vz8zcs.

```asm
_call_from_main:
        sub     sp, sp, #32
        stp     x29, x30, [sp, #16]
        add     x29, sp, #16
        mov     w8, #1
        str     w8, [sp, #4] ;;; We save `0x00000001` on stack
        add     x0, sp, #4 ;;; Save the address
        bl      _device_create_texture
        ldp     x29, x30, [sp, #16]
        add     sp, sp, #32
        ret
_device_create_texture:
        sub     sp, sp, #32
        stp     x29, x30, [sp, #16]
        add     x29, sp, #16
        ldr     x8, [x0] ;;; Load 8-byte, get `0x6f60400000000001`
        str     x8, [sp]
        ldr     w9, [x0, #8]
        str     w9, [sp, #8]
        ubfx    x1, x8, #32, #8
        mov     x0, x8
        bl      __ZN12repro_11790213TextureFormat17required_features17h5e4d5de64322154bE
        ..
        ret
__ZN12repro_11790213TextureFormat17required_features17h5e4d5de64322154bE:
        mov     x8, x0 ;;;  x8/x0 is `0x6f60400000000001`
        mov     w0, #0
        adrp    x9, LJTI10_0 at PAGE
        add     x9, x9, LJTI10_0 at PAGEOFF
        adr     x10, LBB10_1
        ldrb    w11, [x9, x8] ;;; We expect x8 is `0x0000000000000001`
```

In this function, we can't guarantee that `x8/x0` is 0x1, we have to do a transformation `mov w8, w8`.
So I think this delete subregister copy instruction is based on a convention similar to this one. We can delete `mov w8, w0` on this:

```asm
mov x8, x0
mov w8, w0
mov w8, w8
```

If I use `-O0`, I can get the result:

```asm
__ZN12repro_11790213TextureFormat17required_features17h5e4d5de64322154bE:
        sub     sp, sp, #16
        str     w1, [sp, #4]
        subs    w8, w0, #0
        mov     w8, w8 ; This looks like a transformation that guarantees that x8 is `0x0000000000000001`.
```

After that I found the following code in `lowerCopy`.

https://github.com/llvm/llvm-project/blob/0808be47b8fbf0307d0b6f2eb45ba9bfe1b3ae65/llvm/lib/Target/AArch64/AArch64InstrInfo.cpp#L4422-L4429

If we lower the code to `mov x8, x0` instead of `mov w8, w0`, this may be somewhat unsafe. So this code leaves an undef.
My PR was answering this comment and undef.

https://github.com/llvm/llvm-project/pull/74682


More information about the Openmp-commits mailing list