[all-commits] [llvm/llvm-project] 5d0ff9: AMDGPU: Promote array alloca if used by memmove/me...
Ruiling, Song via All-commits
all-commits at lists.llvm.org
Tue Jan 10 18:00:29 PST 2023
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 5d0ff923c3a7cc6c47b6010bbaf68592124110a5
https://github.com/llvm/llvm-project/commit/5d0ff923c3a7cc6c47b6010bbaf68592124110a5
Author: Ruiling Song <ruiling.song at amd.com>
Date: 2023-01-11 (Wed, 11 Jan 2023)
Changed paths:
M llvm/lib/Target/AMDGPU/AMDGPUPromoteAlloca.cpp
M llvm/test/CodeGen/AMDGPU/promote-alloca-array-aggregate.ll
Log Message:
-----------
AMDGPU: Promote array alloca if used by memmove/memcpy
Reviewed by: arsenm
Differential Revision: https://reviews.llvm.org/D140599
Commit: cce24b6af0999c658fd3e4931eb9bc58252478b8
https://github.com/llvm/llvm-project/commit/cce24b6af0999c658fd3e4931eb9bc58252478b8
Author: Ruiling Song <ruiling.song at amd.com>
Date: 2023-01-11 (Wed, 11 Jan 2023)
Changed paths:
M llvm/lib/Target/AMDGPU/SIDefines.h
M llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
Log Message:
-----------
AMDGPU: Remove IsSourceOfDivergence check
This bit is not set/reserved in td file. Let's remove it for now,
we can always add it back if we need it.
Reviewed by: foad
Differential Revision: https://reviews.llvm.org/D141223
Commit: 9119d9bfcef47b245d15fc9d2e5044bc67724bfc
https://github.com/llvm/llvm-project/commit/9119d9bfcef47b245d15fc9d2e5044bc67724bfc
Author: Ruiling Song <ruiling.song at amd.com>
Date: 2023-01-11 (Wed, 11 Jan 2023)
Changed paths:
M llvm/lib/Target/AMDGPU/BUFInstructions.td
M llvm/lib/Target/AMDGPU/DSInstructions.td
M llvm/lib/Target/AMDGPU/FLATInstructions.td
M llvm/lib/Target/AMDGPU/SIDefines.h
M llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp
M llvm/lib/Target/AMDGPU/SIInstrFormats.td
M llvm/lib/Target/AMDGPU/SIInstrInfo.h
M llvm/test/CodeGen/AMDGPU/chain-hi-to-lo.ll
M llvm/test/CodeGen/AMDGPU/load-hi16.ll
M llvm/test/CodeGen/AMDGPU/vector_shuffle.packed.ll
Log Message:
-----------
AMDGPU/SIInsertWait: Skip dummy tied source
For D16 memory load instructions, the hardware usually only write to half
of the 32bit register, but we define the destination register using
32bit register for the MachineIR instruction. Without the extra tied
source register, LLVM framework will think previous write to the other
half of the register being dead. This is because by using 32bit register
as the destination register, LLVM will think the instruction will always
overwrite the whole 32bit register. By adding the extra tied source,
LLVM will think we are reading the register, so previous write to the
register will not be dead. This dummy tied source is introducing
unnecessary read-after-write dependency. The change here is to bypass the
tied source that can be skipped, thus avoiding an unnecessary s_waitcnt.
Reviewed by: foad
Differential Revision: https://reviews.llvm.org/D140537
Compare: https://github.com/llvm/llvm-project/compare/c0b475bd5ec9...9119d9bfcef4
More information about the All-commits
mailing list