[all-commits] [llvm/llvm-project] 33e6b4: [SelectionDAG] Fix and improve TargetLowering::Sim...
Björn Pettersson via All-commits
all-commits at lists.llvm.org
Fri Apr 12 07:54:47 PDT 2024
Branch: refs/heads/main
Home: https://github.com/llvm/llvm-project
Commit: 33e6b488beda80dddb68016a132ee1f0a3c04aa1
https://github.com/llvm/llvm-project/commit/33e6b488beda80dddb68016a132ee1f0a3c04aa1
Author: Björn Pettersson <bjorn.a.pettersson at ericsson.com>
Date: 2024-04-12 (Fri, 12 Apr 2024)
Changed paths:
M llvm/include/llvm/CodeGen/TargetLowering.h
M llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp
M llvm/test/CodeGen/ARM/simplifysetcc_narrow_load.ll
M llvm/test/CodeGen/PowerPC/simplifysetcc_narrow_load.ll
Log Message:
-----------
[SelectionDAG] Fix and improve TargetLowering::SimplifySetCC (#87646)
The load narrowing part of TargetLowering::SimplifySetCC is updated
according to this:
1) The offset calculation (for big endian) did not work properly for
non byte-sized types. This is basically solved by an early exit
if the memory type isn't byte-sized. But the code is also corrected
to use the store size when calculating the offset.
2) To still allow some optimizations for non-byte-sized types the
TargetLowering::isPaddedAtMostSignificantBitsWhenStored hook is
added. By default it assumes that scalar integer types are padded
starting at the most significant bits, if the type needs padding
when being stored to memory.
3) Allow optimizing when isPaddedAtMostSignificantBitsWhenStored is
true, as that hook makes it possible for TargetLowering to know
how the non byte-sized value is aligned in memory.
4) Update the algorithm to always search for a narrowed load with
a power-of-2 byte-sized type. In the past the algorithm started
with the the width of the original load, and then divided it by
two for each iteration. But for a type such as i48 that would
just end up trying to narrow the load into a i24 or i12 load,
and then we would fail sooner or later due to not finding a
newVT that fulfilled newVT.isRound().
With this new approach we can narrow the i48 load into either
an i8, i16 or i32 load. By checking if such a load is allowed
(e.g. alignment wise) for any "multiple of 8 offset", then we can find
more opportunities for the optimization to trigger. So even for a
byte-sized type such as i32 we may now end up narrowing the load
into loading the 16 bits starting at offset 8 (if that is allowed
by the target). The old algorithm did not even consider that case.
5) Also start using getObjectPtrOffset instead of getMemBasePlusOffset
when creating the new ptr. This way we get "nsw" on the add.
To unsubscribe from these emails, change your notification settings at https://github.com/llvm/llvm-project/settings/notifications
More information about the All-commits
mailing list