[llvm-dev] Lowering llvm.memset for ARM target

Friedman, Eli via llvm-dev llvm-dev at lists.llvm.org
Wed Oct 25 11:48:51 PDT 2017


On 10/25/2017 8:22 AM, Evgeny Astigeevich via llvm-dev wrote:
> Hi Bharathi,
>
> I did some debugging, the current problem is that the same threshold values are used for SIMD and non-SIMD memory instructions.
> For the test you provided, when the SIMD extension is disabled the current implementation of llvm.memset lowering finds out 9 store instructions will be required. As 9 > 8 llvm.memset is lowered to a call.
> When the SIMD is enabled, it finds out 3 stores (two vst1.32 + one str) will be enough. As 3 < 8 llvm.memset is lowered to a sequence of stores.
>
> So before changing the threshold values we need to figure out:
>
> 1. Do we need separate thresholds for SIMD and non-SIMD memory instructions? For example, 8 for SIMD and 10-16 for non-SIMD. Some benchmarking is needed to find proper value.
> 2. If we keep the single value, how much it should be increased? This might affect performance of SIMD using applications. So benchmarking again.
> 3. If STRD are used then only 5 instructions are needed and llvm.memset is lowered as expected. I don’t know why the variant with STRD is not considered, maybe to avoid register pressure.

We probably do want to lower memset to strd/stm when we don't have 
NEON.  Patch welcome. :)  (Note that in Thumb2, we often form strd 
anyway in ARMLoadStoreOptimizer, but we don't do it reliably, and we 
probably want to do something similar in ARM, and maybe Thumb1.)

-Eli

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project



More information about the llvm-dev mailing list