[PATCH] Allow code generation of ARM usat/ssat instructions
weimingz at codeaurora.org
Tue Mar 17 10:47:27 PDT 2015
It's a good idea to have a generic sat intrinsic. It will capture the exact semantics of such operations and allows other transformations even for some targets that have no hardware SAT instructions.
For example, for AArch64, it will allow vectorizer to find opportunities for UQXTN/SQXTN.
In order to find as many patterns of clamp as possible, doing in DAG is too late in some cases because clamp is often used with other operations . For example: (clamp(x, 255) >> 2 ) << 5, EarlyCSE will hoist the shift, which prevents SimplifyCFG to convert the outer "if" to "SELECT". Besides, other passes will convert it to (x > 255 ? 2016, ...), which makes pattern matching very difficult.
From: Ahmed Bougacha [mailto:ahmed.bougacha at gmail.com]
Sent: Monday, March 16, 2015 8:06 PM
To: weimingz at codeaurora.org; apazos at codeaurora.org; mcrosier at codeaurora.org
Cc: ahmed.bougacha at gmail.com; renato.golin at linaro.org; amara.emerson at arm.com; llvm-commits at cs.uiuc.edu
Subject: Re: [PATCH] Allow code generation of ARM usat/ssat instructions
I should note I'm sitting on patches to add generic saturation support; originally with dedicated intrinsics (there's a few months old RFC if you want to have a look), but now with SelectionDAG-level matching, with CodeGenPrepare's help. With a few other tweaks, this enables:
- other target support (I gather you generate the ARM intrinsics, what that's unavailable elsewhere?),
- vectorization (on say AArch64 or X86, there are only vector saturation instructions),
- as well as DAG combines (stuff like X86 PACKSS, or add-with-saturation instructions, etc..).
Anyway: if you're interested, I can rebase and continue, and put the patch set up for review. In the meantime I'll try to have a closer look to this one.
Thanks for working on this!
More information about the llvm-commits