[llvm] [SelectionDAG] Fix and improve TargetLowering::SimplifySetCC (PR #87646)

Fri Apr 12 08:34:29 PDT 2024

bjope wrote:

> > > For the record, the arm regression is code correct codegen, right?
> > 
> > 
> > Could you be more specific about which diff you are talking about.
> 
> The `lsls r0, r0, #8` one mentioned previously.

I described a bit regarding why that happens here:  https://github.com/llvm/llvm-project/pull/87646#discussion_r1562338677

The test case is showing that there is a difference between `-mtriple=arm` and `-mtriple=armv7` regarding if they consider misaligned accesses to be "Fast" or not.  The improved TargetLowering::SimplifySetCC now consider that misaligned accesses are considered slow for  `-mtriple=arm` and avoids introducing the narrowed load for that triple. But then later other optimizations trigger in DAGCombine that does not check if the misaligned access is considered as "Fast". So now there is a different DAG combine that triggers the misaligned narrowed load. That DAG combine also introduces an SHL operation that isn't really needed.
I don't know that enough about ARM to know if `lsls` is much worse than the `cmp`. But you say that it is a regression, so in what way? Is it a cycle regression or code size regression? Maybe it is even worse that there is a load narrowing that is resulting in a possibly misaligned load compared to using `lsls` instead of  `cmp`?

https://github.com/llvm/llvm-project/pull/87646