[llvm-dev] Load combine pass

Wed Sep 28 08:22:03 PDT 2016

Hi,

I'm trying to optimize a pattern like this into a single i16 load:
  %1 = bitcast i16* %pData to i8*
  %2 = load i8, i8* %1, align 1
  %3 = zext i8 %2 to i16
  %4 = shl nuw i16 %3, 8
  %5 = getelementptr inbounds i8, i8* %1, i16 1
  %6 = load i8, i8* %5, align 1
  %7 = zext i8 %6 to i16
  %8 = shl nuw nsw i16 %7, 0
  %9 = or i16 %8, %4

I came across load combine pass which is motivated by virtualliy the same pattern. Load combine optimizes the pattern by combining adjacent loads into one load and lets the rest of the optimizer cleanup the rest. From what I see on the initial review for load combine (https://reviews.llvm.org/D3580) it was not enabled by default because it caused some performance regressions. It's not very surprising, I see how this type of widening can obscure some facts for the rest of the optimizer.

I can't find any backstory for this pass, why was it chosen to optimize the pattern in question in this way? What is the current status of this pass?

I have an alternative implementation for it locally. I implemented an instcombine rule similar to recognise bswap/bitreverse idiom. It relies on collectBitParts (Local.cpp) to determine the origin of the bits in a given or value. If all the bits are happen to be loaded from adjacent locations it replaces the or with a single load or a load plus bswap.

If the alternative approach sounds reasonable I'll post my patches for review.

Artur
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.llvm.org/pipermail/llvm-dev/attachments/20160928/e2a12676/attachment.html>