[PATCH] D136396: [X86] Enable reassociation for ADD instructions
Simon Pilgrim via Phabricator via llvm-commits
llvm-commits at lists.llvm.org
Sat Oct 22 01:06:31 PDT 2022
RKSimon added inline comments.
================
Comment at: llvm/test/CodeGen/X86/reassociate-add.ll:2
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mtriple=x86_64-unknown-unknown -mcpu=x86-64 < %s | FileCheck %s
+
----------------
Carrot wrote:
> RKSimon wrote:
> > Drop -mcpu=x86-64
> >
> > Pre-commit these test with current codegen to trunk and rebase to show the patch diffs.
> Test case has been sent out as https://reviews.llvm.org/D136501.
>
> -mcpu=x86-64 is required because in MachineCombiner the reassociation is performed either the new sequence reduce latency, or function doSubstitute returns true. In its implementation, if -mcpu is not specified, the target will not have a valid schedule model, and then doSubstitute returns true, and reassociation is performed.
>
> ```
> bool MachineCombiner::doSubstitute(unsigned NewSize, unsigned OldSize,
> bool OptForSize) {
> if (OptForSize && (NewSize < OldSize))
> return true;
> if (!TSchedModel.hasInstrSchedModelOrItineraries())
> return true;
> return false;
> }
> ```
OK - that makes sense!
================
Comment at: llvm/test/CodeGen/X86/reassociate-add.ll:4
+
+; This file checks the reassociation of ADD instruction.
+; The two ADD instructions add v0,v1,t2 together. t2 has a long dependence
----------------
Carrot wrote:
> craig.topper wrote:
> > Carrot wrote:
> > > spatel wrote:
> > > > RKSimon wrote:
> > > > > pengfei wrote:
> > > > > > No idea if we intended to not do reassociation in ADD instructions.
> > > > > > Is there problem when unexpected overflow/underflow may be generated during reassociation?
> > > > > Not that I can think of - EFLAGS will be the same and we already handles the $dst=$src0 constraint
> > > > I was working on this a long time ago (~2015), so it's hard to remember exactly, but I don't think there was a fundamental reason to exclude integer ADD.
> > > >
> > > > It just seemed like it did not have much potential gain with the limited register set and could interfere with other transforms like LEA formation.
> > > >
> > > > If there's evidence that this improves something (and doesn't cause regressions), then it should be ok.
> > > EFLAGS may be different. Suppose we are adding three bytes, 250 + 10 + 10.
> > >
> > > - If we add them in the order (250 + 10) + 10, the first ADD generates carry/overflow flags, the second ADD doesn't generate carry/overflow.
> > >
> > > - If we add them in the order (10 + 10) + 250, the first ADD doesn't generates carry/overflow flags, the second ADD generates these flags.
> > >
> > > So we need to check if the definition of EFLAGS is dead.
> > >
> > > Maybe this is the reason @spatel didn't add ADD instructions in 2015.
> > >
> > > Thank @pengfei for the reminds.
> > >
> > Scalar MUL would have the same EFLAGs issue. So hopefully we already solved it?
> Indeed. So IMULrr should be handled similar to ADD. But I can't find any code for it :(
Sorry, bad description - EFLAGS will be handled (we check they are ignored...) the same as the existing ops
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D136396/new/
https://reviews.llvm.org/D136396
More information about the llvm-commits
mailing list