[PATCH] D148558: [CodeGen] Improve handling -Ofast generated code by ComplexDeinterleaving pass

Thu May 11 04:14:17 PDT 2023

igor.kirillov added inline comments.

================
Comment at: llvm/lib/Target/AArch64/AArch64ISelLowering.cpp:24710
+    if (Rotation == ComplexDeinterleavingRotation::Rotation_0) {
+      return B.CreateFAdd(InputA, InputB);
+    } else if (Rotation == ComplexDeinterleavingRotation::Rotation_180) {
----------------
NickGuy wrote:
> igor.kirillov wrote:
> > NickGuy wrote:
> > > Do we want to be discarding the fastmath flags here? 
> > Yeah, you're right. Even though the code works fine without the fastmath flags and they might not be important at this point in the optimization pipeline, it's probably better to keep them. But it's not that simple. First of all, we don't know which instruction to get the flags from when dealing with -Ofast code. My idea is to use the **Collect **lambda function from identifyReassocNodes and //AND// all the flags together and store them in the **ComplexDeinterleavingNode**.
> > 
> > The second issue is that currently, replaceSymmetricNode only uses flags from the **Real **Instruction. But what if the flags for the **Imag ** Instruction are different?
> > 
> > The third thing to consider is whether we should add flags to the architecture-specific intrinsics generated by **createComplexDeinterleavingIR**.
> > 
> > I think the best way to handle all this is to add a Flags member to ComplexDeinterleavingNode and then apply those flags when using replaceNode.
> > 
> > What do you think?
> > My idea is to use the Collect lambda function from identifyReassocNodes and AND all the flags together and store them in the ComplexDeinterleavingNode.
> ANDing them feels like the most sensible to me, either that or ensuring they're all equal during identification.
> 
> > I think the best way to handle all this is to add a Flags member to ComplexDeinterleavingNode and then apply those flags when using replaceNode.
> That makes sense to me, but I'd be wary of how to apply the flags themselves. We'd either need to change the target hook to provide the flags, or have some way for a target to specify which instructions get which flags.
> If we do decide to apply the flags on the common side, there's also the problem of applying them to all relevant instructions, we only get the one returned from `createComplexDeinterleavingIR`, but it could create an entire subgraph of instructions if the target deems it necessary (try the existing implementation with an unrealistic <32 x double> for an example of this).
> 
> > The second issue is that currently, replaceSymmetricNode only uses flags from the Real Instruction. But what if the flags for the Imag Instruction are different?
> That's a fair point. I'd err on the side of caution, and only perform the replacement if the flags are consistent on both the Real and Imag sides.
> 
> > The third thing to consider is whether we should add flags to the architecture-specific intrinsics generated by createComplexDeinterleavingIR.
> That is something I don't feel qualified to answer :D 
> My instinct says that it wouldn't be a problem, as nothing would be checking the flags on intrinsics, but can't say for sure.
Added flag check to **identifySymmetricOperation**. And added two field to **ComplexDeinterleavingCompositeNode** Opcode and Flags that are used only by **replaceSymmetricNode**. Also, now I generate Symmetric nodes rather then CAdd with 0 and 180 degree rotation.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D148558/new/

https://reviews.llvm.org/D148558