[PATCH] D119720: [ARM] Pass for Cortex-A57 and Cortex-A72 Fused AES Erratum

Thu Mar 31 02:41:52 PDT 2022

dmgreen added inline comments.

================
Comment at: llvm/lib/Target/ARM/ARMFixCortexA57AES1742098Pass.cpp:200-207
+  case ARM::VMOVv2f32:
+  case ARM::VMOVv4f32:
+  case ARM::VMOVv2i32:
+  case ARM::VMOVv4i32:
+  case ARM::VMOVv4i16:
+  case ARM::VMOVv8i16:
+  case ARM::VMOVv8i8:
----------------
Are these vmov of an immediate? Are they not safe?

I was expecting it to be the lanes sets (VSETLNi8) and other scalar instructions that were unsafe.

================
Comment at: llvm/lib/Target/ARM/ARMFixCortexA57AES1742098Pass.cpp:306
+          // fixup at the start of the function.
+          LLVM_DEBUG(dbgs()
+                     << "Fixup Planned: Live-In (with safe defining instrs): "
----------------
lenary wrote:
> dmgreen wrote:
> > Can you explain more about the IsLiveIn && UnsafeCount==0 case. Am I understanding that correctly that it would be:
> > ```
> > function(q0, ...)
> >   lotsofcode...
> >   q0 = load
> >   aes q0
> > ```
> > Is there a better way to detect that the live-in doesn't matter in cases like that?
> I don't believe there is, and this comes down to issues with the `RDA.getGlobalReachingDefs` implementation, which I want to fix/enhance, but in a follow-up patch.
> 
> To start with, this is actually not a problem, as the pass is intended to be conservative, and we know the clobbers are a no-op at the architectural level (we insert them for their micro-architectural effects). So code will still do the right thing, but maybe with a little too much overhead in the case you showed.
> 
> However, this is necessary in some other cases, such as:
> 
> ```
> function(q0)
>    code
>    conditional branch to L2
> L1:
>    q0 = safe_op(…)
>    branch to L3
> L2:
>    code without update to q0
> L3:
>    aes q0
> ```
> 
> In this case, `AllDefs` is a set containing one single defining instruction for Q0, because there is only one within the function (which is all that `RDA.getGlobalReachingDefs` can report instructions for).
> 
> But in my example, we *need* to protect q0 on the other paths, because the safe definitions of q0, when considered as a set, do not entirely dominate the AES use of q0 (this is slightly stretching the conventional definition of dominance, but think of this as "there exists a path from entry to the aes, which does not contain any of the safe instructions". Sadly, MachineDominance doesn't allow us to make this kind of query either!).
> 
> In this case though, it is safe to insert the protection at function entry, because that will (by definition) dominate all the AES uses, and the protection doesn't need to be dominated by the safe definitions, as we know they're safe.
> 
> I intend to follow-up this initial patch with an enhancement to ReachingDefAnalysis which will also provide the information that you have a set of defs inside the function, and also you're live-in, as this is required info for any conservative pass using the ReachingDefAnalysis. I felt, however, that given the pass is safe as-is, it was good to proceed without this enhancement.
> 
> 
> 
> 
OK sounds good.

================
Comment at: llvm/lib/Target/ARM/ARMTargetMachine.cpp:585

+  addPass(createARMFixCortexA57AES1742098Pass());
+
----------------
lenary wrote:
> dmgreen wrote:
> > Passes can't insert new instructions (or move things further apart) after ConstantIslandPass. The branches/constant pools it has created may go out of range of the instructions that use them. Would it be OK to move it before that?
> TIL. I'll add a comment about the constant island pass as well.
> 
> Should I also look at the restrictions on the Branch Targets pass? I can imagine we also don't want to separate instructions once we've calculated their targets locally?
Yeah - It sounds like the BTI would need to remain as the first instruction in the block.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D119720/new/

https://reviews.llvm.org/D119720