[PATCH] D54896: Introduce control flow speculation tracking pass for AArch64.

Zola Bridges via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Fri Jan 11 16:04:29 PST 2019


zbrid added a comment.

Hi Kristof,
Sorry about this late review. I wanted to make sure that I understood your implementation of SLH for ARM, so I took a look and made a few comments with some questions on this first patch. Let me know your thoughts. I'm planning to take a look at the other commit you made a few days ago related to SLH too. Hopefully I didn't make too many comments that were already addressed in the follow up commit.



================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:100
+
+#define AARCH64_SPECULATION_HARDENING_NAME "AArch64 speculation hardening pass"
+
----------------
Why is this a macro instead of a const variable? 


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:164-166
+  // If both the true and the false condition jump to the same basic block,
+  // there isn't need for any protection - whether the branch is speculated
+  // correctly or not, we end up executing the architecturally correct code.
----------------
Just curious in general about this, are there cases where this is expected to happen in practice? It seems useless to have a branch that goes to the same place for both true and false.


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:172
+  // translate analyzeBranchCondCode to CondCode.
+  assert(analyzeBranchCondCode.size() == 1 && "unknown Cond array format");
+  CondCode = AArch64CC::CondCode(analyzeBranchCondCode[0].getImm());
----------------
How does this assert work? I don't understand how you're ANDing with the string 

```
"unknown Cond array format"
```




================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:191
+        .addUse(AArch64::XZR)
+        .addImm(CondCode);
+    SplitEdgeBB.addLiveIn(AArch64::NZCV);
----------------
>From the comments at the beginning of the file, I expected a CSDB instruction right after this instruction. Is that not needed or added somewhere else (like the follow up commit that adds the masking)?


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:192
+        .addImm(CondCode);
+    SplitEdgeBB.addLiveIn(AArch64::NZCV);
+  }
----------------
What is a live in and why does this one get added?


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:222
+    if (MBB.instr_end() != MBB.instr_begin())
+      DL = (--MBB.instr_end())->getDebugLoc();
+
----------------
Is this getting the instruction before the branch instruction? Then that instruction is used as the debug location for the FBB and TBB we got in lines 214/215? Did I understand this correctly?


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:233
+  // Perform correct code generation around function calls and before returns.
+  {
+    SmallVector<MachineInstr *, 4> ReturnInstructions;
----------------
Why is this code in a separate code block?


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:262
+
+void AArch64SpeculationHardening::insertSPToRegTaintPropagation(
+    MachineBasicBlock *MBB, MachineBasicBlock::iterator MBBI) const {
----------------
Why doesn't this function return bool to indicate whether instructions were modified?


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:269-270
+    // insert full control flow speculation barrier (DSB SYS + ISB)
+    BuildMI(*MBB, MBBI, DebugLoc(), TII->get(AArch64::DSB)).addImm(0xf);
+    BuildMI(*MBB, MBBI, DebugLoc(), TII->get(AArch64::ISB)).addImm(0xf);
+    return;
----------------
Why do these instructions have an immediate operand? It looks like you're passing the default for ISB, but it's not clear to me what that number means for DSB. I checked this manual, but it doesn't say what the immediate operand is used for, only that it's limited in the value it can have: 

  - http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0802a/a64_general_alpha.html
  - http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0802a/a64_general_alpha.html 





================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:274
+
+  // CMP   SP, #0   === SUBS   xzr, SP, #0
+  BuildMI(*MBB, MBBI, DebugLoc(), TII->get(AArch64::SUBSXri))
----------------
Is having the zero register in ARM as the destination register equivalent to discarding the output or is the zero register affected?

If it's the former, why use CMP instead of SUBS if this instruction discards the result of the instruction in the same way that CMP does?


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:280
+      .addImm(0); // no shift
+  // CSETM x16, NE  === CSINV  x16, xzr, xzr, EQ
+  BuildMI(*MBB, MBBI, DebugLoc(), TII->get(AArch64::CSINVXr))
----------------
Why use CSINV instead of CSETM here?


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:297
+
+  // mov   Xtmp, SP  === ADD  Xtmp, SP, #0
+  BuildMI(*MBB, MBBI, DebugLoc(), TII->get(AArch64::ADDXri))
----------------
Why use ADD instead of MOV?


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:298-314
+  BuildMI(*MBB, MBBI, DebugLoc(), TII->get(AArch64::ADDXri))
+      .addDef(TmpReg)
+      .addUse(AArch64::SP)
+      .addImm(0)
+      .addImm(0); // no shift
+  // and   Xtmp, Xtmp, TaintReg === AND Xtmp, Xtmp, TaintReg, #0
+  BuildMI(*MBB, MBBI, DebugLoc(), TII->get(AArch64::ANDXrs))
----------------
Why not the following instead of the three above instructions?
```
AND SP, SP, TaintReg
```


================
Comment at: llvm/trunk/lib/Target/AArch64/AArch64SpeculationHardening.cpp:338-339
+  MisspeculatingTaintReg = AArch64::X16;
+  TII = MF.getSubtarget().getInstrInfo();
+  TRI = MF.getSubtarget().getRegisterInfo();
+  bool Modified = false;
----------------
Why do you have to get this data every time the function is called? Could these be static since we wouldn't expect them to be different between the functions in the same compilation? Or are there cases where this would be different between two functions?


Repository:
  rL LLVM

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D54896/new/

https://reviews.llvm.org/D54896





More information about the llvm-commits mailing list