[PATCH] D55929: Initial AArch64 SLH implementation.
    Kristof Beyls via Phabricator via llvm-commits 
    llvm-commits at lists.llvm.org
       
    Wed Jan  9 01:41:25 PST 2019
    
    
  
kristof.beyls marked 3 inline comments as done.
kristof.beyls added inline comments.
================
Comment at: lib/Target/AArch64/AArch64SpeculationHardening.cpp:604
 
-  UseControlFlowSpeculationBarrier = functionUsesHardeningRegister(MF);
+  // Step 1: Enable automatic insertion of SpeculationSafeValue.
+  if (HardenLoads) {
----------------
kristof.beyls wrote:
> olista01 wrote:
> > Why do you need to insert pseudo-instructions here, only to replace them with ANDs later on? Could this loop be moved to after the control-flow tracking has been inserted, and create the ANDs directly?
> This design inserting pseudo-instructions is based on an earlier design I did (which did not get committed) to implement the intrinsics based approach that is also implemented in gcc.
> By keeping this design, it will be easier in the future to also support the intrinsics based approach (as documented from a user point-of-view at https://lwn.net/Articles/759423/).
> The idea being that the user-specified intrinsics will be lowered to the pseudo-instruction, and that SLH basically inserts the pseudo-instructions. These pseudo-instructions then get lowered in the same way no matter if they came from a user-written intrinsic or from an automatically inserted pseudo-instruction by SLH.
> Granted, maybe the only non-trivial part of lowering the pseudo-instruction is the algorithm to optimize/reduce the number of CSDB instructions that are inserted.
> 
> Obviously, we could not use the pseudo-instructions for now and only introduce them if we introduce the intrinsics-based approach too.
> However, I'm not sure in how far that will complicate the optimization reducing the number of CSDBs inserted.
> Let me look into how easy or complicated the alternative design without pseudo-instructions would be.
I've know looked into a design where the pass iterates over each basic block just once, inserting masking operations and csdb instructions in a single go.
It turns out that the logic to implement the optimized csdb insertion becomes much more fiddly to implement correctly (I'm not sure I've managed to implement it correctly when trying, even after a few iterations of fixing failing regression tests). An extra complexity with the insertion of csdb instructions is that there isn't an easy way to test that it has been done correctly. When csdb instructions aren't inserted in the places it should, the programs will still execute and produce results as expected.
For that reason, I'd prefer to keep the design where we iterate each basic block twice: once for deciding which registers must be masked at which program locations and once to insert CSDBs. That way, we can implement the CSDB instruction insertion separately from the other transformations in this pass and it is less likely there will be bugs in that implementation.
As to the use of pseudo instructions: we could not use pseudo instructions and encode the same information as the pseudo instructions in side data structures in the pass to convey information between the multiple iterations across a basic block. But that again might result in a more complex implementation. And furthermore, when we implement support for user-facing intrinsics, we'll still need those pseudos anyway.
Therefore, I think the current design overall is a better trade-off.
CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D55929/new/
https://reviews.llvm.org/D55929
    
    
More information about the llvm-commits
mailing list