[PATCH] D146278: [flang] add -flang-experimental-hlfir flag to flang-new

Mon Mar 20 20:03:20 PDT 2023

vzakhari added a comment.

LGTM
Please address @awarzynski's comment about the test.

================
Comment at: flang/include/flang/Tools/CLOptions.inc:235
+///   passes pipeline
+inline void createHLFIRToFIRPassPipeline(mlir::PassManager &pm,
+    llvm::OptimizationLevel optLevel = defaultOptLevel) {
----------------
tblah wrote:
> vzakhari wrote:
> > Would you mind also calling this in `bbc`  driver?
> Adding this to bbc will have to wait until after `-emit-fir` and `-emit-hlfir` are different flags. Otherwise hlfir ops will be lowered to fir, breaking some tests (and presumably people's workflows).
Okay! Thank you for considering it!

================
Comment at: flang/include/flang/Tools/CLOptions.inc:238
+  pm.addPass(mlir::createCanonicalizerPass());
+  pm.addPass(hlfir::createLowerHLFIRIntrinsicsPass());
+  pm.addPass(hlfir::createBufferizeHLFIRPass());
----------------
tblah wrote:
> vzakhari wrote:
> > I would imagine we may not want to optimize MATMUL(TRANSPOSE) into MATMUL_TRANSPOSE at O0.  What is the best way to control this?  We may either disable canonicalization or let `LowerHLFIRIntrinsicsPass` lower MATMUL_TRANSPOSE differently based on the optimization level.  Or is it always okay to implement it as a combined operation?
> So far as I know, there should be no loss to precision from implementing it as a combined operation. Memory usage should be reduced as we need one fewer temporary.
> 
> If static linking is used, including MATMUL_TRANSPOSE in the runtime library will increase code size (so long as both matmul and transpose are also called elsewhere). I haven't measured this, but I wouldn't expect this to be a large change relative to the size of a real world application.
> 
> If dynamic linking is used, whether or not this pass runs, MATMUL_TRANSPOSE will make the runtime library a little larger (there are a lot of template instantiations, but MATMUL_TRANSPOSE is only one of many similar functions so the effect as a proportion of the whole shouldn't be much).
> 
> But I'll set the canonicalization pass to only run when we are optimizing for speed. Later canonicalisation passes (after createLowerHLFIRIntrinsicsPass) won't find any hlfir.matmul operations to canonicalise and so won't create a hlfir.matmul_transpose operation.
Thank you!

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146278/new/

https://reviews.llvm.org/D146278