[PATCH] D146278: [flang] add -flang-experimental-hlfir flag to flang-new

Mon Mar 20 07:19:51 PDT 2023

tblah added inline comments.

================
Comment at: flang/include/flang/Tools/CLOptions.inc:235
+///   passes pipeline
+inline void createHLFIRToFIRPassPipeline(mlir::PassManager &pm,
+    llvm::OptimizationLevel optLevel = defaultOptLevel) {
----------------
vzakhari wrote:
> Would you mind also calling this in `bbc`  driver?
Adding this to bbc will have to wait until after `-emit-fir` and `-emit-hlfir` are different flags. Otherwise hlfir ops will be lowered to fir, breaking some tests (and presumably people's workflows).

================
Comment at: flang/include/flang/Tools/CLOptions.inc:238
+  pm.addPass(mlir::createCanonicalizerPass());
+  pm.addPass(hlfir::createLowerHLFIRIntrinsicsPass());
+  pm.addPass(hlfir::createBufferizeHLFIRPass());
----------------
vzakhari wrote:
> I would imagine we may not want to optimize MATMUL(TRANSPOSE) into MATMUL_TRANSPOSE at O0.  What is the best way to control this?  We may either disable canonicalization or let `LowerHLFIRIntrinsicsPass` lower MATMUL_TRANSPOSE differently based on the optimization level.  Or is it always okay to implement it as a combined operation?
So far as I know, there should be no loss to precision from implementing it as a combined operation. Memory usage should be reduced as we need one fewer temporary.

If static linking is used, including MATMUL_TRANSPOSE in the runtime library will increase code size (so long as both matmul and transpose are also called elsewhere). I haven't measured this, but I wouldn't expect this to be a large change relative to the size of a real world application.

If dynamic linking is used, whether or not this pass runs, MATMUL_TRANSPOSE will make the runtime library a little larger (there are a lot of template instantiations, but MATMUL_TRANSPOSE is only one of many similar functions so the effect as a proportion of the whole shouldn't be much).

But I'll set the canonicalization pass to only run when we are optimizing for speed. Later canonicalisation passes (after createLowerHLFIRIntrinsicsPass) won't find any hlfir.matmul operations to canonicalise and so won't create a hlfir.matmul_transpose operation.

================
Comment at: flang/include/flang/Tools/CLOptions.inc:251
+    bool stackArrays = false, bool underscoring = true, bool useHLFIR = false) {
+  if (useHLFIR)
+    fir::createHLFIRToFIRPassPipeline(pm, optLevel);
----------------
vzakhari wrote:
> Is this check and the option really needed?  Except for the extra canonicalizer pass the newly added passes will be no-ops, so maybe we should just run them unconditionally.
> 
> It may be too much to ask for, but will it make sense not to bundle these passes into `MLIRToLLVM` pipeline and have the possibility to let driver emit post-HLFIR-lowering MLIR (under `-emit-fir`) and pre-HLFIR-lowering (under some new option)?
Okay I'll run the HLFIR passes unconditionally.

I made -emit-fir output HLFIR to match what bbc already does, but I can see the usefulness of having both -emit-fir and -emit-hlfir. I won't do it in this patch but I will get around to this at some point (but feel free to jump in sooner if you need the flag for something).

================
Comment at: flang/test/Driver/mlir-pass-pipeline.f90:15-18
+! O2-NEXT: Canonicalizer
+! ALL-NEXT: LowerHLFIRIntrinsics
+! ALL-NEXT: BufferizeHLFIR
+! ALL-NEXT: ConvertHLFIRtoFIR
----------------
awarzynski wrote:
> It looks like these passes are run unconditionally - what's the `-flang-experimental-hlfir` flag for then?
Yes. @vzakhari requested the passes should run unconditionally.

`-flang-experimental-hlfir` controls whether the HLFIR lowering is used. 

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D146278/new/

https://reviews.llvm.org/D146278