[all-commits] [llvm/llvm-project] bfbccf: Add support for math.ctlz in convert-math-to-funcs

Mon Apr 10 10:02:23 PDT 2023

  Branch: refs/heads/main
  Home:   https://github.com/llvm/llvm-project
  Commit: bfbccfa17c97f29993555fcc4d9f191ba49d9606
      https://github.com/llvm/llvm-project/commit/bfbccfa17c97f29993555fcc4d9f191ba49d9606
  Author: Jeremy Kun <j2kun at users.noreply.github.com>
  Date:   2023-04-10 (Mon, 10 Apr 2023)

  Changed paths:
    M mlir/include/mlir/Conversion/Passes.td
    M mlir/lib/Conversion/MathToFuncs/CMakeLists.txt
    M mlir/lib/Conversion/MathToFuncs/MathToFuncs.cpp
    A mlir/test/Conversion/MathToFuncs/ctlz.mlir
    A mlir/test/Integration/Dialect/Math/CPU/mathtofuncs_ctlz.mlir
    M utils/bazel/llvm-project-overlay/mlir/BUILD.bazel

  Log Message:
  -----------
  Add support for math.ctlz in convert-math-to-funcs

This change adds a software implementation of the `math.ctlz` operation
and includes it in `--convert-math-to-funcs`.

This is my first change to MLIR, so please bear with me as I'm still learning
the idioms of the codebase.

The context for this change is that I have some larger scale project in which
I'd like to lower from a mix of MLIR dialects to CIRCT, but many of the CIRCT
passes don't support the `math` dialect.

I noticed the content of `convert-math-to-funcs` was limited entirely to
the `pow` functions, but otherwise provided the needed structure to implement
this feature with minimal changes.

Highlight of the changes:

- Add a dependence on the SCF dialect for this lower. I could have lowered
  directly to cf, following the pow lowerings in the same pass, but I felt it
  was not necessary given the existing support for lowering scf to cf.
- Generalize the DenseMap storing op implementations: modify the callback
  function hashmap to be keyed by both OperationType (for me this effectively
  means the name of the op being implemented in software) and the type
  signature of the resulting function.
- Implement the ctlz function as a loop. I had researched a variety of
  implementations that claimed to be more efficient (such as those based on a
  de Bruijn sequence), but it seems to me that the simplest approach would make
  it easier for later compiler optimizations to do a better (platform-aware)
  job optimizing this than I could do by hand.

Questions I had for the reviewer:

- [edit: found mlir-cpu-runner and added two tests] What would I add to the filecheck invocation to actually run the resulting MLIR on a value and assert the output is correct? I have done this manually with the C implementation but I'm not confident my port to MLIR is correct.
- Should I add a test for a vectorized version of this lowering? I followed suit with the ` VecOpToScalarOp` but I admit I don't fully understand what it's doing.

Reviewed By: vzakhari

Differential Revision: https://reviews.llvm.org/D146261