[clang] [CIR] Build out AST consumer patterns to reach the entry point into CIRGen (PR #91007)

Aaron Ballman via cfe-commits cfe-commits at lists.llvm.org
Tue May 21 07:58:39 PDT 2024


AaronBallman wrote:

> >  I have no idea whether that returns a `SourceLocation`, a `PresumedLoc`, a `FullSourceLoc`, etc, so I don't think that is a use of `auto` I would want to have to reason about as a reviewer. That said, if the only use of `loc` is: `Diag(loc, diag::err_something);` then the type really doesn't matter and use of `auto` is probably fine. It's a judgment call.
> 
> For location, `getLoc` always return `mlir::Location`, it's not related to Clang's source location. CIRGen use of `loc` is mostly to pass it down to MLIR APIs, because MLIR source locations are required whenever one is building or rewriting an operation. We usually convert Clang's location into MLIR and pass those around as expected. Has that changed your opinion for this specific case?

My preference is still to spell it out; I would have had no idea that there was an `mlir::Location` type and presumed this was using Clang's source location types.

I can see the argument either way, but I prefer we optimize for readbility of the code by people *not* ensconced in MLIR. Otherwise this becomes harder for anyone other than MLIR folks to get involved with. That said, if it's inside of the `lib` directory and isn't part of a public interface, it's probably not the end of the world to stick with `auto`. I just don't think it benefits anyone to do so other than to make it easier to land the initial patches, which doesn't seem like a compelling reason to stick with `auto`.

> > Again, I have no idea what the type of that is and I have no way to verify that it actually _uses_ value semantics because the pointer and qualifiers can be inferred and references can be dropped. That makes it harder to review the code; I would spell out the type in this case as well.
> 
> Makes sense. For some background: MLIR "instructions", usually called "operations", usually have pretty default return types when querying operands, `mlir::Value` for SSA values or when an `mlir::Attribute` is returned, the getter/setter is suffixed with `Attr`. Would it be reasonable to suggest a guideline where in CIRGen we get more strict on not using `auto` but for other passes and transformations it should be fine to use auto on those? My rationale here is that post CIRGen it's very likely the reviwer audience is going to be more among MLIR experts than Clang experts and the former are more familiar with the idiom.

I think that's something we can live with, but like above, this makes it harder for the community in general while making it easier for folks already involved with MLIR.

> > My thinking is: 1) (most important) the convention should be consistent with other nearby code, whatever that convention happens to be.
> 
> Ok, for camelBack, it seems like `clang/lib/CodeGen` is already adopting it in multiple places (unsure if it's intentional or not?), and looks incomplete because probably no one took the effort to make it uniform? The signal this gives me is that we should just do it (use camelBack) for CIRGen.

That seems reasonable to me.

> Yep, thank you @AaronBallman for putting the time here to help us set the right approach looking fwd. To be more specific, I'd suggest:
> 
> - `clang/include` -> LLVM
> - `lib/Driver/*` -> LLVM
> - `lib/FrontendTool` -> LLVM
> - `lib/CIR/FrontendAction` -> LLVM
> - `lib/CIR/CodeGen` -> MLIR
> - `lib/CIR/*` -> MLIR
> 
> What do you think?

I think that's a reasonable compromise. @erichkeane WDYT?

https://github.com/llvm/llvm-project/pull/91007


More information about the cfe-commits mailing list