[PATCH] D108602: [RISCV] Initial support .insn directive for the assembler.

Alex Bradbury via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Sep 2 05:02:10 PDT 2021


asb added a comment.

I made a start on .insn support quite some time ago (and unfortunately got distracted with other things and didn't get it to the point it was worth sharing - apologies for that).

I actually had an idea for doing it in a slightly different way vs SystemZ (that SystemZ could hopefully be moved towards as well). Essentially, if we parse `.insn` like an instruction and not like a directive, we can reuse most of the existing tablegen machinery for asm parsing.

e.g. adding a new hook to AsmParser:

  --- a/llvm/include/llvm/MC/MCParser/MCTargetAsmParser.h
  +++ b/llvm/include/llvm/MC/MCParser/MCTargetAsmParser.h
  @@ -416,6 +416,13 @@ public:
     /// \param DirectiveID - the identifier token of the directive.
     virtual bool ParseDirective(AsmToken DirectiveID) = 0;
   
  +  /// Return true if the given IDVal should be parsed as a directive. Called
  +  /// whenever a "normal" directive is encountered (i.e. not conditional
  +  /// assembly directives). Targets may use this to override default parsing
  +  /// behaviour, for instance to allow instruction-like directives such as
  +  /// .insn to be parsed as an instruction.
  +  virtual bool shouldParseAsDirective(StringRef IDVal) { return true; }
  +
     /// MatchAndEmitInstruction - Recognize a series of operands of a parsed
     /// instruction as an actual MCInst and emit it to the specified MCStreamer.
     /// This returns false on success and returns true on failure to match.
  diff --git a/llvm/lib/MC/MCParser/AsmParser.cpp b/llvm/lib/MC/MCParser/AsmParser.cpp
  index c05f26cbdda5..c3fd85f4a9e0 100644
  --- a/llvm/lib/MC/MCParser/AsmParser.cpp
  +++ b/llvm/lib/MC/MCParser/AsmParser.cpp
  @@ -1889,7 +1889,8 @@ bool AsmParser::parseStatement(ParseStatementInfo &Info,
     // Otherwise, we have a normal instruction or directive.
   
     // Directives start with "."
  -  if (IDVal.startswith(".") && IDVal != ".") {
  +  if (IDVal.startswith(".") && IDVal != "." &&
  +      getTargetParser().shouldParseAsDirective(IDVal)) {
       // There are several entities interested in parsing directives:
       //
       // 1. The target-specific assembly parser. Some directives are target

Then implement that hook in RISCVAsmParser:

  @@ -1756,6 +1808,17 @@ bool RISCVAsmParser::ParseInstruction(ParseInstructionInfo &Info,
     // First operand is token for instruction
     Operands.push_back(RISCVOperand::createToken(Name, NameLoc, isRV64()));
   
  +  // If parsing a .insn directive, then create a token containing the
  +  // instruction format identifier.
  +  if (Name == ".insn") {
  +    SMLoc Loc = getLoc();
  +    if (!getLexer().is(AsmToken::Identifier))
  +      return Error(Loc, ".insn should be followed by an instruction format");
  +    StringRef InstFormat = getParser().getTok().getIdentifier();
  +    getParser().Lex();
  +    Operands.push_back(RISCVOperand::createToken(InstFormat, Loc, isRV64()));
  +  }
  +
     // If there are no more operands, then finish
     if (getLexer().is(AsmToken::EndOfStatement))
       return false;
  @@ -1847,6 +1910,10 @@ bool RISCVAsmParser::ParseDirective(AsmToken DirectiveID) {
     return true;
   }
   
  +bool RISCVAsmParser::shouldParseAsDirective(StringRef IDVal) {
  +  return IDVal != ".insn";
  +}
  +

Then with some extra machinery for printing opcode values etc, you can make definitions in RISCVInstrInfo.td like:

  +//===----------------------------------------------------------------------===//
  +// .insn directive
  +//===----------------------------------------------------------------------===//
  +
  +class OpcodeVal<string name, bits<7> enc> {
  +  string Name = name;
  +  bits<7> Encoding = enc;
  +}
  +
  +def OpcodeValList : GenericTable {
  +  let FilterClass = "OpcodeVal";
  +  let Fields = [ "Name", "Encoding" ];
  +
  +  let PrimaryKey = [ "Encoding" ];
  +  let PrimaryKeyName = "lookupOpcodeValByEncoding";
  +}
  +
  +def lookupOpcodeValByName : SearchIndex {
  +  let Table = OpcodeValList;
  +  let Key = [ "Name" ];
  +}
  +
  +// These names match those accepted by the GNU assembler's .insn directive.
  +def : OpcodeVal<"C0", 0x0>;
  +def : OpcodeVal<"C1", 0x1>;
  +def : OpcodeVal<"C2", 0x2>;
  +def : OpcodeVal<"LOAD", 0x03>;
  +def : OpcodeVal<"LOAD_FP", 0x07>;
  +def : OpcodeVal<"CUSTOM_0", 0x0b>;
  +def : OpcodeVal<"MISC_MEM", 0x0f>;
  +def : OpcodeVal<"OP_IMM", 0x13>;
  +def : OpcodeVal<"AUIPC", 0x17>;
  +def : OpcodeVal<"OP_IMM_32", 0x1b>;
  +def : OpcodeVal<"STORE", 0x23>;
  +def : OpcodeVal<"STORE_FP", 0x27>;
  +def : OpcodeVal<"CUSTOM_1", 0x2b>;
  +def : OpcodeVal<"AMO", 0x2f>;
  +def : OpcodeVal<"OP", 0x33>;
  +def : OpcodeVal<"LUI", 0x37>;
  +def : OpcodeVal<"OP_32", 0x3b>;
  +def : OpcodeVal<"MADD", 0x43>;
  +def : OpcodeVal<"MSUB", 0x47>;
  +def : OpcodeVal<"NMADD", 0x4f>;
  +def : OpcodeVal<"NMSUB", 0x4b>;
  +def : OpcodeVal<"OP_FP", 0x53>;
  +def : OpcodeVal<"CUSTOM_2", 0x5b>;
  +def : OpcodeVal<"BRANCH", 0x63>;
  +def : OpcodeVal<"JALR", 0x67>;
  +def : OpcodeVal<"JAL", 0x6f>;
  +def : OpcodeVal<"SYSTEM", 0x73>;
  +def : OpcodeVal<"CUSTOM_3", 0x7b>;
  +
  +// Note that for all directive definitions, the passed constants (e.g. funct3,
  +// funct3, opcode) are all dummy values that are overridden by the .insn
  +// definition.
  +
  +let hasSideEffects = 1, mayLoad = 1, mayStore = 1,
  +    isAsmParserOnly = 1 in {
  +
  +def INSN_R : RVInstR<0, 0, OPC_LOAD, (outs GPR:$rd), (ins opcode_value:$opcode,
  +                     uimm3:$funct3, uimm7:$funct7, GPR:$rs1, GPR:$rs2),
  +                     ".insn r", "$opcode, $funct3, $funct7, $rd, $rs1, $rs2">,
  +             Sched<[]> {
  +  bits<7> funct7;
  +  bits<3> funct3;
  +  bits<7> opcode;
  +
  +  let Inst{31-25} = funct7;
  +  let Inst{14-12} = funct3;
  +  let Inst{6-0} = opcode;
  +}
  +
  +def INSN_I : RVInstI<0, OPC_LOAD, (outs GPR:$rd), (ins opcode_value:$opcode,
  +                     uimm3:$funct3, GPR:$rs1, simm12:$imm12), ".insn i",
  +                     "$opcode, $funct3, $rd, $rs1, $imm12">,
  +      Sched<[]> {
  +  bits<3> funct3;
  +  bits<7> opcode;
  +
  +  let Inst{14-12} = funct3;
  +  let Inst{6-0} = opcode;
  +}

I'm afraid the branch were I was playing with this got a bit messy so it's hard to extract a full patch, but hopefully that's enough to outline the idea (and I can spend more time pulling bits out if it's useful).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108602/new/

https://reviews.llvm.org/D108602



More information about the llvm-commits mailing list