[PATCH] D102794: [MCA] llvm-mca MCTargetStreamer segfault fix (Resubmission)

Patrick Holland via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed May 19 11:55:44 PDT 2021


holland11 created this revision.
holland11 added reviewers: andreadb, qcolombet.
holland11 added a project: LLVM.
Herald added subscribers: pengfei, gbedwell.
holland11 requested review of this revision.

**This is a resubmission as my first submission had an error in the test file (I failed to include the -mcpu command argument which caused build failures after it was committed).**

In order to create the code regions for llvm-mca to analyze, llvm-mca creates an `AsmCodeRegionGenerator` and calls `AsmCodeRegionGenerator::parseCodeRegions()`. Within this function, both an `MCAsmParser` and `MCTargetAsmParser` are created so that `MCAsmParser::Run()` can be used to create the code regions for us.

These parser classes were created for llvm-mc so they are designed to emit code with an `MCStreamer` and `MCTargetStreamer` that are expected to be setup and passed into the `MCAsmParser` constructor. Because llvm-mca doesn’t want to emit any code, an `MCStreamerWrapper` class gets created instead and passed into the `MCAsmParser` constructor. This wrapper inherits from `MCStreamer` and overrides many of the emit methods to just do nothing. The exception is the `emitInstruction()` method which calls `Regions.addInstruction(Inst)`.

This works well and allows llvm-mca to utilize llvm-mc’s `MCAsmParser` to build our code regions, however there are a few directives which rely on the `MCTargetStreamer`. llvm-mc assumes that the `MCStreamer` that gets passed into the `MCAsmParser`’s constructor has a valid pointer to an `MCTargetStreamer`. Because llvm-mca doesn’t setup an `MCTargetStreamer`, when the parser encounters one of those directives, a segfault will occur.

In x86, each one of these 7 directives will cause this segfault if they exist in the input assembly to llvm-mca:

  .cv_fpo_proc
  .cv_fpo_setframe
  .cv_fpo_pushreg
  .cv_fpo_stackalloc
  .cv_fpo_stackalign
  .cv_fpo_endprologue
  .cv_fpo_endproc

I haven’t looked at other targets, but I wouldn’t be surprised if some of the other ones also have certain directives which could result in this same segfault.

My proposed solution is to simply initialize an `MCTargetStreamer` after we initialize the `MCStreamerWrapper`. The `MCTargetStreamer` requires an ostream object, but we don’t actually want any of these directives to be emitted anywhere, so I use an ostream created with the `nulls()` function. Since this needs to happen after the `MCStreamerWrapper` has been initialized, it needs to happen within the `AsmCodeRegionGenerator::parseCodeRegions()` function. The `MCTargetStreamer` also needs an `MCInstPrinter` which is easiest to initialize within the `main()` function of llvm-mca. So this `MCInstPrinter` gets constructed within `main()` then passed into the `parseCodeRegions()` function as a parameter. (If you feel like it would be appropriate and possible to create the `MCInstPrinter` within the `parseCodeRegions()` function, then feel free to modify my solution. That would stop us from having to pass it into the function and would limit its scope / lifetime.)

My solution stops the segfault from happening and still passes all of the current (expected) llvm-mca tests. I also added a new test for x86 that checks for this segfault on an input that includes one of the .cv_fpo directives (this test fails without my solution, but passes with it).

As far as I can tell, all of the functions that I modified are only called from within llvm-mca so there shouldn’t be any worries about breaking other tools.

Assembly input that causes the segfault (relevant directive on line 5):

      .section    __TEXT,__text,regular,pure_instructions
      .build_version macos, 11, 0    sdk_version 11, 3
      .globl    __Z3funi                ## -- Begin function _Z3funi
      .p2align    4, 0x90
      .cv_fpo_pushreg ebx
  __Z3funi:                               ## @_Z3funi
      .cfi_startproc
  ## %bb.0:
      pushq    %rbp
      .cfi_def_cfa_offset 16
      .cfi_offset %rbp, -16
      movq    %rsp, %rbp
      .cfi_def_cfa_register %rbp
      movl    %edi, -4(%rbp)
      movl    -4(%rbp), %eax
      shll    $1, %eax
      popq    %rbp
      retq
      .cfi_endproc
                                          ## -- End function
      .globl    _main                   ## -- Begin function main
      .p2align    4, 0x90
  _main:                                  ## @main
      .cfi_startproc
  ## %bb.0:
      pushq    %rbp
      .cfi_def_cfa_offset 16
      .cfi_offset %rbp, -16
      movq    %rsp, %rbp
      .cfi_def_cfa_register %rbp
      subq    $32, %rsp
      movl    $0, -4(%rbp)
      movl    $5, -8(%rbp)
      movl    $3, -12(%rbp)
      movl    -8(%rbp), %eax
      imull    -12(%rbp), %eax
      movl    %eax, -16(%rbp)
      movl    -16(%rbp), %edi
      callq    __Z3funi
      movl    %eax, -20(%rbp)
      movl    -20(%rbp), %eax
      addq    $32, %rsp
      popq    %rbp
      retq
      .cfi_endproc
                                          ## -- End function
  .subsections_via_symbols

Output when attempting to run this file through llvm-mca without the fix:

  patrick at MacBook-Pro bin % pwd
  /Users/patrick/Desktop/llvm/open_llvm/llvm-project/build/release/bin
  patrick at MacBook-Pro bin % ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s
  
  
  
  /Users/patrick/Desktop/files/temp_x86_crash.s:2:2: warning: .build_version macos used while targeting darwin20.4.0
          .build_version macos, 11, 0     sdk_version 11, 3
          ^
  Assertion failed: (getParser().getStreamer().getTargetStreamer() && "do not have a target streamer"), function getTargetStreamer, file /Users/patrick/Desktop/llvm/open_llvm/llvm-project/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp, line 117.
  PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
  Stack dump:
  0.    Program arguments: ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s
  Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
  0  llvm-mca                 0x0000000103eac5f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 39
  1  llvm-mca                 0x0000000103eab528 llvm::sys::RunSignalHandlers() + 248
  2  llvm-mca                 0x0000000103eacfd0 SignalHandler(int) + 272
  3  libsystem_platform.dylib 0x00007fff20511d7d _sigtramp + 29
  4  libsystem_platform.dylib 0x00007fb15e00b701 _sigtramp + 18446743739737020833
  5  libsystem_c.dylib        0x00007fff20421411 abort + 120
  6  libsystem_c.dylib        0x00007fff204207e8 err + 0
  7  llvm-mca                 0x0000000103fe2541 (anonymous namespace)::X86AsmParser::ParseDirective(llvm::AsmToken) (.cold.13) + 33
  8  llvm-mca                 0x0000000103b7d426 (anonymous namespace)::X86AsmParser::ParseDirective(llvm::AsmToken) + 5238
  9  llvm-mca                 0x0000000103e15a69 (anonymous namespace)::AsmParser::parseStatement((anonymous namespace)::ParseStatementInfo&, llvm::MCAsmParserSemaCallback*) + 3513
  10 llvm-mca                 0x0000000103e0fddc (anonymous namespace)::AsmParser::Run(bool, bool) + 540
  11 llvm-mca                 0x0000000103a215d3 llvm::mca::AsmCodeRegionGenerator::parseCodeRegions() + 243
  12 llvm-mca                 0x0000000103a16725 main + 4261
  13 libdyld.dylib            0x00007fff204e7f3d start + 1
  
  
  
  zsh: abort      ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s
  patrick at MacBook-Pro bin % 

Output when attempting to run the file after the fix (only relevant part of the output is that it succeeds):

  patrick at MacBook-Pro bin % pwd
  /Users/patrick/Desktop/llvm/open_llvm/llvm-project/build/release/bin
  patrick at MacBook-Pro bin % ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s
  
  
  
  /Users/patrick/Desktop/files/temp_x86_crash.s:2:2: warning: .build_version macos used while targeting darwin20.4.0
          .build_version macos, 11, 0     sdk_version 11, 3
          ^
  warning: found a return instruction in the input assembly sequence.
  note: program counter updates are ignored.
  warning: found a call in the input assembly sequence.
  note: call instructions are not correctly modeled. Assume a latency of 100cy.
  Iterations:        100
  Instructions:      2300
  Total Cycles:      2231
  Total uOps:        3700
  
  Dispatch Width:    6
  uOps Per Cycle:    1.66
  IPC:               1.03
  Block RThroughput: 9.0
  
  
  Instruction Info:
  [1]: #uOps
  [2]: Latency
  [3]: RThroughput
  [4]: MayLoad
  [5]: MayStore
  [6]: HasSideEffects (U)
  
  [1]    [2]    [3]    [4]    [5]    [6]    Instructions:
   3      2     1.00           *            pushq    %rbp
   1      1     0.25                        movq    %rsp, %rbp
   1      1     1.00           *            movl    %edi, -4(%rbp)
   1      5     0.50    *                   movl    -4(%rbp), %eax
   1      1     0.50                        shll    %eax
   2      6     0.50    *                   popq    %rbp
   3      7     1.00                  U     retq
   3      2     1.00           *            pushq    %rbp
   1      1     0.25                        movq    %rsp, %rbp
   1      1     0.25                        subq    $32, %rsp
   1      1     1.00           *            movl    $0, -4(%rbp)
   1      1     1.00           *            movl    $5, -8(%rbp)
   1      1     1.00           *            movl    $3, -12(%rbp)
   1      5     0.50    *                   movl    -8(%rbp), %eax
   2      8     1.00    *                   imull    -12(%rbp), %eax
   1      1     1.00           *            movl    %eax, -16(%rbp)
   1      5     0.50    *                   movl    -16(%rbp), %edi
   4      3     1.00                        callq    __Z3funi
   1      1     1.00           *            movl    %eax, -20(%rbp)
   1      5     0.50    *                   movl    -20(%rbp), %eax
   1      1     0.25                        addq    $32, %rsp
   2      6     0.50    *                   popq    %rbp
   3      7     1.00                  U     retq
  
  
  Resources:
  [0]   - SKLDivider
  [1]   - SKLFPDivider
  [2]   - SKLPort0
  [3]   - SKLPort1
  [4]   - SKLPort2
  [5]   - SKLPort3
  [6]   - SKLPort4
  [7]   - SKLPort5
  [8]   - SKLPort6
  [9]   - SKLPort7
  
  
  Resource pressure per iteration:
  [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8]    [9]    
   -      -     3.99   3.99   6.00   6.00   9.00   3.85   4.17   6.00   
  
  Resource pressure by instruction:
  [0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8]    [9]    Instructions:
   -      -     0.19   0.01   0.05    -     1.00   0.15   0.65   0.95   pushq    %rbp
   -      -     0.01   0.16    -      -      -     0.83    -      -     movq    %rsp, %rbp
   -      -      -      -     0.94    -     1.00    -      -     0.06   movl    %edi, -4(%rbp)
   -      -      -      -     0.05   0.95    -      -      -      -     movl    -4(%rbp), %eax
   -      -     0.85    -      -      -      -      -     0.15    -     shll    %eax
   -      -     0.16   0.83   0.01   0.99    -      -     0.01    -     popq    %rbp
   -      -     0.16   0.20   0.64   0.36    -     0.64   1.00    -     retq
   -      -      -     0.01    -      -     1.00   0.20   0.79   1.00   pushq    %rbp
   -      -      -     0.22    -      -      -     0.78    -      -     movq    %rsp, %rbp
   -      -     0.18   0.78    -      -      -      -     0.04    -     subq    $32, %rsp
   -      -      -      -     0.34   0.02   1.00    -      -     0.64   movl    $0, -4(%rbp)
   -      -      -      -      -      -     1.00    -      -     1.00   movl    $5, -8(%rbp)
   -      -      -      -      -      -     1.00    -      -     1.00   movl    $3, -12(%rbp)
   -      -      -      -     0.04   0.96    -      -      -      -     movl    -8(%rbp), %eax
   -      -      -     1.00   0.96   0.04    -      -      -      -     imull    -12(%rbp), %eax
   -      -      -      -      -     0.65   1.00    -      -     0.35   movl    %eax, -16(%rbp)
   -      -      -      -     0.04   0.96    -      -      -      -     movl    -16(%rbp), %edi
   -      -     0.94   0.62   0.01    -     1.00   0.23   0.21   0.99   callq    __Z3funi
   -      -      -      -     0.95   0.04   1.00    -      -     0.01   movl    %eax, -20(%rbp)
   -      -      -      -     0.96   0.04    -      -      -      -     movl    -20(%rbp), %eax
   -      -     0.18    -      -      -      -     0.67   0.15    -     addq    $32, %rsp
   -      -     0.67    -     0.04   0.96    -     0.16   0.17    -     popq    %rbp
   -      -     0.65   0.16   0.97   0.03    -     0.19   1.00    -     retq
  
  
  
  patrick at MacBook-Pro bin % 


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D102794

Files:
  llvm/test/tools/llvm-mca/X86/cv_fpo_directive_no_segfault.s
  llvm/tools/llvm-mca/CodeRegionGenerator.cpp
  llvm/tools/llvm-mca/CodeRegionGenerator.h
  llvm/tools/llvm-mca/llvm-mca.cpp


Index: llvm/tools/llvm-mca/llvm-mca.cpp
===================================================================
--- llvm/tools/llvm-mca/llvm-mca.cpp
+++ llvm/tools/llvm-mca/llvm-mca.cpp
@@ -389,9 +389,28 @@
   std::unique_ptr<MCInstrAnalysis> MCIA(
       TheTarget->createMCInstrAnalysis(MCII.get()));
 
+  // Need to initialize an MCInstPrinter as it is
+  // required for initializing the MCTargetStreamer
+  // which needs to happen within the CRG.parseCodeRegions() call below.
+  // Without an MCTargetStreamer, certain assembly directives can trigger a
+  // segfault. (For example, the .cv_fpo_proc directive on x86 will segfault if
+  // we don't initialize the MCTargetStreamer.)
+  unsigned IPtempOutputAsmVariant =
+      OutputAsmVariant == -1 ? 0 : OutputAsmVariant;
+  std::unique_ptr<MCInstPrinter> IPtemp(TheTarget->createMCInstPrinter(
+      Triple(TripleName), IPtempOutputAsmVariant, *MAI, *MCII, *MRI));
+  if (!IPtemp) {
+    WithColor::error()
+        << "unable to create instruction printer for target triple '"
+        << TheTriple.normalize() << "' with assembly variant "
+        << IPtempOutputAsmVariant << ".\n";
+    return 1;
+  }
+
   // Parse the input and create CodeRegions that llvm-mca can analyze.
   mca::AsmCodeRegionGenerator CRG(*TheTarget, SrcMgr, Ctx, *MAI, *STI, *MCII);
-  Expected<const mca::CodeRegions &> RegionsOrErr = CRG.parseCodeRegions();
+  Expected<const mca::CodeRegions &> RegionsOrErr =
+      CRG.parseCodeRegions(std::move(IPtemp));
   if (!RegionsOrErr) {
     if (auto Err =
             handleErrors(RegionsOrErr.takeError(), [](const StringError &E) {
Index: llvm/tools/llvm-mca/CodeRegionGenerator.h
===================================================================
--- llvm/tools/llvm-mca/CodeRegionGenerator.h
+++ llvm/tools/llvm-mca/CodeRegionGenerator.h
@@ -39,7 +39,8 @@
 public:
   CodeRegionGenerator(SourceMgr &SM) : Regions(SM) {}
   virtual ~CodeRegionGenerator();
-  virtual Expected<const CodeRegions &> parseCodeRegions() = 0;
+  virtual Expected<const CodeRegions &>
+  parseCodeRegions(const std::unique_ptr<MCInstPrinter> &IP) = 0;
 };
 
 /// This class is responsible for parsing input ASM and generating
@@ -60,7 +61,8 @@
         AssemblerDialect(0) {}
 
   unsigned getAssemblerDialect() const { return AssemblerDialect; }
-  Expected<const CodeRegions &> parseCodeRegions() override;
+  Expected<const CodeRegions &>
+  parseCodeRegions(const std::unique_ptr<MCInstPrinter> &IP) override;
 };
 
 } // namespace mca
Index: llvm/tools/llvm-mca/CodeRegionGenerator.cpp
===================================================================
--- llvm/tools/llvm-mca/CodeRegionGenerator.cpp
+++ llvm/tools/llvm-mca/CodeRegionGenerator.cpp
@@ -106,11 +106,21 @@
   Regions.beginRegion(Comment, Loc);
 }
 
-Expected<const CodeRegions &> AsmCodeRegionGenerator::parseCodeRegions() {
+Expected<const CodeRegions &> AsmCodeRegionGenerator::parseCodeRegions(
+    const std::unique_ptr<MCInstPrinter> &IP) {
   MCTargetOptions Opts;
   Opts.PreserveAsmComments = false;
   MCStreamerWrapper Str(Ctx, Regions);
 
+  // Need to initialize an MCTargetStreamer otherwise
+  // certain asm directives will cause a segfault.
+  // Using nulls() so that anything emitted by the MCTagetStreamer
+  // doesn't show up in the llvm-mca output.
+  raw_ostream &OSRef = nulls();
+  formatted_raw_ostream FOSRef(OSRef);
+  TheTarget.createAsmTargetStreamer(Str, FOSRef, IP.get(),
+                                    /*IsVerboseAsm=*/true);
+
   // Create a MCAsmParser and setup the lexer to recognize llvm-mca ASM
   // comments.
   std::unique_ptr<MCAsmParser> Parser(
Index: llvm/test/tools/llvm-mca/X86/cv_fpo_directive_no_segfault.s
===================================================================
--- /dev/null
+++ llvm/test/tools/llvm-mca/X86/cv_fpo_directive_no_segfault.s
@@ -0,0 +1,9 @@
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -mcpu=x86-64 -resource-pressure=false -instruction-info=false < %s | FileCheck %s
+
+.cv_fpo_pushreg ebx
+add %eax, %eax
+add %ebx, %ebx
+add %ecx, %ecx
+add %edx, %edx
+
+# CHECK:      Iterations:        100


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D102794.346526.patch
Type: text/x-patch
Size: 4121 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20210519/a02aa7ad/attachment-0001.bin>


More information about the llvm-commits mailing list