[llvm] e5d59db - [MCA] llvm-mca MCTargetStreamer segfault fix

Andrea Di Biagio via llvm-commits llvm-commits at lists.llvm.org
Wed May 19 10:37:26 PDT 2021


Author: Patrick Holland
Date: 2021-05-19T18:36:10+01:00
New Revision: e5d59db46938caf0280f7347e89181f2a7b0749c

URL: https://github.com/llvm/llvm-project/commit/e5d59db46938caf0280f7347e89181f2a7b0749c
DIFF: https://github.com/llvm/llvm-project/commit/e5d59db46938caf0280f7347e89181f2a7b0749c.diff

LOG: [MCA] llvm-mca MCTargetStreamer segfault fix

In order to create the code regions for llvm-mca to analyze, llvm-mca creates an
AsmCodeRegionGenerator and calls AsmCodeRegionGenerator::parseCodeRegions().
Within this function, both an MCAsmParser and MCTargetAsmParser are created so
that MCAsmParser::Run() can be used to create the code regions for us.

These parser classes were created for llvm-mc so they are designed to emit code
with an MCStreamer and MCTargetStreamer that are expected to be setup and passed
into the MCAsmParser constructor. Because llvm-mca doesn’t want to emit any
code, an MCStreamerWrapper class gets created instead and passed into the
MCAsmParser constructor. This wrapper inherits from MCStreamer and overrides
many of the emit methods to just do nothing. The exception is the
emitInstruction() method which calls Regions.addInstruction(Inst).

This works well and allows llvm-mca to utilize llvm-mc’s MCAsmParser to build
our code regions, however there are a few directives which rely on the
MCTargetStreamer. llvm-mc assumes that the MCStreamer that gets passed into the
MCAsmParser’s constructor has a valid pointer to an MCTargetStreamer. Because
llvm-mca doesn’t setup an MCTargetStreamer, when the parser encounters one of
those directives, a segfault will occur.

In x86, each one of these 7 directives will cause this segfault if they exist in
the input assembly to llvm-mca:

.cv_fpo_proc
.cv_fpo_setframe
.cv_fpo_pushreg
.cv_fpo_stackalloc
.cv_fpo_stackalign
.cv_fpo_endprologue
.cv_fpo_endproc
I haven’t looked at other targets, but I wouldn’t be surprised if some of the
other ones also have certain directives which could result in this same
segfault.

My proposed solution is to simply initialize an MCTargetStreamer after we
initialize the MCStreamerWrapper. The MCTargetStreamer requires an ostream
object, but we don’t actually want any of these directives to be emitted
anywhere, so I use an ostream created with the nulls() function. Since this
needs to happen after the MCStreamerWrapper has been initialized, it needs to
happen within the AsmCodeRegionGenerator::parseCodeRegions() function. The
MCTargetStreamer also needs an MCInstPrinter which is easiest to initialize
within the main() function of llvm-mca. So this MCInstPrinter gets constructed
within main() then passed into the parseCodeRegions() function as a parameter.
(If you feel like it would be appropriate and possible to create the
MCInstPrinter within the parseCodeRegions() function, then feel free to modify
my solution. That would stop us from having to pass it into the function and
would limit its scope / lifetime.)

My solution stops the segfault from happening and still passes all of the
current (expected) llvm-mca tests. I also added a new test for x86 that checks
for this segfault on an input that includes one of the .cv_fpo directives (this
test fails without my solution, but passes with it).

As far as I can tell, all of the functions that I modified are only called from
within llvm-mca so there shouldn’t be any worries about breaking other tools.

Differential Revision: https://reviews.llvm.org/D102709

Added: 
    llvm/test/tools/llvm-mca/X86/cv_fpo_directive_no_segfault.s

Modified: 
    llvm/tools/llvm-mca/CodeRegionGenerator.cpp
    llvm/tools/llvm-mca/CodeRegionGenerator.h
    llvm/tools/llvm-mca/llvm-mca.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/test/tools/llvm-mca/X86/cv_fpo_directive_no_segfault.s b/llvm/test/tools/llvm-mca/X86/cv_fpo_directive_no_segfault.s
new file mode 100644
index 0000000000000..68f719770caf2
--- /dev/null
+++ b/llvm/test/tools/llvm-mca/X86/cv_fpo_directive_no_segfault.s
@@ -0,0 +1,9 @@
+# RUN: llvm-mca -mtriple=x86_64-unknown-unknown -resource-pressure=false -instruction-info=false < %s | FileCheck %s
+
+.cv_fpo_pushreg ebx
+add %eax, %eax
+add %ebx, %ebx
+add %ecx, %ecx
+add %edx, %edx
+
+# CHECK:      Iterations:        100

diff  --git a/llvm/tools/llvm-mca/CodeRegionGenerator.cpp b/llvm/tools/llvm-mca/CodeRegionGenerator.cpp
index 831b76ab80cf7..6ad2a65592b96 100644
--- a/llvm/tools/llvm-mca/CodeRegionGenerator.cpp
+++ b/llvm/tools/llvm-mca/CodeRegionGenerator.cpp
@@ -106,11 +106,21 @@ void MCACommentConsumer::HandleComment(SMLoc Loc, StringRef CommentText) {
   Regions.beginRegion(Comment, Loc);
 }
 
-Expected<const CodeRegions &> AsmCodeRegionGenerator::parseCodeRegions() {
+Expected<const CodeRegions &> AsmCodeRegionGenerator::parseCodeRegions(
+    const std::unique_ptr<MCInstPrinter> &IP) {
   MCTargetOptions Opts;
   Opts.PreserveAsmComments = false;
   MCStreamerWrapper Str(Ctx, Regions);
 
+  // Need to initialize an MCTargetStreamer otherwise
+  // certain asm directives will cause a segfault.
+  // Using nulls() so that anything emitted by the MCTagetStreamer
+  // doesn't show up in the llvm-mca output.
+  raw_ostream &OSRef = nulls();
+  formatted_raw_ostream FOSRef(OSRef);
+  TheTarget.createAsmTargetStreamer(Str, FOSRef, IP.get(),
+                                    /*IsVerboseAsm=*/true);
+
   // Create a MCAsmParser and setup the lexer to recognize llvm-mca ASM
   // comments.
   std::unique_ptr<MCAsmParser> Parser(

diff  --git a/llvm/tools/llvm-mca/CodeRegionGenerator.h b/llvm/tools/llvm-mca/CodeRegionGenerator.h
index 9a10aa2c148b7..1c11784ca3fbb 100644
--- a/llvm/tools/llvm-mca/CodeRegionGenerator.h
+++ b/llvm/tools/llvm-mca/CodeRegionGenerator.h
@@ -39,7 +39,8 @@ class CodeRegionGenerator {
 public:
   CodeRegionGenerator(SourceMgr &SM) : Regions(SM) {}
   virtual ~CodeRegionGenerator();
-  virtual Expected<const CodeRegions &> parseCodeRegions() = 0;
+  virtual Expected<const CodeRegions &>
+  parseCodeRegions(const std::unique_ptr<MCInstPrinter> &IP) = 0;
 };
 
 /// This class is responsible for parsing input ASM and generating
@@ -60,7 +61,8 @@ class AsmCodeRegionGenerator final : public CodeRegionGenerator {
         AssemblerDialect(0) {}
 
   unsigned getAssemblerDialect() const { return AssemblerDialect; }
-  Expected<const CodeRegions &> parseCodeRegions() override;
+  Expected<const CodeRegions &>
+  parseCodeRegions(const std::unique_ptr<MCInstPrinter> &IP) override;
 };
 
 } // namespace mca

diff  --git a/llvm/tools/llvm-mca/llvm-mca.cpp b/llvm/tools/llvm-mca/llvm-mca.cpp
index 72bf2bc134129..4a4230ce4523d 100644
--- a/llvm/tools/llvm-mca/llvm-mca.cpp
+++ b/llvm/tools/llvm-mca/llvm-mca.cpp
@@ -389,9 +389,28 @@ int main(int argc, char **argv) {
   std::unique_ptr<MCInstrAnalysis> MCIA(
       TheTarget->createMCInstrAnalysis(MCII.get()));
 
+  // Need to initialize an MCInstPrinter as it is
+  // required for initializing the MCTargetStreamer
+  // which needs to happen within the CRG.parseCodeRegions() call below.
+  // Without an MCTargetStreamer, certain assembly directives can trigger a
+  // segfault. (For example, the .cv_fpo_proc directive on x86 will segfault if
+  // we don't initialize the MCTargetStreamer.)
+  unsigned IPtempOutputAsmVariant =
+      OutputAsmVariant == -1 ? 0 : OutputAsmVariant;
+  std::unique_ptr<MCInstPrinter> IPtemp(TheTarget->createMCInstPrinter(
+      Triple(TripleName), IPtempOutputAsmVariant, *MAI, *MCII, *MRI));
+  if (!IPtemp) {
+    WithColor::error()
+        << "unable to create instruction printer for target triple '"
+        << TheTriple.normalize() << "' with assembly variant "
+        << IPtempOutputAsmVariant << ".\n";
+    return 1;
+  }
+
   // Parse the input and create CodeRegions that llvm-mca can analyze.
   mca::AsmCodeRegionGenerator CRG(*TheTarget, SrcMgr, Ctx, *MAI, *STI, *MCII);
-  Expected<const mca::CodeRegions &> RegionsOrErr = CRG.parseCodeRegions();
+  Expected<const mca::CodeRegions &> RegionsOrErr =
+      CRG.parseCodeRegions(std::move(IPtemp));
   if (!RegionsOrErr) {
     if (auto Err =
             handleErrors(RegionsOrErr.takeError(), [](const StringError &E) {


        


More information about the llvm-commits mailing list