[llvm] 03512ae - [exegesis][X86] ParallelSnippetGenerator: don't accidentally create serialized instructions

Roman Lebedev via llvm-commits llvm-commits at lists.llvm.org
Tue Sep 7 02:39:39 PDT 2021


Author: Roman Lebedev
Date: 2021-09-07T12:39:23+03:00
New Revision: 03512ae9bf31b725d8233c03094fd463b5f46285

URL: https://github.com/llvm/llvm-project/commit/03512ae9bf31b725d8233c03094fd463b5f46285
DIFF: https://github.com/llvm/llvm-project/commit/03512ae9bf31b725d8233c03094fd463b5f46285.diff

LOG: [exegesis][X86] ParallelSnippetGenerator: don't accidentally create serialized instructions

In the case of no tied variables, we pick random defs, and then random uses that don't alias with defs we just picked.
Sounds good, except that an X86 instruction may have implicit reg uses,
e.g. for `MULX` it's `EDX`/`RDX`: `Intel SDM, 4-162 Vol. 2B MULX — Unsigned Multiply Without Affecting Flags`
> Performs an unsigned multiplication of the implicit source operand (EDX/RDX) and the specified source operand
> (the third operand) and stores the low half of the result in the second destination (second operand), the high half
> of the result in the first destination operand (first operand), without reading or writing the arithmetic flags.

And indeed, every once in a while `llvm-exegesis` happened to pick EDX as a def while measuring throughput,
and producing garbage output:
```
$ ./bin/llvm-exegesis -num-repetitions=1000000 -mode=inverse_throughput -repetition-mode=min --loop-body-size=4096 -dump-object-to-disk=false -opcode-name=MULX32rr --max-configs-per-opcode=65536
---
mode:            inverse_throughput
key:
  instructions:
    - 'MULX32rr EDX R11D R12D'
  config:          ''
  register_initial_values:
    - 'R12D=0x0'
    - 'EDX=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 1000000
measurements:
  - { key: inverse_throughput, value: 4.00014, per_snippet_value: 4.00014 }
error:           ''
info:            instruction has no tied variables picking Uses different from defs
assembled_snippet: 415441BC00000000BA00000000C4C223F6D4C4C223F6D4C4C223F6D4C4C223F6D4415CC3415441BC00000000BA0000000049B80200000000000000C4C223F6D4C4C223F6D44983C0FF75F0415CC3
...
```
```
$ ./bin/llvm-exegesis -num-repetitions=1000000 -mode=inverse_throughput -repetition-mode=min --loop-body-size=4096 -dump-object-to-disk=false -opcode-name=MULX32rr --max-configs-per-opcode=65536
---
mode:            inverse_throughput
key:
  instructions:
    - 'MULX32rr R13D EDX ECX'
  config:          ''
  register_initial_values:
    - 'ECX=0x0'
    - 'EDX=0x0'
cpu_name:        znver3
llvm_triple:     x86_64-unknown-linux-gnu
num_repetitions: 1000000
measurements:
  - { key: inverse_throughput, value: 3.00013, per_snippet_value: 3.00013 }
error:           ''
info:            instruction has no tied variables picking Uses different from defs
assembled_snippet: 4155B900000000BA00000000C4626BF6E9C4626BF6E9C4626BF6E9C4626BF6E9415DC34155B900000000BA0000000049B80200000000000000C4626BF6E9C4626BF6E94983C0FF75F0415DC3
...
```
Oops! Not only does that not look fun, i did hit that pitfail during AMD Zen 3 enablement.
While i have since then addressed this in rGd4d459e7475b4bb0d15280f12ed669342fa5edcd,
i suspect there may be other buggy results lying around, so we should at least stop producing them.

Reviewed By: courbet

Differential Revision: https://reviews.llvm.org/D109275

Added: 
    

Modified: 
    llvm/tools/llvm-exegesis/lib/ParallelSnippetGenerator.cpp
    llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp

Removed: 
    


################################################################################
diff  --git a/llvm/tools/llvm-exegesis/lib/ParallelSnippetGenerator.cpp b/llvm/tools/llvm-exegesis/lib/ParallelSnippetGenerator.cpp
index ca5e5c9191d5c..7728fcb5d62ea 100644
--- a/llvm/tools/llvm-exegesis/lib/ParallelSnippetGenerator.cpp
+++ b/llvm/tools/llvm-exegesis/lib/ParallelSnippetGenerator.cpp
@@ -186,25 +186,59 @@ ParallelSnippetGenerator::generateCodeTemplates(
     return getSingleton(std::move(CT));
   }
   // No tied variables, we pick random values for defs.
+
+  // We don't want to accidentally serialize the instruction,
+  // so we must be sure that we don't pick a def that is an implicit use,
+  // or a use that is an implicit def, so record implicit regs now.
+  BitVector ImplicitUses(State.getRegInfo().getNumRegs());
+  BitVector ImplicitDefs(State.getRegInfo().getNumRegs());
+  for (const auto &Op : Instr.Operands) {
+    if (Op.isReg() && Op.isImplicit() && !Op.isMemory()) {
+      assert(Op.isImplicitReg() && "Not an implicit register operand?");
+      if (Op.isUse())
+        ImplicitUses.set(Op.getImplicitReg());
+      else {
+        assert(Op.isDef() && "Not a use and not a def?");
+        ImplicitDefs.set(Op.getImplicitReg());
+      }
+    }
+  }
+  const auto ImplicitUseAliases =
+      getAliasedBits(State.getRegInfo(), ImplicitUses);
+  const auto ImplicitDefAliases =
+      getAliasedBits(State.getRegInfo(), ImplicitDefs);
   BitVector Defs(State.getRegInfo().getNumRegs());
   for (const auto &Op : Instr.Operands) {
     if (Op.isReg() && Op.isExplicit() && Op.isDef() && !Op.isMemory()) {
       auto PossibleRegisters = Op.getRegisterAliasing().sourceBits();
-      // Do not use forbidden registers.
+      // Do not use forbidden registers and regs that are implicitly used.
+      // Note that we don't try to avoid using implicit defs explicitly.
       remove(PossibleRegisters, ForbiddenRegisters);
-      assert(PossibleRegisters.any() && "No register left to choose from");
+      remove(PossibleRegisters, ImplicitUseAliases);
+      if (!PossibleRegisters.any())
+        return make_error<StringError>(
+            Twine("no available registers:\ncandidates:\n")
+                .concat(debugString(State.getRegInfo(),
+                                    Op.getRegisterAliasing().sourceBits()))
+                .concat("\nforbidden:\n")
+                .concat(debugString(State.getRegInfo(), ForbiddenRegisters))
+                .concat("\nimplicit use:\n")
+                .concat(debugString(State.getRegInfo(), ImplicitUseAliases)),
+            inconvertibleErrorCode());
       const auto RandomReg = randomBit(PossibleRegisters);
       Defs.set(RandomReg);
       Variant.getValueFor(Op) = MCOperand::createReg(RandomReg);
     }
   }
   // And pick random use values that are not reserved and don't alias with defs.
+  // Note that we don't try to avoid using implicit uses explicitly.
   const auto DefAliases = getAliasedBits(State.getRegInfo(), Defs);
   for (const auto &Op : Instr.Operands) {
     if (Op.isReg() && Op.isExplicit() && Op.isUse() && !Op.isMemory()) {
       auto PossibleRegisters = Op.getRegisterAliasing().sourceBits();
       remove(PossibleRegisters, ForbiddenRegisters);
       remove(PossibleRegisters, DefAliases);
+      remove(PossibleRegisters, ImplicitDefAliases);
       assert(PossibleRegisters.any() && "No register left to choose from");
       const auto RandomReg = randomBit(PossibleRegisters);
       Variant.getValueFor(Op) = MCOperand::createReg(RandomReg);

diff  --git a/llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp b/llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp
index 2fd278de07bb7..b430cafd0498b 100644
--- a/llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp
+++ b/llvm/unittests/tools/llvm-exegesis/X86/SnippetGeneratorTest.cpp
@@ -154,29 +154,6 @@ TEST_F(X86SerialSnippetGeneratorTest,
   consumeError(std::move(Error));
 }
 
-TEST_F(X86SerialSnippetGeneratorTest,
-       AvoidSerializingThroughImplicitRegisters) {
-  // MULX32rr implicitly uses EDX. We should not select that register to avoid
-  // serialization.
-  const unsigned Opcode = X86::MULX32rr;
-  randomGenerator().seed(0); // Initialize seed.
-  const Instruction &Instr = State.getIC().getInstr(Opcode);
-  // Forbid all registers but RDX/EDX/DX/DH/DL. The only option would be to
-  // choose that register, but that would serialize the instruction, so we
-  // should be returning an error.
-  auto AllRegisters = State.getRATC().emptyRegisters();
-  AllRegisters.flip();
-  AllRegisters.reset(X86::RDX);
-  AllRegisters.reset(X86::EDX);
-  AllRegisters.reset(X86::DX);
-  AllRegisters.reset(X86::DH);
-  AllRegisters.reset(X86::DL);
-  auto Error =
-      Generator.generateCodeTemplates(&Instr, AllRegisters).takeError();
-  // FIXME: EXPECT_TRUE + consumeError(std::move(Error)).
-  EXPECT_FALSE((bool)Error);
-}
-
 TEST_F(X86SerialSnippetGeneratorTest, DependencyThroughOtherOpcode) {
   // - CMP64rr
   // - Op0 Explicit Use RegClass(GR64)
@@ -385,6 +362,29 @@ TEST_F(X86ParallelSnippetGeneratorTest, MOV16ms) {
               testing::HasSubstr("no available registers"));
 }
 
+TEST_F(X86ParallelSnippetGeneratorTest,
+       AvoidSerializingThroughImplicitRegisters) {
+  // MULX32rr implicitly uses EDX. We should not select that register to avoid
+  // serialization.
+  const unsigned Opcode = X86::MULX32rr;
+  randomGenerator().seed(0); // Initialize seed.
+  const Instruction &Instr = State.getIC().getInstr(Opcode);
+  // Forbid all registers but RDX/EDX/DX/DH/DL. The only option would be to
+  // choose that register, but that would serialize the instruction, so we
+  // should be returning an error.
+  auto AllRegisters = State.getRATC().emptyRegisters();
+  AllRegisters.flip();
+  AllRegisters.reset(X86::RDX);
+  AllRegisters.reset(X86::EDX);
+  AllRegisters.reset(X86::DX);
+  AllRegisters.reset(X86::DH);
+  AllRegisters.reset(X86::DL);
+  auto Err = Generator.generateCodeTemplates(&Instr, AllRegisters);
+  EXPECT_FALSE((bool)Err);
+  EXPECT_THAT(toString(Err.takeError()),
+              testing::HasSubstr("no available registers"));
+}
+
 class X86FakeSnippetGenerator : public SnippetGenerator {
 public:
   X86FakeSnippetGenerator(const LLVMState &State, const Options &Opts)


        


More information about the llvm-commits mailing list