[PATCH] D60586: [Clang] Conversion of a switch table to a bitmap is not profitable for -Os and -Oz compilation

Ramakota Reddy via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Thu Apr 11 14:24:17 PDT 2019


ramred01 created this revision.
ramred01 added reviewers: hans, lattner.
ramred01 added a project: clang.
Herald added subscribers: kristof.beyls, javed.absar.

When compiling for code size, conversion of a switch table into a bitmap is not profitable. The bitmap requires materializing a constant value of 32-bit or 64-bit width into a register, which usually requires 2 - 3 instructions followed by a either a left-shift followed by a right-shift _or_ a bit extract instruction to get the desired result. Whereas, indexing into a jump table would cost at most 2 instructions, plus the size of the jump table.

The bitmap is profitable when compiling for speed since it removes an indexed load, which is costly. But when compiling for size, it simply bloats the code without adding any real value for embedded targets.

The issue is pronounced when the switch result is a sub-word type (e.g., short or char). With word types, unless a double word type exists on the machine, this issue is not seen.

The following code demonstrates this behaviour:

short test(unsigned a) {

short t;
switch (a) {
case 0:

  t = 500;
  break;

case 1:

  t = 200;
  break;

case 2:

  t = 17000;    
  break;

default:

  t = 0;
  break;

}
return (t);
}

On AArch64, it uses 3 instructions to materialize 6 bytes of data and then performs a logical shift left followed by a logical shift right to get the result of the switch. Whereas, simply storing the data in a RO table and indexing into it would have generated two instructions to load the table address followed by one instruction to index into the table and load the result.

We fix the issue by adding a new Codegen option called -fno-switch-bitmap (and its converse -fswitch-bitmap) and make the generation of the Bitmap conditional. Then we modify the Clang FE to pass this switch when either the user explicitly passes this switch or the user is compiling for AArch64 with either -Os or -Oz. Currently we restrict this default option to AArch64 alone since we do not know which other architecture may benefit from this. Other architectures can also make this the default for -Os and -Oz once it is verified.

This is the clang part of the patch.  The LLVM part of the patch is in the review ID D60584 <https://reviews.llvm.org/D60584>


https://reviews.llvm.org/D60586

Files:
  include/clang/Basic/CodeGenOptions.def
  include/clang/Driver/Options.td
  lib/CodeGen/BackendUtil.cpp
  lib/Driver/ToolChains/Clang.cpp
  lib/Frontend/CompilerInvocation.cpp


Index: lib/Frontend/CompilerInvocation.cpp
===================================================================
--- lib/Frontend/CompilerInvocation.cpp
+++ lib/Frontend/CompilerInvocation.cpp
@@ -1284,6 +1284,8 @@
 
   Opts.Addrsig = Args.hasArg(OPT_faddrsig);
 
+  Opts.NoSwitchBitmap = Args.hasArg(OPT_fno_switch_bitmap);
+
   if (Arg *A = Args.getLastArg(OPT_msign_return_address_EQ)) {
     StringRef SignScope = A->getValue();
 
Index: lib/Driver/ToolChains/Clang.cpp
===================================================================
--- lib/Driver/ToolChains/Clang.cpp
+++ lib/Driver/ToolChains/Clang.cpp
@@ -5340,6 +5340,14 @@
                        TC.useIntegratedAs()))
     CmdArgs.push_back("-faddrsig");
 
+  // If the target is Aarch64 and we are optimizing for size, we don't want to
+  // convert switch tables into bitmaps.
+  if (Args.hasFlag(options::OPT_fno_switch_bitmap, options::OPT_fswitch_bitmap,
+                   (Triple.getArch() == llvm::Triple::aarch64)) &&
+      getOptimizationLevelSize(Args)) {
+    CmdArgs.push_back("-fno-switch-bitmap");
+  }
+  
   // Finally add the compile command to the compilation.
   if (Args.hasArg(options::OPT__SLASH_fallback) &&
       Output.getType() == types::TY_Object &&
Index: lib/CodeGen/BackendUtil.cpp
===================================================================
--- lib/CodeGen/BackendUtil.cpp
+++ lib/CodeGen/BackendUtil.cpp
@@ -473,6 +473,7 @@
   Options.DebuggerTuning = CodeGenOpts.getDebuggerTuning();
   Options.EmitStackSizeSection = CodeGenOpts.StackSizeSection;
   Options.EmitAddrsig = CodeGenOpts.Addrsig;
+  Options.EmitSwitchBitmap = CodeGenOpts.NoSwitchBitmap;
 
   if (CodeGenOpts.getSplitDwarfMode() != CodeGenOptions::NoFission)
     Options.MCOptions.SplitDwarfFile = CodeGenOpts.SplitDwarfFile;
@@ -558,6 +559,7 @@
   PMBuilder.PrepareForThinLTO = CodeGenOpts.PrepareForThinLTO;
   PMBuilder.PrepareForLTO = CodeGenOpts.PrepareForLTO;
   PMBuilder.RerollLoops = CodeGenOpts.RerollLoops;
+  PMBuilder.NoSwitchBitmap = CodeGenOpts.NoSwitchBitmap;
 
   MPM.add(new TargetLibraryInfoWrapperPass(*TLII));
 
Index: include/clang/Driver/Options.td
===================================================================
--- include/clang/Driver/Options.td
+++ include/clang/Driver/Options.td
@@ -774,6 +774,10 @@
   HelpText<"Emit an address-significance table">;
 def fno_addrsig : Flag<["-"], "fno-addrsig">, Group<f_Group>, Flags<[CoreOption]>,
   HelpText<"Don't emit an address-significance table">;
+def fswitch_bitmap : Flag<["-"], "fswitch-bitmap">, Group<f_Group>, 
+    Flags<[CC1Option]>, HelpText<"Convert switch to bitmap">;
+def fno_switch_bitmap : Flag<["-"], "fno-switch-bitmap">, Group<f_Group>,
+    Flags<[CoreOption, CC1Option]>, HelpText<"Do not convert switch to bitmap">;
 def fblocks : Flag<["-"], "fblocks">, Group<f_Group>, Flags<[CoreOption, CC1Option]>,
   HelpText<"Enable the 'blocks' language feature">;
 def fbootclasspath_EQ : Joined<["-"], "fbootclasspath=">, Group<f_Group>;
Index: include/clang/Basic/CodeGenOptions.def
===================================================================
--- include/clang/Basic/CodeGenOptions.def
+++ include/clang/Basic/CodeGenOptions.def
@@ -356,6 +356,9 @@
 /// Whether to emit an address-significance table into the object file.
 CODEGENOPT(Addrsig, 1, 0)
 
+/// Whether to convert a switch table into a bitmap or not.
+CODEGENOPT(NoSwitchBitmap, 1, 0)
+
 ENUM_CODEGENOPT(SignReturnAddress, SignReturnAddressScope, 2, None)
 ENUM_CODEGENOPT(SignReturnAddressKey, SignReturnAddressKeyValue, 1, AKey)
 CODEGENOPT(BranchTargetEnforcement, 1, 0)


-------------- next part --------------
A non-text attachment was scrubbed...
Name: D60586.194759.patch
Type: text/x-patch
Size: 3610 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20190411/9d93bc0e/attachment.bin>


More information about the llvm-commits mailing list