[PATCH] D23909: [X86] Remove DenseMap for storing FMA3 grouping information

Craig Topper via llvm-commits llvm-commits at lists.llvm.org
Thu Aug 25 23:41:12 PDT 2016


craig.topper created this revision.
craig.topper added reviewers: v_klochkov, RKSimon, mkuper, delena, spatel.
craig.topper added a subscriber: llvm-commits.

This patch removes the DenseMap for keeping track of FMA3 grouping information avoiding the startup cost of populating the map and the associated memory usage.

Three new bits are added to the instruction TSFlags for keeping track of the form(132, 213, 231) and whether it is a scalar instrinsic instruction.

The 3 different forms of each instruction are combined into groups similar to the current code. But each of the groups are stored into static tables. Each table is sorted by the opcode of each form. Since opcodes encodings are assigned alphabetically and each form is named the same except for the 132, 213, or 231, when one form is sorted the other two forms are sorted. With the tables sorted, we can find the group for a given opcode by getting the form from the TSFlags and doing a binary search through the appropriate column of the table.

There are 6 tables, split by the evex.b bit, memory/register, and masked/unmasked. The two evex.b tables contain masked and unmasked together. The masked/unmasked split for non evex.b makes it easy to populate the load folding tables. The instructions that use evex.b cannot be folded. For the tables without evex.b the register tables are the same size as their equivalent memory table and the opcodes are in the same order. Converting from register form to memory form is as simple as finding the row in one table and looking up the same row in the opposite table. Determining which table an opcode is in can be determined from other TSFlags bits.

Currently the getFMA3Group function is a private function in X86InstrInfo.cpp, but could be made a static function in the X86InstrInfo class if it becomes needed outside this file.

The load folding table creation as well as the commuting code has been updated to use the new interface. The commuting code makes use of new and existing TSFlags to determine additional information about the opcodes beyond which group they are in.

https://reviews.llvm.org/D23909

Files:
  lib/Target/X86/CMakeLists.txt
  lib/Target/X86/MCTargetDesc/X86BaseInfo.h
  lib/Target/X86/X86InstrAVX512.td
  lib/Target/X86/X86InstrFMA.td
  lib/Target/X86/X86InstrFMA3Info.cpp
  lib/Target/X86/X86InstrFMA3Info.h
  lib/Target/X86/X86InstrFormats.td
  lib/Target/X86/X86InstrInfo.cpp
  lib/Target/X86/X86InstrInfo.h

-------------- next part --------------
A non-text attachment was scrubbed...
Name: D23909.69316.patch
Type: text/x-patch
Size: 59800 bytes
Desc: not available
URL: <http://lists.llvm.org/pipermail/llvm-commits/attachments/20160826/3087ea2b/attachment.bin>


More information about the llvm-commits mailing list