[llvm-dev] Addressing TableGen's error "Ran out of lanemask bits" in order to use more than 32 subregisters per register

Fri Jul 28 04:30:27 PDT 2017

   Hello.
     I come back to this older thread.

     As I've said before, I managed to patch the various files from the back end related 
to lanemask in order to support at most 1024 vector lanes. For this I am using a 1024-bit 
long lanemask of type uint1024_t from boost::multiprecision, instead of uint32_t. For this 
I changed the following LLVM source files:
       [repository]/llvm/utils/TableGen/CodeGenRegisters.cpp
       [repository]/llvm/utils/TableGen/CodeGenRegisters.h
       [repository]/llvm/utils/TableGen/RegisterInfoEmitter.cpp
       [repository]/llvm/lib/CodeGen/TargetRegisterInfo.cpp
       [repository]/llvm/lib/CodeGen/MachineVerifier.cpp
       [repository]/llvm/include/llvm/Target/TargetRegisterInfo.h
       [repository]/llvm/include/llvm/MC/MCRegisterInfo.h
       [repository]/llvm/include/llvm/CodeGen/MachineBasicBlock.h
       [repository]/llvm/include/llvm/CodeGen/RegisterPressure.h
     I plan to contribute patches for these changes to the llvm-commits mailing list.
     These changes were tested by me for more than 6 months with llc on various benchmarks 
- things seem to work well.

     Besides these changes I added new vector types (basically all vector types that were 
not already present in LLVM, from 32 lanes to 1024, for types i8, i16, i32, i64 and 
f16/32/64, etc - examples of types that I needed are v128i1, v128i16, also v1024f64). The 
files I changed are:
       [repository]/llvm/include/llvm/CodeGen/ValueTypes.td
       [repository]/lib/IR/ValueTypes.cpp
       [repository]/include/llvm/IR/Intrinsics.td
       [repository]/llvm/include/llvm/CodeGen/MachineValueType.h
       [repository]/llvm/utils/TableGen/CodeGenTarget.cpp
     Please let me know if you want to commit these changes also - they are rather complex 
in the sense there are a lot of small dependencies for these types.

   Best regards,
     Alex


On 9/20/2016 12:48 PM, Alex Susu wrote:
>   Hello.
>     I managed to use SIMD units with more than 32 lanes (32 subregisters per vector
> register) in TableGen, llc and opt. For example, I use SIMD instructions with types
> v128i16 and v512i16.
>
>     An important questions I have is if it is OK to add the types IIT_V128 = 37, IIT_V256
> = 38 like I did below:
>         enum IIT_Info {
>           ...
>           IIT_V2   = 9,
>           IIT_V4   = 10,
>           IIT_V8   = 11,
>           IIT_V16  = 12,
>           IIT_V32  = 13,
>           ...
>           IIT_V64  = 16,
>           IIT_V1   = 28,
>           IIT_VEC_OF_PTRS_TO_ELT = 33,
>           IIT_V512 = 35,
>           IIT_V1024 = 36,
>
>           /* Alex: added these new values. Note that these IIT_* that I add below must be
> defined in llvm.org/docs/doxygen/html/Function_8cpp.html also */
>           IIT_V128 = 37,
>           IIT_V256 = 38
>         };
>
>     I ask because enum IIT_Info has some values that are not consecutive for vector types
> for intrinsics (used e.g. in include/llvm/IR/Intriniscs*.td).
>     Although not important, I wonder why do I still need to define them again (since these
> values are basically already defined in ValueTypes.td) ?
>
>
>     So, I managed to get the code compiled. I had issues because I did not synchronize the
> following code:
>       - enum IIT_Info defined in files llvm/utils/TableGen/IntrinsicEmitter.cpp and
> llvm/lib/IR/Function.cpp;
>       - enum SympleValueType defined in files llvm/include/llvm/CodeGen/ValueTypes.td and
> llvm/include/llvm/CodeGen/MachineValueType.h .
>     I was getting errors because of this out-of-sync like:
>       - "error:unhandled vector type width in intrinsic!", "error:unhandled MVT in
> intrinsic!"
>       - "Not a vector MVT!", "getSizeInBits called on extended MVT."
>
>
>   Best regards,
>     Alex
>
> On 9/19/2016 12:14 AM, Alex Susu wrote:
>>   Hello.
>>     I've managed to patch the various files from the back end related to lanemask - now I
>> have 1024-bit long lanemask.
>>     But now I get the following error when giving make llc:
>>         <<error:unhandled vector type width in intrinsic!>>
>>     This error comes from this file
>> https://github.com/llvm-mirror/llvm/blob/master/utils/TableGen/IntrinsicEmitter.cpp, comes
>> from the fact there is no IIT_V128 (nor IIT_V256), and they is a switch case using them in
>> method static void EncodeFixedType(Record *R, std::vector<unsigned char> &ArgCodes,
>> std::vector<unsigned char> &Sig).
>>
>>     Is there any reason these enum IIT_Info ( IIT_V128, IIT_V256) are not added in file
>> /IntrinsicEmitter.cpp?
>>
>>   Thank you,
>>     Alex
>>
>>
>> On Tue, Sep 13, 2016 at 1:47 AM, Matthias Braun <mbraun at apple.com
>> <mailto:mbraun at apple.com>> wrote:
>>
>>
>>     > On Sep 8, 2016, at 6:37 AM, Alex Susu via llvm-dev <llvm-dev at lists.llvm.org
>> <mailto:llvm-dev at lists.llvm.org>> wrote:
>>     >
>>     >  Hello.
>>     >    In my TableGen back end description I need to use more than 32 (e.g., 128,
>> 1024, etc) subregisters per register for my research SIMD processor. I have used so far
>> with success 32 subregisters.
>>     >
>>     >    However, when using 128 subregisters when I now give the command:
>>     >      llvm-tblgen -gen-register-info Connex.td
>>     >     I get an error message "error:Ran out of lanemask bits to represent
>> subregister sub_16_033".
>>     >
>>     >    To handle this limitation, I started editing the files where this error comes
>> from:
>>     >      llvm/utils/TableGen/CodeGenRegisters.h
>>     >      llvm/utils/TableGen/CodeGenRegisters.cpp
>>     >    More exactly, the error comes from the fact the member LaneMask of the classes
>> CodeGenSubRegIndex and CodeGenRegister is unsigned (i.e., 32 bits). So for every
>> lane/subregister we require a bit from the type LaneMask.
>>     >    I plan to use type long (or even type int1024_t from the boost library, header
>> cpp_int.hpp) for LaneMask and change accordingly the methods handing the type.
>>     >
>>     >    Is there are any limitation I am not aware of (maybe in LLVMV's register
>> allocator) that would prevent me from using more than 32 lanes/subregisters?
>>
>>     There is no known limitation. I chose uint32_t out of concern for compiletime. Going
>>     up for uint64_t should be no problem, I'd be more concerned about bigger types;
>>     hopefully all code properly uses the LaneBitmask type instead of plain unsigned, you
>>     may need a few fixes in that area.
>>     (For history: We had a scheme in the past where the liveness tracking mapped all lanes
>>     after lane 31 to the bit 32, however that turned out to need special code in some
>>     places that turned out to be a constant source of bugs that typically only happened in
>>     big and hard to debug inputs so we moved away from this scheme).
>>
>>     - Matthias
>>
>>
>>