[llvm-dev] Lowering ISD::TRUNCATE

Mon Aug 6 12:08:57 PDT 2018

I'm working on defining the instructions and implementing the lowering 
code for a Z80 backend. For now, the backend supports only the native 
CPU-supported datatypes, which are 8 and 16 bits wide (i.e. no 32 bit 
long, float, ... yet).

So far, a lot of the simple stuff like immediate loads and return values 
is very straightforward, but now I got stuck with ISD::TRUNCATE, as in:

     typedef unsigned char uint8_t;
     uint8_t Func(uint8_t val1) { return val1 + val1; }

built with -O0 results in:

     target datalayout = "e-m:o-S8-p:16:8-p1:8:8-i16:8-i32:8-a:8-n8:16"
     target triple = "z80"
     ; Function Attrs: noinline nounwind optnone
     define dso_local zeroext i8 @Func(i8 zeroext %val1) #0 {
     entry:
       %val1.addr = alloca i8, align 1
       store i8 %val1, i8* %val1.addr, align 1
       %0 = load i8, i8* %val1.addr, align 1
       %conv = zext i8 %0 to i16
       %1 = load i8, i8* %val1.addr, align 1
       %conv1 = zext i8 %1 to i16
       %add = add nsw i16 %conv, %conv1
       %conv2 = trunc i16 %add to i8
       ret i8 %conv2
     }

I looked into the X86 backend, which has a Z80-like register design, 
i.e. being able to access the subregs AL (and AH) from AX directly, 
without any specific truncation operation necessary. But, to be honest, 
I do not really understand from the code where and how the i16 to i8 
case is handled.

So returning an 8 bit result would simply require loading the lower 8 
bits ("AL" on X86) from the resulting value 16 bit (%add) into the 8 bit 
return register, as defined by the calling convention.
(Or to be Z80 specific: The 16 bit add operation will be "ADD HL,DE", 
calling conv defined register "A" be the i8 return value, so the last 
two IR lines should emit something like "LD A,L / RET".)

That said, what is the correct way to implement ISD::TRUNCATE this in 
the backend, using the CPU's capability that truncating i16 to i8 is 
simply accessing an i16' register's subreg?

Should this be handled in "LowerOperation" or in "PerformDAGCombine"?
Or could this be done with a target-independent combine?
Would returning true in "isTruncateFree" suffice?
Is any lowering code needed at all?

The X86 backend seems to do both, "setTargetDAGCombine(ISD::TRUNCATE)", 
but then also registering a lot of MVTs via 
"setOperationAction(...,Custom)", depending on things like soft-float.
I guess I'm

And second:
In my case, with only i16 and i8 data types, And are there other 
truncation operations to be supported? Is there any scenario where i8 to 
i1 is needed? My first guess was for conditional branching, but my tests 
showed that it works with flags, comparing "not equal" or "not zero", so 
I assume not.

Michael