[LLVMdev] Signed/unsigned value type resolution

Tue Nov 1 15:23:07 PDT 2011

Hi all,

I am currently working on a static analysis aimed at integer  
arithmetic overflow/underflow detection. We are attempting to build a  
sound abstract domain (based on Cousot & Cousot-style abstract  
interpretation), but practically speaking this really requires the  
ability to figure out the word size and signedness of values in the  
intermediate representation. I'm well aware that LLVM leverages the  
(usual) equivalence of certain arithmetic operations in two's  
compliment form with respect to signedness, but from a program  
analysis point of view it can be very important to know whether, for  
example, 0xFFFFFFFF means 65535 or -1 (assuming 16 bits), particularly  
when values are represented by conceptually infinitiary abstract  
domains.

There seems to be some support in the head version in the DIType class  
(specifically DIType::isUnsignedDIType()) for extracting this  
information from debug metadata, though this member function is  
missing in 2.9. It is also sometimes possible to infer signedness from  
context, since certain instructions imply it, but I'm finding that  
doing that still leaves many cases unresolved.

What's the best way to go with this?

Thank you in advance,
Sarah Thompson
NASA Ames (back doing LLVM stuff again after a while working on  
robotics)