[llvm-dev] [RFC] Adding range metadata to array subscripts.

Johannes Doerfert via llvm-dev llvm-dev at lists.llvm.org
Sun Mar 28 08:44:24 PDT 2021


On 3/28/21 3:49 AM, James Courtier-Dutton wrote:
> Hi,
>
> char* test_fill(int size) {
>    char *test1 = malloc(size)
>    for (n = 0; n <= size; n++) {
>      test1[n] = 'A';
>    }
> }
>
> Would it be worth making the "range" information a little richer and
> be able to use algebraic expressions as well as numeric ranges.
> Note: the above example code has an off by one overflow, and it would
> be helpful if one could catch that at compile time.
> In this case, it could catch that n must be less than size, and not
> less than or equal to size.
> Thus putting the range value on the test1 pointer as being from
> address of test1 to test1 + (size - 1)
>
> This can only be achieved if algebraic expressions are used for
> ranges, and not just constant values.
> Actual use cases can get much more complicated with for example,
> non-contiguous ranges. e.g. 0,1,4,5 ok, but 2,3,6,7 not ok.
>
> Another useful thing to catch at compile time, would be a warning that
> a pointer is being dereferenced, and we were not able to apply a range
> expression to it. I.e. warn about unbounded dereferences.
>
> I think it would be useful to at least consider how we would capture
> this more complex range information/metadata in LLVM IR.

I think what you want is the max object extend attribute, formerly
known as max object size when we only wanted to track an upper bound
next revision shall also include the lower one:
    https://reviews.llvm.org/D87975

If we allow values instead of only constants you can "properly"
generate warnings, using SCEV to determine the range of `n` above.

That said, in operand bundles we can generally allow non-constant
values, e.g., `["range"(%p, i32 0, i32 %N)]`

~ Johannes


> Kind Regards
>
> James
>
>
>
>
>>>>>> On 3/24/21 9:06 AM, Clement Courbet wrote:
>>>>>>> On Wed, Mar 24, 2021 at 2:20 PM Johannes Doerfert <
>>>>>>> johannesdoerfert at gmail.com> wrote:
>>>>>>>
>>>>>>>> I really like encoding more (range) information in the IR,
>>>>>>>> more thoughts inlined.
>>>>>>>>
>>>>>>>> On 3/24/21 4:14 AM, Clement Courbet via llvm-dev wrote:
>>>>>>>>> struct Histogram {
>>>>>>>>>
>>>>>>>>>      int values[256];
>>>>>>>>>
>>>>>>>>>      int total;
>>>>>>>>>
>>>>>>>>> };
>>>>>>>>>
>>>>>>>>> Histogram DoIt(const int* image, int size) {
>>>>>>>>>
>>>>>>>>>      Histogram histogram;
>>>>>>>>>
>>>>>>>>>      for (int i = 0; i < size; ++i) {
>>>>>>>>>
>>>>>>>>>        ++histogram.values[image[i]];  // (A)
>>>>>>>>>
>>>>>>>>>        ++histogram.total;             // (B)
>>>>>>>>>
>>>>>>>>>      }
>>>>>>>>>
>>>>>>>>>      return histogram;
>>>>>>>>>
>>>>>>>>> }


More information about the llvm-dev mailing list