[llvm] [Support] Add decodeULEB128AndInc/decodeSLEB128AndInc (PR #85739)

David Blaikie via llvm-commits llvm-commits at lists.llvm.org
Mon Mar 25 10:14:11 PDT 2024


dwblaikie wrote:

> The tablegen use case guarantees that there is no error. The `Shift>63` check is pure overhead. Use cases that require rigid error checking can probably use `llvm/include/llvm/Support/DataExtractor.h` `getULEB128`, but the overhead would be substantial. At any rate, the error parameters could be added to `*AndInc` but I have found a good use case yet.


> I was _assuming_ that the code that uses these functions does a bounds safety check for the entire section. @MaskRay ?

But it's not possible to bounds check ULEB reading, right? Non-canonical ULEBs can be arbitrarily long (null padding) - so no untrusted/input buffer can be sure not to buffer overrun, I think? (looking at the implementation of decodeULEB128 - if `Slice` is zero, but the high bit is set, you keep reading the uleb, potentially overrunning any finite buffer)

So this is only safe to call on buffers with known contents? Which is special cases like tblgen, where the buffer is baked into the binary. Anything reading a uleb from unknown input could not use this API correctly.

I'd think it'd be best to  always require error checking, despite the few cases where that's unnecessary. 

Looks to me like a lot of the callers of the previous/existing/underlying API are unsafe. Tblgen's safety seems like it'd be in the minority of callers.

https://github.com/llvm/llvm-project/pull/85739


More information about the llvm-commits mailing list