[Mlir-commits] [mlir] [MLIR][Bytecode] Followup 8106c81 (PR #157136)
Mehdi Amini
llvmlistbot at llvm.org
Sat Sep 6 02:46:37 PDT 2025
================
@@ -300,8 +298,19 @@ class EncodingReader {
// alignment of the root buffer. If it is not, we cannot safely guarantee
// that the specified alignment is globally correct.
//
- // E.g. if the buffer is 8k aligned and the section is 16k aligned,
- // we could end up at an offset of 24k, which is not globally 16k aligned.
+ // E.g. if the buffer is 8k aligned and the section is marked to be 16k
+ // aligned:
+ // - (a) the alignTo call early returns when the pointer is 16k
+ // aligned but given the original 8k alignment we could offset into the
+ // padding by ~8k giving us 16k pointer alignment leaving another ~8k of
+ // padding in the bytecode file that will inadvertently be read when we
+ // attempt to parse the next section.
+ // - (b) we update alignTo to align relative to the start of the buffer,
+ // but given an 8k aligned buffer and section alignment of 16k, we could
+ // end up with a pointer that is 24k aligned (8k start alignment + 16k
+ // offset) instead of globally 16k aligned (versus 16k start alignment +
+ // 16k offset). This would result in incorrectly stated alignment for
+ // resources that reference data inside of the bytecode buffer.
----------------
joker-eph wrote:
I found this explanation still quite confusing to be honest.
I iterated with ChatGPT to get something more clear (I think it can likely be pruned though):
```
// If the section specifies an alignment requirement, handle it here.
if (hasAlignment) {
// Read the requested alignment value from the stream.
uint64_t alignment;
if (failed(parseVarInt(alignment)))
return failure();
// Sanity check: the requested alignment must not exceed the alignment of the
// root buffer itself. Otherwise we cannot guarantee that pointers derived
// from this buffer will actually satisfy the requested alignment globally.
//
// Why is this necessary?
//
// Consider a root buffer that is guaranteed to be 8k aligned, but not 16k
// aligned. For example, suppose the buffer starts at absolute address
// 5×8k = 40960. If a section inside this buffer declares a 16k alignment
// requirement, two problems can arise:
//
// (a) If we simply "align forward" the current pointer to the next
// 16k boundary, the amount of padding we skip depends on the buffer's
// starting address. For example:
//
// buffer_start = 40960
// next 16k boundary = 49152
// bytes skipped = 49152 - 40960 = 8192
//
// If the buffer had started at a different 8k-aligned address, the
// skipped bytes would change accordingly. This makes the section start
// unpredictable and leaves behind variable padding that could be
// misinterpreted as part of the next section.
//
// (b) If instead we align relative to the buffer start, we may obtain
// addresses that are multiples of "buffer_start + section_alignment"
// rather than truly globally aligned addresses. For example:
//
// buffer_start = 40960 (5×8k, 8k aligned but not 16k)
// offset = 16384 (first multiple of 16k)
// section_ptr = 40960 + 16384 = 57344
//
// 57344 is divisible by 8192, so it looks "8k aligned", but:
//
// 57344 % 16384 = 24576 ≠ 0
//
// i.e. the section pointer is at absolute address 57344, which is not
// truly 16k aligned. Any consumer expecting true 16k alignment would
// see this as a violation.
//
// In short: the section's declared alignment must not exceed the alignment
// of the root buffer; otherwise we cannot enforce it in a globally
// consistent and deterministic way.
```
https://github.com/llvm/llvm-project/pull/157136
More information about the Mlir-commits
mailing list