[clang] [Serialization] Fix source location data loss during decoding. (PR #145529)
Chuanqi Xu via cfe-commits
cfe-commits at lists.llvm.org
Wed Jun 25 00:33:33 PDT 2025
ChuanqiXu9 wrote:
> > The design is, the higher 32 bits are used for module file index and the lower bits are used for offsets. Could you give a concrete example why the current implementation is problematic?
>
> I don’t have a concrete failure case, but I noticed this while working on 64-bit source locations.
Yeah, the current situation conflicts with 64-bit source location. We discussed this. The conclusion is, if we need large source location space, use 48 bits and leave upper 16 bits for module file index. 48 bits should be enough for most cases.
>
> Currently, we encode offsets in two ways depending on the module file index:
>
> 1. If the module file index is `0`, we use delta encoding (see [this code path](https://github.com/llvm/llvm-project/blob/01b288fe6a1e627954329198ed5641f2bf55ee8d/clang/include/clang/Serialization/SourceLocationEncoding.h#L165-L168)).
> 2. otherwise, we use raw encoding (see [this code path](https://github.com/llvm/llvm-project/blob/main/clang/include/clang/Serialization/SourceLocationEncoding.h#L173-L183)).
>
> The 2) case is fine, as the encoded value fits into a 32-bit integer. However, the 1) case can produce a 33-bit value, which doesn’t fit in the lower 32 bits.
>
> It appears we only use 16 bits for the module file index, the fix here is to preserve an additional bit for the offset to avoid this issue.
https://github.com/llvm/llvm-project/pull/145529
More information about the cfe-commits
mailing list