[PATCH] D13189: Add support for sub-byte aligned writes to lib/Support/Endian.h

Tue Sep 29 09:43:23 PDT 2015

> On 2015-Sep-28, at 17:20, Teresa Johnson <tejohnson at google.com> wrote:
> 
> On Mon, Sep 28, 2015 at 5:12 PM, Duncan P. N. Exon Smith
> <dexonsmith at apple.com> wrote:
>> 
>>> On 2015-Sep-25, at 22:02, Teresa Johnson <tejohnson at google.com> wrote:
>>> 
>>> tejohnson created this revision.
>>> tejohnson added a reviewer: dexonsmith.
>>> tejohnson added a subscriber: llvm-commits.
>>> 
>>> As per Duncan's review for D12536, I extracted the sub-byte bit aligned
>>> reading and writing code into lib/Support. Added calls from
>>> BackpatchWord.
>>> 
>>> I made the support general instead of tailoring it to the specific
>>> use case and data type in the BackpatchWord case. However, I am not
>>> sure how to add unit testing for the other data types. I did test it
>>> for signed types with negative values to ensure the shifting/masking
>>> handled the sign bit properly, by temporarily adding some write/read
>>> calls into llvm operating on a temporary array.
>>> 
>>> http://reviews.llvm.org/D13189
>>> 
>>> Files:
>>> include/llvm/Bitcode/BitstreamWriter.h
>>> include/llvm/Support/Endian.h
>>> 
>>> <D13189.35794.patch>
>> 
>> I wonder if these would be hard to unit-test?  If not, it might be nice
>> to add one or two.  I imagine if someone inserts a bug, it'll be easier
>> to track down unit test failures than bitcode tests.  I'm not too
>> concerned, since this is tested implicitly by the bitcode tests.
> 
> It would be great to unittest them, I'm just not sure how, especially
> for any other data types like signed, since the calls just don't exist
> for that case yet. For the unsigned 32-bit case being called in
> BackpatchWord, I don't think we need unittests since it is tested
> pretty heavily by all the bitcode tests. I'm also not really sure how
> to unittest it in that case anyway? The alignment of the calls to
> BackpatchWord aren't easily controllable (I could create a test that
> causes BackpatchWord to be called with a particular alignment today,
> but as bitcode changes that alignment will drift).

I wasn't suggesting unit testing BackpatchWord() -- rather, the new utility functions you have here in lib/Support (somewhere in unittest/Support I imagine).

You could write to a char array and check the contents, or put something in a char array and read from it.  It'll take some care to make the tests valid for both little endian and big endian hosts, but that's probably just some byte swap logic.

> 
>> 
>>> Index: include/llvm/Support/Endian.h
>>> ===================================================================
>>> --- include/llvm/Support/Endian.h
>>> +++ include/llvm/Support/Endian.h
>>> @@ -77,6 +77,81 @@
>>>          &value,
>>>          sizeof(value_type));
>>> }
>>> +
>>> +/// Read a value of a particular endianness from memory, for a location
>>> +/// that starts at the given bit offset within the first byte.
>>> +template <typename value_type, endianness endian, std::size_t alignment>
>>> +inline value_type readAtBitAlignment(const void *memory, uint64_t startBit) {
>>> +  assert(startBit < 8);
>>> +  if (startBit == 0)
>>> +    return read<value_type, endian, alignment>(memory);
>>> +  else {
>>> +    // Read two values and compose the result from them.
>>> +    value_type val[2];
>>> +    memcpy(&val[0],
>>> +           LLVM_ASSUME_ALIGNED(
>>> +               memory, (detail::PickAlignment<value_type, alignment>::value)),
>>> +           sizeof(value_type) * 2);
>>> +    val[0] = byte_swap<value_type, endian>(val[0]);
>>> +    val[1] = byte_swap<value_type, endian>(val[1]);
>>> +
>>> +    // Shift bits from the lower value into place.
>>> +    unsigned lowerVal = val[0] >> startBit;
>>> +    // Mask off upper bits after right shift in case of signed type.
>>> +    unsigned numBitsFirstVal = (sizeof(value_type) * 8) - startBit;
>>> +    lowerVal &= (1 << numBitsFirstVal) - 1;
>>> +
>>> +    // Get the bits from the upper value.
>>> +    unsigned upperVal = val[1] & ((1 << startBit) - 1);
>>> +    // Shift them in to place.
>>> +    upperVal <<= numBitsFirstVal;
>>> +
>>> +    return lowerVal | upperVal;
>>> +  }
>>> +}
>>> +
>>> +/// Write a value to memory with a particular endianness, for a location
>>> +/// that starts at the given bit offset within the first byte.
>>> +template <typename value_type, endianness endian, std::size_t alignment>
>>> +inline void writeAtBitAlignment(void *memory, value_type value,
>>> +                                uint64_t startBit) {
>>> +  assert(startBit < 8);
>>> +  if (startBit == 0)
>>> +    write<value_type, endian, alignment>(memory, value);
>>> +  else {
>>> +    // Read two values and shift the result into them.
>>> +    value_type val[2];
>>> +    memcpy(&val[0],
>>> +           LLVM_ASSUME_ALIGNED(
>>> +               memory, (detail::PickAlignment<value_type, alignment>::value)),
>>> +           sizeof(value_type) * 2);
>>> +    val[0] = byte_swap<value_type, endian>(val[0]);
>>> +    val[1] = byte_swap<value_type, endian>(val[1]);
>>> +
>>> +    // Mask off any existing bits in the upper part of the lower value that
>>> +    // we want to replace.
>>> +    val[0] &= (1 << startBit) - 1;
>>> +    // Now shift in the new bits
>>> +    val[0] |= value << startBit;
>>> +
>>> +    // Mask off any existing bits in the lower part of the upper value that
>>> +    // we want to replace.
>>> +    val[1] &= ~((1 << startBit) - 1);
>>> +    // Next shift the bits that go into the upper value into position.
>>> +    unsigned numBitsFirstVal = (sizeof(value_type) * 8) - startBit;
>>> +    unsigned upperVal = value >> numBitsFirstVal;
>>> +    // Mask off upper bits after right shift in case of signed type.
>>> +    upperVal &= (1 << startBit) - 1;
>>> +    val[1] |= upperVal;
>>> +
>>> +    // Finally, rewrite values.
>>> +    val[0] = byte_swap<value_type, endian>(val[0]);
>>> +    val[1] = byte_swap<value_type, endian>(val[1]);
>>> +    memcpy(LLVM_ASSUME_ALIGNED(
>>> +               memory, (detail::PickAlignment<value_type, alignment>::value)),
>>> +           &val[0], sizeof(value_type) * 2);
>>> +  }
>>> +}
>>> } // end namespace endian
>>> 
>>> namespace detail {
>>> Index: include/llvm/Bitcode/BitstreamWriter.h
>>> ===================================================================
>>> --- include/llvm/Bitcode/BitstreamWriter.h
>>> +++ include/llvm/Bitcode/BitstreamWriter.h
>>> @@ -102,18 +102,17 @@
>>>   /// Backpatch a 32-bit word in the output at the given bit offset
>>>   /// with the specified value.
>>>   void BackpatchWord(uint64_t BitNo, unsigned NewWord) {
>>> +    using namespace llvm::support;
>>>     unsigned ByteNo = BitNo / 8;
>>> -    if ((BitNo & 7) == 0) {
>>> -      // Already 8-bit aligned
>>> -      support::endian::write32le(&Out[ByteNo], NewWord);
>>> -    } else {
>>> -      uint64_t CurDWord = support::endian::read64le(&Out[ByteNo]);
>>> -      unsigned StartBit = BitNo & 7;
>>> -      // Currently expect to backpatch 0-value placeholders.
>>> -      assert(((CurDWord >> StartBit) & 0xffffffff) == 0);
>>> -      CurDWord |= NewWord << StartBit;
>>> -      support::endian::write64le(&Out[ByteNo], CurDWord);
>>> -    }
>>> +#ifndef NDEBUG
>>> +    unsigned CurDWord =
>>> +#endif
>>> +        endian::readAtBitAlignment<uint32_t, little, unaligned>(&Out[ByteNo],
>>> +                                                                BitNo & 7);
>> 
>> Does readAtBitAlignment() have side-effects?  If not, why run it when
>> NDEBUG?
> 
> Good point.
> 
>> 
>>> +    // Currently expect to backpatch 0-value placeholders.
>>> +    assert(CurDWord == 0);
>> 
>> If you move the previous call inside the `#ifndef`, then you can
>> probably collapse this into:
>> --
>> assert(!endian::readAtBitAlignment<uint32_t, little, unaligned>(&Out[ByteNo],
>>                                                                BitNo & 7) &&
>>       "Expected to be patching over 0-value placeholders");
>> --
>> 
>> (Note, we generally prefer to be assertion comments inside the assertion
>> message instead of in a code comment.)
> 
> Yeah, that's much better, will make that change.
> 
> Thanks,
> Teresa
> 
>> 
>>> +    endian::writeAtBitAlignment<uint32_t, little, unaligned>(
>>> +        &Out[ByteNo], NewWord, BitNo & 7);
>>>   }
>>> 
>>>   void Emit(uint32_t Val, unsigned NumBits) {
>>> 
>> 
> 
> 
> 
> -- 
> Teresa Johnson | Software Engineer | tejohnson at google.com | 408-460-2413