[LLVMbugs] [Bug 13442] New: ReadDataFromGlobal bad read on unaligned array access

The following sample program:

#include <stdint.h>
#include <stdio.h>

const int test[4] = {1, 2, 3, 4};

int main(int argc, char** argv) {

  uint64_t r = *((uint64_t*)(((char*)test) + 2));
  printf("Result: %lx\n", r);
  return 0;


Compiled with gcc:

$ gcc test.c -O2 -o test
$ ./test
Result: 3000000020000

As you'd expect, it reads the low byte of the second and third array elements.

Compiled with Dragonegg (I'm using GCC 4.5 and LLVM 2.8, but the underlying
mistake is still present in trunk):

$ /home/chris/gcc-4.5-build/install/bin/gcc
-fplugin=/home/chris/dragonegg-2.8/dragonegg.so -O2 test.c -o test-ll
$ ./test
Result: 200000000

The program is wrongly optimised.

As llvm, the offending defn and instruction are:

@test = constant %"int[]" [i32 1, i32 2, i32 3, i32 4], align 16

%0 = load i64* bitcast (i8* getelementptr (i8* bitcast (%"int[]"* @test to
i8*), i64 2) to i64*), align 8

The cause of the bad optimisation is a call from Transforms/SCCP via
ConstantFoldLoadFromConstPtr (in Analysis/ConstantFolding.cpp) to
ReadDataFromGlobal: this handles a constant array using:

  if (ConstantArray *CA = dyn_cast<ConstantArray>(C)) {
    uint64_t EltSize = TD.getTypeAllocSize(CA->getType()->getElementType());
    uint64_t Index = ByteOffset / EltSize;
    uint64_t Offset = ByteOffset - Index * EltSize;
    for (; Index != CA->getType()->getNumElements(); ++Index) {
      if (!ReadDataFromGlobal(CA->getOperand(Index), Offset, CurPtr,
                              BytesLeft, TD))
        return false;
      if (EltSize >= BytesLeft)
        return true;

      Offset = 0;      
      BytesLeft -= EltSize;
      CurPtr += EltSize;

    return true;

BytesLeft and CurPtr are both bumped by EltSize even if Offset was non-zero,
the recursive call will have read less than a full element. As a result we read
2 bytes from element 0 of the array, skip 2 uninitialised bytes by incrementing
the pointer by sizeof(int) == 4, then read the full next element. Good luck on
our part means the uninit bytes were 0 and so we get 0x200000000 with the 0x03
byte missing.

The patch is simple:

      Offset = 0;      
      BytesLeft -= EltSize;
      CurPtr += EltSize;

Should be

      BytesLeft -= (EltSize - Offset);
      CurPtr += (EltSize - Offset);
      Offset = 0;

