[llvm-commits] [PATCH] ARM build attribute reading support

Renato Golin rengolin at systemcall.org
Wed Dec 5 01:58:06 PST 2012


Hi Amara,

Anton is probably away. Evan Cheng is the code owner for the ARM
backend, maybe he can give you the green light.

cheers,
--renato

On 5 December 2012 09:13, Amara Emerson <amara.emerson at arm.com> wrote:
> Ping.
>
> -----Original Message-----
> From: Amara Emerson [mailto:amara.emerson at arm.com]
> Sent: 21 November 2012 08:35
> To: 'Michael Spencer'; Anton Korobeynikov
> Cc: Renato Golin; llvm-commits
> Subject: RE: [llvm-commits] [PATCH] ARM build attribute reading support
>
>> It is correct to use it as *(Elf_Word*). Using *(uint32_t*) is wrong
> because it doesn't handle endianness and alignment properly.
>
> Makes much more sense. I thought that seemed strange.
>
> I'll make the changes pending any more comments from Anton.
>
> Thanks,
> Amara
>
> -----Original Message-----
> From: Michael Spencer [mailto:bigcheesegs at gmail.com]
> Sent: 20 November 2012 20:21
> To: Amara Emerson; Anton Korobeynikov
> Cc: Renato Golin; llvm-commits
> Subject: Re: [llvm-commits] [PATCH] ARM build attribute reading support
>
> On Tue, Nov 20, 2012 at 7:17 AM, Amara Emerson <amara.emerson at arm.com> wrote:
>> Thanks for the review. I believe I've addressed your concerns with the new patch attached. Renato covered the point regarding the endianness.
>>
>> To summarise to main changes:
>>  - Macros used to shorten the repetitive switch cases.
>
> Thanks, this helps a lot.
>
>>  - Variable/function renames.
>>  - New function to check if an attributes section exists in the file, used in readobj.
>>  - Slightly reworked parser code due to not using Elf_Word.
>
> Sorry for not being clear. I mean not to use Elf_Word as a local
> variable. As in:
>
> Elf_Word blah = 4;
>
> It is correct to use it as *(Elf_Word*). Using *(uint32_t*) is wrong
> because it doesn't handle endianness and alignment properly.
>
>>  - The public reader function has been changed to return an appropriate error_code instead of aborting with a fatal error.
>>  - Use of LLVM style RTTI for ELFObjectFile cast in llvm-readobj, eliminating the need for the MemoryBuffer.
>>
>> Amara
>
> Also doxygen comments should use \ instead of @.
>
> Other than that it looks good. But wait for Anton to review.
>
> - Michael Spencer
>
>>
>> -----Original Message-----
>> From: Michael Spencer [mailto:bigcheesegs at gmail.com]
>> Sent: 19 November 2012 20:38
>> To: Amara Emerson
>> Cc: Renato Golin; llvm-commits
>> Subject: Re: [llvm-commits] [PATCH] ARM build attribute reading support
>>
>> On Mon, Nov 19, 2012 at 7:39 AM, Amara Emerson <amara.emerson at arm.com> wrote:
>>> I've attached a new patch with some changes. Note that the patch is a git
>>> binary diff because of the new test object files, but I can supply separate
>>> patch + files if anyone wants me to.
>>>
>>> They are:
>>>  - Separate enum anonymous namespaces.
>>>  - Public getters/setters have been removed completely to make the class
>>> mostly just a data container.
>>>  - A new large test case added which exercises the previously non-tested
>>> attributes. llvm-readobj behaviour is congruent with readelf.
>>>
>>> Michael, does this look reasonable?
>>>
>>> Thanks,
>>> Amara
>>
>>> +++ b/include/llvm/Object/ELF.h
>>> @@ -20,12 +20,15 @@
>>>  #include "llvm/ADT/DenseMap.h"
>>>  #include "llvm/ADT/PointerIntPair.h"
>>>  #include "llvm/Object/ObjectFile.h"
>>> +#include "llvm/Object/ELF_ARM.h"
>>
>> Includes are ordered alphabetically.
>>
>>>  #include "llvm/Support/Casting.h"
>>>  #include "llvm/Support/ELF.h"
>>>  #include "llvm/Support/Endian.h"
>>> +#include "llvm/Support/LEB128.h"
>>>  #include "llvm/Support/ErrorHandling.h"
>>>  #include "llvm/Support/MemoryBuffer.h"
>>>  #include "llvm/Support/raw_ostream.h"
>>> +#include "llvm/Support/Debug.h"
>>>  #include <algorithm>
>>>  #include <limits>
>>>  #include <utility>
>>> @@ -486,6 +489,7 @@ private:
>>>    const Elf_Shdr *dot_gnu_version_sec;   // .gnu.version
>>>    const Elf_Shdr *dot_gnu_version_r_sec; // .gnu.version_r
>>>    const Elf_Shdr *dot_gnu_version_d_sec; // .gnu.version_d
>>> +  const Elf_Shdr *dot_arm_attributes_sec;// .ARM.attributes
>>>
>>>    // Pointer to SONAME entry in dynamic string table
>>>    // This is set the first time getLoadName is called.
>>> @@ -596,6 +600,10 @@ private:
>>>                                     bool &IsDefault) const;
>>>    void VerifyStrTab(const Elf_Shdr *sh) const;
>>>
>>> +  uintptr_t ARMReadSingleAttribute(uintptr_t readPtr,
>>> +                               ARMBuildAttrs::ARMGenericBuildAttrInfo &Attrs,
>>> +                               SmallVector<unsigned, 16> &TagsSet);
>>
>> readPtr should be ReadPtr. Also function names start with lower case.
>> readSingleARMAttribute or something similar.
>>
>>> +
>>>  protected:
>>>    const Elf_Sym  *getSymbol(DataRefImpl Symb) const; // FIXME: Should be private?
>>>    void            validateSymbol(DataRefImpl Symb) const;
>>> @@ -725,6 +733,8 @@ public:
>>>      return v->getType() == getELFType(target_endianness == support::little,
>>>                                        is64Bits);
>>>    }
>>> +  void ARMReadBuildAttributes(ARMBuildAttrs::ARMGenericBuildAttrInfo &Attrs,
>>> +                              SmallVector<unsigned, 16> &TagsSet);
>>
>> Same here. Also, this needs documentation.
>>
>>>  };
>>>
>>>  // Iterate through the version definitions, and place each Elf_Verdef
>>> @@ -1948,6 +1958,7 @@ ELFObjectFile<target_endianness, is64Bits>::ELFObjectFile(MemoryBuffer *Object
>>>    , dot_gnu_version_sec(0)
>>>    , dot_gnu_version_r_sec(0)
>>>    , dot_gnu_version_d_sec(0)
>>> +  , dot_arm_attributes_sec(0)
>>>    , dt_soname(0)
>>>   {
>>>
>>> @@ -2039,6 +2050,15 @@ ELFObjectFile<target_endianness, is64Bits>::ELFObjectFile(MemoryBuffer *Object
>>>        dot_gnu_version_r_sec = sh;
>>>        break;
>>>      }
>>> +    case ELF::SHT_ARM_ATTRIBUTES: {
>>> +      if (dot_arm_attributes_sec != NULL) {
>>> +        // FIXME: Proper error handling
>>> +        report_fatal_error("More than one .ARM.attributes section!");
>>> +        break;
>>> +      }
>>> +      dot_arm_attributes_sec = sh;
>>> +      break;
>>> +    }
>>>      }
>>>      ++sh;
>>>    }
>>> @@ -2673,7 +2693,310 @@ static inline error_code GetELFSymbolVersion(const ObjectFile *Obj,
>>>    llvm_unreachable("Object passed to GetELFSymbolVersion() is not ELF");
>>>  }
>>>
>>> +template<support::endianness target_endianness, bool is64Bits>
>>> +void ELFObjectFile<target_endianness, is64Bits>::
>>> +       ARMReadBuildAttributes(ARMBuildAttrs::ARMGenericBuildAttrInfo &Attrs,
>>> +                              SmallVector<unsigned, 16> &TagsSet) {
>>> +
>>> +  if (getArch() != Triple::arm)
>>> +    report_fatal_error("Cannot read ARM build attributes of a non-ARM file.");
>>> +
>>> +  if (!dot_arm_attributes_sec)
>>> +    report_fatal_error("No ARM attribute section found");
>>
>> There needs to be a way to check if calling this function is ok
>> without crashing.
>>
>>> +
>>
>> All the variable names in the following need to follow the LLVM style
>> guide. I know
>> a lot of this file doesn't, but I'd like added code to follow llvm
>> style. The only
>> exception to this are names directly derived from the ELF specification.
>>
>>> +  const char *sec_start = (const char*)base() +
>>> +                          dot_arm_attributes_sec->sh_offset;
>>> +  const char *sec_end = sec_start + dot_arm_attributes_sec->sh_size;
>>> +  char format_version = *sec_start;
>>> +  if (format_version != ARMBuildAttrs::Format_Version)
>>> +    report_fatal_error("ARM build attribute format version not recognised.");
>>> +
>>> +  uintptr_t sSectionPtr = (uintptr_t)sec_start + 1;
>>> +  uintptr_t readPtr = sSectionPtr;
>>> +  // Begin reading section after format-version byte
>>> +  while (readPtr < (uintptr_t)sec_end) {
>>> +    // Read a subsection, starting with: <section-length> "vendor-name"
>>> +    // For now, we only care about the "aeabi" pseudo-vendor subsection.
>>> +    Elf_Word sSectionLen = *(Elf_Word*)readPtr;
>>
>> Don't use Elf_Word as a local variable. Use uint32_t.
>>
>>> +    readPtr += sizeof(Elf_Word);
>>> +    StringRef vendor((char*)(readPtr));
>>> +    readPtr += vendor.size() + 1; // Vendor string + NUL byte
>>> +
>>> +    if (vendor != "aeabi") {
>>> +      sSectionPtr += sSectionLen;
>>> +      readPtr = sSectionPtr;
>>> +      continue; //skip to the next sub-section
>>> +    }
>>> +
>>> +    bool foundFileTag = false;
>>> +    uintptr_t ssSectionPtr = readPtr;
>>> +    Elf_Word ssSectionLen = *(Elf_Word*)readPtr;
>>> +    // Found aeabi subsection, now find the File scope tag.
>>> +    do {
>>> +      ssSectionPtr = readPtr;
>>> +      unsigned n = 0;
>>> +      uint64_t tag = decodeULEB128((uint8_t*)readPtr, &n);
>>
>> Is ULEB byte oriented? Or does endianess matter?
>>
>>> +      readPtr += n;
>>> +      ssSectionLen = *(Elf_Word*)readPtr;
>>> +      if (tag != ARMBuildAttrs::File) {
>>> +        // We only handle File scope attributes, skip ahead to the next
>>> +        // sub-subsection.
>>> +        ssSectionPtr += ssSectionLen;
>>> +        readPtr = ssSectionPtr;
>>> +        continue;
>>> +      }
>>> +      foundFileTag = true;
>>> +      break; // Found the file tag.
>>> +    } while (readPtr < sSectionPtr + sSectionLen);
>>> +
>>> +    if (!foundFileTag)
>>> +      report_fatal_error("Could not find ELF file scope attribute tag.");
>>> +
>>> +    readPtr += sizeof(Elf_Word);
>>> +    while (readPtr < ssSectionPtr + ssSectionLen) {
>>> +      // Read any number of attributes.
>>> +      // Attributes are pairs of <uleb128, uleb128> or <uleb128, NTBS>.
>>> +      readPtr = ARMReadSingleAttribute(readPtr, Attrs, TagsSet);
>>> +    }
>>> +    Attrs.setValid(true);
>>> +  }
>>>  }
>>> +
>>> +// Helper to read a single ARM attribute at the given pointer and return the
>>> +// read pointer moved forward to the next attribute location.
>>> +template<support::endianness target_endianness, bool is64Bits>
>>> +uintptr_t ELFObjectFile<target_endianness, is64Bits>::
>>> +                       ARMReadSingleAttribute(uintptr_t readPtr,
>>> +                           ARMBuildAttrs::ARMGenericBuildAttrInfo &Attrs,
>>> +                           SmallVector<unsigned, 16> &TagsSet) {
>>> +  // The ABI says that tags in the range 0-63 must be handled by tools.
>>> +  unsigned n = 0;
>>> +  uint64_t tagInt = decodeULEB128((uint8_t*)readPtr, &n);
>>> +  ARMBuildAttrs::AttrType tag = (ARMBuildAttrs::AttrType)tagInt;
>>> +  readPtr += n;
>>> +  uint64_t ulebValue = 0;
>>> +  StringRef strValue;
>>> +  switch (tag) {
>>> +    case ARMBuildAttrs::CPU_arch: // uleb128
>>
>> Case labels shouldn't be indented.
>>
>>> +      ulebValue = decodeULEB128((uint8_t*)readPtr, &n);
>>> +      readPtr += n;
>>> +      Attrs.Tag_CPU_arch = (ARMBuildAttrs::CPUArch)ulebValue;
>>> +      TagsSet.push_back(tag);
>>> +      break;
>>
>> [snip]
>>
>>> +    case ARMBuildAttrs::ABI_VFP_args: // uleb128
>>> +      ulebValue = decodeULEB128((uint8_t*)readPtr, &n);
>>> +      readPtr += n;
>>> +      Attrs.Tag_ABI_VFP_args = ulebValue;
>>> +      TagsSet.push_back(tag);
>>> +      break;
>>> +    default:
>>> +      // Unhandled build attribute tag, according to the spec we should be able
>>> +      // to infer the type of the value from (tag % 2) and skip over it.
>>> +      if (tag & 0x1) {
>>> +        // Value should be a null terminated byte string
>>> +        strValue = (char*)readPtr;
>>> +        readPtr += strValue.size() + 1;
>>> +      }
>>
>> No newline before else.
>>
>>> +      else {
>>> +        // Value should be a uleb128
>>> +        ulebValue = decodeULEB128((uint8_t*)readPtr, &n);
>>> +        readPtr += n;
>>> +      }
>>> +      break;
>>> +  }
>>> +  return readPtr;
>>>  }
>>>
>>> +} // object
>>> +} // llvm
>>> +
>>>  #endif
>>> diff --git a/include/llvm/Object/ELF_ARM.h b/include/llvm/Object/ELF_ARM.h
>>> new file mode 100644
>>> index 0000000..1cb1226
>>> --- /dev/null
>>> +++ b/include/llvm/Object/ELF_ARM.h
>>> @@ -0,0 +1,332 @@
>>> +//===-- ELF_ARM.h - ARM ELF ABI ---------------------------------*- C++ -*-===//
>>> +//
>>> +//                     The LLVM Compiler Infrastructure
>>> +//
>>> +// This file is distributed under the University of Illinois Open Source
>>> +// License. See LICENSE.TXT for details.
>>> +//
>>> +//===----------------------------------------------------------------------===//
>>> +//
>>> +// This file contains enumerations and support routines for ARM build attributes
>>> +// as defined in ARM ABI addenda document (ABI release 2.08).
>>> +//
>>> +//===----------------------------------------------------------------------===//
>>> +
>>> +#ifndef LLVM_OBJECT_ELF_ARM_H
>>> +#define LLVM_OBJECT_ELF_ARM_H
>>> +
>>> +#include "llvm/ADT/SmallVector.h"
>>> +#include "llvm/Support/Endian.h"
>>> +
>>> +namespace llvm {
>>> +
>>> +namespace ARMBuildAttrs {
>>
>> The contents of namespace this large should not be indented.
>>
>>> +  enum SpecialAttr {
>>> +    // This is for the .cpu asm attr. It translates into one or more
>>> +    // AttrType (below) entries in the .ARM.attributes section in the ELF.
>>> +    SEL_CPU
>>> +  };
>>> +
>>
>> [snip]
>>
>>> +
>>> +  class ARMGenericBuildAttrInfo {
>>> +public:
>>
>> public should be at the same indentation as class.
>>
>>> +    ARMGenericBuildAttrInfo()
>>> +      : Valid(false) {}
>>> +
>>> +    CPUArch Tag_CPU_arch;
>>> +    CPUArchProfile Tag_CPU_arch_profile;
>>> +    std::string Tag_CPU_raw_name;
>>
>> [snip]
>>
>>> diff --git a/tools/llvm-readobj/llvm-readobj.cpp b/tools/llvm-readobj/llvm-readobj.cpp
>>> index 3be1289..afe3d54 100644
>>> --- a/tools/llvm-readobj/llvm-readobj.cpp
>>> +++ b/tools/llvm-readobj/llvm-readobj.cpp
>>> @@ -184,6 +184,155 @@ void DumpHeaders(const ObjectFile *obj) {
>>>    outs() << "\n";
>>>  }
>>>
>>> +void printARMBuildAttributes(ObjectFile *obj, MemoryBuffer *Buffer) {
>>> +  ARMBuildAttrs::ARMGenericBuildAttrInfo BuildAttrs;
>>> +  SmallVector<unsigned, 16> AttrsRead;
>>> +  // Check the ELF header for endianness and word size class.
>>> +  char type = (uint8_t)Buffer->getBufferStart()[ELF::EI_CLASS];
>>> +  char endian = (uint8_t)Buffer->getBufferStart()[ELF::EI_DATA];
>>
>> This information is available from ELFObjectFile.
>>
>>> +
>>> +  if (endian == ELF::ELFDATA2LSB && type == ELF::ELFCLASS32)
>>> +    static_cast<ELFObjectFile<support::little, false> *>(obj)
>>> +      ->ARMReadBuildAttributes(BuildAttrs, AttrsRead);
>>> +  else if (endian == ELF::ELFDATA2LSB && type == ELF::ELFCLASS64)
>>> +    static_cast<ELFObjectFile<support::little, true> *>(obj)
>>> +      ->ARMReadBuildAttributes(BuildAttrs, AttrsRead);
>>> +  else if (endian == ELF::ELFDATA2MSB && type == ELF::ELFCLASS32)
>>> +    static_cast<ELFObjectFile<support::big, false> *>(obj)
>>> +      ->ARMReadBuildAttributes(BuildAttrs, AttrsRead);
>>> +  else
>>> +    static_cast<ELFObjectFile<support::big, true> *>(obj)
>>> +      ->ARMReadBuildAttributes(BuildAttrs, AttrsRead);
>>
>> There's no such thing as bigendian ARM, right?
>>
>>> +
>>> +  outs() << "ARM file attributes:\n";
>>> +  for (SmallVector<unsigned, 16>::iterator I =AttrsRead.begin(),
>>
>> Space after =.
>>
>>> +       E = AttrsRead.end(); I != E; ++I) {
>>> +    switch (*I) {
>>> +      case ARMBuildAttrs::CPU_name:
>>
>> Case labels should be indented to the same level as switch.
>>
>>> +        outs() << "  Tag_CPU_name: " << BuildAttrs.Tag_CPU_name << "\n";
>>> +        break;
>>
>> [snip]
>>
>>> +      case ARMBuildAttrs::ABI_FP_optimization_goals:
>>> +        outs() << "  Tag_ABI_FP_optimization_goals: " <<
>>> +                  BuildAttrs.Tag_ABI_FP_optimization_goals << "\n";
>>> +        break;
>>> +      default:
>>> +        break;
>>
>> This looks like it could be shrunk a lot with a simple macro.
>>
>>> +    }
>>> +  }
>>> +}
>>> +
>>>  int main(int argc, char** argv) {
>>>    error_code ec;
>>>    sys::PrintStackTraceOnErrorSignal();
>>> @@ -204,7 +353,7 @@ int main(int argc, char** argv) {
>>>      return 1;
>>>    }
>>>
>>> -  ObjectFile *obj = ObjectFile::createObjectFile(File.take());
>>> +  ObjectFile *obj = ObjectFile::createObjectFile(File.get());
>>
>> This is wrong. ObjectFile takes ownership of the MemoryBuffer. This
>> will double free.
>>
>>>    if (!obj) {
>>>      errs() << InputFilename << ": Object type not recognized\n";
>>>    }
>>> @@ -213,6 +362,10 @@ int main(int argc, char** argv) {
>>>    DumpSymbols(obj);
>>>    DumpDynamicSymbols(obj);
>>>    DumpLibrariesNeeded(obj);
>>> +
>>> +  if (obj->getArch() == Triple::arm && obj->isELF())
>>> +    printARMBuildAttributes(obj, File.get());
>>> +
>>
>> With the changes I mentioned above there's no need to pass the memory buffer in.
>>
>>>    return 0;
>>>  }
>>>
>>
>> I have no idea how ARM attributes are formatted so I'm not sure about
>> that part. But overall this patch seems very fragile and verbose.
>>
>> - Michael Spencer
>>
>>
>>> -----Original Message-----
>>> From: rengolin at gmail.com [mailto:rengolin at gmail.com] On Behalf Of Renato
>>> Golin
>>> Sent: 16 November 2012 08:53
>>> To: Amara Emerson
>>> Cc: llvm-commits at cs.uiuc.edu for LLVM
>>> Subject: Re: [llvm-commits] [PATCH] ARM build attribute reading support
>>>
>>> On 15 November 2012 18:18, Amara Emerson <amara.emerson at arm.com> wrote:
>>>> Thanks for the comments. Yes, I realise the tests are a bit light at the
>>>> moment. I was hoping to get some general functionality comments as it
>>> would
>>>> probably take a few hundred lines to test each attribute type (and bloat
>>> the
>>>> llvm-readobj tool somewhat). If the general structure looks good I can add
>>>> more tests.
>>>
>>> Should be enough to have a few objects with different configurations
>>> for the attribute table on a test dir. And maybe one with a giant
>>> table with all possible arguments in it.
>>>
>>>
>>>>> Does big.LITTLE have its own value, here?
>>>>
>>>> No this is for the current ARM ELF ABI spec, which has a defined list of
>>>> architecture types.
>>>
>>> I know, but since big.LITTLE will certainly need new thoughts on how
>>> to optimise and especially how to select libraries (probably something
>>> like gnu's ifunc etc), would be good to have it in, if ARM already
>>> have something public towards that.
>>>
>>>
>>>> I don't have much of a preference really. The header file was an extended
>>>> version of the existing one in the ARM backend so I continued with the
>>> enum
>>>> style already used.
>>>
>>> I know, much of it was in the old file.
>>>
>>>
>>>> I used getters/setters because I wasn't sure on how to do error handling
>>> if
>>>> an invalid or strange attribute value was read. libObject seems different
>>> to
>>>> the rest of LLVM in this sense. E.g. should reading an invalid/malformed
>>>> attribute value result in an fatal error? If not, how should it warn or be
>>>> handled silently. If anyone who knows it better than me can give tips here
>>>> it'd be welcome.
>>>
>>> My personal feeling is that a malformed attribute table should be
>>> treated as an error, and you should not have setters in the class.
>>>
>>> So, instead of transforming that into a POD, I'd leave it all private,
>>> implement only the getters and let only that function create it
>>> (either by marking its class as friend or by having a way to construct
>>> it with all the information at once - maybe a factory).
>>>
>>> The point is that we may want to merge different tables, but we don't
>>> want to silently change any of the original tables, and each one must
>>> reflect exactly what was in the object file. The factory mentioned
>>> above could also hold the merging rules (when constructing from an
>>> array of tables, for instance), if that proves necessary in the
>>> future.
>>>
>>> --
>>> cheers,
>>> --renato
>>>
>>> http://systemcall.org/
>>
>> The goal of error handling in LLVMObject is to allow as much laziness
>> as possible while not crashing on any invalid files. The ELF reader
>> currently doesn't do this very well, however the COFF reader has quite
>> a few examples of how this is handled. I admit that it's quite verbose
>> but the alternative is either silently dropping data or crashing on
>> parse errors :( (If someone has a better solution I'd love to see it)
>>
>> This patch really should be written to correctly handle parsing errors.
>>
>> - Michael Spencer
>
>
>
>
>
> _______________________________________________
> llvm-commits mailing list
> llvm-commits at cs.uiuc.edu
> http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits



-- 
cheers,
--renato

http://systemcall.org/



More information about the llvm-commits mailing list