[PATCH] add support for merging common strings

Michael Spencer bigcheesegs at gmail.com
Mon Feb 11 13:54:19 PST 2013



================
Comment at: lib/ReaderWriter/ELF/File.h:74
@@ +73,3 @@
+    int64_t operator()(const MergeSectionKey &k) const {
+      return (k._shdr->sh_name << 8) + k._offset;
+    }
----------------
Shankar Kalpathi Easwaran wrote:
> Michael Spencer wrote:
> > Use hash_combine.
> hash_combine takes a string, integer, .... I initially did that and reverted to this.
hash_combine takes any number of any type. You may need to cast sh_name to whatever its underlying type is.

================
Comment at: lib/ReaderWriter/ELF/File.h:225-226
@@ +224,4 @@
+      for (i = 0; i < sectionContents.size(); ++i) {
+        switch (sectionContents[i]) {
+        case '\0':
+          tokens.push_back(new (_readerStorage) 
----------------
Shankar Kalpathi Easwaran wrote:
> Michael Spencer wrote:
> > Michael Spencer wrote:
> > > if (sectionContents[i] == '\0')
> > This is also ignoring sh_entsize.
> The resolver doesnot support multibyte characters currently I think. 
Ah, yes. We should not treat the section as mergeable in that case and add a TODO.

================
Comment at: lib/ReaderWriter/ELF/File.h:468
@@ +467,3 @@
+                             llvm::ELF::SHF_STRINGS)) {
+          const MergeSectionKey ms(shdr, ri->offsetInAtom()+ri->addend());
+          ri->setTarget(_mergedSectionMap[ms]);
----------------
Shankar Kalpathi Easwaran wrote:
> Michael Spencer wrote:
> > Spaces around +.
> > 
> > Also, the spec doesn't require that relocations point to the beginning of of strings.
> Will that not be a relocation to the symbol that holds the data, than the data itself ?
> 
> for example :
> 
> const char *foo = "foo";
> 
> char fn() {
> return foo+3;
> }
> 
> relocation is here is relative to foo.
> 
> (or)
> 
> const char *foo1 = "foo";
> const char *bar = (char *)&foo1+1;
> 
> The relocation here is also to foo1
Yes, that's what the compiler generates. But it doesn't have to. For example:

const char *a = "abcd";
const char *b = "cd";

The compiler may do local merging itself, and refer directly to b instead of a + 2.


http://llvm-reviews.chandlerc.com/D387



More information about the llvm-commits mailing list