[PATCH] [LLD] Add a new PE/COFF port

Rui Ueyama ruiu at google.com
Thu May 28 10:06:54 PDT 2015


Thank you for reviewing!


================
Comment at: CMakeLists.txt:99
@@ -98,1 +98,2 @@
 add_subdirectory(docs)
+add_subdirectory(COFF)
----------------
rafael wrote:
> No need to change it now, but what do you think this will look like once we add ELF? Will this be renamed SectionObj or there will be an ELF directory and the common code will be moved to lib?
It depends on how it goes. If we end up having code that works really well for COFF but not fit to ELF, we would decide to share less code, and if that's the case keeping class definitions in the same directory would make sense. If that's not the case, we should move them out of this directory to lib.

================
Comment at: COFF/Chunks.h:19
@@ +18,3 @@
+
+using llvm::COFF::ImportDirectoryTableEntry;
+using llvm::object::COFFSymbolRef;
----------------
rafael wrote:
> Using "using" in headers? Maybe at least move them to a namespace.
> 
Done.

================
Comment at: COFF/Chunks.h:40
@@ +39,3 @@
+public:
+  virtual ~Chunk() {}
+
----------------
rafael wrote:
> Is there an out of line key method?
> 
> Is this deleted polimorficaly?
It's deleted polimorfically. Writer has a vector of unique_ptr<Chunk>.

================
Comment at: COFF/Chunks.h:44
@@ +43,3 @@
+  // this is a common or BSS chunk.
+  virtual const uint8_t *getData() const { llvm_unreachable("internal error"); }
+
----------------
rafael wrote:
> A more descriptive string would be nice :-)
Done.

================
Comment at: COFF/Chunks.h:65
@@ +64,3 @@
+  // by this chunk is filled with zeros.
+  virtual bool hasData() const { return true; }
+
----------------
rafael wrote:
> Having a HasData protected member would be more llvm like.
Writer uses this member function.

================
Comment at: COFF/Chunks.h:79
@@ +78,3 @@
+  // is illegal to call this function on non-section chunks because
+  // only section chunks are subject of garbage collection.
+  virtual void printDiscardedMessage() { llvm_unreachable("internal error"); }
----------------
rafael wrote:
> COFF has nothing like SHF_MERGE sections, right?
> 
> This is called only for COMDAT discarding, right? There is nothing like --gc_sections at the moment.
> 
> Where should it print? STDOUT? Document that.
It has COMDAT merging; search for "opt:icf". And gc-sections is already implemented in this patch.

================
Comment at: COFF/Chunks.h:112
@@ +111,3 @@
+
+// A chunk representing a section of an input file.
+class SectionChunk : public Chunk {
----------------
rafael wrote:
> s/representing/corresponding/?
Done.

================
Comment at: COFF/Chunks.h:228
@@ +227,3 @@
+
+// A chunk for the import descriptor table.
+class NullChunk : public Chunk {
----------------
rafael wrote:
> Why is it called Null then?
Added comments.

================
Comment at: COFF/Driver.cpp:124
@@ +123,3 @@
+    OS.flush();
+    return make_dynamic_error_code(StringRef(S));
+  }
----------------
rafael wrote:
> So, make_dynamic_error_code is a bit of a misfeature IMHO.
> 
> There are two things at play
> * Issuing a diagnostic
> * Returning an error
> 
> For the diagnostic, something as simple as llvm::errs() is a reasonable start (or passing down a raw_ostream). At some point we really have move the diagnostic handling code out of lib/IR so that it can be used for all of llvm.
> 
> For error handling, we just need as many error_code enums as the callers can handle. For example, for all of the bitcode reading we use just InvalidBitcodeSignature, CorruptedBitcode.
I agree with that. If you want to use this as a library (not in a fancy way, but just by passing a list of mix of strings (command line options) and memory buffer objects (instead of filename string) to link in-memory objects, you want to get a machine-readable error code instead of this kind of human-readable error messages.

However, there's a beauty of this dynamic_error_code thing. It contains a error string. Every time you check for an error, you can construct a new error message by appending prefix to that. For example, if you get an error string "file not found" from a subroutine, you can then construct a new error message "foo.obj: file not found" and then propagate it up to the caller. The caller may append more error messages, like "reading bar.obj's directive section: foo.obj: file not found".

I don't think you can do that with error code enums.

I'll leave this as is as this is not that important, I'll revisit this later.

================
Comment at: COFF/Driver.cpp:169
@@ +168,3 @@
+std::unique_ptr<InputFile> createFile(StringRef Path) {
+  if (StringRef(Path).endswith_lower(".lib"))
+    return llvm::make_unique<ArchiveFile>(Path);
----------------
rafael wrote:
> Does link.exe really use the file name instead of the file signature?
> Even if it does, it might be better to pass a MemoryBufferRef to the ArchiveFile and ObjectFile constructors and have them *not* own the buffer.
Maybe I should create Driver class and make it own MemoryBuffers. Currently as you can see below it's just a non-member function. Let me do that in a follow-up patch.

I don't think link.exe uses extensions at least this way. This is just my negligence.

================
Comment at: COFF/InputFiles.cpp:111
@@ +110,3 @@
+
+  if (!isa<COFFObjectFile>(Bin.get()))
+    return make_dynamic_error_code(Twine(Name) + " is not a COFF file.");
----------------
rafael wrote:
> dyn_cast instead of isa+cast
Done.

================
Comment at: COFF/InputFiles.cpp:150
@@ +149,3 @@
+      continue;
+    auto *C = new (Alloc) SectionChunk(this, Sec, I);
+    Chunks.push_back(C);
----------------
rafael wrote:
> who owns this?
This is a placement new.

================
Comment at: COFF/InputFiles.h:41
@@ +40,3 @@
+  // Returns symbols defined by this file.
+  virtual std::vector<SymbolBody *> &getSymbols() = 0;
+
----------------
rafael wrote:
> Exposing an iterator might be better, no?
Are you suggesting adding begin() and end() to InputFile? If so, because a subclass of thiis class contains another iterable object, vector<Chunk *>, adding begin and end is confusing.

================
Comment at: COFF/Memory.h:22
@@ +21,3 @@
+
+class StringAllocator {
+public:
----------------
rafael wrote:
> We should make this the one true StringSaver in llvm :-)
Yeah. This is off topic, but do you think if it makes sense to add these member functions to BumpPtrAllocator itself?

================
Comment at: COFF/Symbols.cpp:24
@@ +23,3 @@
+
+// Returns 1, 0 or -1 if this symbol should take precedence over the
+// Other in the symbol table, tie or lose, respectively.
----------------
rafael wrote:
> At least on ELF common is not that simple. The end result of linking two common symbols with the same name is a common symbol with the maximum size and the maximum alignment, so it is possible that it corresponds to none of the input common symbols. Is that the case for COFF?
The PE/COFF spec uses a term "common symbol" without defining it, there's no explanation what it is. I was thinking to investigate the behavior of the actual linker.

As to how to handle the case you described for ELF, I guess we can change the signature of this function to return ErrorOr<SymbolBody*> and return this, Other or a new SymbolBody object. This doesn't have to be done in this patch, but this needs considering. Thank you for pointing that out.

http://reviews.llvm.org/D10036

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list