[PATCH] [lld] [LinkerScript] Implement semantics for simple sections mappings

Rafael Auler rafaelauler at gmail.com
Tue Mar 10 14:04:59 PDT 2015


================
Comment at: include/lld/ReaderWriter/LinkerScript.h:1122
@@ +1121,3 @@
+  struct SectionKey {
+    StringRef archiveName;
+    StringRef memberName;
----------------
shankarke wrote:
> rafaelauler wrote:
> > shankarke wrote:
> > > can you rename this to archiveNameOrPath
> > I was under the impression that, from the linker script perspective, we should always work with file names instead of paths. For example, to make my test case work correctly, I need to drop the path that comes from the File object and use only the file name. Only then I can match it with a linker script rule because they omit the full path. Can a linker script specify a full path to an archive?
> I think so, you can specify a full path to the archive too. why not ?
I checked the manual and you are right, thanks for pointing out. I renamed this to archivePath (and the other to memberPath) to maintain consistency with file.archivePath, which also can be a name or a path.

I also deleted a function call that I was using to retrieve the file name from a path in DefaultLayout. Now I checked with GNU ld that if the linker script has a full path inside it, it must only match inputs with this exact name, and vice-verse (if the input file was specified with a full path, it does not match a rule that only specifies the file name).

================
Comment at: include/lld/ReaderWriter/LinkerScript.h:1130
@@ +1129,3 @@
+  /// number (caching results).
+  struct SectionKeyHash {
+    int64_t operator()(const SectionKey &k) const {
----------------
shankarke wrote:
> rafaelauler wrote:
> > ruiu wrote:
> > > Instead of defining a class (struct) implementing an operator=(const SectionKey &), you can simply define of a function (int64_t f(const SectionKey &)) and pass it to std::unordered_map.
> > > 
> > > You want to move this below private:.
> > I don't understand how caching here would be error prone. I'm simply implementing the design that was suggested from the previous thread of D7915:
> > 
> > * Make DefaultLayout ask Sema about the order of two sections by section name (or, more precisely, section "key") and do not expose "ids".
> > 
> > * Since these questions are frequently asked for the same input section, we cache the result.
> > 
> > Which users do you refer to? You mean, the TargetLayout?
> Yes, TargetLayout.
Ok, but in this current implementation, it is impossible for TargetLayouts to change this id, since this is private to Sema. If the user wants to change this order, we can later implement an interface to do this in a clean way that also updates our caches.

================
Comment at: include/lld/ReaderWriter/LinkerScript.h:1138
@@ +1137,3 @@
+  /// our hash table mapping Sections to a Layout Order number (caching results)
+  struct SectionKeyEq {
+    bool operator()(const SectionKey &lhs, const SectionKey &rhs) const {
----------------
ruiu wrote:
> Ditto
I'm under the impression that std::unordered_map only accepts functors, no? I tried to remove this from the class, but I couldn't get this to work. Am I missing something? But I moved this to private.

================
Comment at: include/lld/ReaderWriter/LinkerScript.h:1186
@@ +1185,3 @@
+  /// to the same set of expressions because of wildcards rules.
+  std::vector<const SymbolAssignment *> getExprs(const SectionKey &key);
+
----------------
shankarke wrote:
> Could we evaluate as and when the input section is appended to the output section ?
Woudn't this be equivalent to evaluating once we append to segments? I mean, the algorithm I have in mind is exactly the same. Before adding a section to the output section, we gather all expressions and put it there. This is exactly the same as putting expressions once we attach an input section to a segment.

================
Comment at: lib/ReaderWriter/ELF/DefaultLayout.h:229-235
@@ +228,9 @@
+
+  /// \brief Add extra chunks to a segment just before including the input
+  /// section given by <archiveName, memberName, sectionName>. This is used by
+  /// ScriptLayout to add linker script expressions before each section.
+  virtual void addExtraChunksToSegment(Segment<ELFT> *segment,
+                                       StringRef archiveName,
+                                       StringRef memberName,
+                                       StringRef sectionName) {}
+
----------------
rafaelauler wrote:
> shankarke wrote:
> > If you can associate the expression also to an outputsection, this would not be needed. The code gets overly simplified too.
> Ok, I'll work on this. Thanks.
See above.

================
Comment at: lib/ReaderWriter/ELF/DefaultLayout.h:350
@@ +349,3 @@
+/// override some of the functionalities.
+template <class ELFT> class ScriptLayout : public DefaultLayout<ELFT> {
+public:
----------------
shankarke wrote:
> ruiu wrote:
> > shankarke wrote:
> > > rafaelauler wrote:
> > > > ruiu wrote:
> > > > > Merge this with DefaultLayout and remove this class.
> > > > No problem, but first can we get a consensus on whether do we need ScriptLayout or not? I'm asking this because I only moved the code out of DefaultLayout due to Shankar's solicitation.
> > > DefaultLayout was designed as a class to handle generic ELF Layout when not having linker scripts. Some of the linker script usecases are not even relevant for the DefaultLayout. Conditionally executing things in the DefaultLayout if we have a linker script IMO is a bad design. 
> > I'm sorry but I completely disagree. I need to keep an eye on this kind of overkill abstraction as the history of this project shows. I spent so much time to reduce complexity of the linker, such as removal of InputGraph. Don't overdesign class hierarchy. That's going to become technical debt.
> I dont think making changes in the DefaultLayout to accomodate LinkerScripts is a good idea. 
> 
> If others / Rafael think its a great idea, please go for it.
> 
> 
> 
> 
I guess currently we only have a few functions for ScriptLayout, so I merged them back into DefaultLayout. I tried to maintain them factored out from the rest of DefaultLayout, so as to be easy to see where script-related functions hook in the default behavior.

================
Comment at: lib/ReaderWriter/ELF/DefaultLayout.h:587
@@ +586,3 @@
+
+  StringRef archiveName = llvm::sys::path::filename(da->file().archivePath());
+  StringRef memberName = llvm::sys::path::filename(da->file().memberPath());
----------------
rafaelauler wrote:
> ruiu wrote:
> > It's good to add a brief comment here, like, section names may be renamed by linker scripts.
> Sure
After a comment from Shankar explaining that we may need the full path (instead of just the file name), I removed this behavior (I don't call llvm::sys::path::filename() anymore).

================
Comment at: lib/ReaderWriter/ELF/SegmentChunks.h:437
@@ -402,1 +436,3 @@
+        if (auto expr = dyn_cast<ExpressionChunk<ELFT>>(section))
+          fileOffset += expr->virtualAddr() - lastVirtualAddress;
       // Align fileoffset to the alignment of the section.
----------------
shankarke wrote:
> rafaelauler wrote:
> > shankarke wrote:
> > > why would you do this ?
> > This is used to keep in sync the file image with the memory image.
> Do you have a specific test for this ? I would like to see that to understand here.
Well, the test attached to this patch tests this. Let me try to briefly show what's happening here:

you see the line below, "fileOffset = llvm::RoundUpToAlignment..." ? 

Well, it is fixing the offset because when we assigned virtual addresses, we may have changed the address where a given chunk resides to meet alignment contraints. Thus, to keep the fileOffset in sync with the virtual addresses (memory image), this line fixes this.

However, the linker script also has power to change the current virtual address (we see now that not only alignments can change this). So I added these three lines of code that basically checks if we are calculating the address of an expression chunk and, if yes, check if our virtual address (calculated when assigning virtual addresses to chunks) is deviating from the expected value found in "lastVirtualAddress". If yes, we need to fix "fileOffset" by adding this "gap". It's just like a "custom alignment".

To see how this is necessary, any test case that uses a linker script that changes any address will suffice. The attached test case works: it requests a .ro.data section to start somewhere around address 0x506000 (actually, it is 0x50600 added to the size of previously emitted sections), but the current address for sections is around 0x500000 + previous sections size. If this code is removed, even though symbols get fixed up as though these sections reside at 0x506000, the memory content at run time will not reflect this because the file contents will be created putting everything cluttered at 0x500000.

================
Comment at: lib/ReaderWriter/LinkerScript.cpp:2210
@@ +2209,3 @@
+      return exprOrder;
+    }
+  }
----------------
rafaelauler wrote:
> ruiu wrote:
> > This loop content is duplicate of the previous one. Do we really need to separate non-wildcard patterns and wildcard patterns? Technically, the former is a special case of the latter, and I don't think we have to optimize this routine. Simplicity is preferred, I guess.
> I will refactor this.
I gave a look at this and thinks are already refactored. The real contents of the loop are actually seen in "matchSectionName()". However, I can't merge these two different loops into one because they are iterating over quite different things: the first iterates over a range of a multimap that contains names associated with "order ids", while the second iterated over a vector. I don't see how I can "blend" these two different ranges into the same loop. But if you have an idea of how to simplify this and there is something I'm missing here, I'll gladly implement it.

http://reviews.llvm.org/D8157

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/






More information about the llvm-commits mailing list