[PATCH] D73821: [yaml2obj] Add -D k=v to preprocess the input YAML

Fangrui Song via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Wed Feb 5 09:46:28 PST 2020


MaskRay marked an inline comment as done.
MaskRay added inline comments.


================
Comment at: llvm/tools/yaml2obj/yaml2obj.cpp:76
+
+  return Preprocessed;
+}
----------------
jhenderson wrote:
> MaskRay wrote:
> > grimar wrote:
> > > MaskRay wrote:
> > > > grimar wrote:
> > > > > I'd suggest something like the following
> > > > > 
> > > > > ```
> > > > > std::vector<std::pair<std::string, StringRef>> Substitution;
> > > > >   for (StringRef Define : D) {
> > > > >     StringRef F, V;
> > > > >     std::tie(F, V) = Define.split('=');
> > > > >     if (!Define.count('=') || F.empty()) {
> > > > >       ErrHandler("bad override, missing field name: " + Define);
> > > > >       return {};
> > > > >     }
> > > > >     Substitution.push_back({(Twine("[[") + F + "]]").str(), V});
> > > > >   }
> > > > > 
> > > > >   std::string Document = Buf.str();
> > > > >   if (Substitution.empty())
> > > > >     return Document;
> > > > > 
> > > > >   for (std::pair<std::string, StringRef> &P : Substitution) {
> > > > >     while (true) {
> > > > >       auto I = Document.find(P.first);
> > > > >       if (I == std::string::npos)
> > > > >         break;
> > > > >       Document = Document.replace(I, P.first.size(), P.second.str());
> > > > >     }
> > > > >   }
> > > > > ```
> > > > Maybe we should avoid quadratic complexity behavior, just in case that there are a long run of `[[[[[[[[[[`.
> > > > just in case that there are a long run of [[[[[[[[[[.
> > > 
> > > It reminded me about D36307. I believe that `[[[[[[` is a misuse of the feature and hence we probably should not care much.
> > > The approach I suggested is just very simple and it is much easier to read. I think the common use case for this feature will be
> > > "replace 1-2 defines in a short enough YAML", i.e. I am not sure we should think about an additional optimizations here.
> > > 
> > > Probably it is time to hear a third opinion though. @jhenderson, what do you think?
> > > 
> > Doing `replace` repeatedly may also have a non-convergence problem...
> Recursive replacement causes issues. For example `-DFOO=[[FOO]]` would result in an infinite loop. Convergence issues are only a problem if we try to do this. I personally don't think we should in the first version.
> 
> Another issue is what to do about variables that expand to other variables. One example is `-D VAR1=FOO -D VAR2=[[VAR1]]BAR` which will result in `[[VAR2]]` becoming `[[VAR1]]BAR` if specified in that order, or `FOOBAR` if VAR2 is defined first. I don't think we want this behaviour, as it is unstable and unobvious.
> 
> (Relatedly, I think we should have a test showing that variables that expand to a pattern that could be another variable aren't recursively expanded)
> 
> As for the quadratic complexity, I'm not too bothered by it in this case, as I doubt we'll ever get a YAML doc in a test where it really matters (and even if we do, there won't be many, so the overall slow-down in the testsuite will be minimal).
> 
> FWIW, I think we can probably forbid names containing '[' or ']' themselves.
Added the following to `macro.yaml`

```lang=sh
# RUN: yaml2obj --docnum=2 -D a0='[[a1]]' -D a1='[[a0]]' %s | llvm-nm --just-symbol-name --no-sort - > %t.recur
# RUN: echo -e '[[a1]]\n[[a0]]' > %t0.recur
# RUN: diff -u %t0.recur %t.recur
```

This patch avoids quadratic complexity and non-convergence with very little code, so I think sticking with the current version is fine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73821/new/

https://reviews.llvm.org/D73821





More information about the llvm-commits mailing list