[PATCH] D73226: [ms] [llvm-ml] Improve data support, adding names and complex initializers.

Eric Astor via Phabricator via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 24 10:03:00 PST 2020


epastor marked 3 inline comments as done.
epastor added inline comments.


================
Comment at: llvm/lib/MC/MCParser/MasmParser.cpp:3018
+      for (int i = 0; i < Repetitions; ++i)
+        ValuesAsInt.append(DuplicatedValues.begin(), DuplicatedValues.end());
+    } else {
----------------
thakis wrote:
> epastor wrote:
> > thakis wrote:
> > > Is there no more efficient way to express this internally? If you `1000000 dup (2)`, do we have to write out 1M 2s?
> > We could optimize the case of `1000000 dup (2)` with a fill-type instruction, but I'm not sure how we would handle (e.g.) `500000 dup (2,3)`. Also, this is data-initialization logic, so either way the actual binary is going to contain 1M literal copies. Do you think it's worth special-casing the single-value handling internally anyway?
> as-is is fine for now
Just a note for the future. Currently this means we actually make 1M copies of the APInt representing a real with value 2 in memory while compiling, and don't let go until we've processed the entire initializer list.

There are three cases here:
1. `1000000 dup (2)` - writes out 2,2,2,2,... to 1M positions.
2. `500000 dup (2,3)` - writes out 2,3,2,3,... to 1M positions.
3. `250000 dup (2, 3 dup (1))` - writes out 2,1,1,1,2,1,1,1,... to 1M positions.

Case 1 could be optimized by using Fill logic.
Case 2 could be handled by passing a "duplication count" to the recursive call, and letting it emit results directly.

Case 3 is more of a problem, though. Arbitrarily nested duplications get hard unless you expand them in place, as the current code does... or unless you parse them into a specialized tree structure, and expand it back to actual values only as needed.

All of this is complicated by the fact that once STRUCT support is in place, we do need to be able to parse initializers and hold them in memory, not just emit them immediately.

The ideal solution would probably be a tree-based intermediate representation for initializers, with duplication nodes representing "X DUP (<child>)". It's probably worth coming back to this eventually, but we'll leave it as-is for now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73226/new/

https://reviews.llvm.org/D73226





More information about the llvm-commits mailing list