[llvm] New tool 'llvm-elf2bin'. (NOT READY FOR REVIEW – NO TESTS) (PR #73625)

Simon Tatham via llvm-commits llvm-commits at lists.llvm.org
Tue Nov 28 01:25:21 PST 2023


https://github.com/statham-arm created https://github.com/llvm/llvm-project/pull/73625

This implements conversion from ELF images to binary and hex file formats, in a similar way to some of llvm-objcopy's output modes.

The biggest difference is that it works exclusively with the ELF loadable segment view, that is, program header table entries with the PT_LOAD type. Thus, it will output the whole of every loadable segment whether or not a section covers the same area of the file. In particular, it can still operate on an ELF image which has no section header table at all. This difference is large enough that it would be much more difficult to put the same functionality into llvm-objcopy: it would involve working out the interactions with all llvm-objcopy's other options, particularly the section-oriented ones.

Other features:

 - supports both Intel hex and Motorola S-records

 - supports output of a binary file per ELF segment, or a single one with padding between the segments to place them at their correct offsets

 - option to use either p_paddr or p_vaddr from each segment

 - allows distributing the output binary data into multiple 'banks', to support hardware configurations in which each 32-bit word of memory is stored as 16 bits in each of two ROMs, or 8 in each of four, or similar

This PR is not really ready for review yet, because there are no tests in it. `elf2bin` was developed as a standalone tool, and it's only just been integrated into the LLVM code base. It was requested by quic-subhkedi in
https://discourse.llvm.org/t/bring-features-of-fromelf-of-arm-to-llvm-objcopy/73229/12
and I'll do the extra work of porting the tests into LLVM if there's agreement that this is acceptable as a separate tool.

>From 32eded2b00685a7468dad496d31e5533828cf853 Mon Sep 17 00:00:00 2001
From: Simon Tatham <simon.tatham at arm.com>
Date: Thu, 23 Nov 2023 14:25:49 +0000
Subject: [PATCH] New tool 'llvm-elf2bin'.

This implements conversion from ELF images to binary and hex file
formats, in a similar way to some of llvm-objcopy's output modes.

The biggest difference is that it works exclusively with the ELF
loadable segment view, that is, program header table entries with the
PT_LOAD type. Thus, it will output the whole of every loadable segment
whether or not a section covers the same area of the file. In
particular, it can still operate on an ELF image which has no section
header table at all. This difference is large enough that it would be
much more difficult to put the same functionality into llvm-objcopy:
it would involve working out the interactions with all llvm-objcopy's
other options, particularly the section-oriented ones.

Other features:

 - supports both Intel hex and Motorola S-records

 - supports output of a binary file per ELF segment, or a single one
   with padding between the segments to place them at their correct
   offsets

 - option to use either p_paddr or p_vaddr from each segment

 - allows distributing the output binary data into multiple 'banks',
   to support hardware configurations in which each 32-bit word of
   memory is stored as 16 bits in each of two ROMs, or 8 in each of
   four, or similar

This was developed as a standalone tool, and it's only just been
integrated into the LLVM code base. Until now, it's been tested using
a Python test harness that constructs test ELF files with no section
view. In order to bring the tests into the LLVM infrastructure, we
would probably need to start by enhancing yaml2obj to be able to write
out ELF files of that type.

Requested by quic-subhkedi in
https://discourse.llvm.org/t/bring-features-of-fromelf-of-arm-to-llvm-objcopy/73229/12
---
 llvm/tools/llvm-elf2bin/CMakeLists.txt   |  26 ++
 llvm/tools/llvm-elf2bin/Opts.td          |  63 ++++
 llvm/tools/llvm-elf2bin/bin.cpp          | 326 ++++++++++++++++
 llvm/tools/llvm-elf2bin/elf.cpp          |  66 ++++
 llvm/tools/llvm-elf2bin/hex.cpp          | 199 ++++++++++
 llvm/tools/llvm-elf2bin/llvm-elf2bin.cpp | 461 +++++++++++++++++++++++
 llvm/tools/llvm-elf2bin/llvm-elf2bin.h   | 170 +++++++++
 7 files changed, 1311 insertions(+)
 create mode 100644 llvm/tools/llvm-elf2bin/CMakeLists.txt
 create mode 100644 llvm/tools/llvm-elf2bin/Opts.td
 create mode 100644 llvm/tools/llvm-elf2bin/bin.cpp
 create mode 100644 llvm/tools/llvm-elf2bin/elf.cpp
 create mode 100644 llvm/tools/llvm-elf2bin/hex.cpp
 create mode 100644 llvm/tools/llvm-elf2bin/llvm-elf2bin.cpp
 create mode 100644 llvm/tools/llvm-elf2bin/llvm-elf2bin.h

diff --git a/llvm/tools/llvm-elf2bin/CMakeLists.txt b/llvm/tools/llvm-elf2bin/CMakeLists.txt
new file mode 100644
index 000000000000000..3c4344d049764b4
--- /dev/null
+++ b/llvm/tools/llvm-elf2bin/CMakeLists.txt
@@ -0,0 +1,26 @@
+# ===- CMakeLists.txt - llvm-elf2bin's build script -----------------------===//
+# 
+#  Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+#  See https://llvm.org/LICENSE.txt for license information.
+#  SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+# 
+# ===----------------------------------------------------------------------===//
+
+set(LLVM_LINK_COMPONENTS
+  Object
+  Option
+  Support)
+
+set(LLVM_TARGET_DEFINITIONS Opts.td)
+tablegen(LLVM Opts.inc -gen-opt-parser-defs)
+add_public_tablegen_target(Elf2BinOptsTableGen)
+
+add_llvm_tool(llvm-elf2bin
+  llvm-elf2bin.cpp
+  elf.cpp
+  bin.cpp
+  hex.cpp
+  DEPENDS
+  Elf2BinOptsTableGen
+  GENERATE_DRIVER
+)
diff --git a/llvm/tools/llvm-elf2bin/Opts.td b/llvm/tools/llvm-elf2bin/Opts.td
new file mode 100644
index 000000000000000..b6f654b5867f3b2
--- /dev/null
+++ b/llvm/tools/llvm-elf2bin/Opts.td
@@ -0,0 +1,63 @@
+//===- Opts.td - llvm-elf2bin's command-line option definitions -----------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+include "llvm/Option/OptParser.td"
+
+multiclass Value<string longname, string shortname, string metavar,
+                 string help, OptionGroup group> {
+  def NAME # _EQ : Joined<["--"], longname # "=">,
+                   HelpText<help>,
+                   MetaVarName<metavar>,
+                   Group<group>;
+  def : Separate<["--"], longname>,
+        Alias<!cast<Joined>(NAME # "_EQ")>;
+
+  if !not(!empty(shortname)) then {
+    def : JoinedOrSeparate<["-"], shortname>,
+          Alias<!cast<Joined>(NAME # "_EQ")>,
+          HelpText<"Alias for --" # longname>, Group<group>;
+  }
+}
+
+multiclass NoValue<string longname, string shortname, string help,
+                   OptionGroup group> {
+  def NAME : Flag<["--"], longname>,
+             HelpText<help>,
+             Group<group>;
+
+  if !not(!empty(shortname)) then {
+    def : Flag<["-"], shortname>,
+          Alias<!cast<Flag>(NAME)>,
+          HelpText<"Alias for --" # longname>;
+  }
+}
+
+def grp_mode: OptionGroup<"mode">, HelpText<"FORMAT OF OUTPUT">;
+def grp_file: OptionGroup<"output">, HelpText<"LOCATION OF OUTPUT">;
+def grp_opts: OptionGroup<"options">, HelpText<"OPTIONS">;
+
+defm output_file: Value<"output-file", "o", "FILENAME", "Name of the output file. Invalid if more than one output file is to be generated.", grp_file>;
+defm output_pattern: Value<"output-pattern", "O", "PATTERN", "Schema for naming all output files. May contain %f for base name of input file; %F for full name of input file; %a or %A for base address of ELF segment, in hex or in HEX; %b for bank number, if using --banks; %% for a literal % sign.", grp_file>;
+
+defm ihex: NoValue<"ihex", "", "Output Intel Hex files (\":108000\" style)", grp_mode>;
+defm srec: NoValue<"srec", "", "Output Motorola Hex files (\"S3150800\" style)", grp_mode>;
+defm bin: NoValue<"bin", "", "Output a binary file per ELF segment", grp_mode>;
+defm bincombined: NoValue<"bincombined", "", "Output a single binary file, containing all segments, with padding to place them at the correct relative positions", grp_mode>;
+defm vhx: NoValue<"vhx", "", "Output a Verilog Hex file per ELF segment", grp_mode>;
+defm vhxcombined: NoValue<"vhxcombined", "", "Output a single Verilog Hex file, containing all segments, with padding to place them at the correct relative positions", grp_mode>;
+
+defm base: Value<"base", "", "ADDRESS", "Base address of the whole output file, for --bincombined or --vhxcombined", grp_opts>;
+defm banks: Value<"banks", "", "WIDTHxNUM", "Partition the output into subfiles by writing WIDTH bytes in turn to each of NUM files in cyclic order, for binary or Verilog Hex output", grp_opts>;
+defm datareclen: Value<"datareclen", "", "LENGTH", "Number of data bytes to put into each record, for Intel Hex or Motorola Hex output", grp_opts>;
+defm segments: Value<"segments", "", "ADDRESS,ADDRESS,...", "Select only the ELF segments of the input which start at the specified addresses", grp_opts>;
+defm zi: NoValue<"zi", "", "Include the zero-initialized padding at the end of each ELF segment", grp_opts>;
+defm physical: NoValue<"physical", "", "Use the physical addresses (p_paddr) of ELF segments", grp_opts>;
+defm virtual: NoValue<"virtual", "", "Use the virtual addresses (p_vaddr) of ELF segments", grp_opts>;
+
+defm help: NoValue<"help", "", "Display this help", grp_opts>;
+defm version: NoValue<"version", "", "Display the version", grp_opts>;
diff --git a/llvm/tools/llvm-elf2bin/bin.cpp b/llvm/tools/llvm-elf2bin/bin.cpp
new file mode 100644
index 000000000000000..15044613aac895b
--- /dev/null
+++ b/llvm/tools/llvm-elf2bin/bin.cpp
@@ -0,0 +1,326 @@
+//===- bin.cpp - Code to write binary and VHX output for llvm-elf2bin -----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm-elf2bin.h"
+
+#include <algorithm>
+#include <cassert>
+#include <memory>
+#include <queue>
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/bit.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace llvm;
+
+/*
+ * Abstraction that provides a means of getting binary data from
+ * somewhere. In the simple case this will involve reading data from a
+ * segment in an ELF file. In more complex cases there might also be
+ * zero-byte padding, or one of these stream objects filtering out the
+ * even-index bytes of another.
+ */
+class BinaryDataStream {
+public:
+  // Returns a string of data, in whatever size is convenient. (But
+  // the size should be bounded, so that other streams filtering this
+  // one don't have to swallow a whole file in one go.)
+  //
+  // An empty returned string means EOF.
+  virtual StringRef read() = 0;
+  virtual ~BinaryDataStream() = default;
+};
+
+/*
+ * BinaryDataStream implementation that reads data from an ELF image.
+ */
+class ElfSegment : public BinaryDataStream {
+  InputObject &inobj;
+  size_t position, remaining;
+
+  StringRef read() override {
+    size_t readlen = std::min<size_t>(remaining, 65536);
+    size_t readpos = position;
+    position += readlen;
+    remaining -= readlen;
+    return inobj.membuf->getBuffer().substr(readpos, readlen);
+  }
+
+public:
+  ElfSegment(InputObject &inobj, size_t offset, size_t size)
+      : inobj(inobj), position(offset), remaining(size) {}
+};
+
+/*
+ * BinaryDataStream implementation that generates zero padding.
+ */
+class Padding : public BinaryDataStream {
+  size_t remaining;
+  char buffer[65536];
+
+  StringRef read() override {
+    size_t readlen = std::min<size_t>(remaining, 65536);
+    remaining -= readlen;
+    return StringRef(buffer, readlen);
+  }
+
+public:
+  Padding(size_t size) : remaining(size) { memset(buffer, 0, sizeof(buffer)); }
+};
+
+/*
+ * BinaryDataStream implementation that chains other BinaryDataStreams
+ * together.
+ */
+class Chain : public BinaryDataStream {
+  std::queue<std::unique_ptr<BinaryDataStream>> queue;
+
+  StringRef read() override {
+    while (!queue.empty()) {
+      StringRef data = queue.front()->read();
+      if (!data.empty())
+        return data;
+
+      queue.pop();
+    }
+
+    // If we get here, everything in our queue has finished.
+    return "";
+  }
+
+public:
+  Chain() = default;
+  void push(std::unique_ptr<BinaryDataStream> &&st) {
+    queue.push(std::move(st));
+  }
+};
+
+/*
+ * BinaryDataStream implementation that filters the output of another
+ * BinaryDataStream so as to return only a subset of the bytes,
+ * defined by a modulus and a range of residues. Specifically, a range
+ * of 'nres' consecutive bytes from the underlying stream is passed
+ * on, beginning at each byte whose index is congruent to 'firstres'
+ * mod 'modulus' (regarding the first byte of the underlying stream as
+ * having index 0).
+ *
+ * 'modulus' has to be a power of 2.
+ */
+class ModFilter : public BinaryDataStream {
+  std::unique_ptr<BinaryDataStream> st;
+  // Internal representation: 'mask' is the bitmask that reduces mod
+  // 'modulus'. 'pos' iterates cyclically over 0,...,modulus-1 with
+  // each byte we consume, and we return the byte if pos < nres.
+  uint64_t mask, nres, pos;
+  std::string outstring;
+
+  StringRef read() override {
+    outstring.clear();
+    llvm::raw_string_ostream outstream(outstring);
+
+    while (true) {
+      StringRef data = st->read();
+      if (data.empty())
+        break;
+
+      for (char c : data) {
+        if (pos < nres)
+          outstream << c;
+        pos = (pos + 1) & mask;
+      }
+
+      // If that batch of input contributed no bytes to our
+      // output, go round again. Otherwise, we have something to
+      // return.
+      if (!outstring.empty())
+        break;
+    }
+    return outstring;
+  }
+
+public:
+  ModFilter(std::unique_ptr<BinaryDataStream> &&st, uint64_t modulus,
+            uint64_t firstres, uint64_t nres)
+      : st(std::move(st)), nres(nres) {
+    // Check input values are reasonable.
+    assert(llvm::has_single_bit(modulus));
+    assert(nres > 0);
+    assert(nres < modulus);
+
+    mask = modulus - 1;
+
+    // Set pos to be (-firstres), so that we'll skip the right
+    // number of bytes before the first one we return.
+    //
+    // Written as (1 + ~firstres) to avoid Visual Studio
+    // complaining about negating an unsigned.
+    pos = (1 + ~firstres) & mask;
+  }
+};
+
+static void bin_write(BinaryDataStream &st, const std::string &outfile) {
+  std::error_code error;
+  llvm::raw_fd_ostream ofs(outfile, error);
+  if (error)
+    fatal(outfile, "unable to open", errorCodeToError(error));
+
+  while (true) {
+    StringRef data = st.read();
+    if (data.empty())
+      return;
+    ofs << data;
+  }
+}
+
+static void vhx_write(BinaryDataStream &st, const std::string &outfile) {
+  std::error_code error;
+  llvm::raw_fd_ostream ofs(outfile, error);
+  if (error)
+    fatal(outfile, "unable to open", errorCodeToError(error));
+
+  while (true) {
+    StringRef data = st.read();
+    if (data.empty())
+      return;
+    for (uint8_t c : data) {
+      static const char hexdigits[] = "0123456789ABCDEF";
+      ofs << hexdigits[c >> 4] << hexdigits[c & 0xF] << '\n';
+    }
+  }
+}
+
+static std::unique_ptr<BinaryDataStream> onesegment_prepare(InputObject &inobj,
+                                                            uint64_t fileoffset,
+                                                            uint64_t size,
+                                                            uint64_t zi_size) {
+  auto base_stream = std::make_unique<ElfSegment>(inobj, fileoffset, size);
+
+  if (!zi_size)
+    return base_stream;
+
+  auto chain = std::make_unique<Chain>();
+  chain->push(std::move(base_stream));
+  chain->push(std::make_unique<Padding>(zi_size));
+  return chain;
+}
+
+static std::unique_ptr<BinaryDataStream>
+combined_prepare(InputObject &inobj, const std::vector<Segment> &segments_orig,
+                 bool include_zi, std::optional<uint64_t> baseaddr) {
+  // Sort the segments by base address, in case they weren't already.
+  struct {
+    bool operator()(const Segment &a, const Segment &b) {
+      return a.baseaddr < b.baseaddr;
+    }
+  } comparator;
+  std::vector<Segment> sorted = segments_orig;
+  std::sort(sorted.begin(), sorted.end(), comparator);
+
+  // Spot and reject overlapping segments.
+  //
+  // (WIBNI: we _could_ tolerate these if they also agreed on what
+  // part of the ELF file corresponded to the overlapping range of
+  // address space. I don't see a reason to implement that in
+  // advance of someone actually having a good use for it, but
+  // that's why I'm leaving this overlap check as a separate pass
+  // rather than folding it into the next one - this way, we could
+  // write a modified set of segments into 'nonoverlapping'.)
+  std::vector<Segment> nonoverlapping;
+  if (!sorted.empty()) {
+    auto it = sorted.begin(), end = sorted.end();
+
+    if (baseaddr && it->baseaddr < baseaddr.value())
+      fatal(inobj, Twine("first segment is at address 0x") +
+                       Twine::utohexstr(it->baseaddr) +
+                       ", below the specified base address 0x" +
+                       Twine::utohexstr(baseaddr.value()));
+
+    nonoverlapping.push_back(*it++);
+    for (; it != end; ++it) {
+      const auto &prev = nonoverlapping.back(), curr = *it;
+      if (curr.baseaddr - prev.baseaddr < prev.memsize)
+        fatal(inobj, Twine("segments at addresses 0x")
+              + Twine::utohexstr(prev.baseaddr) + " and 0x"
+              + Twine::utohexstr(curr.baseaddr) + " overlap");
+      nonoverlapping.push_back(curr);
+    }
+  }
+
+  // Make a chained output stream that inserts the right padding
+  // between all those segments.
+  auto chain = std::make_unique<Chain>();
+  if (!nonoverlapping.empty()) {
+    uint64_t addr =
+        (baseaddr ? baseaddr.value() : nonoverlapping.begin()->baseaddr);
+
+    for (const auto &seg : nonoverlapping) {
+      if (addr < seg.baseaddr)
+        chain->push(std::make_unique<Padding>(seg.baseaddr - addr));
+      chain->push(
+          std::make_unique<ElfSegment>(inobj, seg.fileoffset, seg.filesize));
+      addr = seg.baseaddr + seg.filesize;
+
+      if (include_zi && seg.memsize > seg.filesize) {
+        chain->push(std::make_unique<Padding>(seg.memsize - seg.filesize));
+        addr = seg.baseaddr + seg.memsize;
+      }
+    }
+  }
+  return chain;
+}
+
+static std::unique_ptr<BinaryDataStream>
+bank_prepare(std::unique_ptr<BinaryDataStream> stream, uint64_t bank_modulus,
+             uint64_t bank_firstres, uint64_t bank_nres) {
+  if (bank_modulus == 1)
+    return stream;
+
+  return std::make_unique<ModFilter>(std::move(stream), bank_modulus,
+                                     bank_firstres, bank_nres);
+}
+
+void bin_write(InputObject &inobj, const std::string &outfile,
+               uint64_t fileoffset, uint64_t size, uint64_t zi_size,
+               uint64_t bank_modulus, uint64_t bank_firstres,
+               uint64_t bank_nres) {
+  auto streamp = onesegment_prepare(inobj, fileoffset, size, zi_size);
+  streamp =
+      bank_prepare(std::move(streamp), bank_modulus, bank_firstres, bank_nres);
+  bin_write(*streamp, outfile);
+}
+
+void bincombined_write(InputObject &inobj, const std::string &outfile,
+                       const std::vector<Segment> &segments, bool include_zi,
+                       std::optional<uint64_t> baseaddr, uint64_t bank_modulus,
+                       uint64_t bank_firstres, uint64_t bank_nres) {
+  auto streamp = combined_prepare(inobj, segments, include_zi, baseaddr);
+  streamp =
+      bank_prepare(std::move(streamp), bank_modulus, bank_firstres, bank_nres);
+  bin_write(*streamp, outfile);
+}
+
+void vhx_write(InputObject &inobj, const std::string &outfile,
+               uint64_t fileoffset, uint64_t size, uint64_t zi_size,
+               uint64_t bank_modulus, uint64_t bank_firstres,
+               uint64_t bank_nres) {
+  auto streamp = onesegment_prepare(inobj, fileoffset, size, zi_size);
+  streamp =
+      bank_prepare(std::move(streamp), bank_modulus, bank_firstres, bank_nres);
+  vhx_write(*streamp, outfile);
+}
+
+void vhxcombined_write(InputObject &inobj, const std::string &outfile,
+                       const std::vector<Segment> &segments, bool include_zi,
+                       std::optional<uint64_t> baseaddr, uint64_t bank_modulus,
+                       uint64_t bank_firstres, uint64_t bank_nres) {
+  auto streamp = combined_prepare(inobj, segments, include_zi, baseaddr);
+  streamp =
+      bank_prepare(std::move(streamp), bank_modulus, bank_firstres, bank_nres);
+  vhx_write(*streamp, outfile);
+}
diff --git a/llvm/tools/llvm-elf2bin/elf.cpp b/llvm/tools/llvm-elf2bin/elf.cpp
new file mode 100644
index 000000000000000..aa87c2f8d2744d6
--- /dev/null
+++ b/llvm/tools/llvm-elf2bin/elf.cpp
@@ -0,0 +1,66 @@
+//===- elf.cpp - Code to read ELF data structures for llvm-elf2bin --------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm-elf2bin.h"
+
+using namespace llvm;
+using namespace llvm::object;
+
+template <typename ELFT>
+static std::vector<Segment>
+get_segments(InputObject &inobj, ELFObjectFile<ELFT> &elfobj, bool physical) {
+  std::vector<Segment> segments;
+
+  Expected<ArrayRef<typename ELFT::Phdr>> phdrs_or_err =
+      elfobj.getELFFile().program_headers();
+  if (!phdrs_or_err) {
+    fatal(inobj, "unable to read program header table",
+          phdrs_or_err.takeError());
+    return segments;
+  }
+
+  for (const typename ELFT::Phdr &phdr : *phdrs_or_err) {
+    Segment seg;
+    seg.fileoffset = phdr.p_offset;
+    seg.baseaddr = physical ? phdr.p_paddr : phdr.p_vaddr;
+    seg.filesize = phdr.p_filesz;
+    seg.memsize = phdr.p_memsz;
+    segments.push_back(seg);
+  }
+
+  return segments;
+}
+
+template <typename ELFT>
+static uint64_t get_entry_point(ELFObjectFile<ELFT> &obj) {
+  return obj.getELFFile().getHeader().e_entry;
+}
+
+std::vector<Segment> InputObject::segments(bool physical) {
+  if (auto *specific = dyn_cast<ELF32LEObjectFile>(elf.get()))
+    return get_segments(*this, *specific, physical);
+  if (auto *specific = dyn_cast<ELF32BEObjectFile>(elf.get()))
+    return get_segments(*this, *specific, physical);
+  if (auto *specific = dyn_cast<ELF64LEObjectFile>(elf.get()))
+    return get_segments(*this, *specific, physical);
+  if (auto *specific = dyn_cast<ELF64BEObjectFile>(elf.get()))
+    return get_segments(*this, *specific, physical);
+  llvm_unreachable("unexpected subclass of ELFOBjectFileBase");
+}
+
+uint64_t InputObject::entry_point() {
+  if (auto *specific = dyn_cast<ELF32LEObjectFile>(elf.get()))
+    return get_entry_point(*specific);
+  if (auto *specific = dyn_cast<ELF32BEObjectFile>(elf.get()))
+    return get_entry_point(*specific);
+  if (auto *specific = dyn_cast<ELF64LEObjectFile>(elf.get()))
+    return get_entry_point(*specific);
+  if (auto *specific = dyn_cast<ELF64BEObjectFile>(elf.get()))
+    return get_entry_point(*specific);
+  llvm_unreachable("unexpected subclass of ELFOBjectFileBase");
+}
diff --git a/llvm/tools/llvm-elf2bin/hex.cpp b/llvm/tools/llvm-elf2bin/hex.cpp
new file mode 100644
index 000000000000000..3bb5aa96fb2df7c
--- /dev/null
+++ b/llvm/tools/llvm-elf2bin/hex.cpp
@@ -0,0 +1,199 @@
+//===- hex.cpp - Code to write Intel and Motorola Hex for llvm-elf2bin ----===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm-elf2bin.h"
+
+#include <assert.h>
+
+#include "llvm/Support/Endian.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace llvm;
+
+class Hex {
+public:
+  virtual void data(uint64_t addr, const std::string &data) = 0;
+  virtual void trailer(uint64_t entry) = 0;
+  virtual ~Hex() = default;
+};
+
+template <typename Integer> std::string bigend(Integer i, size_t bytes_wanted) {
+  assert(bytes_wanted <= sizeof(i));
+  char buf[sizeof(i)];
+  llvm::support::endian::write(buf, i, llvm::endianness::big);
+  return std::string(buf + sizeof(i) - bytes_wanted, bytes_wanted);
+}
+
+class IHex : public Hex {
+  static std::string record(uint8_t type, uint16_t addr,
+                            const std::string &data) {
+    std::string binstring;
+    llvm::raw_string_ostream binstream(binstring);
+
+    binstream << (char)data.size() << bigend(addr, 2) << (char)type << data;
+
+    uint8_t checksum = 0;
+    for (uint8_t c : binstring)
+      checksum -= c;
+    binstream << (char)checksum;
+
+    std::string hexstring;
+    llvm::raw_string_ostream hexstream(hexstring);
+
+    hexstream << ':';
+    for (uint8_t c : binstring) {
+      static const char hexdigits[] = "0123456789ABCDEF";
+      hexstream << hexdigits[c >> 4] << hexdigits[c & 0xF];
+    }
+    hexstream << '\n';
+    return hexstring;
+  }
+
+  InputObject &inobj;
+  llvm::raw_ostream &os;
+  uint64_t curr_offset = 0;
+
+  void data(uint64_t addr, const std::string &data) override {
+    uint64_t offset = addr >> 16;
+    if (offset != curr_offset) {
+      if (offset >= 0x10000)
+        fatal(inobj, "data address does not fit in 32 bits");
+      curr_offset = offset;
+      os << record(4, 0, bigend(curr_offset, 2));
+    }
+    os << record(0, addr & 0xFFFF, data);
+  }
+
+  void trailer(uint64_t entry) override {
+    if (entry >= 0x100000000)
+      fatal(inobj, "entry point does not fit in 32 bits");
+    os << record(5, 0, bigend(entry, 4)); // entry point
+    os << record(1, 0, "");               // EOF
+  }
+
+public:
+  static constexpr uint64_t max_datalen = 0xFF;
+  IHex(InputObject &inobj, llvm::raw_ostream &os) : inobj(inobj), os(os) {}
+};
+
+class SRec : public Hex {
+  static std::string record(uint8_t type, uint32_t addr,
+                            const std::string &data) {
+    std::string binstring;
+    llvm::raw_string_ostream binstream(binstring);
+
+    size_t addrsize = (type == 2 || type == 8   ? 3
+                       : type == 3 || type == 7 ? 4
+                                                : 2);
+
+    binstream << (char)(data.size() + addrsize + 1) << bigend(addr, addrsize)
+              << data;
+
+    uint8_t checksum = -1;
+    for (uint8_t c : binstring)
+      checksum -= c;
+    binstream << (char)checksum;
+
+    std::string hexstring;
+    llvm::raw_string_ostream hexstream(hexstring);
+
+    hexstream << 'S' << (char)('0' + type);
+    for (uint8_t c : binstring) {
+      static const char hexdigits[] = "0123456789ABCDEF";
+      hexstream << hexdigits[c >> 4] << hexdigits[c & 0xF];
+    }
+    hexstream << '\n';
+    return hexstring;
+  }
+
+  InputObject &inobj;
+  llvm::raw_ostream &os;
+
+  void data(uint64_t addr, const std::string &data) override {
+    if (addr >= 0x100000000)
+      fatal(inobj, "data address does not fit in 32 bits");
+    os << record(3, static_cast<uint32_t>(addr), data);
+  }
+
+  void trailer(uint64_t entry) override {
+    if (entry >= 0x100000000)
+      fatal(inobj, "entry point does not fit in 32 bits");
+
+    // We could also write an S5 or S6 record here, containing the total number
+    // of data records in the file. However, srec_motorola(5) says one of these
+    // is optional, and I'm unaware of anyone depending on it existing. Also,
+    // we'd have to decide what to do if the file were so huge that the number
+    // wouldn't fit.
+
+    os << record(7, static_cast<uint32_t>(entry), "");
+  }
+
+public:
+  // In S-records, the length field includes the address and checksum,
+  // so we can have fewer data bytes in a record than 0xFF
+  static constexpr uint64_t max_datalen = 0xFF - 5;
+
+  SRec(InputObject &inobj, llvm::raw_ostream &os) : inobj(inobj), os(os) {}
+};
+
+static void hex_write(InputObject &inobj, Hex &hex,
+                      const std::vector<Segment> &segments, bool include_zi,
+                      uint64_t datareclen) {
+  for (auto seg : segments) {
+    uint64_t segsize = include_zi ? seg.memsize : seg.filesize;
+
+    for (uint64_t pos = 0; pos < segsize; pos += datareclen) {
+      size_t thisreclen = std::min<size_t>(datareclen, segsize - pos);
+      size_t readlen = pos >= seg.filesize
+                           ? 0
+                           : std::min<size_t>(thisreclen, seg.filesize - pos);
+
+      std::string data;
+      if (readlen)
+        data = std::string(
+            inobj.membuf->getBuffer().substr(seg.fileoffset + pos, readlen));
+      if (thisreclen > readlen)
+        data += std::string(thisreclen - readlen, '\0');
+      hex.data(seg.baseaddr + pos, data);
+    }
+  }
+
+  hex.trailer(inobj.entry_point());
+}
+
+template <typename HexFormat>
+static void hex_write(InputObject &inobj, const std::string &outfile,
+                      const std::vector<Segment> &segments, bool include_zi,
+                      uint64_t datareclen) {
+  if (datareclen > HexFormat::max_datalen)
+    fatal(inobj, "data record length must be at most " +
+                     Twine(unsigned(HexFormat::max_datalen)));
+
+  if (datareclen < 1)
+    fatal(inobj, "data record length must be at least 1");
+
+  std::error_code error;
+  llvm::raw_fd_ostream outstream(outfile, error);
+  if (error)
+    fatal(outfile, "unable to open", errorCodeToError(error));
+
+  HexFormat hex(inobj, outstream);
+  hex_write(inobj, hex, segments, include_zi, datareclen);
+}
+
+void ihex_write(InputObject &inobj, const std::string &outfile,
+                const std::vector<Segment> &segments, bool include_zi,
+                uint64_t datareclen) {
+  hex_write<IHex>(inobj, outfile, segments, include_zi, datareclen);
+}
+
+void srec_write(InputObject &inobj, const std::string &outfile,
+                const std::vector<Segment> &segments, bool include_zi,
+                uint64_t datareclen) {
+  hex_write<SRec>(inobj, outfile, segments, include_zi, datareclen);
+}
diff --git a/llvm/tools/llvm-elf2bin/llvm-elf2bin.cpp b/llvm/tools/llvm-elf2bin/llvm-elf2bin.cpp
new file mode 100644
index 000000000000000..78fed8e762f7bd3
--- /dev/null
+++ b/llvm/tools/llvm-elf2bin/llvm-elf2bin.cpp
@@ -0,0 +1,461 @@
+//===- llvm-elf2bin.cpp - Convert ELF image to binary formats -------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This is a tool that converts images into binary and hex formats, similar to
+// some of llvm-objcopy's functionality, but specialized to ELF, using only the
+// 'load view' of an ELF image, that is, the PT_LOAD segments in the program
+// header table. The output can be written to plain binary files or various hex
+// formats. An additional option allows the output to be split into multiple
+// 'banks' to be loaded into separate ROMs, e.g. with the first 2 bytes out of
+// every 4 going into one ROM and the other 2 bytes going into another.
+//
+//===----------------------------------------------------------------------===//
+
+#include "llvm-elf2bin.h"
+
+#include <cstdint>
+#include <cstdlib>
+#include <limits>
+#include <optional>
+#include <set>
+#include <string>
+#include <vector>
+
+#include "llvm/ADT/StringRef.h"
+#include "llvm/ADT/bit.h"
+#include "llvm/Object/ELFObjectFile.h"
+#include "llvm/Option/Arg.h"
+#include "llvm/Option/ArgList.h"
+#include "llvm/Option/Option.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/InitLLVM.h"
+#include "llvm/Support/LLVMDriver.h"
+
+using namespace llvm;
+using llvm::object::ELFObjectFileBase;
+
+[[noreturn]] static void fatal_common(std::optional<StringRef> filename,
+                                      Twine message,
+                                      std::optional<llvm::Error> err,
+                                      bool suggest_help) {
+  llvm::errs() << "llvm-elf2bin: ";
+  if (filename)
+    llvm::errs() << *filename << ": ";
+  llvm::errs() << message;
+  if (err) {
+    handleAllErrors(std::move(*err), [](const ErrorInfoBase &einfo) {
+      llvm::errs() << ": " << einfo.message();
+    });
+  }
+  llvm::errs() << "\n";
+  if (suggest_help)
+    llvm::errs() << "(try 'llvm-elf2bin --help' for help)\n";
+  exit(1);
+}
+
+[[noreturn]] void fatal(InputObject &inobj, Twine message, llvm::Error err) {
+  fatal_common(inobj.filename, message, std::move(err), false);
+}
+[[noreturn]] void fatal(InputObject &inobj, Twine message) {
+  fatal_common(inobj.filename, message, std::nullopt, false);
+}
+[[noreturn]] void fatal(StringRef filename, Twine message, llvm::Error err) {
+  fatal_common(filename, message, std::move(err), false);
+}
+[[noreturn]] void fatal(StringRef filename, Twine message) {
+  fatal_common(filename, message, std::nullopt, false);
+}
+[[noreturn]] void fatal(Twine message) {
+  fatal_common(std::nullopt, message, std::nullopt, false);
+}
+// Just like fatal() but also suggests --help
+[[noreturn]] void fatal_suggest_help(Twine message) {
+  fatal_common(std::nullopt, message, std::nullopt, true);
+}
+
+/*
+ * Format an output file name, according to the format string
+ * description provided by the user and documented in the help above.
+ *
+ * (So, unlike sprintf, this function doesn't take an arbitrary
+ * variadic argument list. Its input data consists of the details of
+ * an output file that's about to be written, and the format
+ * directives refer to particular pieces of data like 'input file
+ * name' and 'bank number' rather than to data types.)
+ */
+std::string format_outfile(std::string pattern, std::string inpath,
+                           std::optional<uint64_t> baseaddr, uint64_t bank) {
+  std::string infile;
+  size_t slash = inpath.find_last_of("/"
+#ifdef _WIN32
+                                     "\\:"
+#endif
+  );
+  infile = inpath.substr(slash == std::string::npos ? 0 : slash + 1);
+
+  std::string outstring;
+  llvm::raw_string_ostream outstream(outstring);
+
+  for (auto it = pattern.begin(); it != pattern.end();) {
+    char c = *it++;
+    if (c == '%') {
+      if (it == pattern.end())
+        fatal(Twine("output pattern '") + pattern +
+              "' ends with incomplete % escape");
+      char d = *it++;
+      switch (d) {
+      case 'F':
+        outstream << infile;
+        break;
+      case 'f':
+        outstream << infile.substr(
+            0, std::min(infile.size(), infile.find_last_of('.')));
+        break;
+      case 'a':
+      case 'A': {
+        if (!baseaddr)
+          fatal(Twine("output pattern '") + pattern + "' contains '%" +
+                Twine(d) + "' but no base address is available");
+        Twine hex = Twine::utohexstr(baseaddr.value());
+        if (d == 'a') {
+          outstream << hex;
+        } else {
+          SmallVector<char, 16> hexdata;
+          outstream << hex.toStringRef(hexdata).upper();
+        }
+        break;
+      }
+      case 'b':
+        outstream << bank;
+        break;
+      case '%':
+        outstream << '%';
+        break;
+      default:
+        fatal(Twine("output pattern '") + pattern +
+              "' contains unrecognized % escape '%" + Twine(d) + "'");
+      }
+    } else {
+      outstream << c;
+    }
+  }
+
+  return outstring;
+}
+
+enum class Format {
+  IHex,
+  SRec,
+  BinMultifile,
+  BinCombined,
+  VhxMultifile,
+  VhxCombined
+};
+
+namespace {
+using namespace llvm::opt; // for HelpHidden in Opts.inc
+enum ID {
+  OPT_INVALID = 0, // This is not an option ID.
+#define OPTION(...) LLVM_MAKE_OPT_ID(__VA_ARGS__),
+#include "Opts.inc"
+#undef OPTION
+};
+
+#define PREFIX(NAME, VALUE)                                                    \
+  static constexpr StringLiteral NAME##_init[] = VALUE;                        \
+  static constexpr ArrayRef<StringLiteral> NAME(NAME##_init,                   \
+                                                std::size(NAME##_init) - 1);
+#include "Opts.inc"
+#undef PREFIX
+
+static constexpr opt::OptTable::Info InfoTable[] = {
+#define OPTION(...) LLVM_CONSTRUCT_OPT_INFO(__VA_ARGS__),
+#include "Opts.inc"
+#undef OPTION
+};
+
+class Elf2BinOptTable : public opt::GenericOptTable {
+public:
+  Elf2BinOptTable() : opt::GenericOptTable(InfoTable) {}
+};
+} // namespace
+
+int llvm_elf2bin_main(int argc, char **argv, const llvm::ToolContext &) {
+  InitLLVM X(argc, argv);
+  BumpPtrAllocator A;
+  StringSaver Saver(A);
+  Elf2BinOptTable table;
+  opt::InputArgList Args =
+      table.parseArgs(argc, argv, OPT_UNKNOWN, Saver, fatal_suggest_help);
+
+  if (Args.hasArg(OPT_help)) {
+    table.printHelp(outs(), "llvm-elf2bin [options] <input ELF images>",
+                    "LLVM ELF-to-binary converter");
+    return 0;
+  }
+  if (Args.hasArg(OPT_version)) {
+    llvm::cl::PrintVersionMessage();
+    return 0;
+  }
+
+  std::vector<std::string> infiles = Args.getAllArgValues(OPT_INPUT);
+
+  std::optional<std::string> outfile, outpattern;
+  if (Arg *A = Args.getLastArg(OPT_output_file_EQ))
+    outfile = A->getValue();
+  if (Arg *A = Args.getLastArg(OPT_output_pattern_EQ))
+    outpattern = A->getValue();
+
+  std::optional<Format> format;
+  if (Arg *A = Args.getLastArg(OPT_ihex, OPT_srec, OPT_bin, OPT_bincombined,
+                               OPT_vhx, OPT_vhxcombined)) {
+    auto Option = A->getOption();
+    if (Option.matches(OPT_ihex))
+      format = Format::IHex;
+    if (Option.matches(OPT_srec))
+      format = Format::SRec;
+    if (Option.matches(OPT_bin))
+      format = Format::BinMultifile;
+    if (Option.matches(OPT_bincombined))
+      format = Format::BinCombined;
+    if (Option.matches(OPT_vhx))
+      format = Format::VhxMultifile;
+    if (Option.matches(OPT_vhxcombined))
+      format = Format::VhxCombined;
+  }
+
+  std::optional<uint64_t> baseaddr;
+  if (Arg *A = Args.getLastArg(OPT_base_EQ)) {
+    uint64_t value;
+    StringRef str = A->getValue();
+    if (str.getAsInteger(0, value))
+      fatal(Twine("cannot parse base address '") + str + "'");
+    baseaddr = value;
+  }
+
+  std::optional<std::set<uint64_t>> segments_wanted;
+  if (Arg *A = Args.getLastArg(OPT_segments_EQ)) {
+    std::set<uint64_t> bases;
+    SmallVector<StringRef, 4> fields;
+    SplitString(A->getValue(), fields, ",");
+    for (auto field : fields) {
+      uint64_t base;
+      if (field.getAsInteger(0, base))
+        fatal(Twine("cannot parse segment base address '") + field + "'");
+      bases.insert(base);
+    }
+    segments_wanted = bases;
+  }
+
+  uint64_t bankwidth = 1, nbanks = 1;
+  if (Arg *A = Args.getLastArg(OPT_banks_EQ)) {
+    StringRef str = A->getValue();
+    size_t xpos = str.find_first_of("x");
+    size_t xpos2 = str.find_last_of("x");
+    if (!(xpos == xpos2 && !str.substr(0, xpos).getAsInteger(0, bankwidth) &&
+          !str.substr(xpos + 1).getAsInteger(0, nbanks) &&
+          llvm::has_single_bit(bankwidth) && llvm::has_single_bit(nbanks) &&
+          bankwidth * nbanks != 0))
+      fatal(Twine("cannot parse bank specification '") + str + "'");
+  }
+
+  uint64_t datareclen = 16;
+  if (Arg *A = Args.getLastArg(OPT_datareclen_EQ)) {
+    StringRef str = A->getValue();
+    if (str.getAsInteger(0, datareclen))
+      fatal(Twine("cannot parse base address '") + str + "'");
+  }
+
+  bool include_zi = Args.hasArg(OPT_zi);
+  bool physical = true;
+  if (Arg *A = Args.getLastArg(OPT_physical, OPT_virtual))
+    physical = A->getOption().matches(OPT_physical);
+
+  if (infiles.empty())
+    fatal_suggest_help("no input file specified");
+  if (!outfile && !outpattern)
+    fatal_suggest_help("no output filename or pattern specified");
+  if (!format)
+    fatal_suggest_help("no output format specified");
+  if (outfile && outpattern)
+    fatal_suggest_help("output filename and pattern both specified");
+
+  if ((format != Format::BinCombined && format != Format::VhxCombined) &&
+      baseaddr)
+    fatal("--base only applies to --bincombined and --vhxcombined");
+
+  if ((format != Format::BinMultifile && format != Format::BinCombined &&
+       format != Format::VhxMultifile && format != Format::VhxCombined) &&
+      (bankwidth != 1 || nbanks != 1))
+    fatal("--banks only applies to binary and VHX output");
+
+  /*
+   * Open the input files.
+   */
+  std::vector<InputObject> objects;
+
+  for (const std::string &infile : infiles) {
+    objects.emplace_back();
+    InputObject &inobj = objects.back();
+    inobj.filename = infile;
+
+    ErrorOr<std::unique_ptr<MemoryBuffer>> membuf_or_err =
+        MemoryBuffer::getFileOrSTDIN(infile, false, false);
+    if (std::error_code error = membuf_or_err.getError())
+      fatal(infile, "unable to open", errorCodeToError(error));
+    inobj.membuf = std::move(membuf_or_err.get());
+
+    Expected<std::unique_ptr<llvm::object::Binary>> binary_or_err =
+        llvm::object::createBinary(inobj.membuf->getMemBufferRef(), nullptr,
+                                   false);
+    if (!binary_or_err)
+      fatal(infile, "unable to process", binary_or_err.takeError());
+
+    std::unique_ptr<ELFObjectFileBase> elf =
+        dyn_cast<ELFObjectFileBase>(*binary_or_err);
+    if (!elf)
+      fatal(infile, "unable to process: not an ELF file");
+    inobj.elf = std::move(elf);
+  }
+
+  /*
+   * Helper function for listing the segments of a file, paying
+   * attention to the --segments option to restrict to a subset.
+   */
+  auto segments = [&](InputObject &inobj) {
+    std::vector<Segment> allsegs = inobj.segments(physical);
+    if (!segments_wanted)
+      return allsegs;
+
+    std::vector<Segment> segs;
+    auto &keep = segments_wanted.value();
+    for (auto seg : allsegs)
+      if (keep.find(seg.baseaddr) != keep.end())
+        segs.push_back(seg);
+    return segs;
+  };
+
+  /*
+   * Make a list of all the conversions we want to do.
+   */
+
+  struct Conversion {
+    InputObject *inobj;
+    std::string outfile;
+    std::optional<uint64_t> baseaddr, fileoffset, size, zisize;
+    uint64_t bank;
+  };
+  std::vector<Conversion> convs;
+
+  for (auto &inobj : objects) {
+    // Helper function to fill in infile and outfile
+    auto add_conv = [&](Conversion conv) {
+      conv.inobj = &inobj;
+      if (outfile)
+        conv.outfile = outfile.value();
+      else
+        conv.outfile = format_outfile(outpattern.value(), conv.inobj->filename,
+                                      conv.baseaddr, conv.bank);
+      convs.push_back(conv);
+    };
+
+    switch (format.value()) {
+    case Format::BinMultifile:
+    case Format::VhxMultifile:
+      /*
+       * Separate output file per segment and per bank, so go
+       * through this input file and list its segments.
+       */
+      for (auto seg : segments(inobj)) {
+        for (uint64_t bank = 0; bank < nbanks; bank++) {
+          Conversion conv;
+          conv.baseaddr = seg.baseaddr;
+          conv.fileoffset = seg.fileoffset;
+          conv.size = seg.filesize;
+          conv.bank = bank;
+          if (include_zi && seg.memsize > seg.filesize)
+            conv.zisize = seg.memsize - seg.filesize;
+          else
+            conv.zisize = 0;
+          add_conv(conv);
+        }
+      }
+      break;
+    case Format::BinCombined:
+    case Format::VhxCombined:
+      /*
+       * Separate output file per bank, but each one contains
+       * the whole input file.
+       */
+      for (uint64_t bank = 0; bank < nbanks; bank++) {
+        Conversion conv;
+        conv.bank = bank;
+        add_conv(conv);
+      }
+      break;
+    default:
+      /*
+       * One output file per input file.
+       */
+      add_conv(Conversion{});
+      break;
+    }
+  }
+
+  std::set<std::string> outfiles;
+  for (const auto &conv : convs) {
+    if (outfiles.find(conv.outfile) != outfiles.end()) {
+      fatal(Twine("output file '") + conv.outfile +
+            "' would be written more than once by this command");
+      std::exit(1);
+    }
+    outfiles.insert(conv.outfile);
+  }
+
+  uint64_t bankmod = nbanks * bankwidth;
+
+  for (const auto &conv : convs) {
+    switch (format.value()) {
+    case Format::BinMultifile:
+      bin_write(*conv.inobj, conv.outfile, conv.fileoffset.value(),
+                conv.size.value(), conv.zisize.value(), bankmod,
+                conv.bank * bankwidth, bankwidth);
+      break;
+
+    case Format::BinCombined:
+      bincombined_write(*conv.inobj, conv.outfile, segments(*conv.inobj),
+                        include_zi, baseaddr, bankmod, conv.bank * bankwidth,
+                        bankwidth);
+      break;
+
+    case Format::VhxMultifile:
+      vhx_write(*conv.inobj, conv.outfile, conv.fileoffset.value(),
+                conv.size.value(), conv.zisize.value(), bankmod,
+                conv.bank * bankwidth, bankwidth);
+      break;
+
+    case Format::VhxCombined:
+      vhxcombined_write(*conv.inobj, conv.outfile, segments(*conv.inobj),
+                        include_zi, baseaddr, bankmod, conv.bank * bankwidth,
+                        bankwidth);
+      break;
+
+    case Format::IHex:
+      ihex_write(*conv.inobj, conv.outfile, segments(*conv.inobj), include_zi,
+                 datareclen);
+      break;
+
+    case Format::SRec:
+      srec_write(*conv.inobj, conv.outfile, segments(*conv.inobj), include_zi,
+                 datareclen);
+      break;
+    }
+  }
+
+  return 0;
+}
diff --git a/llvm/tools/llvm-elf2bin/llvm-elf2bin.h b/llvm/tools/llvm-elf2bin/llvm-elf2bin.h
new file mode 100644
index 000000000000000..daebd8640bcfe0f
--- /dev/null
+++ b/llvm/tools/llvm-elf2bin/llvm-elf2bin.h
@@ -0,0 +1,170 @@
+//===- llvm-elf2bin.h - Header file for llvm-elf2bin ----------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include <cstdint>
+#include <istream>
+#include <memory>
+#include <string>
+#include <vector>
+
+#include "llvm/ADT/Twine.h"
+#include "llvm/Object/ELFObjectFile.h"
+#include "llvm/Support/Error.h"
+
+/*
+ * A structure that describes a single loadable segment in an ELF
+ * image. 'fileoffset' and 'filesize' designate a region of bytes in
+ * the file; 'baseaddr' specifies the memory address that region is
+ * expected to be loaded at.
+ *
+ * 'memsize' indicates how many bytes the region will occupy once it's
+ * loaded. This should never be less than 'filesize'. If it's more
+ * than 'filesize', it indicates that zero padding should be appended
+ * by the loader to pad to the full size.
+ */
+struct Segment {
+  uint64_t baseaddr, fileoffset, filesize, memsize;
+};
+
+/*
+ * A structure containing the name of an input file to the program,
+ * and the result of loading it into an ELFObjectFile.
+ */
+struct InputObject {
+  std::string filename;
+  std::unique_ptr<llvm::MemoryBuffer> membuf;
+  std::unique_ptr<llvm::object::ELFObjectFileBase> elf;
+
+  /*
+   * List the loadable segments in the file.
+   *
+   * The flag 'physical' indicates that the 'baseaddr' value for each
+   * returned segment should be the physical address of that segment,
+   * i.e. the p_paddr field in its program header table entry. If it's
+   * false, the virtual address will be used instead, i.e. the p_vaddr
+   * field.
+   */
+  std::vector<Segment> segments(bool physical);
+
+  uint64_t entry_point();
+};
+
+/*
+ * Write a single binary or Verilog hex file. (A Verilog hex file is
+ * just the plain binary data, represented as a sequence of text lines
+ * each containing a hex-encoded byte).
+ *
+ * Input data comes from the file 'infile', at offset 'fileoffset',
+ * 'size' bytes long. 'zi_size' zero bytes are appended after that.
+ *
+ * Bank switching is supported by the bank_* parameters, which select
+ * a subset of the input bytes to be written to the output.
+ * Specifically, each input byte is included or excluded depending on
+ * the residue of its position in the input, mod 'bank_modulus'.
+ * 'bank_nres' consecutive residues are kept, starting at
+ * 'bank_firstres'. For example:
+ *
+ * modulus=8, firstres=0, nres=1: keep only the bytes whose positions
+ * are congruent to 0 mod 8. (Just one residue, namely 0.) That is,
+ * divide the input into 8-byte blocks and only write the initial byte
+ * of each block.
+ *
+ * modulus=8, firstres=0, nres=2: keep the bytes whose positions are
+ * congruent to {0,1} mod 8. (Two consecutive residues, starting at 0.)
+ *
+ * modulus=8, firstres=4, nres=2: keep the bytes whose positions are
+ * congruent to {4,5} mod 8. (Still two residues, but now starting at 4.)
+ *
+ * If no bank switching is needed, then modulus=1, firstres=0, nres=1
+ * is a combination that indicates 'write all input bytes to the output'.
+ *
+ * The output is written to the file 'outfile'.
+ *
+ * These functions will exit the entire program with an error message if
+ * anything goes wrong. So callers need not handle the failure case.
+ */
+void bin_write(InputObject &inobj, const std::string &outfile,
+               uint64_t fileoffset, uint64_t size, uint64_t zi_size,
+               uint64_t bank_modulus, uint64_t bank_firstres,
+               uint64_t bank_nres);
+void vhx_write(InputObject &inobj, const std::string &outfile,
+               uint64_t fileoffset, uint64_t size, uint64_t zi_size,
+               uint64_t bank_modulus, uint64_t bank_firstres,
+               uint64_t bank_nres);
+
+/*
+ * Write a combined binary or Verilog hex file, including multiple
+ * segments from an ELF file, at their correct relative offsets.
+ *
+ * 'segments' gives the list of segments from the ELF file to include.
+ * Each Segment structure includes the file offset, base address and
+ * size, so these functions can work out the padding required in
+ * between.
+ *
+ * 'baseaddr' gives the address corresponding to the start of the
+ * file. (So if this is lower than the base address of the first
+ * segment, then padding must be inserted at the very start of the
+ * file.)
+ *
+ * If 'include_zi' is set, then the ZI padding specified in the ELF
+ * file after each segment will be reliably included in the output
+ * file. (This is likely only relevant to the final segment, because
+ * if there are two segments with space between them, then the ZI
+ * padding for the first segment will occupy some of that space, and
+ * will be included in the file anyway.) If 'include_zi' is false, the
+ * output file will end as soon as the last byte of actual file data
+ * has been written.
+ *
+ * The bank_* parameters are interpreted identically to the previous
+ * pair of functions. So is 'outfile'.
+ *
+ * These functions too will exit with an error in case of failure.
+ */
+void bincombined_write(InputObject &inobj, const std::string &outfile,
+                       const std::vector<Segment> &segments, bool include_zi,
+                       std::optional<uint64_t> baseaddr, uint64_t bank_modulus,
+                       uint64_t bank_firstres, uint64_t bank_nres);
+void vhxcombined_write(InputObject &inobj, const std::string &outfile,
+                       const std::vector<Segment> &segments, bool include_zi,
+                       std::optional<uint64_t> baseaddr, uint64_t bank_modulus,
+                       uint64_t bank_firstres, uint64_t bank_nres);
+
+/*
+ * Write a structured hex file (Intel Hex or Motorola S-records)
+ * describing an ELF image.
+ *
+ * 'segments' lists the loadable segments that should be included
+ * (which may not be all the segments in the file, if the command line
+ * specified only a subset of them).
+ *
+ * 'include_zi' indicates whether the ZI padding at the end of each
+ * segment should be explicitly represented in the hex file.
+ *
+ * 'datareclen' indicates how many bytes of data should appear in each
+ * hex record. (Fewer bytes per record mean more file size overhead,
+ * but also less likelihood of overflowing a reader's size limit.
+ *
+ * These functions will exit the entire program with an error message if
+ * anything goes wrong. So callers need not handle the failure case.
+ */
+void ihex_write(InputObject &inobj, const std::string &outfile,
+                const std::vector<Segment> &segments, bool include_zi,
+                uint64_t datareclen);
+void srec_write(InputObject &inobj, const std::string &outfile,
+                const std::vector<Segment> &segments, bool include_zi,
+                uint64_t datareclen);
+
+/*
+ * Error-reporting functions. These are all fatal.
+ */
+[[noreturn]] void fatal(llvm::StringRef filename, llvm::Twine message, llvm::Error err);
+[[noreturn]] void fatal(llvm::StringRef filename, llvm::Twine message);
+[[noreturn]] void fatal(InputObject &inobj, llvm::Twine message,
+                        llvm::Error err);
+[[noreturn]] void fatal(InputObject &inobj, llvm::Twine message);
+[[noreturn]] void fatal(llvm::Twine message);



More information about the llvm-commits mailing list