[llvm] a29f0dd - [llubi] Add initial support for llubi (#180022)

via llvm-commits llvm-commits at lists.llvm.org
Mon Feb 9 09:54:44 PST 2026


Author: Yingwei Zheng
Date: 2026-02-10T01:54:34+08:00
New Revision: a29f0dd09680fbf7c24aa182c87f51cf4b93e21d

URL: https://github.com/llvm/llvm-project/commit/a29f0dd09680fbf7c24aa182c87f51cf4b93e21d
DIFF: https://github.com/llvm/llvm-project/commit/a29f0dd09680fbf7c24aa182c87f51cf4b93e21d.diff

LOG: [llubi] Add initial support for llubi (#180022)

This patch implements the initial support for upstreaming
[llubi](https://github.com/dtcxzyw/llvm-ub-aware-interpreter). It only
provides the minimal functionality to run a simple main function. I hope
we can focus on the interface design in this PR, rather than trivial
implementations for each instruction.
RFC link:
https://discourse.llvm.org/t/rfc-upstreaming-llvm-ub-aware-interpreter/89645

Excluding the driver `llubi.cpp`, this patch contains three components
for better decoupling:
+ `Value.h/cpp`: Value representation
+ `Context.h/cpp`: Global state management (e.g., memory) and
interpreter configuration
+ `Interpreter.cpp`: The main interpreter loop

Compared to the out-of-tree version, the major differences are listed
below:
+ The interpreter logic always returns the control to its caller, i.e.,
it never calls `exit/abort` when immediate UBs are triggered.
+ `EventHandler` provides an interface to dump the trace. It also allows
callers to inspect the actual value and verify the correctness of
analysis passes (e.g, KnownBits/SCEV).
+ The context is designed to be reentrant. That is, you can call
`runFunction` multiple times. But its usefulness remains in doubt due to
side effects made by previous calls.
+ `runFunction` handles function calls with a loop, instead of calling
itself recursively. This makes it no longer bounded by the stack depth.
+ Uninitialized memory is planned to be approximated by returning random
values each time an uninitialized byte is loaded.

Added: 
    llvm/docs/CommandGuide/llubi.rst
    llvm/test/tools/llubi/main.ll
    llvm/test/tools/llubi/main2.ll
    llvm/test/tools/llubi/poison.ll
    llvm/tools/llubi/CMakeLists.txt
    llvm/tools/llubi/lib/CMakeLists.txt
    llvm/tools/llubi/lib/Context.cpp
    llvm/tools/llubi/lib/Context.h
    llvm/tools/llubi/lib/Interpreter.cpp
    llvm/tools/llubi/lib/Value.cpp
    llvm/tools/llubi/lib/Value.h
    llvm/tools/llubi/llubi.cpp

Modified: 
    llvm/docs/CommandGuide/index.rst
    llvm/test/CMakeLists.txt
    llvm/test/lit.cfg.py

Removed: 
    


################################################################################
diff  --git a/llvm/docs/CommandGuide/index.rst b/llvm/docs/CommandGuide/index.rst
index 8f080ded81c69..c6427d1245b9a 100644
--- a/llvm/docs/CommandGuide/index.rst
+++ b/llvm/docs/CommandGuide/index.rst
@@ -17,6 +17,7 @@ Basic Commands
    dsymutil
    llc
    lli
+   llubi
    llvm-as
    llvm-cgdata
    llvm-config

diff  --git a/llvm/docs/CommandGuide/llubi.rst b/llvm/docs/CommandGuide/llubi.rst
new file mode 100644
index 0000000000000..f652af83d810a
--- /dev/null
+++ b/llvm/docs/CommandGuide/llubi.rst
@@ -0,0 +1,79 @@
+llubi - LLVM UB-aware Interpreter
+=================================
+
+.. program:: llubi
+
+SYNOPSIS
+--------
+
+:program:`llubi` [*options*] [*filename*] [*program args*]
+
+DESCRIPTION
+-----------
+
+:program:`llubi` directly executes programs in LLVM bitcode format and tracks values in LLVM IR semantics.
+Unlike :program:`lli`, :program:`llubi` is designed to be aware of undefined behaviors during execution.
+It detects immediate undefined behaviors such as integer division by zero, and respects poison generating flags
+like `nsw` and `nuw`. As it captures most of the guardable undefined behaviors, it is highly suitable for
+constructing an interesting-ness test for miscompilation bugs.
+
+If `filename` is not specified, then :program:`llubi` reads the LLVM bitcode for the
+program from standard input.
+
+The optional *args* specified on the command line are passed to the program as
+arguments.
+
+GENERAL OPTIONS
+---------------
+
+.. option:: -fake-argv0=executable
+
+ Override the ``argv[0]`` value passed into the executing program.
+
+.. option:: -entry-function=function
+
+ Specify the name of the function to execute as the program's entry point.
+ By default, :program:`llubi` uses the function named ``main``.
+
+.. option:: -help
+
+ Print a summary of command line options.
+
+.. option:: -verbose
+
+ Print results for each instruction executed.
+
+.. option:: -version
+
+ Print out the version of :program:`llubi` and exit without doing anything else.
+
+INTERPRETER OPTIONS
+-------------------
+
+.. option:: -max-mem=N
+
+  Limit the amount of memory (in bytes) that can be allocated by the program, including
+  stack, heap, and global variables. If the limit is exceeded, execution will be terminated.
+  By default, there is no limit (N = 0).
+
+.. option:: -max-stack-depth=N
+
+  Limit the maximum stack depth to N. If the limit is exceeded, execution will be terminated.
+  The default limit is 256. Set N to 0 to disable the limit.
+
+.. option:: -max-steps=N
+
+  Limit the number of instructions executed to N. If the limit is reached, execution will
+  be terminated. By default, there is no limit (N = 0).
+
+.. option:: -vscale=N
+
+  Set the value of `llvm.vscale` to N. The default value is 4.
+
+EXIT STATUS
+-----------
+
+If :program:`llubi` fails to load the program, or an error occurs during execution (e.g, an immediate undefined
+behavior is triggered), it will exit with an exit code of 1.
+If the return type of entry function is not an integer type, it will return 0.
+Otherwise, it will return the exit code of the program.

diff  --git a/llvm/test/CMakeLists.txt b/llvm/test/CMakeLists.txt
index 77fbbe28ca56d..388ce613ad1d0 100644
--- a/llvm/test/CMakeLists.txt
+++ b/llvm/test/CMakeLists.txt
@@ -76,6 +76,7 @@ set(LLVM_TEST_DEPENDS
   llc
   lli
   lli-child-target
+  llubi
   llvm-addr2line
   llvm-ar
   llvm-as

diff  --git a/llvm/test/lit.cfg.py b/llvm/test/lit.cfg.py
index 8463e667d9f71..79b78ffeb2dab 100644
--- a/llvm/test/lit.cfg.py
+++ b/llvm/test/lit.cfg.py
@@ -235,6 +235,7 @@ def get_asan_rtlib():
         "dsymutil",
         "lli",
         "lli-child-target",
+        "llubi",
         "llvm-ar",
         "llvm-as",
         "llvm-addr2line",

diff  --git a/llvm/test/tools/llubi/main.ll b/llvm/test/tools/llubi/main.ll
new file mode 100644
index 0000000000000..c10824621018e
--- /dev/null
+++ b/llvm/test/tools/llubi/main.ll
@@ -0,0 +1,11 @@
+; RUN: llubi --verbose < %s 2>&1 | FileCheck %s
+
+define i32 @main(i32 %argc, ptr %argv) {
+  ret i32 0
+}
+
+; CHECK: Entering function: main
+; CHECK:   i32 %argc = i32 1
+; CHECK:   ptr %argv = ptr 0x8 [argv]
+; CHECK:   ret i32 0
+; CHECK: Exiting function: main

diff  --git a/llvm/test/tools/llubi/main2.ll b/llvm/test/tools/llubi/main2.ll
new file mode 100644
index 0000000000000..58c5744bb0909
--- /dev/null
+++ b/llvm/test/tools/llubi/main2.ll
@@ -0,0 +1,9 @@
+; RUN: llubi --verbose < %s 2>&1 | FileCheck %s
+
+define i32 @main() {
+  ret i32 0
+}
+
+; CHECK: Entering function: main
+; CHECK:   ret i32 0
+; CHECK: Exiting function: main

diff  --git a/llvm/test/tools/llubi/poison.ll b/llvm/test/tools/llubi/poison.ll
new file mode 100644
index 0000000000000..cf3b69d1aeb77
--- /dev/null
+++ b/llvm/test/tools/llubi/poison.ll
@@ -0,0 +1,11 @@
+; RUN: not llubi --verbose < %s 2>&1 | FileCheck %s
+
+define i32 @main(i32 %argc, ptr %argv) {
+  ret i32 poison
+}
+; CHECK: Entering function: main
+; CHECK:   i32 %argc = i32 1
+; CHECK:   ptr %argv = ptr 0x8 [argv]
+; CHECK:   ret i32 poison
+; CHECK: Exiting function: main
+; CHECK: error: Execution of function 'main' resulted in poison return value.

diff  --git a/llvm/tools/llubi/CMakeLists.txt b/llvm/tools/llubi/CMakeLists.txt
new file mode 100644
index 0000000000000..46d06f6e5dfc2
--- /dev/null
+++ b/llvm/tools/llubi/CMakeLists.txt
@@ -0,0 +1,17 @@
+set(LLVM_LINK_COMPONENTS
+  Analysis
+  Core
+  IRPrinter
+  IRReader
+  Support
+  )
+
+add_llvm_tool(llubi
+  llubi.cpp
+
+  DEPENDS
+  intrinsics_gen
+  )
+
+add_subdirectory(lib)
+target_link_libraries(llubi PRIVATE LLVMUBAwareInterpreter)

diff  --git a/llvm/tools/llubi/lib/CMakeLists.txt b/llvm/tools/llubi/lib/CMakeLists.txt
new file mode 100644
index 0000000000000..d3b54d0bd45b5
--- /dev/null
+++ b/llvm/tools/llubi/lib/CMakeLists.txt
@@ -0,0 +1,12 @@
+set(LLVM_LINK_COMPONENTS
+  Analysis
+  Core
+  Support
+  )
+
+add_llvm_library(LLVMUBAwareInterpreter
+  STATIC
+  Context.cpp
+  Interpreter.cpp
+  Value.cpp
+  )

diff  --git a/llvm/tools/llubi/lib/Context.cpp b/llvm/tools/llubi/lib/Context.cpp
new file mode 100644
index 0000000000000..6b5362204cfde
--- /dev/null
+++ b/llvm/tools/llubi/lib/Context.cpp
@@ -0,0 +1,129 @@
+//===- Context.cpp - State Tracking for llubi -----------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file tracks the global states (e.g., memory) of the interpreter.
+//
+//===----------------------------------------------------------------------===//
+
+#include "Context.h"
+#include "llvm/Support/MathExtras.h"
+
+namespace llvm::ubi {
+
+Context::Context(Module &M)
+    : Ctx(M.getContext()), M(M), DL(M.getDataLayout()),
+      TLIImpl(M.getTargetTriple()) {}
+
+Context::~Context() = default;
+
+AnyValue Context::getConstantValueImpl(Constant *C) {
+  if (isa<PoisonValue>(C))
+    return AnyValue::getPoisonValue(*this, C->getType());
+
+  // TODO: Handle ConstantInt vector.
+  if (auto *CI = dyn_cast<ConstantInt>(C))
+    return CI->getValue();
+
+  llvm_unreachable("Unrecognized constant");
+}
+
+const AnyValue &Context::getConstantValue(Constant *C) {
+  auto It = ConstCache.find(C);
+  if (It != ConstCache.end())
+    return It->second;
+
+  return ConstCache.emplace(C, getConstantValueImpl(C)).first->second;
+}
+
+MemoryObject::~MemoryObject() = default;
+MemoryObject::MemoryObject(uint64_t Addr, uint64_t Size, StringRef Name,
+                           unsigned AS, MemInitKind InitKind)
+    : Address(Addr), Size(Size), Name(Name), AS(AS),
+      State(InitKind != MemInitKind::Poisoned ? MemoryObjectState::Alive
+                                              : MemoryObjectState::Dead) {
+  switch (InitKind) {
+  case MemInitKind::Zeroed:
+    Bytes.resize(Size, Byte{0, ByteKind::Concrete});
+    break;
+  case MemInitKind::Uninitialized:
+    Bytes.resize(Size, Byte{0, ByteKind::Undef});
+    break;
+  case MemInitKind::Poisoned:
+    Bytes.resize(Size, Byte{0, ByteKind::Poison});
+    break;
+  }
+}
+
+IntrusiveRefCntPtr<MemoryObject> Context::allocate(uint64_t Size,
+                                                   uint64_t Align,
+                                                   StringRef Name, unsigned AS,
+                                                   MemInitKind InitKind) {
+  if (MaxMem != 0 && SaturatingAdd(UsedMem, Size) >= MaxMem)
+    return nullptr;
+  uint64_t AlignedAddr = alignTo(AllocationBase, Align);
+  auto MemObj =
+      makeIntrusiveRefCnt<MemoryObject>(AlignedAddr, Size, Name, AS, InitKind);
+  MemoryObjects[AlignedAddr] = MemObj;
+  AllocationBase = AlignedAddr + Size;
+  UsedMem += Size;
+  return MemObj;
+}
+
+bool Context::free(uint64_t Address) {
+  auto It = MemoryObjects.find(Address);
+  if (It == MemoryObjects.end())
+    return false;
+  UsedMem -= It->second->getSize();
+  It->second->markAsFreed();
+  MemoryObjects.erase(It);
+  return true;
+}
+
+Pointer Context::deriveFromMemoryObject(IntrusiveRefCntPtr<MemoryObject> Obj) {
+  assert(Obj && "Cannot determine the address space of a null memory object");
+  return Pointer(
+      Obj,
+      APInt(DL.getPointerSizeInBits(Obj->getAddressSpace()), Obj->getAddress()),
+      /*Offset=*/0);
+}
+
+void MemoryObject::markAsFreed() {
+  State = MemoryObjectState::Freed;
+  Bytes.clear();
+}
+
+void MemoryObject::writeRawBytes(uint64_t Offset, const void *Data,
+                                 uint64_t Length) {
+  assert(SaturatingAdd(Offset, Length) <= Size && "Write out of bounds");
+  const uint8_t *ByteData = static_cast<const uint8_t *>(Data);
+  for (uint64_t I = 0; I < Length; ++I)
+    Bytes[Offset + I].set(ByteData[I]);
+}
+
+void MemoryObject::writeInteger(uint64_t Offset, const APInt &Int,
+                                const DataLayout &DL) {
+  uint64_t BitWidth = Int.getBitWidth();
+  uint64_t IntSize = divideCeil(BitWidth, 8);
+  assert(SaturatingAdd(Offset, IntSize) <= Size && "Write out of bounds");
+  for (uint64_t I = 0; I < IntSize; ++I) {
+    uint64_t ByteIndex = DL.isLittleEndian() ? I : (IntSize - 1 - I);
+    uint64_t Bits = std::min(BitWidth - ByteIndex * 8, uint64_t(8));
+    Bytes[Offset + I].set(Int.extractBitsAsZExtValue(Bits, ByteIndex * 8));
+  }
+}
+void MemoryObject::writeFloat(uint64_t Offset, const APFloat &Float,
+                              const DataLayout &DL) {
+  writeInteger(Offset, Float.bitcastToAPInt(), DL);
+}
+void MemoryObject::writePointer(uint64_t Offset, const Pointer &Ptr,
+                                const DataLayout &DL) {
+  writeInteger(Offset, Ptr.address(), DL);
+  // TODO: provenance
+}
+
+} // namespace llvm::ubi

diff  --git a/llvm/tools/llubi/lib/Context.h b/llvm/tools/llubi/lib/Context.h
new file mode 100644
index 0000000000000..a0153752b4404
--- /dev/null
+++ b/llvm/tools/llubi/lib/Context.h
@@ -0,0 +1,185 @@
+//===--- Context.h - State Tracking for llubi -------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TOOLS_LLUBI_CONTEXT_H
+#define LLVM_TOOLS_LLUBI_CONTEXT_H
+
+#include "Value.h"
+#include "llvm/ADT/DenseMap.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
+#include "llvm/IR/Module.h"
+#include <map>
+
+namespace llvm::ubi {
+
+enum class MemInitKind {
+  Zeroed,
+  Uninitialized,
+  Poisoned,
+};
+
+enum class MemoryObjectState {
+  // This memory object is accessible.
+  // Valid transitions:
+  //   -> Dead (after the end of lifetime of an alloca)
+  //   -> Freed (after free is called on a heap object)
+  Alive,
+  // This memory object is out of lifetime. It is OK to perform
+  // operations that do not access its content, e.g., getelementptr.
+  // Otherwise, an immediate UB occurs.
+  // Valid transition:
+  //   -> Alive (after the start of lifetime of an alloca)
+  Dead,
+  // This heap memory object has been freed. Any access to it
+  // causes immediate UB. Like dead objects, it is still possible to
+  // perform operations that do not access its content.
+  Freed,
+};
+
+class MemoryObject : public RefCountedBase<MemoryObject> {
+  uint64_t Address;
+  uint64_t Size;
+  SmallVector<Byte, 8> Bytes;
+  StringRef Name;
+  unsigned AS;
+
+  MemoryObjectState State;
+  bool IsConstant = false;
+
+public:
+  MemoryObject(uint64_t Addr, uint64_t Size, StringRef Name, unsigned AS,
+               MemInitKind InitKind);
+  MemoryObject(const MemoryObject &) = delete;
+  MemoryObject(MemoryObject &&) = delete;
+  MemoryObject &operator=(const MemoryObject &) = delete;
+  MemoryObject &operator=(MemoryObject &&) = delete;
+  ~MemoryObject();
+
+  uint64_t getAddress() const { return Address; }
+  uint64_t getSize() const { return Size; }
+  StringRef getName() const { return Name; }
+  unsigned getAddressSpace() const { return AS; }
+  MemoryObjectState getState() const { return State; }
+  bool isConstant() const { return IsConstant; }
+  void setIsConstant(bool C) { IsConstant = C; }
+
+  Byte &operator[](uint64_t Offset) {
+    assert(Offset < Size && "Offset out of bounds");
+    return Bytes[Offset];
+  }
+  void writeRawBytes(uint64_t Offset, const void *Data, uint64_t Length);
+  void writeInteger(uint64_t Offset, const APInt &Int, const DataLayout &DL);
+  void writeFloat(uint64_t Offset, const APFloat &Float, const DataLayout &DL);
+  void writePointer(uint64_t Offset, const Pointer &Ptr, const DataLayout &DL);
+
+  void markAsFreed();
+};
+
+/// An interface for handling events and managing outputs during interpretation.
+/// If the handler returns false from any of the methods, the interpreter will
+/// stop execution immediately.
+class EventHandler {
+public:
+  virtual ~EventHandler() = default;
+
+  virtual bool onInstructionExecuted(Instruction &I, const AnyValue &Result) {
+    return true;
+  }
+  virtual void onUnrecognizedInstruction(Instruction &I) {}
+  virtual void onImmediateUB(StringRef Msg) {}
+  virtual bool onBBJump(Instruction &I, BasicBlock &To) { return true; }
+  virtual bool onFunctionEntry(Function &F, ArrayRef<AnyValue> Args,
+                               CallBase *CallSite) {
+    return true;
+  }
+  virtual bool onFunctionExit(Function &F, const AnyValue &RetVal) {
+    return true;
+  }
+  virtual bool onPrint(StringRef Msg) {
+    outs() << Msg;
+    return true;
+  }
+};
+
+/// The global context for the interpreter.
+/// It tracks global state such as heap memory objects and floating point
+/// environment.
+class Context {
+  // Module
+  LLVMContext &Ctx;
+  Module &M;
+  const DataLayout &DL;
+  const TargetLibraryInfoImpl TLIImpl;
+
+  // Configuration
+  uint64_t MaxMem = 0;
+  uint32_t VScale = 4;
+  uint32_t MaxSteps = 0;
+  uint32_t MaxStackDepth = 256;
+
+  // Memory
+  uint64_t UsedMem = 0;
+  // The addresses of memory objects are monotonically increasing.
+  // For now we don't model the behavior of address reuse, which is common
+  // with stack coloring.
+  uint64_t AllocationBase = 8;
+  std::map<uint64_t, IntrusiveRefCntPtr<MemoryObject>> MemoryObjects;
+
+  // Constants
+  // Use std::map to avoid iterator/reference invalidation.
+  std::map<Constant *, AnyValue> ConstCache;
+  AnyValue getConstantValueImpl(Constant *C);
+
+  // TODO: errno and fpenv
+
+public:
+  explicit Context(Module &M);
+  Context(const Context &) = delete;
+  Context(Context &&) = delete;
+  Context &operator=(const Context &) = delete;
+  Context &operator=(Context &&) = delete;
+  ~Context();
+
+  void setMemoryLimit(uint64_t Max) { MaxMem = Max; }
+  void setVScale(uint32_t VS) { VScale = VS; }
+  void setMaxSteps(uint32_t MS) { MaxSteps = MS; }
+  void setMaxStackDepth(uint32_t Depth) { MaxStackDepth = Depth; }
+  uint64_t getMemoryLimit() const { return MaxMem; }
+  uint32_t getVScale() const { return VScale; }
+  uint32_t getMaxSteps() const { return MaxSteps; }
+  uint32_t getMaxStackDepth() const { return MaxStackDepth; }
+
+  LLVMContext &getContext() const { return Ctx; }
+  const DataLayout &getDataLayout() const { return DL; }
+  const TargetLibraryInfoImpl &getTLIImpl() const { return TLIImpl; }
+  uint32_t getEVL(ElementCount EC) const {
+    if (EC.isScalable())
+      return VScale * EC.getKnownMinValue();
+    return EC.getFixedValue();
+  }
+
+  const AnyValue &getConstantValue(Constant *C);
+  IntrusiveRefCntPtr<MemoryObject> allocate(uint64_t Size, uint64_t Align,
+                                            StringRef Name, unsigned AS,
+                                            MemInitKind InitKind);
+  bool free(uint64_t Address);
+  // Derive a pointer from a memory object with offset 0.
+  // Please use Pointer's interface for further manipulations.
+  Pointer deriveFromMemoryObject(IntrusiveRefCntPtr<MemoryObject> Obj);
+
+  /// Execute the function \p F with arguments \p Args, and store the return
+  /// value in \p RetVal if the function is not void.
+  /// Returns true if the function executed successfully. False indicates an
+  /// error occurred during execution.
+  bool runFunction(Function &F, ArrayRef<AnyValue> Args, AnyValue &RetVal,
+                   EventHandler &Handler);
+};
+
+} // namespace llvm::ubi
+
+#endif

diff  --git a/llvm/tools/llubi/lib/Interpreter.cpp b/llvm/tools/llubi/lib/Interpreter.cpp
new file mode 100644
index 0000000000000..aaad8fb15262e
--- /dev/null
+++ b/llvm/tools/llubi/lib/Interpreter.cpp
@@ -0,0 +1,202 @@
+//===- Interpreter.cpp - Interpreter Loop for llubi -----------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements the evaluation loop for each kind of instruction.
+//
+//===----------------------------------------------------------------------===//
+
+#include "Context.h"
+#include "Value.h"
+#include "llvm/IR/InstVisitor.h"
+#include "llvm/Support/Allocator.h"
+
+namespace llvm::ubi {
+
+enum class FrameState {
+  // It is about to enter the function.
+  // Valid transition:
+  //   -> Running
+  Entry,
+  // It is executing instructions inside the function.
+  // Valid transitions:
+  //   -> Pending (on call)
+  //   -> Exit (on return)
+  Running,
+  // It is about to enter a callee or handle return value from the callee.
+  // Valid transitions:
+  //   -> Running (after returning from callee)
+  Pending,
+  // It is about to return the control to the caller.
+  Exit,
+};
+
+/// Context for a function call.
+/// This struct maintains the state during the execution of a function,
+/// including the control flow, values of executed instructions, and stack
+/// objects.
+struct Frame {
+  Function &Func;
+  Frame *LastFrame;
+  CallBase *CallSite;
+  ArrayRef<AnyValue> Args;
+  AnyValue &RetVal;
+
+  TargetLibraryInfo TLI;
+  BasicBlock *BB;
+  BasicBlock::iterator PC;
+  FrameState State = FrameState::Entry;
+  // Stack objects allocated in this frame. They will be automatically freed
+  // when the function returns.
+  SmallVector<IntrusiveRefCntPtr<MemoryObject>> Allocas;
+  // Values of arguments and executed instructions in this function.
+  DenseMap<Value *, AnyValue> ValueMap;
+
+  // Reserved for in-flight subroutines.
+  SmallVector<AnyValue> CalleeArgs;
+  AnyValue CalleeRetVal;
+
+  Frame(Function &F, CallBase *CallSite, Frame *LastFrame,
+        ArrayRef<AnyValue> Args, AnyValue &RetVal,
+        const TargetLibraryInfoImpl &TLIImpl)
+      : Func(F), LastFrame(LastFrame), CallSite(CallSite), Args(Args),
+        RetVal(RetVal), TLI(TLIImpl, &F) {
+    assert((Args.size() == F.arg_size() ||
+            (F.isVarArg() && Args.size() >= F.arg_size())) &&
+           "Expected enough arguments to call the function.");
+    BB = &Func.getEntryBlock();
+    PC = BB->begin();
+    for (Argument &Arg : F.args())
+      ValueMap[&Arg] = Args[Arg.getArgNo()];
+  }
+};
+
+/// Instruction executor using the visitor pattern.
+/// visit* methods return true on success, false on error.
+/// Unlike the Context class that manages the global state,
+/// InstExecutor only maintains the state for call frames.
+class InstExecutor : public InstVisitor<InstExecutor, bool> {
+  Context &Ctx;
+  EventHandler &Handler;
+  std::list<Frame> CallStack;
+  // Used to indicate whether the interpreter should continue execution.
+  bool Status;
+  Frame *CurrentFrame = nullptr;
+  AnyValue None;
+
+  void reportImmediateUB(StringRef Msg) {
+    // Check if we have already reported an immediate UB.
+    if (!Status)
+      return;
+    Status = false;
+    // TODO: Provide stack trace information.
+    Handler.onImmediateUB(Msg);
+  }
+
+  const AnyValue &getValue(Value *V) {
+    if (auto *C = dyn_cast<Constant>(V))
+      return Ctx.getConstantValue(C);
+    return CurrentFrame->ValueMap.at(V);
+  }
+
+public:
+  InstExecutor(Context &C, EventHandler &H, Function &F,
+               ArrayRef<AnyValue> Args, AnyValue &RetVal)
+      : Ctx(C), Handler(H), Status(true) {
+    CallStack.emplace_back(F, /*CallSite=*/nullptr, /*LastFrame=*/nullptr, Args,
+                           RetVal, Ctx.getTLIImpl());
+  }
+  bool visitReturnInst(ReturnInst &RI) {
+    if (auto *RV = RI.getReturnValue())
+      CurrentFrame->RetVal = getValue(RV);
+    CurrentFrame->State = FrameState::Exit;
+    return Handler.onInstructionExecuted(RI, None);
+  }
+  bool visitInstruction(Instruction &I) {
+    Handler.onUnrecognizedInstruction(I);
+    return false;
+  }
+
+  /// This function implements the main interpreter loop.
+  /// It handles function calls in a non-recursive manner to avoid stack
+  /// overflows.
+  bool runMainLoop() {
+    uint32_t MaxSteps = Ctx.getMaxSteps();
+    uint32_t Steps = 0;
+    while (Status && !CallStack.empty()) {
+      Frame &Top = CallStack.back();
+      CurrentFrame = &Top;
+      if (Top.State == FrameState::Entry) {
+        Handler.onFunctionEntry(Top.Func, Top.Args, Top.CallSite);
+        // TODO: Handle arg attributes
+      } else {
+        assert(Top.State == FrameState::Pending &&
+               "Expected to return from a callee.");
+      }
+
+      Top.State = FrameState::Running;
+      // Interpreter loop inside a function
+      while (Status) {
+        assert(Top.State == FrameState::Running &&
+               "Expected to be in running state.");
+        if (MaxSteps != 0 && Steps >= MaxSteps) {
+          reportImmediateUB("Exceeded maximum number of execution steps.");
+          break;
+        }
+        ++Steps;
+
+        Instruction &I = *Top.PC;
+        if (!visit(&I)) {
+          Status = false;
+          break;
+        }
+        if (!Status)
+          break;
+
+        if (Top.State != FrameState::Pending && !I.isTerminator()) {
+          if (I.getType()->isVoidTy())
+            Handler.onInstructionExecuted(I, None);
+          else
+            Handler.onInstructionExecuted(I, Top.ValueMap.at(&I));
+        }
+
+        // A function call or return has occurred.
+        // We need to exit the inner loop and switch to a 
diff erent frame.
+        if (Top.State != FrameState::Running)
+          break;
+
+        // Otherwise, move to the next instruction if it is not a terminator.
+        // For terminators, the PC is updated in the visit* method.
+        if (!I.isTerminator())
+          ++Top.PC;
+      }
+
+      if (!Status)
+        break;
+
+      if (Top.State == FrameState::Exit) {
+        assert((Top.Func.getReturnType()->isVoidTy() || !Top.RetVal.isNone()) &&
+               "Expected return value to be set on function exit.");
+        // TODO:Handle retval attributes
+        Handler.onFunctionExit(Top.Func, Top.RetVal);
+        CallStack.pop_back();
+      } else {
+        assert(Top.State == FrameState::Pending &&
+               "Expected to enter a callee.");
+      }
+    }
+    return Status;
+  }
+};
+
+bool Context::runFunction(Function &F, ArrayRef<AnyValue> Args,
+                          AnyValue &RetVal, EventHandler &Handler) {
+  InstExecutor Executor(*this, Handler, F, Args, RetVal);
+  return Executor.runMainLoop();
+}
+
+} // namespace llvm::ubi

diff  --git a/llvm/tools/llubi/lib/Value.cpp b/llvm/tools/llubi/lib/Value.cpp
new file mode 100644
index 0000000000000..57cd94ef0f7bb
--- /dev/null
+++ b/llvm/tools/llubi/lib/Value.cpp
@@ -0,0 +1,230 @@
+//===- Value.cpp - Value Representation for llubi -------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements utility functions for the value representation.
+//
+//===----------------------------------------------------------------------===//
+
+#include "Value.h"
+#include "Context.h"
+#include "llvm/ADT/SmallString.h"
+
+namespace llvm::ubi {
+
+void Pointer::print(raw_ostream &OS) const {
+  SmallString<32> AddrStr;
+  Address.toStringUnsigned(AddrStr, 16);
+  OS << "ptr 0x" << AddrStr << " [";
+  if (Obj) {
+    OS << Obj->getName();
+    if (Offset)
+      OS << " + " << Offset;
+  } else {
+    OS << "dangling";
+  }
+  OS << "]";
+}
+
+AnyValue Pointer::null(unsigned BitWidth) {
+  return AnyValue(Pointer(nullptr, APInt::getZero(BitWidth), 0));
+}
+
+void AnyValue::print(raw_ostream &OS) const {
+  switch (Kind) {
+  case StorageKind::Integer:
+    if (IntVal.getBitWidth() == 1) {
+      OS << (IntVal.getBoolValue() ? "T" : "F");
+      break;
+    }
+    OS << "i" << IntVal.getBitWidth() << ' ' << IntVal;
+    break;
+  case StorageKind::Float:
+    OS << FloatVal;
+    break;
+  case StorageKind::Pointer:
+    PtrVal.print(OS);
+    break;
+  case StorageKind::Poison:
+    OS << "poison";
+    break;
+  case StorageKind::None:
+    OS << "none";
+    break;
+  case StorageKind::Aggregate:
+    OS << "{ ";
+    for (size_t I = 0, E = AggVal.size(); I != E; ++I) {
+      if (I != 0)
+        OS << ", ";
+      AggVal[I].print(OS);
+    }
+    OS << " }";
+    break;
+  }
+}
+
+void AnyValue::destroy() {
+  switch (Kind) {
+  case StorageKind::Integer:
+    IntVal.~APInt();
+    break;
+  case StorageKind::Float:
+    FloatVal.~APFloat();
+    break;
+  case StorageKind::Pointer:
+    PtrVal.~Pointer();
+    break;
+  case StorageKind::Poison:
+  case StorageKind::None:
+    break;
+  case StorageKind::Aggregate:
+    AggVal.~vector();
+    break;
+  }
+}
+
+AnyValue::AnyValue(const AnyValue &Other) : Kind(Other.Kind) {
+  switch (Other.Kind) {
+  case StorageKind::Integer:
+    new (&IntVal) APInt(Other.IntVal);
+    break;
+  case StorageKind::Float:
+    new (&FloatVal) APFloat(Other.FloatVal);
+    break;
+  case StorageKind::Pointer:
+    new (&PtrVal) Pointer(Other.PtrVal);
+    break;
+  case StorageKind::Poison:
+  case StorageKind::None:
+    break;
+  case StorageKind::Aggregate:
+    new (&AggVal) std::vector<AnyValue>(Other.AggVal);
+    break;
+  }
+}
+AnyValue::AnyValue(AnyValue &&Other) : Kind(Other.Kind) {
+  switch (Other.Kind) {
+  case StorageKind::Integer:
+    new (&IntVal) APInt(std::move(Other.IntVal));
+    break;
+  case StorageKind::Float:
+    new (&FloatVal) APFloat(std::move(Other.FloatVal));
+    break;
+  case StorageKind::Pointer:
+    new (&PtrVal) Pointer(std::move(Other.PtrVal));
+    break;
+  case StorageKind::Poison:
+  case StorageKind::None:
+    break;
+  case StorageKind::Aggregate:
+    new (&AggVal) std::vector<AnyValue>(std::move(Other.AggVal));
+    break;
+  }
+}
+
+AnyValue &AnyValue::operator=(const AnyValue &Other) {
+  if (&Other == this)
+    return *this;
+
+  destroy();
+  Kind = Other.Kind;
+  switch (Other.Kind) {
+  case StorageKind::Integer:
+    new (&IntVal) APInt(Other.IntVal);
+    break;
+  case StorageKind::Float:
+    new (&FloatVal) APFloat(Other.FloatVal);
+    break;
+  case StorageKind::Pointer:
+    new (&PtrVal) Pointer(Other.PtrVal);
+    break;
+  case StorageKind::Poison:
+  case StorageKind::None:
+    break;
+  case StorageKind::Aggregate:
+    new (&AggVal) std::vector<AnyValue>(Other.AggVal);
+    break;
+  }
+
+  return *this;
+}
+AnyValue &AnyValue::operator=(AnyValue &&Other) {
+  if (&Other == this)
+    return *this;
+  destroy();
+  Kind = Other.Kind;
+  switch (Other.Kind) {
+  case StorageKind::Integer:
+    new (&IntVal) APInt(std::move(Other.IntVal));
+    break;
+  case StorageKind::Float:
+    new (&FloatVal) APFloat(std::move(Other.FloatVal));
+    break;
+  case StorageKind::Pointer:
+    new (&PtrVal) Pointer(std::move(Other.PtrVal));
+    break;
+  case StorageKind::Poison:
+  case StorageKind::None:
+    break;
+  case StorageKind::Aggregate:
+    new (&AggVal) std::vector<AnyValue>(std::move(Other.AggVal));
+    break;
+  }
+
+  return *this;
+}
+
+AnyValue AnyValue::getPoisonValue(Context &Ctx, Type *Ty) {
+  if (Ty->isFloatingPointTy() || Ty->isIntegerTy() || Ty->isPointerTy())
+    return AnyValue::poison();
+  if (auto *VecTy = dyn_cast<VectorType>(Ty)) {
+    uint32_t NumElements = Ctx.getEVL(VecTy->getElementCount());
+    return AnyValue(std::vector<AnyValue>(NumElements, AnyValue::poison()));
+  }
+  if (auto *ArrTy = dyn_cast<ArrayType>(Ty)) {
+    uint64_t NumElements = ArrTy->getNumElements();
+    return AnyValue(std::vector<AnyValue>(
+        NumElements, getPoisonValue(Ctx, ArrTy->getElementType())));
+  }
+  if (auto *StructTy = dyn_cast<StructType>(Ty)) {
+    std::vector<AnyValue> Elements;
+    Elements.reserve(StructTy->getNumElements());
+    for (uint32_t I = 0, E = StructTy->getNumElements(); I != E; ++I)
+      Elements.push_back(getPoisonValue(Ctx, StructTy->getElementType(I)));
+    return AnyValue(std::move(Elements));
+  }
+  llvm_unreachable("Unsupported type");
+}
+AnyValue AnyValue::getNullValue(Context &Ctx, Type *Ty) {
+  if (Ty->isIntegerTy())
+    return AnyValue(APInt::getZero(Ty->getIntegerBitWidth()));
+  if (Ty->isFloatingPointTy())
+    return AnyValue(APFloat::getZero(Ty->getFltSemantics()));
+  if (Ty->isPointerTy())
+    return Pointer::null(
+        Ctx.getDataLayout().getPointerSizeInBits(Ty->getPointerAddressSpace()));
+  if (auto *VecTy = dyn_cast<VectorType>(Ty)) {
+    uint32_t NumElements = Ctx.getEVL(VecTy->getElementCount());
+    return AnyValue(std::vector<AnyValue>(
+        NumElements, getNullValue(Ctx, VecTy->getElementType())));
+  }
+  if (auto *ArrTy = dyn_cast<ArrayType>(Ty)) {
+    uint64_t NumElements = ArrTy->getNumElements();
+    return AnyValue(std::vector<AnyValue>(
+        NumElements, getNullValue(Ctx, ArrTy->getElementType())));
+  }
+  if (auto *StructTy = dyn_cast<StructType>(Ty)) {
+    std::vector<AnyValue> Elements;
+    Elements.reserve(StructTy->getNumElements());
+    for (uint32_t I = 0, E = StructTy->getNumElements(); I != E; ++I)
+      Elements.push_back(getNullValue(Ctx, StructTy->getElementType(I)));
+    return AnyValue(std::move(Elements));
+  }
+  llvm_unreachable("Unsupported type");
+}
+
+} // namespace llvm::ubi

diff  --git a/llvm/tools/llubi/lib/Value.h b/llvm/tools/llubi/lib/Value.h
new file mode 100644
index 0000000000000..0828941538798
--- /dev/null
+++ b/llvm/tools/llubi/lib/Value.h
@@ -0,0 +1,152 @@
+//===--- Value.h - Value Representation for llubi ---------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_TOOLS_LLUBI_VALUE_H
+#define LLVM_TOOLS_LLUBI_VALUE_H
+
+#include "llvm/ADT/APFloat.h"
+#include "llvm/ADT/APInt.h"
+#include "llvm/ADT/IntrusiveRefCntPtr.h"
+#include "llvm/IR/Type.h"
+#include "llvm/Support/raw_ostream.h"
+
+namespace llvm::ubi {
+
+class MemoryObject;
+class Context;
+class AnyValue;
+
+enum class ByteKind : uint8_t {
+  // A concrete byte with a known value.
+  Concrete,
+  // A uninitialized byte. Each load from an uninitialized byte yields
+  // a nondeterministic value.
+  Undef,
+  // A poisoned byte. It occurs when the program stores a poison value to
+  // memory,
+  // or when a memory object is dead.
+  Poison,
+};
+
+struct Byte {
+  uint8_t Value;
+  ByteKind Kind : 2;
+  // TODO: provenance
+
+  void set(uint8_t V) {
+    Value = V;
+    Kind = ByteKind::Concrete;
+  }
+};
+
+// TODO: Byte
+enum class StorageKind {
+  Integer,
+  Float,
+  Pointer,
+  Poison,
+  None,      // Placeholder for void type
+  Aggregate, // Struct, Array or Vector
+};
+
+class Pointer {
+  // The underlying memory object. It can be null for invalid or dangling
+  // pointers.
+  IntrusiveRefCntPtr<MemoryObject> Obj;
+  // The address of the pointer. The bit width is determined by
+  // DataLayout::getPointerSizeInBits.
+  APInt Address;
+  // The offset within the memory object.
+  uint64_t Offset;
+  // TODO: modeling inrange(Start, End) attribute
+
+public:
+  explicit Pointer(IntrusiveRefCntPtr<MemoryObject> Obj, const APInt &Address,
+                   uint64_t Offset)
+      : Obj(std::move(Obj)), Address(Address), Offset(Offset) {}
+  static AnyValue null(unsigned BitWidth);
+  void print(raw_ostream &OS) const;
+  const APInt &address() const { return Address; }
+  MemoryObject *getMemoryObject() const { return Obj.get(); }
+};
+
+// Value representation for actual values of LLVM values.
+// We don't model undef values here (except for byte types).
+class [[nodiscard]] AnyValue {
+  StorageKind Kind;
+  union {
+    APInt IntVal;
+    APFloat FloatVal;
+    Pointer PtrVal;
+    std::vector<AnyValue> AggVal;
+  };
+
+  struct PoisonTag {};
+  void destroy();
+
+public:
+  AnyValue() : Kind(StorageKind::None) {}
+  explicit AnyValue(PoisonTag) : Kind(StorageKind::Poison) {}
+  AnyValue(APInt Val) : Kind(StorageKind::Integer), IntVal(std::move(Val)) {}
+  AnyValue(APFloat Val) : Kind(StorageKind::Float), FloatVal(std::move(Val)) {}
+  AnyValue(Pointer Val) : Kind(StorageKind::Pointer), PtrVal(std::move(Val)) {}
+  AnyValue(std::vector<AnyValue> Val)
+      : Kind(StorageKind::Aggregate), AggVal(std::move(Val)) {}
+  AnyValue(const AnyValue &Other);
+  AnyValue(AnyValue &&Other);
+  AnyValue &operator=(const AnyValue &);
+  AnyValue &operator=(AnyValue &&);
+  ~AnyValue() { destroy(); }
+
+  void print(raw_ostream &OS) const;
+
+  static AnyValue poison() { return AnyValue(PoisonTag{}); }
+  static AnyValue getPoisonValue(Context &Ctx, Type *Ty);
+  static AnyValue getNullValue(Context &Ctx, Type *Ty);
+
+  bool isNone() const { return Kind == StorageKind::None; }
+  bool isPoison() const { return Kind == StorageKind::Poison; }
+
+  const APInt &asInteger() const {
+    assert(Kind == StorageKind::Integer && "Expect an integer value");
+    return IntVal;
+  }
+
+  const APFloat &asFloat() const {
+    assert(Kind == StorageKind::Float && "Expect a float value");
+    return FloatVal;
+  }
+
+  const Pointer &asPointer() const {
+    assert(Kind == StorageKind::Pointer && "Expect a pointer value");
+    return PtrVal;
+  }
+
+  const std::vector<AnyValue> &asAggregate() const {
+    assert(Kind == StorageKind::Aggregate &&
+           "Expect an aggregate/vector value");
+    return AggVal;
+  }
+
+  // Helper function for C++ 17 structured bindings.
+  template <size_t I> const AnyValue &get() const {
+    assert(Kind == StorageKind::Aggregate &&
+           "Expect an aggregate/vector value");
+    assert(I < AggVal.size() && "Index out of bounds");
+    return AggVal[I];
+  }
+};
+
+inline raw_ostream &operator<<(raw_ostream &OS, const AnyValue &V) {
+  V.print(OS);
+  return OS;
+}
+
+} // namespace llvm::ubi
+
+#endif

diff  --git a/llvm/tools/llubi/llubi.cpp b/llvm/tools/llubi/llubi.cpp
new file mode 100644
index 0000000000000..67ab01eca89fe
--- /dev/null
+++ b/llvm/tools/llubi/llubi.cpp
@@ -0,0 +1,239 @@
+//===------------- llubi.cpp - LLVM UB-aware Interpreter --------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This utility provides an UB-aware interpreter for programs in LLVM bitcode.
+// It is not built on top of the existing ExecutionEngine interface, but instead
+// implements its own value representation, state tracking and interpreter loop.
+//
+//===----------------------------------------------------------------------===//
+
+#include "lib/Context.h"
+#include "llvm/Config/llvm-config.h"
+#include "llvm/IR/LLVMContext.h"
+#include "llvm/IR/Module.h"
+#include "llvm/IR/Type.h"
+#include "llvm/IRReader/IRReader.h"
+#include "llvm/Support/CommandLine.h"
+#include "llvm/Support/Format.h"
+#include "llvm/Support/InitLLVM.h"
+#include "llvm/Support/MathExtras.h"
+#include "llvm/Support/SourceMgr.h"
+#include "llvm/Support/WithColor.h"
+#include "llvm/Support/raw_ostream.h"
+
+using namespace llvm;
+
+static cl::opt<std::string> InputFile(cl::desc("<input bitcode>"),
+                                      cl::Positional, cl::init("-"));
+
+static cl::list<std::string> InputArgv(cl::ConsumeAfter,
+                                       cl::desc("<program arguments>..."));
+
+static cl::opt<std::string>
+    EntryFunc("entry-function",
+              cl::desc("Specify the entry function (default = 'main') "
+                       "of the executable"),
+              cl::value_desc("function"), cl::init("main"));
+
+static cl::opt<std::string>
+    FakeArgv0("fake-argv0",
+              cl::desc("Override the 'argv[0]' value passed into the executing"
+                       " program"),
+              cl::value_desc("executable"));
+
+static cl::opt<bool>
+    Verbose("verbose", cl::desc("Print results for each instruction executed."),
+            cl::init(false));
+
+cl::OptionCategory InterpreterCategory("Interpreter Options");
+
+static cl::opt<unsigned> MaxMem(
+    "max-mem",
+    cl::desc("Max amount of memory (in bytes) that can be allocated by the"
+             " program, including stack, heap, and global variables."
+             " Set to 0 to disable the limit."),
+    cl::value_desc("N"), cl::init(0), cl::cat(InterpreterCategory));
+
+static cl::opt<unsigned>
+    MaxSteps("max-steps",
+             cl::desc("Max number of instructions executed."
+                      " Set to 0 to disable the limit."),
+             cl::value_desc("N"), cl::init(0), cl::cat(InterpreterCategory));
+
+static cl::opt<unsigned> MaxStackDepth(
+    "max-stack-depth",
+    cl::desc("Max stack depth (default = 256). Set to 0 to disable the limit."),
+    cl::value_desc("N"), cl::init(256), cl::cat(InterpreterCategory));
+
+static cl::opt<unsigned>
+    VScale("vscale", cl::desc("The value of llvm.vscale (default = 4)"),
+           cl::value_desc("N"), cl::init(4), cl::cat(InterpreterCategory));
+
+class VerboseEventHandler : public ubi::EventHandler {
+public:
+  bool onInstructionExecuted(Instruction &I,
+                             const ubi::AnyValue &Result) override {
+    if (Result.isNone()) {
+      errs() << I << '\n';
+    } else {
+      errs() << I << " => " << Result << '\n';
+    }
+
+    return true;
+  }
+
+  void onImmediateUB(StringRef Msg) override {
+    errs() << "Immediate UB detected: " << Msg << '\n';
+  }
+
+  bool onBBJump(Instruction &I, BasicBlock &To) override {
+    errs() << I << " jump to ";
+    To.printAsOperand(errs(), /*PrintType=*/false);
+    return true;
+  }
+
+  bool onFunctionEntry(Function &F, ArrayRef<ubi::AnyValue> Args,
+                       CallBase *CallSite) override {
+    errs() << "Entering function: " << F.getName() << '\n';
+    size_t ArgSize = F.arg_size();
+    for (auto &&[Idx, Arg] : enumerate(Args)) {
+      if (Idx >= ArgSize)
+        errs() << "  vaarg[" << (Idx - ArgSize) << "] = " << Arg << '\n';
+      else
+        errs() << "  " << *F.getArg(Idx) << " = " << Arg << '\n';
+    }
+    return true;
+  }
+
+  bool onFunctionExit(Function &F, const ubi::AnyValue &RetVal) override {
+    errs() << "Exiting function: " << F.getName() << '\n';
+    return true;
+  }
+
+  void onUnrecognizedInstruction(Instruction &I) override {
+    errs() << "Unrecognized instruction: " << I << '\n';
+  }
+};
+
+int main(int argc, char **argv) {
+  InitLLVM X(argc, argv);
+
+  cl::ParseCommandLineOptions(argc, argv, "llvm ub-aware interpreter\n");
+
+  if (EntryFunc.empty()) {
+    WithColor::error() << "--entry-function name cannot be empty\n";
+    return 1;
+  }
+
+  LLVMContext Context;
+
+  // Load the bitcode...
+  SMDiagnostic Err;
+  std::unique_ptr<Module> Owner = parseIRFile(InputFile, Err, Context);
+  Module *Mod = Owner.get();
+  if (!Mod) {
+    Err.print(argv[0], errs());
+    return 1;
+  }
+
+  // If the user specifically requested an argv[0] to pass into the program,
+  // do it now.
+  if (!FakeArgv0.empty()) {
+    InputFile = static_cast<std::string>(FakeArgv0);
+  } else {
+    // Otherwise, if there is a .bc suffix on the executable strip it off, it
+    // might confuse the program.
+    if (StringRef(InputFile).ends_with(".bc"))
+      InputFile.erase(InputFile.length() - 3);
+  }
+
+  // Add the module's name to the start of the vector of arguments to main().
+  InputArgv.insert(InputArgv.begin(), InputFile);
+
+  // Initialize the execution context and set parameters.
+  ubi::Context Ctx(*Mod);
+  Ctx.setMemoryLimit(MaxMem);
+  Ctx.setVScale(VScale);
+  Ctx.setMaxSteps(MaxSteps);
+  Ctx.setMaxStackDepth(MaxStackDepth);
+
+  // Call the main function from M as if its signature were:
+  //   int main (int argc, char **argv)
+  // using the contents of Args to determine argc & argv
+  Function *EntryFn = Mod->getFunction(EntryFunc);
+  if (!EntryFn) {
+    WithColor::error() << '\'' << EntryFunc
+                       << "\' function not found in module.\n";
+    return 1;
+  }
+  TargetLibraryInfo TLI(Ctx.getTLIImpl());
+  Type *IntTy = IntegerType::get(Ctx.getContext(), TLI.getIntSize());
+  auto *MainFuncTy = FunctionType::get(
+      IntTy, {IntTy, PointerType::getUnqual(Ctx.getContext())}, false);
+  SmallVector<ubi::AnyValue> Args;
+  if (EntryFn->getFunctionType() == MainFuncTy) {
+    Args.push_back(
+        Ctx.getConstantValue(ConstantInt::get(IntTy, InputArgv.size())));
+
+    uint32_t PtrSize = Ctx.getDataLayout().getPointerSize();
+    uint64_t PtrsSize = PtrSize * (InputArgv.size() + 1);
+    auto ArgvPtrsMem = Ctx.allocate(PtrsSize, 8, "argv",
+                                    /*AS=*/0, ubi::MemInitKind::Zeroed);
+    if (!ArgvPtrsMem) {
+      WithColor::error() << "Failed to allocate memory for argv pointers.\n";
+      return 1;
+    }
+    for (const auto &[Idx, Arg] : enumerate(InputArgv)) {
+      uint64_t Size = Arg.length() + 1;
+      auto ArgvStrMem = Ctx.allocate(Size, 8, "argv_str",
+                                     /*AS=*/0, ubi::MemInitKind::Zeroed);
+      if (!ArgvStrMem) {
+        WithColor::error() << "Failed to allocate memory for argv strings.\n";
+        return 1;
+      }
+      ubi::Pointer ArgPtr = Ctx.deriveFromMemoryObject(ArgvStrMem);
+      ArgvStrMem->writeRawBytes(0, Arg.c_str(), Arg.length());
+      ArgvPtrsMem->writePointer(Idx * PtrSize, ArgPtr, Ctx.getDataLayout());
+    }
+    Args.push_back(Ctx.deriveFromMemoryObject(ArgvPtrsMem));
+  } else if (!EntryFn->arg_empty()) {
+    // If the signature does not match (e.g., llvm-reduce change the signature
+    // of main), it will pass null values for all arguments.
+    WithColor::warning()
+        << "The signature of function '" << EntryFunc
+        << "' does not match 'int main(int, char**)', passing null values for "
+           "all arguments.\n";
+    Args.reserve(EntryFn->arg_size());
+    for (Argument &Arg : EntryFn->args())
+      Args.push_back(ubi::AnyValue::getNullValue(Ctx, Arg.getType()));
+  }
+
+  ubi::EventHandler NoopHandler;
+  VerboseEventHandler VerboseHandler;
+  ubi::AnyValue RetVal;
+  if (!Ctx.runFunction(*EntryFn, Args, RetVal,
+                       Verbose ? VerboseHandler : NoopHandler)) {
+    WithColor::error() << "Execution of function '" << EntryFunc
+                       << "' failed.\n";
+    return 1;
+  }
+
+  // If the function returns an integer, return that as the exit code.
+  if (EntryFn->getReturnType()->isIntegerTy()) {
+    assert(!RetVal.isNone() && "Expected a return value from entry function");
+    if (RetVal.isPoison()) {
+      WithColor::error() << "Execution of function '" << EntryFunc
+                         << "' resulted in poison return value.\n";
+      return 1;
+    }
+    APInt Result = RetVal.asInteger();
+    return (int)Result.extractBitsAsZExtValue(
+        std::min(Result.getBitWidth(), 8U), 0);
+  }
+  return 0;
+}


        


More information about the llvm-commits mailing list