[llvm] [MC,CodeGen] Update .prefalign for symbol-based preferred alignment (PR #184032)
Fangrui Song via llvm-commits
llvm-commits at lists.llvm.org
Mon Mar 2 22:05:26 PST 2026
https://github.com/MaskRay updated https://github.com/llvm/llvm-project/pull/184032
>From f42b3383765cfaeea7a0654995bb6939278325f4 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Sat, 28 Feb 2026 13:25:09 -0800
Subject: [PATCH 1/3] [MC,CodeGen] Update .prefalign for symbol-based preferred
alignment
https://discourse.llvm.org/t/rfc-enhancing-function-alignment-attributes/88019/17
The recently-introduced .prefalign only worked when each function was in
its own section (-ffunction-sections), because the section size gave the
function body size needed for the alignment rule.
This led to -ffunction-sections and -fno-function-sections AsmPrinter
differences (#155529), which is rather unusual.
This patch fixes this AsmPrinter difference by extending .prefalign to
accept an end symbol and a required fill operand:
.prefalign <pref_align>, <end_sym>, nop
.prefalign <pref_align>, <end_sym>, <fill_byte>
The body size (end_sym_offset - start_offset) determines the alignment:
0 < body_size < pref_align => ComputedAlign = NextPowerOf2(body_size-1)
body_size >= pref_align => ComputedAlign = pref_align
To also enforce a minimum alignment, emit a .p2align before .prefalign.
The fill operand is required: `nop` generates target-appropriate NOP
instructions via writeNopData, while an integer in [0,255] fills the
padding with that byte value.
In ELFObjectWriter::writeSectionHeader, sh_addralign is set to the
maximum of regular alignment values and ComputedAlign over all
FT_PrefAlign fragments.
Initialize MCSection::CurFragList to nullptr and add a null check
to skip ELFObjectWriter-created sections like .strtab/.symtab
that never receive changeSection calls.
---
llvm/docs/Extensions.rst | 32 ++--
llvm/include/llvm/MC/MCAssembler.h | 1 +
llvm/include/llvm/MC/MCObjectStreamer.h | 3 +-
llvm/include/llvm/MC/MCSection.h | 70 +++++--
llvm/include/llvm/MC/MCStreamer.h | 3 +-
llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp | 32 ++--
llvm/lib/MC/ELFObjectWriter.cpp | 17 +-
llvm/lib/MC/MCAsmStreamer.cpp | 14 +-
llvm/lib/MC/MCAssembler.cpp | 63 +++++++
llvm/lib/MC/MCFragment.cpp | 8 +-
llvm/lib/MC/MCObjectStreamer.cpp | 10 +-
llvm/lib/MC/MCParser/AsmParser.cpp | 47 ++++-
llvm/lib/MC/MCSection.cpp | 10 -
llvm/lib/MC/MCStreamer.cpp | 3 +-
.../AArch64/preferred-function-alignment.ll | 9 +-
.../ARM/preferred-function-alignment.ll | 13 +-
.../CodeGen/LoongArch/linker-relaxation.ll | 1 -
llvm/test/CodeGen/PowerPC/code-align.ll | 4 +-
llvm/test/CodeGen/PowerPC/ppc64-calls.ll | 2 +-
llvm/test/CodeGen/SystemZ/vec-perm-14.ll | 6 +-
llvm/test/CodeGen/X86/eh-label.ll | 2 +-
llvm/test/CodeGen/X86/empty-function.ll | 2 +-
llvm/test/CodeGen/X86/kcfi-arity.ll | 3 +-
.../X86/kcfi-patchable-function-prefix.ll | 14 +-
llvm/test/CodeGen/X86/kcfi.ll | 3 +-
llvm/test/CodeGen/X86/prefalign.ll | 12 +-
llvm/test/CodeGen/X86/statepoint-invoke.ll | 4 +-
llvm/test/MC/ELF/prefalign-errors.s | 46 ++++-
llvm/test/MC/ELF/prefalign.s | 175 +++++++++---------
llvm/test/MC/RISCV/prefalign.s | 34 ++++
30 files changed, 454 insertions(+), 189 deletions(-)
create mode 100644 llvm/test/MC/RISCV/prefalign.s
diff --git a/llvm/docs/Extensions.rst b/llvm/docs/Extensions.rst
index c8de7f59de5c0..e910c2bdff5e8 100644
--- a/llvm/docs/Extensions.rst
+++ b/llvm/docs/Extensions.rst
@@ -31,17 +31,27 @@ hexadecimal format instead of decimal if desired.
``.prefalign`` directive
------------------------
-The ``.prefalign`` directive sets the preferred alignment for a section,
-and enables the section's final alignment to be set in a way that is
-dependent on the section size (currently only supported with ELF).
-
-If the section size is less than the section's minimum alignment as
-determined using ``.align`` family directives, the section's alignment
-will be equal to its minimum alignment. Otherwise, if the section size is
-between the minimum alignment and the preferred alignment, the section's
-alignment will be equal to the power of 2 greater than or equal to the
-section size. Otherwise, the section's alignment will be equal to the
-preferred alignment.
+.. code-block:: gas
+
+ .prefalign <pref_align>, <end_sym>, nop
+ .prefalign <pref_align>, <end_sym>, <fill_byte>
+
+The ``.prefalign`` directive pads the current location so that the code
+between the directive and ``end_sym`` starts at an alignment that depends
+on the size of that code (currently only supported with ELF). ``align``
+must be a power of 2. ``end_sym`` must be a symbol defined in the same
+section. The fill operand is required: ``nop`` fills the padding with
+target-appropriate NOP instructions, while an integer in ``[0, 255]``
+fills the padding with that byte value.
+
+The alignment is determined by the *body_size* (the number of bytes between
+the padded start and ``end_sym``):
+
+- If *body_size* < *pref_align*: align to the smallest power of 2
+ greater than or equal to *body_size*.
+- If *body_size* ≥ *pref_align*: align to *pref_align*.
+
+To also enforce a minimum alignment, emit a ``.p2align`` before ``.prefalign``.
Machine-specific Assembly Syntax
================================
diff --git a/llvm/include/llvm/MC/MCAssembler.h b/llvm/include/llvm/MC/MCAssembler.h
index dbae271a1c198..a7b865cb16b81 100644
--- a/llvm/include/llvm/MC/MCAssembler.h
+++ b/llvm/include/llvm/MC/MCAssembler.h
@@ -112,6 +112,7 @@ class MCAssembler {
void relaxInstruction(MCFragment &F);
void relaxLEB(MCFragment &F);
void relaxBoundaryAlign(MCBoundaryAlignFragment &BF);
+ void relaxPrefAlign(MCFragment &F);
void relaxDwarfLineAddr(MCFragment &F);
void relaxDwarfCallFrameFragment(MCFragment &F);
void relaxSFrameFragment(MCFragment &DF);
diff --git a/llvm/include/llvm/MC/MCObjectStreamer.h b/llvm/include/llvm/MC/MCObjectStreamer.h
index 5fc17b2b383b1..cb2694b231d5b 100644
--- a/llvm/include/llvm/MC/MCObjectStreamer.h
+++ b/llvm/include/llvm/MC/MCObjectStreamer.h
@@ -139,7 +139,8 @@ class LLVM_ABI MCObjectStreamer : public MCStreamer {
unsigned MaxBytesToEmit = 0) override;
void emitCodeAlignment(Align ByteAlignment, const MCSubtargetInfo *STI,
unsigned MaxBytesToEmit = 0) override;
- void emitPrefAlign(Align Alignment) override;
+ void emitPrefAlign(Align Alignment, const MCSymbol &End, bool EmitNops,
+ uint8_t Fill, const MCSubtargetInfo &STI) override;
void emitValueToOffset(const MCExpr *Offset, unsigned char Value,
SMLoc Loc) override;
void emitDwarfLocDirective(unsigned FileNo, unsigned Line, unsigned Column,
diff --git a/llvm/include/llvm/MC/MCSection.h b/llvm/include/llvm/MC/MCSection.h
index 4c36ed567de62..8dc6a62dc77eb 100644
--- a/llvm/include/llvm/MC/MCSection.h
+++ b/llvm/include/llvm/MC/MCSection.h
@@ -53,6 +53,7 @@ class MCFragment {
FT_Data,
FT_Relaxable,
FT_Align,
+ FT_PrefAlign,
FT_Fill,
FT_LEB,
FT_Nops,
@@ -132,6 +133,19 @@ class MCFragment {
// Value to use for filling padding bytes.
int64_t Fill;
} align;
+ struct {
+ // Symbol denoting the end of the region; always non-null.
+ const MCSymbol *End;
+ // The preferred (maximum) alignment.
+ Align PreferredAlign;
+ // The alignment computed during relaxation.
+ Align ComputedAlign;
+ // If true, fill padding with target NOPs via writeNopData; the STI field
+ // holds the subtarget info needed. If false, fill with Fill byte.
+ bool EmitNops;
+ // Fill byte used when !EmitNops.
+ uint8_t Fill;
+ } prefalign;
struct {
// True if this is a sleb128, false if uleb128.
bool IsSigned;
@@ -268,6 +282,45 @@ class MCFragment {
return u.align.EmitNops;
}
+ //== FT_PrefAlign functions
+ // Initialize an FT_PrefAlign fragment. The region starts at this fragment and
+ // ends at \p End. ComputedAlign is set during relaxation:
+ // body_size == 0 => ComputedAlign = 1
+ // 0 < body_size < PrefAlign => ComputedAlign = NextPowerOf2(body_size-1)
+ // body_size >= PrefAlign => ComputedAlign = PrefAlign
+ void makePrefAlign(Align PrefAlign, const MCSymbol &End, bool EmitNops,
+ uint8_t Fill) {
+ Kind = FT_PrefAlign;
+ u.prefalign.End = &End;
+ u.prefalign.PreferredAlign = PrefAlign;
+ u.prefalign.EmitNops = EmitNops;
+ u.prefalign.Fill = Fill;
+ }
+ const MCSymbol &getPrefAlignEnd() const {
+ assert(Kind == FT_PrefAlign);
+ return *u.prefalign.End;
+ }
+ Align getPrefAlignPreferred() const {
+ assert(Kind == FT_PrefAlign);
+ return u.prefalign.PreferredAlign;
+ }
+ Align getPrefAlignComputed() const {
+ assert(Kind == FT_PrefAlign);
+ return u.prefalign.ComputedAlign;
+ }
+ void setPrefAlignComputed(Align A) {
+ assert(Kind == FT_PrefAlign);
+ u.prefalign.ComputedAlign = A;
+ }
+ bool getPrefAlignEmitNops() const {
+ assert(Kind == FT_PrefAlign);
+ return u.prefalign.EmitNops;
+ }
+ uint8_t getPrefAlignFill() const {
+ assert(Kind == FT_PrefAlign);
+ return u.prefalign.Fill;
+ }
+
//== FT_LEB functions
void makeLEB(bool IsSigned, const MCExpr *Value) {
assert(Kind == FT_Data);
@@ -538,14 +591,14 @@ class LLVM_ABI MCSection {
private:
// At parse time, this holds the fragment list of the current subsection. At
// layout time, this holds the concatenated fragment lists of all subsections.
- FragList *CurFragList;
+ // Null until the first fragment is added to this section.
+ FragList *CurFragList = nullptr;
// In many object file formats, this denotes the section symbol. In Mach-O,
// this denotes an optional temporary label at the section start.
MCSymbol *Begin;
MCSymbol *End = nullptr;
/// The alignment requirement of this section.
Align Alignment;
- MaybeAlign PreferredAlignment;
/// The section index in the assemblers section list.
unsigned Ordinal = 0;
// If not -1u, the first linker-relaxable fragment's order within the
@@ -606,19 +659,6 @@ class LLVM_ABI MCSection {
Alignment = MinAlignment;
}
- Align getPreferredAlignment() const {
- if (!PreferredAlignment || Alignment > *PreferredAlignment)
- return Alignment;
- return *PreferredAlignment;
- }
-
- void ensurePreferredAlignment(Align PrefAlign) {
- if (!PreferredAlignment || PrefAlign > *PreferredAlignment)
- PreferredAlignment = PrefAlign;
- }
-
- Align getAlignmentForObjectFile(uint64_t Size) const;
-
unsigned getOrdinal() const { return Ordinal; }
void setOrdinal(unsigned Value) { Ordinal = Value; }
diff --git a/llvm/include/llvm/MC/MCStreamer.h b/llvm/include/llvm/MC/MCStreamer.h
index 148d69ae5098f..05cd0f214c025 100644
--- a/llvm/include/llvm/MC/MCStreamer.h
+++ b/llvm/include/llvm/MC/MCStreamer.h
@@ -845,7 +845,8 @@ class LLVM_ABI MCStreamer {
virtual void emitCodeAlignment(Align Alignment, const MCSubtargetInfo *STI,
unsigned MaxBytesToEmit = 0);
- virtual void emitPrefAlign(Align A);
+ virtual void emitPrefAlign(Align A, const MCSymbol &End, bool EmitNops,
+ uint8_t Fill, const MCSubtargetInfo &STI);
/// Emit some number of copies of \p Value until the byte offset \p
/// Offset is reached.
diff --git a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
index 083b83567e47f..1d9550c53db09 100644
--- a/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
+++ b/llvm/lib/CodeGen/AsmPrinter/AsmPrinter.cpp
@@ -1046,19 +1046,20 @@ void AsmPrinter::emitFunctionHeader() {
emitLinkage(&F, CurrentFnSym);
if (MAI->hasFunctionAlignment()) {
- // Make sure that the preferred alignment directive (.prefalign) is
- // supported before using it. The preferred alignment directive will not
- // have the intended effect unless function sections are enabled, so check
- // for that as well.
+ Align PrefAlign = MF->getPreferredAlignment();
+ // Use .prefalign when the integrated assembler supports it and the target
+ // has a preferred alignment distinct from the minimum. The end symbol must
+ // be created here, before the function body, so that .prefalign can
+ // reference it; emitFunctionBody will emit the label at the function end.
if (MAI->useIntegratedAssembler() && MAI->hasPreferredAlignment() &&
- TM.getFunctionSections()) {
- Align Alignment = MF->getAlignment();
- Align PrefAlignment = MF->getPreferredAlignment();
- emitAlignment(Alignment, &F);
- if (Alignment != PrefAlignment)
- OutStreamer->emitPrefAlign(PrefAlignment);
+ MF->getAlignment() != PrefAlign) {
+ emitAlignment(MF->getAlignment(), &F);
+ CurrentFnEnd = createTempSymbol("func_end");
+ OutStreamer->emitPrefAlign(PrefAlign, *CurrentFnEnd,
+ /*EmitNops=*/true, /*Fill=*/0,
+ getSubtargetInfo());
} else {
- emitAlignment(MF->getPreferredAlignment(), &F);
+ emitAlignment(PrefAlign, &F);
}
}
@@ -2365,9 +2366,11 @@ void AsmPrinter::emitFunctionBody() {
// SPIR-V supports label instructions only inside a block, not after the
// function body.
if (TT.getObjectFormat() != Triple::SPIRV &&
- (EmitFunctionSize || needFuncLabels(*MF, *this))) {
- // Create a symbol for the end of function.
- CurrentFnEnd = createTempSymbol("func_end");
+ (EmitFunctionSize || needFuncLabels(*MF, *this) || CurrentFnEnd)) {
+ // Create a symbol for the end of function, if not already pre-created
+ // (e.g. for .prefalign directive).
+ if (!CurrentFnEnd)
+ CurrentFnEnd = createTempSymbol("func_end");
OutStreamer->emitLabel(CurrentFnEnd);
}
@@ -3121,6 +3124,7 @@ void AsmPrinter::SetupMachineFunction(MachineFunction &MF) {
CurrentFnSymForSize = CurrentFnSym;
CurrentFnBegin = nullptr;
CurrentFnBeginLocal = nullptr;
+ CurrentFnEnd = nullptr;
CurrentSectionBeginSym = nullptr;
CurrentFnCallsiteEndSymbols.clear();
MBBSectionRanges.clear();
diff --git a/llvm/lib/MC/ELFObjectWriter.cpp b/llvm/lib/MC/ELFObjectWriter.cpp
index b23fa92ac194d..408aee4f6bc68 100644
--- a/llvm/lib/MC/ELFObjectWriter.cpp
+++ b/llvm/lib/MC/ELFObjectWriter.cpp
@@ -912,10 +912,19 @@ void ELFWriter::writeSectionHeader(uint32_t GroupSymbolIndex, uint64_t Offset,
sh_link = Sym->getSection().getOrdinal();
}
- writeSectionHeaderEntry(
- StrTabBuilder.getOffset(Section.getName()), Section.getType(),
- Section.getFlags(), 0, Offset, Size, sh_link, sh_info,
- Section.getAlignmentForObjectFile(Size), Section.getEntrySize());
+ // Compute sh_addralign as the maximum ComputedAlign over all FT_PrefAlign
+ // fragments, falling back to the section's minimum alignment. curFragList()
+ // can be nullptr for ELFObjectWriter-created sections like .strtab and
+ // .symtab.
+ Align SHAlign = Section.getAlign();
+ if (Section.curFragList())
+ for (const MCFragment &F : Section)
+ if (F.getKind() == MCFragment::FT_PrefAlign)
+ SHAlign = std::max(SHAlign, F.getPrefAlignComputed());
+ writeSectionHeaderEntry(StrTabBuilder.getOffset(Section.getName()),
+ Section.getType(), Section.getFlags(), 0, Offset,
+ Size, sh_link, sh_info, SHAlign,
+ Section.getEntrySize());
}
void ELFWriter::writeSectionHeaders() {
diff --git a/llvm/lib/MC/MCAsmStreamer.cpp b/llvm/lib/MC/MCAsmStreamer.cpp
index 1a50ae43cd9c9..78b5e8309d004 100644
--- a/llvm/lib/MC/MCAsmStreamer.cpp
+++ b/llvm/lib/MC/MCAsmStreamer.cpp
@@ -286,7 +286,8 @@ class MCAsmStreamer final : public MCStreamer {
void emitCodeAlignment(Align Alignment, const MCSubtargetInfo *STI,
unsigned MaxBytesToEmit = 0) override;
- void emitPrefAlign(Align Alignment) override;
+ void emitPrefAlign(Align Alignment, const MCSymbol &End, bool EmitNops,
+ uint8_t Fill, const MCSubtargetInfo &STI) override;
void emitValueToOffset(const MCExpr *Offset,
unsigned char Value,
@@ -1562,8 +1563,15 @@ void MCAsmStreamer::emitCodeAlignment(Align Alignment,
emitAlignmentDirective(Alignment.value(), std::nullopt, 1, MaxBytesToEmit);
}
-void MCAsmStreamer::emitPrefAlign(Align Alignment) {
- OS << "\t.prefalign\t" << Alignment.value();
+void MCAsmStreamer::emitPrefAlign(Align Alignment, const MCSymbol &End,
+ bool EmitNops, uint8_t Fill,
+ const MCSubtargetInfo &) {
+ OS << "\t.prefalign\t" << Alignment.value() << ", ";
+ End.print(OS, MAI);
+ if (EmitNops)
+ OS << ", nop";
+ else
+ OS << ", " << static_cast<unsigned>(Fill);
EmitEOL();
}
diff --git a/llvm/lib/MC/MCAssembler.cpp b/llvm/lib/MC/MCAssembler.cpp
index e649ea7fedabe..f6f64a6f64f3d 100644
--- a/llvm/lib/MC/MCAssembler.cpp
+++ b/llvm/lib/MC/MCAssembler.cpp
@@ -219,6 +219,9 @@ uint64_t MCAssembler::computeFragmentSize(const MCFragment &F) const {
return Size;
}
+ case MCFragment::FT_PrefAlign:
+ return F.getSize();
+
case MCFragment::FT_Nops:
return cast<MCNopsFragment>(F).getNumBytes();
@@ -451,6 +454,23 @@ static void writeFragment(raw_ostream &OS, const MCAssembler &Asm,
}
} break;
+ case MCFragment::FT_PrefAlign: {
+ OS << StringRef(F.getContents().data(), F.getContents().size());
+ uint64_t PadSize = FragmentSize - F.getContents().size();
+ if (F.getPrefAlignEmitNops()) {
+ if (!Asm.getBackend().writeNopData(OS, PadSize, F.getSubtargetInfo()))
+ reportFatalInternalError("unable to write nop sequence of " +
+ Twine(PadSize) + " bytes");
+ } else if (F.getPrefAlignFill() == 0) {
+ OS.write_zeros(PadSize);
+ } else {
+ char B = char(F.getPrefAlignFill());
+ for (uint64_t I = 0; I < PadSize; ++I)
+ OS << B;
+ }
+ break;
+ }
+
case MCFragment::FT_Fill: {
++stats::EmittedFillFragments;
const MCFillFragment &FF = cast<MCFillFragment>(F);
@@ -584,6 +604,10 @@ void MCAssembler::writeSectionData(raw_ostream &OS,
// 0.
assert(F.getAlignFill() == 0 && "Invalid align in virtual section!");
break;
+ case MCFragment::FT_PrefAlign:
+ assert(!F.getPrefAlignEmitNops() && F.getPrefAlignFill() == 0 &&
+ "Invalid align in BSS");
+ break;
case MCFragment::FT_Fill:
HasNonZero = cast<MCFillFragment>(F).getValue() != 0;
break;
@@ -884,6 +908,39 @@ void MCAssembler::relaxBoundaryAlign(MCBoundaryAlignFragment &BF) {
BF.setSize(NewSize);
}
+void MCAssembler::relaxPrefAlign(MCFragment &F) {
+ const MCSymbol &End = F.getPrefAlignEnd();
+ if (!End.getFragment() || End.getFragment()->getParent() != F.getParent()) {
+ recordError(SMLoc(), "end symbol '" + End.getName() +
+ "' must be a symbol in the current section");
+ return;
+ }
+ uint64_t EndOffset;
+ if (!getSymbolOffset(End, EndOffset))
+ return;
+ // RawStart is the start of the (variable) padding region; StartOffset is
+ // the start of the body (RawStart plus current padding). BodySize is
+ // measured from StartOffset, not RawStart, so that padding is not counted
+ // as part of the body.
+ uint64_t RawStart = F.Offset + F.getFixedSize();
+ uint64_t StartOffset = RawStart + F.getVarSize();
+ Align NewAlign;
+ if (StartOffset < EndOffset) {
+ uint64_t BodySize = EndOffset - StartOffset;
+ if (BodySize < F.getPrefAlignPreferred().value())
+ NewAlign = Align(NextPowerOf2(BodySize - 1));
+ else
+ NewAlign = F.getPrefAlignPreferred();
+ }
+ F.setPrefAlignComputed(NewAlign);
+ // Compute padding to align the body start to NewAlign.
+ uint64_t NewPadSize = offsetToAlignment(RawStart, NewAlign);
+ F.VarContentStart = F.getFixedSize();
+ F.VarContentEnd = F.VarContentStart + NewPadSize;
+ if (F.VarContentEnd > F.getParent()->ContentStorage.size())
+ F.getParent()->ContentStorage.resize(F.VarContentEnd);
+}
+
void MCAssembler::relaxDwarfLineAddr(MCFragment &F) {
if (getBackend().relaxDwarfLineAddr(F))
return;
@@ -962,6 +1019,9 @@ bool MCAssembler::relaxFragment(MCFragment &F) {
case MCFragment::FT_BoundaryAlign:
relaxBoundaryAlign(static_cast<MCBoundaryAlignFragment &>(F));
break;
+ case MCFragment::FT_PrefAlign:
+ relaxPrefAlign(F);
+ break;
case MCFragment::FT_CVInlineLines:
getContext().getCVContext().encodeInlineLineTable(
*this, static_cast<MCCVInlineLineTableFragment &>(F));
@@ -979,6 +1039,9 @@ bool MCAssembler::relaxFragment(MCFragment &F) {
void MCAssembler::layoutSection(MCSection &Sec) {
uint64_t Offset = 0;
+ // Note: fragments are not relaxed here. Some fragments depend on
+ // downstream symbols whose offsets have not been set in this pass yet.
+ // They are instead relaxed by relaxFragment.
for (MCFragment &F : Sec) {
F.Offset = Offset;
if (F.getKind() == MCFragment::FT_Align) {
diff --git a/llvm/lib/MC/MCFragment.cpp b/llvm/lib/MC/MCFragment.cpp
index 85d1c5888f1da..21a304da0bb4f 100644
--- a/llvm/lib/MC/MCFragment.cpp
+++ b/llvm/lib/MC/MCFragment.cpp
@@ -55,7 +55,8 @@ LLVM_DUMP_METHOD void MCFragment::dump() const {
case MCFragment::FT_DwarfFrame: OS << "DwarfCallFrame"; break;
case MCFragment::FT_SFrame: OS << "SFrame"; break;
case MCFragment::FT_LEB: OS << "LEB"; break;
- case MCFragment::FT_BoundaryAlign: OS<<"BoundaryAlign"; break;
+ case MCFragment::FT_BoundaryAlign: OS << "BoundaryAlign"; break;
+ case MCFragment::FT_PrefAlign: OS << "PrefAlign"; break;
case MCFragment::FT_SymbolId: OS << "SymbolId"; break;
case MCFragment::FT_CVInlineLines: OS << "CVInlineLineTable"; break;
case MCFragment::FT_CVDefRange: OS << "CVDefRangeTable"; break;
@@ -170,6 +171,11 @@ LLVM_DUMP_METHOD void MCFragment::dump() const {
<< " Size:" << BF->getSize();
break;
}
+ case MCFragment::FT_PrefAlign:
+ OS << " PrefAlign:" << getPrefAlignPreferred().value()
+ << " End:" << getPrefAlignEnd().getName()
+ << " ComputedAlign:" << getPrefAlignComputed().value();
+ break;
case MCFragment::FT_SymbolId: {
const auto *F = cast<MCSymbolIdFragment>(this);
OS << " Sym:" << F->getSymbol();
diff --git a/llvm/lib/MC/MCObjectStreamer.cpp b/llvm/lib/MC/MCObjectStreamer.cpp
index 58aa7945d7393..f6d1ae7e50295 100644
--- a/llvm/lib/MC/MCObjectStreamer.cpp
+++ b/llvm/lib/MC/MCObjectStreamer.cpp
@@ -690,8 +690,14 @@ void MCObjectStreamer::emitCodeAlignment(Align Alignment,
F->STI = STI;
}
-void MCObjectStreamer::emitPrefAlign(Align Alignment) {
- getCurrentSectionOnly()->ensurePreferredAlignment(Alignment);
+void MCObjectStreamer::emitPrefAlign(Align Alignment, const MCSymbol &End,
+ bool EmitNops, uint8_t Fill,
+ const MCSubtargetInfo &STI) {
+ auto *F = getCurrentFragment();
+ F->makePrefAlign(Alignment, End, EmitNops, Fill);
+ if (EmitNops)
+ F->STI = &STI;
+ newFragment();
}
void MCObjectStreamer::emitValueToOffset(const MCExpr *Offset,
diff --git a/llvm/lib/MC/MCParser/AsmParser.cpp b/llvm/lib/MC/MCParser/AsmParser.cpp
index 3452708bcec8a..c30c4d09797f0 100644
--- a/llvm/lib/MC/MCParser/AsmParser.cpp
+++ b/llvm/lib/MC/MCParser/AsmParser.cpp
@@ -3468,13 +3468,54 @@ bool AsmParser::parseDirectivePrefAlign() {
int64_t Alignment;
if (checkForValidSection() || parseAbsoluteExpression(Alignment))
return true;
- if (parseEOL())
- return true;
if (!isPowerOf2_64(Alignment))
return Error(AlignmentLoc, "alignment must be a power of 2");
- getStreamer().emitPrefAlign(Align(Alignment));
+ // Parse end symbol: .prefalign N, sym
+ SMLoc SymLoc = getLexer().getLoc();
+ if (!getLexer().is(AsmToken::Comma))
+ return Error(SymLoc, "expected ',' and end symbol");
+ Lex();
+ StringRef Name;
+ SymLoc = getLexer().getLoc();
+ if (parseIdentifier(Name))
+ return Error(SymLoc, "expected symbol name");
+ MCSymbol *End = getContext().getOrCreateSymbol(Name);
+
+ // Parse fill operand: integer byte [0, 255] or "nop".
+ SMLoc FillLoc = getLexer().getLoc();
+ if (!getLexer().is(AsmToken::Comma))
+ return Error(FillLoc, "expected ',' followed by 'nops' or fill byte");
+ Lex();
+
+ bool EmitNops = false;
+ uint8_t Fill = 0;
+ SMLoc FillLoc2 = getLexer().getLoc();
+ if (getLexer().is(AsmToken::Integer)) {
+ int64_t FillVal = getLexer().getTok().getIntVal();
+ Lex();
+ if (FillVal < 0 || FillVal > 255)
+ return Error(FillLoc2, "fill value must be in range [0, 255]");
+ Fill = static_cast<uint8_t>(FillVal);
+ } else if (getLexer().is(AsmToken::Identifier) &&
+ getLexer().getTok().getIdentifier() == "nop") {
+ EmitNops = true;
+ Lex();
+ } else {
+ return Error(FillLoc2, "expected integer fill byte or 'nop'");
+ }
+
+ if (parseEOL())
+ return true;
+ if ((EmitNops || Fill != 0) &&
+ getStreamer().getCurrentSectionOnly()->isBssSection())
+ return Error(FillLoc, "non-zero fill in BSS section '" +
+ getStreamer().getCurrentSectionOnly()->getName() +
+ "'");
+
+ getStreamer().emitPrefAlign(Align(Alignment), *End, EmitNops, Fill,
+ getTargetParser().getSTI());
return false;
}
diff --git a/llvm/lib/MC/MCSection.cpp b/llvm/lib/MC/MCSection.cpp
index 8285379eeaf81..a668e7919b7b9 100644
--- a/llvm/lib/MC/MCSection.cpp
+++ b/llvm/lib/MC/MCSection.cpp
@@ -30,16 +30,6 @@ MCSymbol *MCSection::getEndSymbol(MCContext &Ctx) {
return End;
}
-Align MCSection::getAlignmentForObjectFile(uint64_t Size) const {
- if (Size < getAlign().value())
- return getAlign();
-
- if (Size < getPreferredAlignment().value())
- return Align(NextPowerOf2(Size - 1));
-
- return getPreferredAlignment();
-}
-
bool MCSection::hasEnded() const { return End && End->isInSection(); }
#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
diff --git a/llvm/lib/MC/MCStreamer.cpp b/llvm/lib/MC/MCStreamer.cpp
index a913528d53a70..4133ea1235227 100644
--- a/llvm/lib/MC/MCStreamer.cpp
+++ b/llvm/lib/MC/MCStreamer.cpp
@@ -1354,7 +1354,8 @@ void MCStreamer::emitFill(const MCExpr &NumBytes, uint64_t Value, SMLoc Loc) {}
void MCStreamer::emitFill(const MCExpr &NumValues, int64_t Size, int64_t Expr,
SMLoc Loc) {}
void MCStreamer::emitValueToAlignment(Align, int64_t, uint8_t, unsigned) {}
-void MCStreamer::emitPrefAlign(Align A) {}
+void MCStreamer::emitPrefAlign(Align A, const MCSymbol &End, bool EmitNops,
+ uint8_t Fill, const MCSubtargetInfo &STI) {}
void MCStreamer::emitCodeAlignment(Align Alignment, const MCSubtargetInfo *STI,
unsigned MaxBytesToEmit) {}
void MCStreamer::emitValueToOffset(const MCExpr *Offset, unsigned char Value,
diff --git a/llvm/test/CodeGen/AArch64/preferred-function-alignment.ll b/llvm/test/CodeGen/AArch64/preferred-function-alignment.ll
index a6cb7123e5af4..d272dae33e814 100644
--- a/llvm/test/CodeGen/AArch64/preferred-function-alignment.ll
+++ b/llvm/test/CodeGen/AArch64/preferred-function-alignment.ll
@@ -29,10 +29,11 @@ define void @test() {
}
; CHECK-LABEL: test
-; ALIGN2: .p2align 2
-; ALIGN3: .p2align 3
-; ALIGN4: .p2align 4
-; ALIGN5: .p2align 5
+; CHECK: .p2align 2
+; ALIGN2-NOT: .prefalign
+; ALIGN3-NEXT: .prefalign 8
+; ALIGN4-NEXT: .prefalign 16
+; ALIGN5-NEXT: .prefalign 32
define void @test_optsize() optsize {
ret void
diff --git a/llvm/test/CodeGen/ARM/preferred-function-alignment.ll b/llvm/test/CodeGen/ARM/preferred-function-alignment.ll
index 2fc67905f6db7..3ae20da251df1 100644
--- a/llvm/test/CodeGen/ARM/preferred-function-alignment.ll
+++ b/llvm/test/CodeGen/ARM/preferred-function-alignment.ll
@@ -1,15 +1,18 @@
; RUN: llc -mtriple=arm-none-eabi -mcpu=cortex-m85 < %s | FileCheck --check-prefixes=CHECK,ALIGN-64,ALIGN-CS-16 %s
; RUN: llc -mtriple=arm-none-eabi -mcpu=cortex-m23 < %s | FileCheck --check-prefixes=CHECK,ALIGN-16,ALIGN-CS-16 %s
-; RUN: llc -mtriple=arm-none-eabi -mcpu=cortex-a5 < %s | FileCheck --check-prefixes=CHECK,ALIGN-32,ALIGN-CS-32 %s
-; RUN: llc -mtriple=arm-none-eabi -mcpu=cortex-m33 < %s | FileCheck --check-prefixes=CHECK,ALIGN-32,ALIGN-CS-16 %s
-; RUN: llc -mtriple=arm-none-eabi -mcpu=cortex-m55 < %s | FileCheck --check-prefixes=CHECK,ALIGN-32,ALIGN-CS-16 %s
+; RUN: llc -mtriple=arm-none-eabi -mcpu=cortex-a5 < %s | FileCheck --check-prefixes=CHECK,ALIGN-32A,ALIGN-CS-32 %s
+; RUN: llc -mtriple=arm-none-eabi -mcpu=cortex-m33 < %s | FileCheck --check-prefixes=CHECK,ALIGN-32T,ALIGN-CS-16 %s
+; RUN: llc -mtriple=arm-none-eabi -mcpu=cortex-m55 < %s | FileCheck --check-prefixes=CHECK,ALIGN-32T,ALIGN-CS-16 %s
; RUN: llc -mtriple=arm-none-eabi -mcpu=cortex-m7 < %s | FileCheck --check-prefixes=CHECK,ALIGN-64,ALIGN-CS-16 %s
; CHECK-LABEL: test
; ALIGN-16: .p2align 1
-; ALIGN-32: .p2align 2
-; ALIGN-64: .p2align 3
+; ALIGN-32A: .p2align 2
+; ALIGN-32T: .p2align 1
+; ALIGN-32T-NEXT: .prefalign 4
+; ALIGN-64: .p2align 1
+; ALIGN-64-NEXT: .prefalign 8
define void @test() {
ret void
diff --git a/llvm/test/CodeGen/LoongArch/linker-relaxation.ll b/llvm/test/CodeGen/LoongArch/linker-relaxation.ll
index 6b197bc578919..873a1f9168323 100644
--- a/llvm/test/CodeGen/LoongArch/linker-relaxation.ll
+++ b/llvm/test/CodeGen/LoongArch/linker-relaxation.ll
@@ -77,7 +77,6 @@ declare dso_local void @callee3() nounwind
; RELAX-NEXT: R_LARCH_RELAX - 0x0
; CHECK-RELOC-NEXT: R_LARCH_PCALA_LO12 g_i1 0x0
; RELAX-NEXT: R_LARCH_RELAX - 0x0
-; RELAX-NEXT: R_LARCH_ALIGN - 0x1C
; CHECK-RELOC-NEXT: R_LARCH_CALL36 callee1 0x0
; RELAX-NEXT: R_LARCH_RELAX - 0x0
; CHECK-RELOC-NEXT: R_LARCH_CALL36 callee2 0x0
diff --git a/llvm/test/CodeGen/PowerPC/code-align.ll b/llvm/test/CodeGen/PowerPC/code-align.ll
index 805873816c4d9..841636d65d87e 100644
--- a/llvm/test/CodeGen/PowerPC/code-align.ll
+++ b/llvm/test/CodeGen/PowerPC/code-align.ll
@@ -20,9 +20,7 @@ entry:
ret i32 %mul
; CHECK-LABEL: .globl foo
-; GENERIC: .p2align 2
-; BASIC: .p2align 4
-; PWR: .p2align 4
+; CHECK: .p2align 2
; CHECK: @foo
}
diff --git a/llvm/test/CodeGen/PowerPC/ppc64-calls.ll b/llvm/test/CodeGen/PowerPC/ppc64-calls.ll
index 2c2743f5400d9..67ff626b4f680 100644
--- a/llvm/test/CodeGen/PowerPC/ppc64-calls.ll
+++ b/llvm/test/CodeGen/PowerPC/ppc64-calls.ll
@@ -19,7 +19,7 @@ define dso_local void @test_direct() nounwind readnone {
tail call void @foo() nounwind
; Because of tail call optimization, it can be 'b' instruction.
; CHECK: [[BR:b[l]?]] foo
-; CHECK-NOT: nop
+; CHECK-NOT: {{^[[:space:]]+}}nop
ret void
}
diff --git a/llvm/test/CodeGen/SystemZ/vec-perm-14.ll b/llvm/test/CodeGen/SystemZ/vec-perm-14.ll
index 0b392676fa3ec..5d437ce8b091d 100644
--- a/llvm/test/CodeGen/SystemZ/vec-perm-14.ll
+++ b/llvm/test/CodeGen/SystemZ/vec-perm-14.ll
@@ -61,7 +61,8 @@ define <4 x i8> @fun1(<2 x i8> %arg) {
; CHECK-NEXT: .space 1
; CHECK-NEXT: .text
; CHECK-NEXT: .globl fun1
-; CHECK-NEXT: .p2align 4
+; CHECK-NEXT: .p2align 1
+; CHECK-NEXT: .prefalign 16, .Lfunc_end1, nop
; CHECK-NEXT: .type fun1, at function
; CHECK-NEXT: fun1: # @fun1
; CHECK-NEXT: .cfi_startproc
@@ -96,7 +97,8 @@ define <4 x i8> @fun2(<2 x i8> %arg) {
; CHECK-NEXT: .space 1
; CHECK-NEXT: .text
; CHECK-NEXT: .globl fun2
-; CHECK-NEXT: .p2align 4
+; CHECK-NEXT: .p2align 1
+; CHECK-NEXT: .prefalign 16, .Lfunc_end2, nop
; CHECK-NEXT: .type fun2, at function
; CHECK-NEXT:fun2: # @fun2
; CHECK-NEXT: .cfi_startproc
diff --git a/llvm/test/CodeGen/X86/eh-label.ll b/llvm/test/CodeGen/X86/eh-label.ll
index 78611000e18dd..b3954700463eb 100644
--- a/llvm/test/CodeGen/X86/eh-label.ll
+++ b/llvm/test/CodeGen/X86/eh-label.ll
@@ -7,7 +7,7 @@ define void @f() personality ptr @g {
bb0:
call void asm ".Lfunc_end0:", ""()
; CHECK: #APP
-; CHECK-NEXT: .Lfunc_end0:
+; CHECK-NEXT: .Lfunc_end0{{.*}}:
; CHECK-NEXT: #NO_APP
invoke void @g() to label %bb2 unwind label %bb1
diff --git a/llvm/test/CodeGen/X86/empty-function.ll b/llvm/test/CodeGen/X86/empty-function.ll
index 7d908311ec8dc..bf05c8e359130 100644
--- a/llvm/test/CodeGen/X86/empty-function.ll
+++ b/llvm/test/CodeGen/X86/empty-function.ll
@@ -16,7 +16,7 @@ entry:
; CHECK-LABEL: f:
; WIN32: nop
; WIN64: nop
-; LINUX-NOT: nop
+; LINUX-NOT: {{^[[:space:]]+}}nop
; LINUX-NOT: ud2
}
diff --git a/llvm/test/CodeGen/X86/kcfi-arity.ll b/llvm/test/CodeGen/X86/kcfi-arity.ll
index 5a19bcd7835ea..d84e7aae9a07c 100644
--- a/llvm/test/CodeGen/X86/kcfi-arity.ll
+++ b/llvm/test/CodeGen/X86/kcfi-arity.ll
@@ -3,7 +3,8 @@
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -verify-machineinstrs -stop-after=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,ISEL
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -verify-machineinstrs -stop-after=kcfi < %s | FileCheck %s --check-prefixes=MIR,KCFI
-; ASM: .p2align 4
+; ASM: .p2align 2
+; ASM: .prefalign 16
; ASM: .type __cfi_f1, at function
; ASM-LABEL: __cfi_f1:
; ASM-NEXT: nop
diff --git a/llvm/test/CodeGen/X86/kcfi-patchable-function-prefix.ll b/llvm/test/CodeGen/X86/kcfi-patchable-function-prefix.ll
index 1b7bd7835e890..cc99739febe41 100644
--- a/llvm/test/CodeGen/X86/kcfi-patchable-function-prefix.ll
+++ b/llvm/test/CodeGen/X86/kcfi-patchable-function-prefix.ll
@@ -1,6 +1,6 @@
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -verify-machineinstrs < %s | FileCheck %s
-; CHECK: .p2align 4
+; CHECK: .prefalign 16
; CHECK-LABEL: __cfi_f1:
; CHECK-COUNT-11: nop
; CHECK-NEXT: movl $12345678, %eax
@@ -13,9 +13,9 @@ define void @f1(ptr noundef %x) !kcfi_type !1 {
ret void
}
-; CHECK: .p2align 4
+; CHECK: .prefalign 16
; CHECK-NOT: __cfi_f2:
-; CHECK-NOT: nop
+; CHECK-NOT: {{^[[:space:]]+}}nop
; CHECK-LABEL: f2:
define void @f2(ptr noundef %x) {
; CHECK: addl -4(%r{{..}}), %r10d
@@ -23,9 +23,9 @@ define void @f2(ptr noundef %x) {
ret void
}
-; CHECK: .p2align 4
+; CHECK: .prefalign 16
; CHECK-LABEL: __cfi_f3:
-; CHECK-NOT: nop
+; CHECK-NOT: {{^[[:space:]]+}}nop
; CHECK-NEXT: movl $12345678, %eax
; CHECK-COUNT-11: nop
; CHECK-LABEL: f3:
@@ -35,9 +35,9 @@ define void @f3(ptr noundef %x) #0 !kcfi_type !1 {
ret void
}
-; CHECK: .p2align 4
+; CHECK: .prefalign 16
; CHECK-NOT: __cfi_f4:
-; CHECK-COUNT-16: nop
+; CHECK-COUNT-16: {{^[[:space:]]+}}nop
; CHECK-LABEL: f4:
define void @f4(ptr noundef %x) #0 {
; CHECK: addl -15(%r{{..}}), %r10d
diff --git a/llvm/test/CodeGen/X86/kcfi.ll b/llvm/test/CodeGen/X86/kcfi.ll
index fd93b8e3d4188..62cb78e770d6c 100644
--- a/llvm/test/CodeGen/X86/kcfi.ll
+++ b/llvm/test/CodeGen/X86/kcfi.ll
@@ -2,7 +2,8 @@
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -verify-machineinstrs -stop-after=finalize-isel < %s | FileCheck %s --check-prefixes=MIR,ISEL
; RUN: llc -mtriple=x86_64-unknown-linux-gnu -verify-machineinstrs -stop-after=kcfi < %s | FileCheck %s --check-prefixes=MIR,KCFI
-; ASM: .p2align 4
+; ASM: .p2align 2
+; ASM: .prefalign 16
; ASM: .type __cfi_f1, at function
; ASM-LABEL: __cfi_f1:
; ASM-NEXT: nop
diff --git a/llvm/test/CodeGen/X86/prefalign.ll b/llvm/test/CodeGen/X86/prefalign.ll
index 062cf740eabeb..45b700e611fa0 100644
--- a/llvm/test/CodeGen/X86/prefalign.ll
+++ b/llvm/test/CodeGen/X86/prefalign.ll
@@ -1,12 +1,11 @@
-; RUN: llc < %s | FileCheck --check-prefixes=CHECK,NOFS %s
-; RUN: llc -function-sections < %s | FileCheck --check-prefixes=CHECK,FS %s
+; RUN: llc < %s | FileCheck %s
+; RUN: llc -function-sections < %s | FileCheck %s
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"
; CHECK: .globl f1
-; NOFS-NEXT: .p2align 4
-; FS-NEXT: .prefalign 16
+; CHECK-NEXT: .prefalign 16
define void @f1() {
ret void
}
@@ -19,9 +18,8 @@ define void @f2() prefalign(1) {
}
; CHECK: .globl f3
-; NOFS-NEXT: .p2align 2
-; FS-NEXT: .p2align 1
-; FS-NEXT: .prefalign 4
+; CHECK-NEXT: .p2align 1
+; CHECK-NEXT: .prefalign 4
define void @f3() align 2 prefalign(4) {
ret void
}
diff --git a/llvm/test/CodeGen/X86/statepoint-invoke.ll b/llvm/test/CodeGen/X86/statepoint-invoke.ll
index 34dbc21a8a8cb..b9400974b4136 100644
--- a/llvm/test/CodeGen/X86/statepoint-invoke.ll
+++ b/llvm/test/CodeGen/X86/statepoint-invoke.ll
@@ -56,7 +56,7 @@ exceptional_return:
; CHECK: .uleb128 .Ltmp{{[0-9]+}}-.Ltmp{{[0-9]+}}
; CHECK: .uleb128 .Ltmp{{[0-9]+}}-.Lfunc_begin{{[0-9]+}}
; CHECK: .byte 0
-; CHECK: .p2align 4
+; CHECK: .prefalign 16
define ptr addrspace(1) @test_result(ptr addrspace(1) %obj,
; CHECK-LABEL: test_result:
@@ -99,7 +99,7 @@ exceptional_return:
; CHECK: .uleb128 .Ltmp{{[0-9]+}}-.Ltmp{{[0-9]+}}
; CHECK: .uleb128 .Ltmp{{[0-9]+}}-.Lfunc_begin{{[0-9]+}}
; CHECK: .byte 0
-; CHECK: .p2align 4
+; CHECK: .prefalign 16
define ptr addrspace(1) @test_same_val(i1 %cond, ptr addrspace(1) %val1, ptr addrspace(1) %val2, ptr addrspace(1) %val3)
; CHECK-LABEL: test_same_val:
diff --git a/llvm/test/MC/ELF/prefalign-errors.s b/llvm/test/MC/ELF/prefalign-errors.s
index 802a78fde7c44..35f35834e308e 100644
--- a/llvm/test/MC/ELF/prefalign-errors.s
+++ b/llvm/test/MC/ELF/prefalign-errors.s
@@ -1,5 +1,47 @@
-// RUN: not llvm-mc -filetype=asm -triple x86_64-pc-linux-gnu %s -o - 2>&1 | FileCheck %s
+# RUN: rm -fr %t && split-file %s %t && cd %t
+# RUN: not llvm-mc -triple=x86_64 a.s 2>&1 | FileCheck a.s
+# RUN: not llvm-mc -triple=x86_64 -filetype=obj b.s 2>&1 | FileCheck b.s
+# RUN: not llvm-mc -triple=x86_64 -filetype=obj c.s 2>&1 | FileCheck c.s
+#--- a.s
.section .text.f1,"ax", at progbits
-// CHECK: {{.*}}.s:[[# @LINE+1]]:12: error: alignment must be a power of 2
+# CHECK: [[#@LINE+1]]:12: error: alignment must be a power of 2
.prefalign 3
+
+# CHECK: [[#@LINE+1]]:13: error: expected ',' and end symbol
+.prefalign 4
+
+# CHECK: [[#@LINE+1]]:14: error: expected symbol name
+.prefalign 4,
+
+# CHECK: [[#@LINE+1]]:23: error: expected integer fill byte or 'nop'
+.prefalign 4,.text.f1,trap
+
+# CHECK: [[#@LINE+1]]:23: error: fill value must be in range [0, 255]
+.prefalign 4,.text.f1,256
+
+# CHECK: [[#@LINE+1]]:23: error: expected integer fill byte or 'nop'
+.prefalign 4,.text.f1,-1
+
+## Non-zero fill in a BSS section.
+.bss
+# CHECK: [[#@LINE+1]]:19: error: non-zero fill in BSS section '.bss'
+.prefalign 4,.Lend,1
+# CHECK: [[#@LINE+1]]:19: error: non-zero fill in BSS section '.bss'
+.prefalign 4,.Lend,nop
+.space 1
+.Lend:
+
+#--- b.s
+## End symbol is undefined.
+.section .text.f1,"ax", at progbits
+# CHECK: <unknown>:0: error: end symbol 'undef' must be a symbol in the current section
+.prefalign 4,undef,0
+
+#--- c.s
+## End symbol is defined in a different section.
+.section .text.f1,"ax", at progbits
+.prefalign 4,.Lend,0
+# CHECK: <unknown>:0: error: end symbol '.Lend' must be a symbol in the current section
+.section .text.f2,"ax", at progbits
+.Lend:
diff --git a/llvm/test/MC/ELF/prefalign.s b/llvm/test/MC/ELF/prefalign.s
index 803bb5d730340..7629d45657df6 100644
--- a/llvm/test/MC/ELF/prefalign.s
+++ b/llvm/test/MC/ELF/prefalign.s
@@ -1,104 +1,109 @@
-// RUN: llvm-mc -triple x86_64 %s -o - | FileCheck --check-prefix=ASM %s
-// RUN: llvm-mc -filetype=obj -triple x86_64 %s -o - | llvm-readelf -SW - | FileCheck --check-prefix=OBJ %s
+# RUN: llvm-mc -triple x86_64 %s -o - | FileCheck --check-prefix=ASM %s
+# RUN: llvm-mc -filetype=obj -triple x86_64 %s -o %t
+# RUN: llvm-readelf -SW %t | FileCheck --check-prefix=OBJ %s
+# RUN: llvm-objdump -d --no-show-raw-insn %t | FileCheck --check-prefix=DIS %s
-// Minimum alignment >= preferred alignment, no effect on sh_addralign.
-// ASM: .section .text.f1lt
-// ASM: .p2align 2
-// ASM: .prefalign 2
-// OBJ: .text.f1lt PROGBITS 0000000000000000 000040 000003 00 AX 0 0 4
-.section .text.f1lt,"ax", at progbits
+## MinAlign >= PrefAlign: the three-way rule is bounded by MinAlign regardless
+## of body size, so sh_addralign stays at MinAlign.
+# ASM: .section .text.f1
+# ASM: .p2align 2
+# ASM: .prefalign 2, .Lf1_end, 0
+# OBJ: .text.f1 PROGBITS 0000000000000000 {{[0-9a-f]+}} 000003 00 AX 0 0 4
+.section .text.f1,"ax", at progbits
.p2align 2
-.prefalign 2
+.prefalign 2, .Lf1_end, 0
.rept 3
-nop
+clc
.endr
+.Lf1_end:
-// ASM: .section .text.f1eq
-// ASM: .p2align 2
-// ASM: .prefalign 2
-// OBJ: .text.f1eq PROGBITS 0000000000000000 000044 000004 00 AX 0 0 4
-.section .text.f1eq,"ax", at progbits
+## Multiple .prefalign on the same end symbol: effective PrefAlign is the maximum.
+# ASM: .section .text.f2
+# ASM: .prefalign 8, .Lf2_end, 0
+# ASM: .prefalign 16, .Lf2_end, 0
+# ASM: .prefalign 8, .Lf2_end, 0
+# OBJ: .text.f2 PROGBITS 0000000000000000 {{[0-9a-f]+}} 000009 00 AX 0 0 16
+.section .text.f2,"ax", at progbits
.p2align 2
-.prefalign 2
-.rept 4
-nop
-.endr
-
-// ASM: .section .text.f1gt
-// ASM: .p2align 2
-// ASM: .prefalign 2
-// OBJ: .text.f1gt PROGBITS 0000000000000000 000048 000005 00 AX 0 0 4
-.section .text.f1gt,"ax", at progbits
-.p2align 2
-.prefalign 2
-.rept 5
-nop
+.prefalign 8, .Lf2_end, 0
+.prefalign 16, .Lf2_end, 0
+.prefalign 8, .Lf2_end, 0
+.rept 9
+clc
.endr
+.Lf2_end:
-// Minimum alignment < preferred alignment, sh_addralign influenced by section size.
-// Use maximum of all .prefalign directives.
-// ASM: .section .text.f2lt
-// ASM: .p2align 2
-// ASM: .prefalign 8
-// ASM: .prefalign 16
-// ASM: .prefalign 8
-// OBJ: .text.f2lt PROGBITS 0000000000000000 000050 000003 00 AX 0 0 4
-.section .text.f2lt,"ax", at progbits
+## Multiple functions in a section, each with its own .prefalign.
+## nop fill; f3b's 5-byte padding is a NOP.
+## f3b: ComputedAlign=8, padding=5
+## f3c: ComputedAlign=16, padding=0
+# ASM: .prefalign 16, .Lf3a_end, nop
+# ASM: .prefalign 16, .Lf3b_end, nop
+# ASM: .prefalign 16, .Lf3c_end, 204
+# OBJ: .text.f3 PROGBITS 0000000000000000 {{[0-9a-f]+}} 000020 00 AX 0 0 16
+# DIS: Disassembly of section .text.f3:
+# DIS: 0: clc
+# DIS-NEXT: 1: clc
+# DIS-NEXT: 2: clc
+# DIS-NEXT: 3: nopl
+# DIS-NEXT: 8: stc
+# DIS: f: stc
+# DIS-NEXT: 10: clc
+# DIS: 1f: clc
+# DIS-EMPTY:
+.section .text.f3,"ax", at progbits
.p2align 2
-.prefalign 8
-.prefalign 16
-.prefalign 8
+.prefalign 16, .Lf3a_end, nop
.rept 3
-nop
+clc
.endr
-
-// ASM: .section .text.f2between1
-// OBJ: .text.f2between1 PROGBITS 0000000000000000 000054 000008 00 AX 0 0 8
-.section .text.f2between1,"ax", at progbits
-.p2align 2
-.prefalign 8
-.prefalign 16
-.prefalign 8
+.Lf3a_end:
+.prefalign 16, .Lf3b_end, nop
.rept 8
-nop
-.endr
-
-// OBJ: .text.f2between2 PROGBITS 0000000000000000 00005c 000009 00 AX 0 0 16
-.section .text.f2between2,"ax", at progbits
-.p2align 2
-.prefalign 8
-.prefalign 16
-.prefalign 8
-.rept 9
-nop
+stc
.endr
-
-// OBJ: .text.f2between3 PROGBITS 0000000000000000 000068 000010 00 AX 0 0 16
-.section .text.f2between3,"ax", at progbits
-.p2align 2
-.prefalign 8
-.prefalign 16
-.prefalign 8
+.Lf3b_end:
+.prefalign 16, .Lf3c_end, 0xcc
.rept 16
-nop
+clc
.endr
+.Lf3c_end:
+## No-op prefalign
+.prefalign 16, .Lf3d_end, 0xcc
+.Lf3d_end:
+.prefalign 16, .Lf3a_end, 0xcc
-// OBJ: .text.f2gt1 PROGBITS 0000000000000000 000078 000011 00 AX 0 0 16
-.section .text.f2gt1,"ax", at progbits
+## Two functions in one section where the second function's padding depends on
+## the first function's size.
+# OBJ: .text.f4 PROGBITS 0000000000000000 {{[0-9a-f]+}} 00001e 00 AX 0 0 16
+# DIS: Disassembly of section .text.f4:
+# DIS: 0: pushq
+# DIS: 7: retq
+# DIS-NEXT: 8: nopl
+# DIS-NEXT: 10: movl
+# DIS: 1d: retq
+# DIS-EMPTY:
+.section .text.f4,"ax", at progbits
.p2align 2
-.prefalign 8
-.prefalign 16
-.prefalign 8
-.rept 17
-nop
-.endr
+.prefalign 16, .Lf4a_end, nop
+pushq %rbp
+movq %rsp, %rbp
+xorl %eax, %eax
+popq %rbp
+retq
+.Lf4a_end:
+.prefalign 16, .Lf4b_end, nop
+movl $0, 0
+xorl %eax, %eax
+retq
+.Lf4b_end:
-// OBJ: .text.f2gt2 PROGBITS 0000000000000000 00008c 000021 00 AX 0 0 16
-.section .text.f2gt2,"ax", at progbits
+## .prefalign in a BSS section with zero fill.
+# ASM: .bss
+# ASM: .prefalign 16, .Lbss_end, 0
+# OBJ: .bss NOBITS 0000000000000000 {{[0-9a-f]+}} 000004 00 WA 0 0 4
+.bss
.p2align 2
-.prefalign 8
-.prefalign 16
-.prefalign 8
-.rept 33
-nop
-.endr
+.prefalign 16, .Lbss_end, 0
+.space 4
+.Lbss_end:
diff --git a/llvm/test/MC/RISCV/prefalign.s b/llvm/test/MC/RISCV/prefalign.s
new file mode 100644
index 0000000000000..0e48a953707fb
--- /dev/null
+++ b/llvm/test/MC/RISCV/prefalign.s
@@ -0,0 +1,34 @@
+# RUN: llvm-mc -filetype=obj -triple riscv64 -mattr=+relax %s -o %t
+# RUN: llvm-readelf -SW %t | FileCheck --check-prefix=OBJ %s
+# RUN: llvm-objdump -d -M no-aliases --no-show-raw-insn %t | FileCheck --check-prefix=DIS %s
+# RUN: llvm-readobj -r %t | FileCheck --check-prefix=RELOC %s
+
+## Two functions in one section with nop fill.
+## f1: body = 12 bytes < 16, ComputedAlign=16, but section start is 16-aligned
+## so pad = 0
+## f2: body = 32 bytes >= 16, ComputedAlign=16, pad = 4 (one nop at 0xc)
+# OBJ: .text.f1 PROGBITS {{[0-9a-f]+}} {{[0-9a-f]+}} 000030 00 AX 0 0 16
+# DIS: 0: addi a0, zero, 0x1
+# DIS-NEXT: 4: addi a0, zero, 0x2
+# DIS-NEXT: 8: add a0, a0, a1
+## Padding nop for f2
+# DIS-NEXT: c: addi zero, zero, 0x0
+## f2 starts at 0x10, aligned to 16
+# DIS-NEXT: 10: add a0, a0, a1
+.section .text.f1,"ax", at progbits
+.p2align 2
+.prefalign 16, .Lf1_end, nop
+addi a0, zero, 1
+addi a0, zero, 2
+add a0, a0, a1
+.Lf1_end:
+.prefalign 16, .Lf2_end, nop
+.rept 8
+add a0, a0, a1
+.endr
+.Lf2_end:
+
+## .prefalign does not emit R_RISCV_ALIGN relocations. The padding is fully
+## resolved at assembly time, so no linker adjustment is needed.
+# RELOC: Relocations [
+# RELOC-NEXT: ]
>From c05329dad2b8568cee018bdf1d4218826ec4cfc8 Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Sun, 1 Mar 2026 12:18:33 -0800
Subject: [PATCH 2/3] coverage
---
llvm/test/MC/ELF/prefalign.s | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/llvm/test/MC/ELF/prefalign.s b/llvm/test/MC/ELF/prefalign.s
index 7629d45657df6..57ecc64781646 100644
--- a/llvm/test/MC/ELF/prefalign.s
+++ b/llvm/test/MC/ELF/prefalign.s
@@ -98,6 +98,16 @@ xorl %eax, %eax
retq
.Lf4b_end:
+## sh_addralign stays at 32, not downgraded by .prefalign.
+# OBJ: .text.f5 PROGBITS 0000000000000000 {{[0-9a-f]+}} 000003 00 AX 0 0 32
+.section .text.f5,"ax", at progbits
+.p2align 5
+.prefalign 16, .Lf5_end, 0
+.rept 3
+clc
+.endr
+.Lf5_end:
+
## .prefalign in a BSS section with zero fill.
# ASM: .bss
# ASM: .prefalign 16, .Lbss_end, 0
>From 66d5257d5598b5dd7939af917b6d65f88a49736e Mon Sep 17 00:00:00 2001
From: Fangrui Song <i at maskray.me>
Date: Mon, 2 Mar 2026 22:05:16 -0800
Subject: [PATCH 3/3] fix quadratic convergence issue
---
llvm/include/llvm/MC/MCAssembler.h | 2 +-
llvm/lib/MC/MCAssembler.cpp | 53 +++++++++------
llvm/test/MC/ELF/prefalign-convergence.s | 86 ++++++++++++++++++++++++
3 files changed, 121 insertions(+), 20 deletions(-)
create mode 100644 llvm/test/MC/ELF/prefalign-convergence.s
diff --git a/llvm/include/llvm/MC/MCAssembler.h b/llvm/include/llvm/MC/MCAssembler.h
index a7b865cb16b81..dad3bf01c9feb 100644
--- a/llvm/include/llvm/MC/MCAssembler.h
+++ b/llvm/include/llvm/MC/MCAssembler.h
@@ -112,7 +112,7 @@ class MCAssembler {
void relaxInstruction(MCFragment &F);
void relaxLEB(MCFragment &F);
void relaxBoundaryAlign(MCBoundaryAlignFragment &BF);
- void relaxPrefAlign(MCFragment &F);
+ void layoutPrefAlign(MCFragment &F, uint64_t RawStart);
void relaxDwarfLineAddr(MCFragment &F);
void relaxDwarfCallFrameFragment(MCFragment &F);
void relaxSFrameFragment(MCFragment &DF);
diff --git a/llvm/lib/MC/MCAssembler.cpp b/llvm/lib/MC/MCAssembler.cpp
index f6f64a6f64f3d..7a98616d3fb5e 100644
--- a/llvm/lib/MC/MCAssembler.cpp
+++ b/llvm/lib/MC/MCAssembler.cpp
@@ -908,32 +908,34 @@ void MCAssembler::relaxBoundaryAlign(MCBoundaryAlignFragment &BF) {
BF.setSize(NewSize);
}
-void MCAssembler::relaxPrefAlign(MCFragment &F) {
+// Compute the body size by walking forward from F to the End symbol and
+// summing fragment sizes. This avoids depending on stale layout offsets.
+void MCAssembler::layoutPrefAlign(MCFragment &F, uint64_t RawStart) {
const MCSymbol &End = F.getPrefAlignEnd();
if (!End.getFragment() || End.getFragment()->getParent() != F.getParent()) {
recordError(SMLoc(), "end symbol '" + End.getName() +
"' must be a symbol in the current section");
return;
}
- uint64_t EndOffset;
- if (!getSymbolOffset(End, EndOffset))
+ const MCFragment *EndFrag = End.getFragment();
+ if (EndFrag->getLayoutOrder() <= F.getLayoutOrder())
return;
- // RawStart is the start of the (variable) padding region; StartOffset is
- // the start of the body (RawStart plus current padding). BodySize is
- // measured from StartOffset, not RawStart, so that padding is not counted
- // as part of the body.
- uint64_t RawStart = F.Offset + F.getFixedSize();
- uint64_t StartOffset = RawStart + F.getVarSize();
+ uint64_t BodySize = 0;
+ for (const MCFragment *Cur = F.getNext();; Cur = Cur->getNext()) {
+ if (Cur == EndFrag) {
+ BodySize += End.getOffset();
+ break;
+ }
+ BodySize += computeFragmentSize(*Cur);
+ }
Align NewAlign;
- if (StartOffset < EndOffset) {
- uint64_t BodySize = EndOffset - StartOffset;
+ if (BodySize) {
if (BodySize < F.getPrefAlignPreferred().value())
NewAlign = Align(NextPowerOf2(BodySize - 1));
else
NewAlign = F.getPrefAlignPreferred();
}
F.setPrefAlignComputed(NewAlign);
- // Compute padding to align the body start to NewAlign.
uint64_t NewPadSize = offsetToAlignment(RawStart, NewAlign);
F.VarContentStart = F.getFixedSize();
F.VarContentEnd = F.VarContentStart + NewPadSize;
@@ -1020,7 +1022,7 @@ bool MCAssembler::relaxFragment(MCFragment &F) {
relaxBoundaryAlign(static_cast<MCBoundaryAlignFragment &>(F));
break;
case MCFragment::FT_PrefAlign:
- relaxPrefAlign(F);
+ layoutPrefAlign(F, F.Offset + F.getFixedSize());
break;
case MCFragment::FT_CVInlineLines:
getContext().getCVContext().encodeInlineLineTable(
@@ -1037,11 +1039,16 @@ bool MCAssembler::relaxFragment(MCFragment &F) {
return computeFragmentSize(F) != Size;
}
+// Assign offsets to fragments. While most fragments are relaxed by
+// relaxFragment, alignment fragments are exceptions: their padding
+// depend on the current offset. If computed in relaxFragment,
+// the offset comes from F.Offset set by the previous layoutSection call.
+// When an upstream alignment fragment changes padding, F.Offset becomes
+// stale, causing each relaxOnce iteration to fix only one more fragment
+// — O(N) iterations for N alignment fragments. Computing them here with
+// the tracked Offset avoids this.
void MCAssembler::layoutSection(MCSection &Sec) {
uint64_t Offset = 0;
- // Note: fragments are not relaxed here. Some fragments depend on
- // downstream symbols whose offsets have not been set in this pass yet.
- // They are instead relaxed by relaxFragment.
for (MCFragment &F : Sec) {
F.Offset = Offset;
if (F.getKind() == MCFragment::FT_Align) {
@@ -1067,6 +1074,10 @@ void MCAssembler::layoutSection(MCSection &Sec) {
if (F.VarContentEnd > F.getParent()->ContentStorage.size())
F.getParent()->ContentStorage.resize(F.VarContentEnd);
Offset += Size;
+ } else if (F.getKind() == MCFragment::FT_PrefAlign) {
+ Offset += F.getFixedSize();
+ layoutPrefAlign(F, Offset);
+ Offset += F.getVarSize();
} else {
Offset += computeFragmentSize(F);
}
@@ -1074,7 +1085,7 @@ void MCAssembler::layoutSection(MCSection &Sec) {
}
unsigned MCAssembler::relaxOnce(unsigned FirstStable) {
- ++stats::RelaxationSteps;
+ uint64_t MaxIterations = 0;
PendingErrors.clear();
unsigned Res = 0;
@@ -1082,8 +1093,10 @@ unsigned MCAssembler::relaxOnce(unsigned FirstStable) {
// Assume each iteration finalizes at least one extra fragment. If the
// layout does not converge after N+1 iterations, bail out.
auto &Sec = *Sections[I];
- auto MaxIter = Sec.curFragList()->Tail->getLayoutOrder() + 1;
+ auto Limit = Sec.curFragList()->Tail->getLayoutOrder() + 1;
+ auto MaxIter = Limit;
for (;;) {
+ --MaxIter;
bool Changed = false;
for (MCFragment &F : Sec)
if (F.getKind() != MCFragment::FT_Data && relaxFragment(F))
@@ -1095,11 +1108,13 @@ unsigned MCAssembler::relaxOnce(unsigned FirstStable) {
// sections. Therefore, we must re-evaluate all sections.
FirstStable = Sections.size();
Res = I;
- if (--MaxIter == 0)
+ if (MaxIter == 0)
break;
layoutSection(Sec);
}
+ MaxIterations = std::max(MaxIterations, uint64_t(Limit - MaxIter));
}
+ stats::RelaxationSteps += MaxIterations;
// The subsequent relaxOnce call only needs to visit Sections [0,Res) if no
// change occurred.
return Res;
diff --git a/llvm/test/MC/ELF/prefalign-convergence.s b/llvm/test/MC/ELF/prefalign-convergence.s
new file mode 100644
index 0000000000000..3debc4210a0ef
--- /dev/null
+++ b/llvm/test/MC/ELF/prefalign-convergence.s
@@ -0,0 +1,86 @@
+// REQUIRES: asserts
+// Test that sections with many .prefalign fragments converge in a small
+// number of relaxation steps (not O(N) steps). Without the layoutSection
+// fix, each relaxOnce inner iteration would only correctly resolve one
+// PrefAlign fragment (because subsequent fragments see stale offsets),
+// leading to O(N) iterations. With the fix, layoutSection recomputes all
+// PrefAlign fragments using the tracked offset, converging in 1 iteration.
+
+// RUN: llvm-mc -filetype=obj -triple x86_64 --stats %s -o %t 2>&1 \
+// RUN: | FileCheck %s
+// CHECK: 1 assembler - Number of assembler layout and relaxation steps
+
+// RUN: llvm-objdump -d --no-show-raw-insn %t | FileCheck --check-prefix=DIS %s
+
+.section .text,"ax", at progbits
+.byte 0
+
+// DIS: 8: nop
+.prefalign 16, .Lend0, nop
+.rept 5
+nop
+.endr
+.Lend0:
+
+// DIS: 10: nop
+.prefalign 16, .Lend1, nop
+.rept 5
+nop
+.endr
+.Lend1:
+
+// DIS: 18: nop
+.prefalign 16, .Lend2, nop
+.rept 5
+nop
+.endr
+.Lend2:
+
+// DIS: 20: nop
+.prefalign 16, .Lend3, nop
+.rept 5
+nop
+.endr
+.Lend3:
+
+// DIS: 28: nop
+.prefalign 16, .Lend4, nop
+.rept 5
+nop
+.endr
+.Lend4:
+
+// DIS: 30: nop
+.prefalign 16, .Lend5, nop
+.rept 5
+nop
+.endr
+.Lend5:
+
+// DIS: 38: nop
+.prefalign 16, .Lend6, nop
+.rept 5
+nop
+.endr
+.Lend6:
+
+// DIS: 40: nop
+.prefalign 16, .Lend7, nop
+.rept 5
+nop
+.endr
+.Lend7:
+
+// DIS: 48: nop
+.prefalign 16, .Lend8, nop
+.rept 5
+nop
+.endr
+.Lend8:
+
+// DIS: 50: nop
+.prefalign 16, .Lend9, nop
+.rept 5
+nop
+.endr
+.Lend9:
More information about the llvm-commits
mailing list