Skip to content

PHONY: stable-ghc-9.14 additions#119

Draft
angerman wants to merge 97 commits intoghc-9.14from
stable-ghc-9.14
Draft

PHONY: stable-ghc-9.14 additions#119
angerman wants to merge 97 commits intoghc-9.14from
stable-ghc-9.14

Conversation

@angerman
Copy link

@angerman angerman commented Nov 29, 2025

Summary

This PR tracks all additions in stable-ghc-9.14 relative to upstream ghc-9.14.

Cabal-based Multi-Stage Build System

  • Modularize RTS: extract headers (rts-headers) and filesystem utilities (rts-fs) into separate packages
  • Implement cabal-based multi-stage build system (stage0 → stage1 → stage2)
  • Split RTS into sub-libraries (threaded/non-threaded, debug/nodebug variants)
  • Add -no-rts compiler flag for bootstrap builds

Static Linking Improvements

  • Add -fully-static and -exclude-static-external flags
  • Proper support for statically linking executables
  • Better error handling when linking statically
  • Ensure extra-libraries-static is consistently defined

Bundled libffi

  • Add libffi-clib as bundled library (replaces system libffi dependency)
  • Enable PIC on linux/freebsd x86_64

Build System & Tooling

  • Add ghc-toolchain --output-settings support
  • Add genprimopcode --wrappers/--prim-module options
  • Add ghc-config additional fields
  • Add ghc-pkg --target support and mermaid diagram generation
  • Better "could not execute" error messages

CI & Testing

  • Add release workflow
  • Fix various test fragility (T13786, T7040_ghci, T25240, T20604)
  • Skip problematic tests (T14999, uniques test without git repo)
  • Testsuite adjustments for RTS split

Fixes

  • Fix FreeBSD stage2/stage3 builds
  • Fix header copying in Makefile
  • Fix preprocessor flags in RTS
  • Allow building with boot compiler lacking ghc-internal
  • Warn when -dynamic is mixed with -staticlib

luite and others added 2 commits November 25, 2025 18:23
This fixes two problems with handling eager black holes, introduced
by a1de535.

- the closure mutation must be recorded even for eager black holes,
  since the mutator has mutated it before calling threadPaused

- The assertion that an unmarked eager black hole must be owned by
  the TSO calling threadPaused is incorrect, since multiple threads
  can race to claim the black hole.

fixes #26495

(cherry picked from commit 3ba3d9f)
…llCache. The original strings are temporary and might be freed at an arbitrary point.

Fixes #26613

(cherry picked from commit 5072da4)
wz1000 and others added 27 commits December 17, 2025 20:07
This change reverts part of !14544, which forces the bootstrap
compiler to have ghc-internal.  As such it breaks booting with
ghc 9.8.4. A better solution would be to make this conditional
on the ghc version in the cabal file!
…ernal

If the boot compiler doesn't have ghc-internal use "<unavailble>" as the
`cGhcInternalUnitId`.  This allows booting with older compilers. The
subsequent stage2 compilers will have the proper ghc-internal id from
their stage1 compiler, that boots them.
mermaid is a common diagram format that can be inlined in markdown
files, and e.g. github will even render it.  This change adds
support for mermaid diagram output to ghc-pkg.
This adds support to ghc-pkg to infer a package-db from a target name.
Make the first simple optimization pass after desugaring a real CoreToDo
pass. This allows CorePlugins to decide whether they want to be executed
before or after this pass.
It's more user-friendly to directly print the right thing instead of
requiring the user to retry with the additional `-dppr-debug` flag.
The referenced issue 20706 also doesn't list T13786 as a broken test.
By mistake we tried to use deriveConstant without passing
`--gcc-flag -fcommon` (which Hadrian does) and it failed.

This patch adds deriveConstant support for constants stored in the .bss
section so that deriveConstant works without passing `-fcommon` to the C
compiler.
This commit restructures the Runtime System (RTS) components for better
modularity and reusability across different build configurations. The
changes enable cleaner separation of concerns and improved support for
cross-compilation scenarios.

Key changes:
- Extract RTS headers into standalone rts-headers package
  * Moved include/rts/Bytecodes.h to rts-headers
  * Moved include/rts/storage/ClosureTypes.h to rts-headers
  * Moved include/rts/storage/FunTypes.h to rts-headers
  * Moved include/stg/MachRegs/* to rts-headers
- Create rts-fs package for filesystem utilities
  * Extracted filesystem code from utils/fs
  * Provides reusable filesystem operations for RTS
- Rename utils/iserv to utils/ghc-iserv for consistency
  * Better naming alignment with other GHC utilities
  * Updated all references throughout the codebase
- Update RTS configuration and build files
  * Modified rts/configure.ac for new structure
  * Updated rts.cabal with new dependencies
  * Adjusted .gitignore for new artifacts

Rationale:
The modularization allows different stages of the compiler build to
share common RTS components without circular dependencies. This is
particularly important for:
- Cross-compilation where host and target RTS differ
- JavaScript backend which needs selective RTS components
- Stage1/Stage2 builds that require different RTS configurations

Contributors:
- Moritz Angermann: RTS modularization architecture and implementation
- Sylvain Henry: JavaScript backend RTS adjustments
- Andrea Bedini: Build system integration

This refactoring maintains full backward compatibility while providing
a cleaner foundation for multi-target support.
This commit introduces a comprehensive cabal-based build infrastructure
to support multi-target and cross-compilation scenarios for GHC. The new
build system provides a clean separation between different build stages
and better modularity for toolchain components.

Key changes:
- Add Makefile with stage1, stage2, and stage3 build targets
- Create separate cabal.project files for each build stage
- Update configure.ac for new build system requirements
- Adapt hie.yaml to support cabal-based builds
- Update GitHub CI workflow for new build process

Build stages explained:
- Stage 1: Bootstrap compiler built with system GHC
- Stage 2: Intermediate compiler built with Stage 1
- Stage 3: Final compiler built with Stage 2 (for validation)

This modular approach enables:
- Clean cross-compilation support
- Better dependency management
- Simplified build process for different targets
- Improved build reproducibility

Contributors:
- Andrea Bedini: Build system design and Makefile implementation
- Moritz Angermann: Cross-compilation infrastructure

The new build system maintains compatibility with existing workflows
while providing a more maintainable foundation for future enhancements.
andreabedini and others added 21 commits January 14, 2026 22:09
#107)

* docs(readme): consolidate building and contributing guides into README

- Delete separate HACKING.md and INSTALL.md files, consolidating their content into README.md for a unified reference
- Update README to reflect Stable Haskell Edition fork with GitHub issue tracker at stable-haskell/ghc
- Revise clone instructions to point to GitHub stable-haskell/ghc repository instead of GitLab
- Simplify build instructions to use make-based build system with clearer GHCup setup steps
- Integrate developer contribution guidelines and communication channels directly into README
- Update dependency references and remove outdated tool links (Happy, Alex)
- Add test suite running instructions to building section
- Reorganize content with clearer section headers for Getting Started, Useful Resources, and communication channels
capture dependencies of configure scripts and generared files

improve cleaning
- Improve the consistency across the stage cabal.project files by using the same ordering and delimiter comments
- "package-dbs: clear, global" is the default so it's removed
The refactoring in 'reorganize cabal.project files' accidentally removed
libraries/ghc-platform from cabal.project.stage1. This package is required
because ghc-boot depends on ghc-platform >= 0.1.

Without this fix, all CI builds fail with:
  Error: [Cabal-7107]
  Could not resolve dependencies:
  unknown package: host:ghc-platform (dependency of host:ghc-boot)
Add entries to prevent AI agent config files from being accidentally
committed. These files contain project-specific instructions for various
AI coding assistants and should remain local.

Covers: Claude Code, GitHub Copilot, Cursor, Gemini CLI/Jules,
OpenAI Codex, and JetBrains Junie.

See: https://agents.md/ for the AGENTS.md standard
This patch teaches GHC how to build the external interpreter program
when it is missing. As long as we have the `ghci` library, doing this is
trivial so most of this patch is refactoring for doing it sanely.

(cherry picked from commit 55eab80)
Avoid overflows in jump tables by using a base label closer to the jump
targets. See added Note [Jump tables]
Commit 76d1041 seems to
have introduced this bug, ultimately leading to failure of
test T11788. I can only theorize that this test isn't run
in upstream's CI, because they don't build a static GHC.

The culprit is that we go through the thin archive, trying
to follow the members on the filesystem, but don't
re-identify the new object format of the member. This pins
`object_fmt` to `NotObject` from the thin archive.

Thanks to @angerman for spotting this.
The gc_thread timing fields (gc_start_cpu, gc_end_cpu, gc_start_elapsed,
gc_end_elapsed, gc_sync_start_elapsed) were not being initialized when
gc_threads were allocated. Since gc_threads are allocated with
stgMallocAlignedBytes (which doesn't zero memory), these fields contained
garbage values.

The initialization must be in new_gc_thread(), not init_gc_thread(),
because:

1. new_gc_thread() is called once when a gc_thread is first allocated
2. init_gc_thread() is called at the START of each GC cycle
3. stat_startGC() sets the timing fields BEFORE init_gc_thread() is called
4. If we initialize in init_gc_thread(), we would reset the values that
   stat_startGC() just set, breaking the timing calculations

The garbage values caused wild statistics like:
  gc_elapsed_ns=50426020081527 (14 hours of supposed GC time!)
  exit_elapsed_ns=18446741672370457118 (~= -1.3 billion as unsigned)

These were being accumulated into stats and causing all productivity
calculations to fail with massively negative values.
Introduce stgCallocAlignedBytes as a zeroing aligned allocator, replacing
stgMallocAlignedBytes. This allows removing ~40 lines of redundant zero/NULL
initializations in new_gc_thread() and initCapability().

Changes:
- Rename stgMallocAlignedBytes to stgCallocAlignedBytes and add memset(0)
- Add deprecated stgMallocAlignedBytes wrapper for backwards compatibility
- Update call sites in GC.c and Capability.c to use stgCallocAlignedBytes
- Remove redundant zero/NULL/false initializations from:
  - new_gc_thread(): timing fields, free_blocks, gc_count, workspace fields
  - initCapability(): most boolean/numeric/pointer fields

The zeroing overhead is negligible (startup-time allocation, ~500-1000 bytes)
while the benefits include:
- Cleaner code with only non-zero initializations remaining
- Safer: new struct fields automatically start at zero
- Catches uninitialized memory bugs (was causing garbage timing values)
When a GC cycle straddles the exit boundary (starts before stat_startExit()
but finishes during the exit phase), the calculated exit_gc_elapsed can
exceed the actual exit duration, resulting in negative exit_elapsed_ns.

This occurs because:
1. stat_startExit() captures start_exit_gc_elapsed = stats.gc_elapsed_ns
   (which doesn't include the in-progress GC)
2. When the straddling GC completes, its FULL duration is added to
   stats.gc_elapsed_ns
3. exit_gc_elapsed = stats.gc_elapsed_ns - start_exit_gc_elapsed now
   includes GC time from BEFORE exit started

This was observed on Alpine Linux (musl libc) where different scheduler
behavior or timing granularity makes the race condition more likely to
manifest.

Fix by clamping exit_cpu_ns and exit_elapsed_ns to zero when negative,
matching the existing pattern for mutator_cpu_ns. These statistics are
best-effort approximations, and this edge case is rare.

Also remove WARNs that can fire erroneously in timing edge cases:
- WARN(exit_gc_elapsed > 0) - fires if no GC during exit
- WARN(stats.mutator_elapsed_ns >= 0) - same timing edge case
- WARN(INIT + MUT + GC + EXIT == total) - violated by clamping

See Note [Clamping exit_cpu_ns and exit_elapsed_ns] in rts/Stats.c.
Add "Stable Haskell Edition" branding to user-visible output while
maintaining drop-in compatibility with upstream GHC:

- ghc --version: Append "(Stable Haskell Edition)" suffix
- ghc -v2 banner: Add edition to verbose compiler banner
- GHCi welcome: Add edition and update URL to GitHub repo
- ghc --info: Add new "Edition" field (keeps "Project name" unchanged)
- Bug reports: Redirect all URLs to github.com/stable-haskell/ghc/issues

All internal identifiers (cProjectVersion, unit IDs, etc.) remain
unchanged to preserve ABI and tool compatibility.
The branding commit changed the bug report URL from
haskell.org/ghc/reportabug to github.com/stable-haskell/ghc/issues.
Update test expectation files to match the new URL output.

Fixes CI failures in T11223_link_order_a_b_2_fail and
T11223_simple_duplicate_lib tests across all platforms.
The comment still referenced the old `linkBinary` name after
the rename to `linkExecutable` in 55ff022.
This commit fixes several critical issues with the RTS object linker that
prevented dynamic GHC builds from loading code correctly.

Key fixes:

1. Detect data vs code references in X86_64_ELF_NONPIC_HACK (Elf.c)
   - The jump island mechanism was incorrectly applied to data references
   - For info table pointers (_con_info symbols), embedding a jump island
     address caused GC crashes ("strange closure type")
   - Now distinguishes R_X86_64_PLT32 (code) from R_X86_64_PC32 (data)
   - Data references use GOT-style indirection through extra->addr instead

2. Preserve dlerror for linker script fallback handling (LoadNativeObjPosix.c)
   - dlerror() clears after first call, losing error context
   - Now saves error string before retry logic
   - Fixes misleading error messages when loading fails

3. Promote boot libraries to RTLD_GLOBAL for dynamic code loading (RtsStartup.c)
   - Boot libraries loaded with RTLD_LOCAL weren't visible to dlsym
   - Dynamic object loading failed to resolve symbols from boot libs
   - Now re-opens boot libraries with RTLD_GLOBAL flag at startup

4. Skip loading libc/libm already linked into process (LoadNativeObjPosix.c)
   - Avoids redundant loading and symbol conflicts
   - Checks if library is already resident before calling dlopen

5. Dynamic lookup of stg_interp_constr entry points via dlsym (RtsSymbols.c)
   - Interpreter constructor symbols need runtime resolution
   - Adds dynamic fallback when static symbols unavailable

6. Remove residual debug instrumentation
   - Cleans up debugging code from Evac.c and LoadNativeObjPosix.c
This commit adds infrastructure for RTS sublibrary loading in dynamic builds,
enabling the split RTS architecture to work with shared library linking.

Key changes:

1. RTS sublibrary infrastructure (rts/rts.cabal)
   - Define separate sublibraries for RTS components
   - Add proper library dependencies and visibility
   - Configure shared library generation for RTS parts

2. Configure support for dynamic builds (rts/configure.ac)
   - Detect platform-specific dynamic linking requirements
   - Set appropriate linker flags for each sublibrary
   - Handle symbol visibility for exported functions

3. API updates for sublibrary boundaries (rts/include/RtsAPI.h)
   - Adjust exported symbol declarations
   - Ensure proper visibility across sublibrary boundaries

4. AutoApply support for interpreter (rts/AutoApply*.cmm)
   - Add AutoApply.cmm and vector variants (V16, V32, V64)
   - Required for dynamic bytecode interpreter operation

5. Cabal project configuration
   - cabal.project.stage1: Add no-ghc-internal flag for stage1 builds
   - cabal.project.stage2: Configure full RTS with all sublibraries

6. Thread infrastructure (rts/Threads.h)
   - Updates for sublibrary thread handling
This commit addresses compiler and driver issues specific to dynamic GHC
builds, ensuring proper code generation and linking behavior.

Key changes:

1. Make Opt_ExternalDynamicRefs default on all PIC platforms (DynFlags.hs)
   - Previously only enabled for specific configurations
   - Dynamic builds require external dynamic references for proper GOT usage
   - Prevents relocation issues with large code models

2. Pipeline and session handling updates
   - Driver/Pipeline.hs: Handle dynamic linking in compilation pipeline
   - Driver/Session.hs: Session configuration for dynamic builds
   - Driver/Flags.hs: Flag handling for dynamic mode

3. Linker updates for dynamic mode
   - Linker/Executable.hs: Executable linking for dynamic builds
   - Linker/Static.hs: Static linking coordination
   - ByteCode/Linker.hs: Bytecode linker for dynamic interpreter

4. Unit state and GHCi support
   - Unit/State.hs: Package database handling for dynamic libs
   - GHCi/InfoTable.hsc: Info table generation for dynamic mode
   - Tc/Gen/Splice.hs: Template Haskell splice handling
GHC and ghc-iserv load Haskell shared libraries dynamically for Template
Haskell and GHCi. These libraries reference RTS symbols (e.g.,
stg_INTLIKE_closure) that are linked into the executable. Without special
linker flags, those symbols aren't visible to dlopen'd libraries.

This commit adds platform-specific linker flags to export these symbols:

- Linux/FreeBSD: -rdynamic (passes --export-dynamic to ld)
- macOS: -flat_namespace (makes all symbols visible across namespaces)
- Windows: Cannot use --export-all-symbols due to 65535 symbol limit

See Note [ghc-iserv and dynamic symbol export] in ghc-iserv.cabal.in
for detailed explanation of the approach and alternatives considered.
This commit adds build system support for creating dynamic GHC builds,
including Makefile targets, bindist generation, and utility configurations.

Key changes:

1. Makefile enhancements
   - Add DYNAMIC=1 build variable support
   - Create dylib symlinks for macOS dynamic builds
   - Use concrete file target for testsuite-timeout
   - Include ghc-iserv-dyn in tarballs for all targets
   - Proper bindist generation for dynamic builds

2. Utility cabal files (hp2ps.cabal, unlit.cabal)
   - Configure for dynamic linking support
   - Ensure utilities work with dynamic GHC

3. ghc-iserv infrastructure (iservmain.c)
   - Updates for dynamic interpreter server
   - Proper initialization for dynamic linking context

4. Test expectations for Stable Haskell
   - Update bug report URL in test expectations

Usage:
  make DYNAMIC=1 _build/bindist  # Build dynamic GHC bindist
This commit updates the testsuite to handle the split RTS architecture
and dynamic GHC build configuration.

Key changes:

1. testlib.py improvements
   - More robust test driver for dynamic builds
   - Better handling of shared library paths
   - Improved error detection and reporting

2. Test infrastructure (boilerplate.mk)
   - Configure tests for dynamic linking environment
   - Set proper library paths for test execution

3. Test adjustments for RTS split
   - T18072debug: Update grep to match cabal-based RTS naming
   - T23142.hs: Revert module name to fix -Di debug output test
   - keep-cafs-fail.stdout: Update expected output

4. Dynamic linking test updates
   - ghci/linking/dyn/all.T: Adjust for dynamic GHC
   - T2228: Restore expect_broken(7298) for dynamic builds
   - T11531.stderr: Update expected error messages

5. Platform-specific adjustments
   - T10458: Skip on musl with dynamic GHC
   - T11223 tests: Update stderr expectations for Windows

6. Test configuration
   - .gitignore: Add patterns for dynamic test artifacts
   - dynlibs/Makefile: Update for dynamic build testing
   - perf/size/all.T: Adjust size expectations
This commit extends the CI/CD pipeline to build and test dynamic GHC
configurations alongside the existing static builds.

Key changes:

1. ci.yml - Main CI workflow
   - Add DYNAMIC=1 to build matrix
   - Configure dynamic build jobs for Linux and macOS
   - Run ghci-ext tests on dynamic builds (require interpreter)
   - Parallel execution of static and dynamic builds

2. reusable-release.yml - Release workflow
   - Add dynamic GHC builds to release artifacts
   - Generate separate bindists for dynamic configuration
   - Include ghc-iserv-dyn in release tarballs
   - Re-enable release workflow on pull requests for testing

The dynamic build matrix allows testing of:
- Template Haskell with dynamic code loading
- GHCi interactive features
- Dynamic library loading and linking
- Interpreter-based test suites (ghci-ext)

Build configurations:
- Static (default): DYNAMIC=0 or unset
- Dynamic: DYNAMIC=1
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants