<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/"><channel><title>Publications on Rootsec</title><link>https://roots.ec/publications/</link><description>Recent publications from Rootsec</description><generator>Hugo -- gohugo.io</generator><language>en-us</language><lastBuildDate>Fri, 14 Aug 2026 00:00:00 +0000</lastBuildDate><atom:link href="https://roots.ec/publications/feed.xml" rel="self" type="application/rss+xml"/><item><title>StackWarp: Breaking AMD SEV-SNP Integrity via Deterministic Stack-Pointer Manipulation through the CPU’s Stack Engine</title><link>https://roots.ec/publications/zhang2026stackwarp/</link><pubDate>Fri, 14 Aug 2026 00:00:00 +0000</pubDate><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Tristan Hornetz</dc:creator><dc:creator>Daniel Weber</dc:creator><dc:creator>Fabian Thomas</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/zhang2026stackwarp/</guid><description>Confidential Virtual Machines (CVMs), such as AMD SEV-SNP, aim to protect guest operating systems from an untrusted host by encrypting state and constraining privileged control. These platforms promise isolation even in multi-tenant cloud setups where simultaneous multithreading (SMT) remains enabled. While prior attacks focus on the memory hierarchy or execution units, they largely ignore frontend configurations. In this paper, we present StackWarp, a software-based architectural attack exploiting the stack engine on AMD Zen CPUs to modify the stack pointer within an SEV-SNP guest, fully breaking integrity. StackWarp relies on an undocumented bit within a shared model-specific register (MSR) available on AMD Zen 1–5 CPUs that enables or disables the stack engine. Our reverse engineering shows that the state of the stack engine is not correctly synchronized across the logical cores, allowing an attacker to deterministically adjust the stack pointer on the sibling logical core across Zen generations, including fully patched Zen 5. We discover StackWarp via a systematic exploration of the MSR space, including undocumented MSRs. By flipping MSR bits, we discover bits that affect SEV-SNP guests running on a sibling logical core. To demonstrate the security impact, we show StackWarp in four end-to-end attacks on SEV-SNP guests. RSA-CRT private-key recovery, OpenSSH password-authentication bypass, and privilege escalations using either sudo or a kernel-mode ROP chain. We conclude with software hardening guidance and argue for a microcode or hardware change that prevents cross-core control of the stack engine when CVMs are active. Our results show that leaving SMT enabled undermines SEV-SNP integrity guarantees today.<br/><br/>Published at USENIX Security 2026.</description><category>USENIX Security</category></item><item><title>InstrSem: Automatically and Generically Inferring Semantics of (Undocumented) CPU Instructions</title><link>https://roots.ec/publications/hetterich2026instrsem/</link><pubDate>Wed, 12 Aug 2026 00:00:00 +0000</pubDate><dc:creator>Lorenz Hetterich</dc:creator><dc:creator>Fabian Thomas</dc:creator><dc:creator>Tristan Hornetz</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/hetterich2026instrsem/</guid><description>Modern CPUs implement complex Instruction Set Architectures (ISAs), yet machine-readable semantics are often incomplete. Worse, many CPUs support undocumented instructions, i.e., bitstrings that execute on hardware but are absent from specifications, leading to potential security vulnerabilities. In this paper, we present InstrSem, an ISA-agnostic, modular, fully automated approach to infer instruction semantics from execution behavior alone and provide semantics that are understandable by both, humans and machines. Starting from a raw encoding, InstrSem executes it under systematically varied architectural states and synthesizes compact mathematical functions that explain every changed state component. By mutating encoding bits and correlating induced behavioral changes with bit positions, InstrSem then generalizes from a single encoding to a full instruction, recovering register and immediate fields. In contrast to prior work focusing on a single ISA, InstrSem is generic. It requires only a lightweight ISA model and a per-architecture user-space runner and supports fixed- and variable-length encodings (RISC and CISC), memory accesses, and conditional behavior. We evaluate InstrSem on RV64I, AArch64, and LA64, and additionally showcase CISC applicability on a Logitech macro language and partial x86-64. InstrSem automatically recovers correct semantics for over 97.81% of the RV64I base instruction set, and 136 instructions covering 1009055744 instruction encodings within 77 hours for the LA64 instruction set. InstrSem discovers undocumented vector instructions, inconsistencies between QEMU and Loongson hardware, and instructions that crash QEMU. InstrSem enables scalable recovery of instruction semantics, substantially automating reverse engineering across commodity and niche targets and strengthening the foundations for emulation, verification, and security analysis. With minimal requirements to support new architectures, its modular design, and human-readable output, InstrSem can aid future security analysis.<br/><br/>Published at USENIX Security 2026.</description><category>USENIX Security</category></item><item><title>Crucible: Retrofitting Commodity CPUs with Vulnerabilities via Transparent Software Emulation</title><link>https://roots.ec/publications/hornetz2026crucible/</link><pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate><dc:creator>Tristan Hornetz</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/hornetz2026crucible/</guid><description><br/><br/>Published at S&amp;P 2026.</description><category>S&amp;P</category></item><item><title>TDXRay: Microarchitectural Side-Channel Analysis of Intel TDX for Real-World Workloads</title><link>https://roots.ec/publications/hornetz2026tdxray/</link><pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate><dc:creator>Tristan Hornetz</dc:creator><dc:creator>Hosein Yavarzadeh</dc:creator><dc:creator>Albert Cheu</dc:creator><dc:creator>Adria Gascon</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Daniel Moghimi</dc:creator><dc:creator>Phillipp Schoppmann</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Ruiyi Zhang</dc:creator><guid>https://roots.ec/publications/hornetz2026tdxray/</guid><description><br/><br/>Published at S&amp;P 2026.</description><category>S&amp;P</category></item><item><title>RISCy Cache Coherence: Timer-Free Architectural Cache Attacks via Instruction/Data Cache Incoherence</title><link>https://roots.ec/publications/thomas2026riscy/</link><pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate><dc:creator>Fabian Thomas</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/thomas2026riscy/</guid><description><br/><br/>Published at S&amp;P 2026.</description><category>S&amp;P</category></item><item><title>TREVEX: A Black-Box Detection Framework For Data-Flow Transient Execution Vulnerabilities</title><link>https://roots.ec/publications/weber2026trevex/</link><pubDate>Mon, 18 May 2026 00:00:00 +0000</pubDate><dc:creator>Daniel Weber</dc:creator><dc:creator>Fabian Thomas</dc:creator><dc:creator>Leon Trampert</dc:creator><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/weber2026trevex/</guid><description><br/><br/>Published at S&amp;P 2026.</description><category>S&amp;P</category></item><item><title>SNPeek: Side-Channel Analysis for Privacy Applications on Confidential VMs</title><link>https://roots.ec/publications/zhang2025snpeek/</link><pubDate>Mon, 23 Feb 2026 00:00:00 +0000</pubDate><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Albert Cheu</dc:creator><dc:creator>Adria Gascon</dc:creator><dc:creator>Daniel Moghimi</dc:creator><dc:creator>Phillipp Schoppmann</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Octavian Suciu</dc:creator><guid>https://roots.ec/publications/zhang2025snpeek/</guid><description>Confidential virtual machines (CVMs) based on trusted execution environments (TEEs) enable new privacy-preserving solutions. Yet, they leave side-channel leakage outside their threat model, shifting the responsibility of mitigating such attacks to developers. However, mitigations are either not generic or too slow for practical use, and developers currently lack a systematic, efficient way to measure and compare leakage across real-world deployments. In this paper, we present SNPeek, an open-source toolkit that offers configurable side-channel tracing primitives on production AMD SEV-SNP hardware and couples them with statistical and machine-learning-based analysis pipelines for automated leakage estimation. We apply SNPeek to three representative workloads that are deployed on CVMs to enhance user privacy-private information retrieval, private heavy hitters, and Wasm user-defined functions-and uncover previously unnoticed leaks, including a covert channel that exfiltrates data at 497 kbit/s. The results show that SNPeek pinpoints vulnerabilities and guides low-overhead mitigations based on oblivious memory and differential privacy, giving practitioners a practical path to deploy CVMs with meaningful confidentiality guarantees.<br/><br/>Published at NDSS 2026.</description><category>NDSS</category></item><item><title>Zero-Store Elimination and its Implications on the SIKE Cryptosystem</title><link>https://roots.ec/publications/gerlach2026zerostore/</link><pubDate>Tue, 03 Feb 2026 00:00:00 +0000</pubDate><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Niklas Flentje</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/gerlach2026zerostore/</guid><description>Modern processors spend a significant amount of their execution cycles waiting on memory. Value-based optimizations tackle this bottleneck by optimizing for specific memory content patterns. Zero-store elimination, in particular, skips memory writes for redundant zero values, reducing memory pressure and boosting processor performance. We investigate the state of zero-store elimination in modern Intel processors and design experiments to reverse engineer its properties. We identify the conditions that trigger zero-store elimination and demonstrate how an attacker can selectively induce zero-store elimination. Similar to previous work on pointer prediction on Apple silicon, our analysis reveals that zero-store elimination has severe security implications, reaffirming Intel’s decision to turn off this optimization via microcode updates. Our analysis reveals that value-based optimizations extend traditional side-channel attacker models, exposing partial information about the processed values (as opposed to just metadata). This expanded attack surface, created by value-based optimizations, breaks constant-time programming techniques, enabling attacks such as key leakage from Supersingular Isogeny Key Encapsulation (SIKE). We design a zero-store elimination-based attack on SIKE that recovers 208 of the 217 bits of the secret key in 3.7 s. Additionally, we provide a dynamic analysis tool to detect zero-store elimination in programs and verify that it successfully detects SIKE’s weakness toward zero-store elimination. We propose mitigations that allow a tradeoff between security and performance. Our findings caution against the broader implications of value-based optimizations and urge careful consideration of their security risks in future processor designs.<br/><br/>Published at uASC 2026.</description><category>uASC</category></item><item><title>ExfilState: Automated Discovery of Timer-Free Cache Side Channels on ARM CPUs</title><link>https://roots.ec/publications/thomas2025exfilstate/</link><pubDate>Mon, 13 Oct 2025 00:00:00 +0000</pubDate><dc:creator>Fabian Thomas</dc:creator><dc:creator>Michael Torres</dc:creator><dc:creator>Daniel Moghimi</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/thomas2025exfilstate/</guid><description>Microarchitectural attacks and reverse-engineering efforts rely on inferring the cache state of cache lines. While high-resolution timers traditionally enable this, such timers are increasingly restricted or unavailable to unprivileged users on modern ARM64 systems. We introduce a fuzzing-based methodology to automatically discover instruction sequences that leak cache state into architectural state—without timing measurements. Our proof-of-concept, ExfilState, uses differential testing, F-score ranking, and covert-channel verification to identify architectural side channels on ARM64 CPUs. Across 160 devices with 37 microarchitectures—including smartphones, laptops, and cloud servers—ExfilState uncovers 5 undocumented side channels, 2 of which are reliably and widely exploitable. We demonstrate their practical impact with a timer-free Spectre variant, a cache-based AES key-recovery attack, and a novel defense mechanism that aborts sensitive algorithms on eviction of victim cache lines. Our findings show that architectural side channels are both real and exploitable, even in environments without timers, broadening the attack surface on modern ARM64 platforms.<br/><br/>Published at CCS 2025.</description><category>CCS</category></item><item><title>RISCover: Automatic Discovery of User-exploitable Architectural Security Vulnerabilities in Closed-Source RISC-V CPUs</title><link>https://roots.ec/publications/thomas2025riscover/</link><pubDate>Mon, 13 Oct 2025 00:00:00 +0000</pubDate><dc:creator>Fabian Thomas</dc:creator><dc:creator>Eric García Arribas</dc:creator><dc:creator>Lorenz Hetterich</dc:creator><dc:creator>Daniel Weber</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/thomas2025riscover/</guid><description>The open and extensible RISC-V instruction set has enabled many new CPU vendors and implementations, but most commercial CPUs are closed-source, significantly hindering vulnerability analysis—especially for bugs exploitable from unprivileged user space. We present RISCover, a user-space framework for detecting architectural vulnerabilities in closed-source RISC-V CPUs. It compares instruction-sequence behavior across CPUs, identifying deviations without source code, hardware changes, or models, and achieving orders-of-magnitude speedups over RTL-based methods. Unlike prior work, RISCover runs user code on Linux directly on real hardware, exposing vulnerabilities exploitable by unprivileged attackers. Evaluated on 8 off-the-shelf CPUs from 3 different vendors, it uncovers 4 previously unknown vulnerabilities. Notably, GhostWrite lets unprivileged code write chosen bytes to physical memory, enabling arbitrary data leakage and full machine-mode execution, while 3 unprivileged "halt-and-catch-fire" bugs halt CPUs and misaligned zero-stores silently corrupt data. Our results highlight the pressing need for post-silicon fuzzing techniques. RISCover complements existing RTL-level fuzzers by enabling rapid and automated security analysis of closed-source CPUs.<br/><br/>Published at CCS 2025.</description><category>CCS</category></item><item><title>Styled to Steal: The Overlooked Attack Surface in Email Clients</title><link>https://roots.ec/publications/trampert2025styled/</link><pubDate>Mon, 13 Oct 2025 00:00:00 +0000</pubDate><dc:creator>Leon Trampert</dc:creator><dc:creator>Daniel Weber</dc:creator><dc:creator>Christian Rossow</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/trampert2025styled/</guid><description>Email is still a widely used communication medium, particularly in professional contexts. Standards such as OpenPGP and S/MIME offer encryption while maintaining compatibility with existing infrastructure. Within the end-to-end encryption threat model, email servers are untrusted, which creates opportunities for attackers to inject malicious HTML or CSS into encrypted emails. Although the 2018 Efail attack led to substantial mitigations against direct content exfiltration in such mixed-context scenarios, it remains unclear whether these measures in email clients sufficiently protect encrypted content from more subtle, software-level rendering attacks. In this paper, we show that isolation mechanisms in widely used email client software remain inadequate. We present a novel scriptless attack that extracts arbitrary plaintext from encrypted emails using only CSS without requiring JavaScript. Our approach leverages container queries, lazy-loading fonts, and adaptive font ligatures to leak sensitive information without visual clues for the victim. We can incrementally extract unknown textual data from mixed-context emails. This approach undermines the security of email encryption by enabling text exfiltration from encrypted emails in a single shot. We demonstrate the severity of this threat through an end-to-end attack, successfully exfiltrating PGP-encrypted text from an email rendered in the latest version of Mozilla Thunderbird. Furthermore, we show that our technique affects code integrity tools and sanitization techniques reused in software stacks, including Meta's Code Verify. Our findings led to practical mitigations in Thunderbird, as well as a revision of Meta's threat model to include CSS. These results underline the need for robust content isolation in email client software and challenge the assumption that existing mitigations fully prevent encrypted content leakage.<br/><br/>Published at CCS 2025.</description><category>CCS</category></item><item><title>Confusing Value with Enumeration: Studying the Use of CVEs in Academia</title><link>https://roots.ec/publications/schloegel2025cve/</link><pubDate>Wed, 13 Aug 2025 00:00:00 +0000</pubDate><dc:creator>Moritz Schloegel</dc:creator><dc:creator>Daniel Klischies</dc:creator><dc:creator>Simon Koch</dc:creator><dc:creator>David Klein</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Malte Wessels</dc:creator><dc:creator>Leon Trampert</dc:creator><dc:creator>Martin Johns</dc:creator><dc:creator>Mathy Vanhoef</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Thorsten Holz</dc:creator><dc:creator>Jo Van Bulck</dc:creator><guid>https://roots.ec/publications/schloegel2025cve/</guid><description>Common Vulnerabilities and Exposures (CVE) IDs serve as unique identifiers for security-relevant bugs, facilitating clear communication and tracking of affected products. Originally intended solely for identification, the CVE system has faced increasing criticism due to the misconception that assigning a CVE implies a serious security issue. Notably, academic works on security vulnerabilities often claim CVEs, presumably to demonstrate the practical impact of their methods. We systematically study the use of CVEs in academic papers to better understand the correlation of academic CVEs with real-world implications. To this end, we present the trends we identified through quantitative analysis, qualitative review of published papers, and a user survey. We observe a clear shift towards more frequent use of CVEs in academic papers over the last 25 years, especially in certain research areas. Our qualitative review of 1,803 CVEs claimed in papers published in the past five years reveals that 34% have not been publicly confirmed or were disputed by the maintainers of the affected software, challenging the notion of real-world effects. Our survey of 103 academic reviewers and authors reveals widespread misconceptions about the CVE system and an explicit preference for reporting CVE numbers, but without indicating any implicit bias in the review process. We advise caution on using CVEs as a proxy for real-world impact and provide actionable recommendations for the academic security community and practitioners.<br/><br/>Published at USENIX Security 2025.</description><category>USENIX Security</category></item><item><title>SCASE: Automated Secret Recovery via Side-Channel-Assisted Symbolic Execution</title><link>https://roots.ec/publications/weber2025scase/</link><pubDate>Wed, 13 Aug 2025 00:00:00 +0000</pubDate><dc:creator>Daniel Weber</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Leon Trampert</dc:creator><dc:creator>Youheng Lue</dc:creator><dc:creator>Jo Van Bulck</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/weber2025scase/</guid><description>In recent years, there has been an explosion of research on software-based side-channel attacks, which commonly require an in-depth understanding of the victim application to extract sensitive information. With evermore leakage sources and targets, an important remaining challenge is how to automatically reconstruct secrets from side-channel traces. This paper proposes SCASE, a novel methodology for inferring secrets from an opaque victim binary using symbolic execution, guided by a concrete side-channel trace. Our key innovation is in utilizing the memory accesses observed in the side-channel trace to effectively prune the symbolic-execution space, thus avoiding state explosion. To demonstrate the effectiveness of our approach, we introduce Athena, a proof-of-concept framework to automatically recover secrets from Intel SGX enclaves via controlled channels. We show that Athena can automatically recover the 2048-bit secret key of an enclave running RSA within 4 minutes and the 256-bit key from an RC4 KSA implementation within 5 minutes. Furthermore, we demonstrate key recovery of OpenSSL’s 256-bit AES S-Box implementation and recover the inputs to OpenSSL’s binary extended Euclidean algorithm. To demonstrate the versatility of our approach beyond cryptographic applications, we further recover the input to a poker-hand evaluator. In conclusion, our findings indicate that constraining symbolic execution via side-channel traces is an effective way to automate software-based side-channel attacks without requiring an in-depth understanding of the victim application.<br/><br/>Published at USENIX Security 2025.</description><category>USENIX Security</category></item><item><title>Taming the Linux Memory Allocator for Rapid Prototyping</title><link>https://roots.ec/publications/zhang2025mapalloc/</link><pubDate>Wed, 09 Jul 2025 00:00:00 +0000</pubDate><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Tristan Hornetz</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/zhang2025mapalloc/</guid><description>Microarchitectural attacks pose an increasing threat to system security. They enable attackers to extract sensitive information such as cryptographic keys, website usage patterns, or keystrokes. Software-level defenses, such as constant-time implementations, mitigate some attack vectors but impose significant challenges on developers. Operating-system-level mitigations, such as page coloring and memory isolation, address these threats but require intricate kernel modifications and time-consuming workflows, making prototyping new defenses complex. In this paper, we present MAPAlloc (Microarchitectural Prototyping Allocator), a flexible, cross-architecture framework for rapidly prototyping memory allocation-based defenses and attacks on Linux systems. Using a simple domain-specific language, MAPAlloc allows precise control over physical memory allocation on x86, ARMv8, and RISC-V. MAPAlloc enables quick implementation and evaluation of mitigations such as page coloring and novel techniques like layered page coloring, increasing the number of cache colors from 32 to 256 on modern CPUs. We demonstrate MAPAlloc’s versatility through case studies that prevent Prime+Probe and DRAMA attacks and reverse-engineer the AMD Zen 4 complex cache-indexing function for use in layered page coloring. Additionally, we prototype a Prime+Probe attack with an incomplete non-linear slice function from previous work by limiting the physical memory using MAPAlloc. Without MAPAlloc, such defense and attack prototypes require complicated modifications of the Linux kernel, making them hard to develop and test. Thus, MAPAlloc is an essential framework for simplifying research in microarchitectural security.<br/><br/>Published at DIMVA 2025.</description><category>DIMVA</category></item><item><title>Rapid Reversing of Non-Linear CPU Cache Slice Functions: Unlocking Physical Address Leakage</title><link>https://roots.ec/publications/rainer2025rapid/</link><pubDate>Mon, 12 May 2025 00:00:00 +0000</pubDate><dc:creator>Mikka Rainer</dc:creator><dc:creator>Lorenz Hetterich</dc:creator><dc:creator>Fabian Thomas</dc:creator><dc:creator>Tristan Hornetz</dc:creator><dc:creator>Leon Trampert</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/rainer2025rapid/</guid><description>Microarchitectural attacks are a growing threat to modern computing systems. CPU caches are an essential but complex element in many microarchitectural attacks, making it crucial to understand the inner workings. Despite progress in reverse-engineering techniques, non-linear cache-slice functions remain challenging to analyze, especially in recent Intel hybrid microarchitectures. In this paper, we introduce a novel approach towards reverse-engineering complex, non-linear cache-slice functions, particularly on modern Intel CPUs with hybrid microarchitectures. Our method significantly advances prior work by understanding the specific structure of microarchitectural hash functions, reducing the time required for reverse-engineering from days to minutes. In contrast to prior work, our technique successfully handles systems with 512GB of memory and diverse slice configurations. We present 17 newly identified functions used for cache-slice addressing and extend existing functions to support systems with more DRAM for multiple CPU generations. Additionally, we introduce an unprivileged virtual-to-physical address oracle that is a direct consequence of the complexity of the non-linear slice functions. Our method is particularly effective on modern Intel hybrid CPUs, including Alder Lake and Meteor Lake, where previously used methods for measuring slices or leaking physical addresses are unavailable. In 3 case studies, we validate our approach, demonstrating its effectiveness in executing targeted Spectre attacks on non-attacker-mapped memory, enabling DRAMA attacks, and creating cache eviction sets. Our findings emphasize the increased attack surface introduced by complex cache-slice functions in modern CPUs.<br/><br/>Published at S&amp;P 2025.</description><category>S&amp;P</category></item><item><title>Do Compilers Break Constant-time Guarantees?</title><link>https://roots.ec/publications/gerlach2025compiler/</link><pubDate>Mon, 14 Apr 2025 00:00:00 +0000</pubDate><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Robert Pietsch</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/gerlach2025compiler/</guid><description>Side-channel attacks are a significant concern for the implementation of cryptographic algorithms. Data-oblivious programming is a discipline that helps mitigate side-channel attacks by preventing data leakage over side channels. However, due to various optimizations in modern compilers, data-obliviousness cannot be guaranteed in high-level languages. This work investigates to which extent compiler optimizations violate data-obliviousness. To this end, we present data-oblivious compiler checker (DOCC), an automated binary testing pipeline for detecting data-obliviousness violations under different compiler configurations. We show that DOCC is applicable across 6 widely used compilers. Additionally, DOCC can retrofit existing analysis tools with advanced leakage models, such as data-dependent instruction execution times and data-obliviousness under speculation. We evaluate DOCC on 5 major cryptographic libraries and the recently proposed NIST lightweight cryptography primitives. We reveal data-obliviousness violations in 93 out of the 127 tested algorithms and 1845 out of the 12,917 test cases across different cryptographic libraries, building blocks, and programming languages. We demonstrate that the choice of compiler and optimizations heavily influences the resulting binary’s properties.<br/><br/>Published at FC 2025.</description><category>FC</category></item><item><title>Lixom: Protecting Encryption Keys with Execute-Only Memory</title><link>https://roots.ec/publications/hornetz2025lixom/</link><pubDate>Mon, 14 Apr 2025 00:00:00 +0000</pubDate><dc:creator>Tristan Hornetz</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/hornetz2025lixom/</guid><description>The confidentiality of cryptographic secrets is crucial for the security of modern computing systems. However, ensuring the confidentiality can be difficult in the presence of privileged attackers or transient-execution vulnerabilities such as Meltdown or Spectre. Trusted Execution Environments (TEEs) offer protection but are not always available and may require significant redesigns. In this paper, we present Lixom, a lightweight and generic technique for providing leakage resistance to cryptographic secrets on x86 processors. Lixom achieves its confidentiality guarantees by storing secrets in code instead of data and preventing accesses with execute-only memory (XOM). In virtual machines, Lixom can protect secrets from a compromised guest kernel, providing security guarantees comparable to TEEs. Additionally, Lixom provides robust protection against Spectre attacks, Meltdown, and Foreshadow, without impacting the throughput of algorithms such as AES. Lixom is broadly applicable as a hardening mechanism and can tangibly improve the security of applications like disk encryption or digital rights management.<br/><br/>Published at FC 2025.</description><category>FC</category></item><item><title>Peripheral Instinct: How External Devices Breach Browser Sandboxes</title><link>https://roots.ec/publications/trampert2025peripheralinstinct/</link><pubDate>Tue, 08 Apr 2025 00:00:00 +0000</pubDate><dc:creator>Leon Trampert</dc:creator><dc:creator>Lorenz Hetterich</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Mona Schappert</dc:creator><dc:creator>Christian Rossow</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/trampert2025peripheralinstinct/</guid><description>Browser APIs such as WebHID, WebUSB, Web Serial, and Web MIDI enable web applications to interact directly with external devices. The support of such APIs in Chromium-based browsers, such as Chrome and Edge, radically changes the threat model for peripherals and increases the attack surface. In the past, devices could assume a trusted host, i.e., the operating system. Now, the host is a potentially malicious website and cannot be trusted. We show how this changed threat model leads to security and privacy problems, up to a complete compromise of the operating system. While the API specifications list initial security considerations, they shift the responsibility to (unprepared) device vendors. We systematically analyze the security implications of external devices exposed by such new APIs. By reverse-engineering peripheral devices of several popular widespread vendors, we show that many vendors allow controlling devices via Web APIs up to reprogramming or even fully replacing the firmware. Consequently, web attackers can reprogram devices with malicious payloads and custom firmware without requiring any physical interaction. To demonstrate the security implications, we build several full-chain exploits, leading to arbitrary code execution on the victim system, circumventing the browser sandbox. Our research shows that browser security should not rely on the secure implementation of third-party hardware.<br/><br/>Published at WWW 2025.</description><category>Browser</category><category>Browser Security</category><category>WWW</category></item><item><title>ShadowLoad: Injecting State into Hardware Prefetchers</title><link>https://roots.ec/publications/hetterich2025shadowload/</link><pubDate>Sun, 30 Mar 2025 00:00:00 +0000</pubDate><dc:creator>Lorenz Hetterich</dc:creator><dc:creator>Fabian Thomas</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Nils Bernsdorf</dc:creator><dc:creator>Eduard Ebert</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/hetterich2025shadowload/</guid><description>Hardware prefetchers are an optimization in modern CPUs that predict memory accesses and preemptively load the corresponding value into the cache. Previous work showed that the internal state of hardware prefetchers can act as a side channel, leaking information across security boundaries such as processes, user and kernel space, and even trusted execution environments. In this paper, we present ShadowLoad, a new attack primitive to bring inaccessible victim data into the cache by injecting state into the hardware prefetcher. ShadowLoad relies on the inner workings of the hardware stride prefetchers, which we automatically reverse-engineer using our tool StrideRE. We illustrate how ShadowLoad extends the attack surface of existing microarchitectural attacks such as Meltdown and software-based power analysis attacks like Collide+Power and how it can partially bypass L1TF mitigations on clouds, such as AWS. We further demonstrate FetchProbe, a stride prefetcher side-channel attack leaking offsets of memory accesses with sub-cache-line granularity, extending previous work on control-flow leakage. We demonstrate FetchProbe on the side-channel hardened Base64 implementation of WolfSSL, showing that even real-world side-channel-hardened implementations can be attacked with our new attack.<br/><br/>Published at ASPLOS 2025.</description><category>ASPLOS</category></item><item><title>Cascading Spy Sheets: Exploiting the Complexity of Modern CSS for Email and Browser Fingerprinting</title><link>https://roots.ec/publications/trampert2025cascadingspysheets/</link><pubDate>Sun, 23 Feb 2025 00:00:00 +0000</pubDate><dc:creator>Leon Trampert</dc:creator><dc:creator>Daniel Weber</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Christian Rossow</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/trampert2025cascadingspysheets/</guid><description>In an attempt to combat user tracking, both privacy-aware browsers (e.g., Tor) and email applications usually disable JavaScript. This effectively closes a major angle for user fingerprinting. However, recent findings hint at the potential for privacy leakage through selected Cascading Style Sheets (CSS) features. Nevertheless, the full fingerprinting potential of CSS remains unknown, and it is unclear if attacks apply to more restrictive settings such as email. In this paper, we systematically investigate the modern dynamic features of CSS and their applicability for script-less fingerprinting, bypassing many state-of-the-art mitigations. We present three innovative techniques based on fuzzing and templating that exploit nuances in CSS container queries, arithmetic functions, and complex selectors. This allows us to infer detailed application, OS, and hardware configurations at high accuracy. For browsers, we can distinguish 97.95% of 1176 tested browser-OS combinations. Our methods also apply to email applications - as shown for 8 out of 21 tested web, desktop or mobile email applications. This demonstrates that fingerprinting is possible in the highly restrictive setting of HTML emails and expands the scope of tracking beyond traditional web environments. In response to these and potential future CSS-based tracking capabilities, we propose two defense mechanisms that eliminate the root causes of privacy leakage. For browsers, we propose to preload conditional resources, which eliminates feature-dependent leakage. For the email setting, we design an email proxy service that retains privacy and email integrity while largely preserving feature compatibility. Our work provides new insights and solutions to the ongoing privacy debate, highlighting the importance of robust defenses against emerging tracking methods.<br/><br/>Published at NDSS 2025.</description><category>Browser</category><category>Browser Fingerprinting</category><category>Email Security</category><category>NDSS</category></item><item><title>PortPrint: Identifying Inaccessible Code with Port Contention</title><link>https://roots.ec/publications/hornetz2025portprint/</link><pubDate>Wed, 19 Feb 2025 00:00:00 +0000</pubDate><dc:creator>Tristan Hornetz</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/hornetz2025portprint/</guid><description>In many real-world scenarios, being able to infer specific software versions or variations of cryptographic libraries is critical to mounting targeted exploits. For this, traditional version-detection approaches often rely on direct inspection of programs. However, modern computing platforms frequently employ protection for code, e.g., using execute-only memory (XOM) or trusted execution environments (TEE) to safeguard sensitive code from disclosure and reverse engineering. This paper demonstrates how side-channel measurements via CPU port contention reveal distinctive execution signatures, even when code is inaccessible for inspection. Our proof-of-concept implementation PortPrint identifies cryptographic functions, reveals library versions, and even uncovers whether a WolfSSL build is vulnerable to CVE-2024-1544 or if Spectre mitigations are active in Xen. We verify that PortPrint works despite state-of-the-art code protection mechanisms, such as memory protection keys, hypervisor-based XOM, Intel SGX, Intel TDX, and AMD SEV. We also report a negative result for leaking code protected with these techniques using Meltdown and Foreshadow, providing valuable insights into the limitations of these attacks. Our results show that hardware-based isolation is insufficient to conceal instruction streams.<br/><br/>Published at uASC 2025.</description><category>uASC</category></item><item><title>Hidden in Plain Sight: Scriptless Microarchitectural Attacks via TrueType Font Hinting</title><link>https://roots.ec/publications/trampert2025hiddenplainsight/</link><pubDate>Wed, 19 Feb 2025 00:00:00 +0000</pubDate><dc:creator>Leon Trampert</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/trampert2025hiddenplainsight/</guid><description>Microarchitectural attacks threaten system security and privacy, especially if they can be mounted without native code execution. Recent research has shown that such attacks are possible from within web browsers via JavaScript and WebAssembly. Moreover, recent works have demonstrated that 'scriptless' attacks, using only CSS, can be leveraged for side-channel attacks, including cache contention and user fingerprinting. In this paper, we introduce a new class of scriptless attacks that use the hinting instructions embedded within TrueType font files. We show that the hinting language is sufficiently robust to craft cache attacks, demonstrating cache-contention attacks and precise L1 Prime+Probe attacks. We demonstrate a website fingerprinting attack, as well as a method to track which page of a PDF is currently displayed. Our results demonstrate the practicality of font-based scriptless attacks in real-world scenarios. This emphasizes the need for future mitigations that go beyond traditional scripting languages.<br/><br/>Published at uASC 2025.</description><category>Browser</category><category>Website Fingerprinting</category><category>uASC</category></item><item><title>No Leakage Without State Change: Repurposing Configurable CPU Exceptions to Prevent Microarchitectural Attacks</title><link>https://roots.ec/publications/weber2024irqguard/</link><pubDate>Mon, 09 Dec 2024 00:00:00 +0000</pubDate><dc:creator>Daniel Weber</dc:creator><dc:creator>Leonard Niemann</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Jan Reineke</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/weber2024irqguard/</guid><description>Microarchitectural side-channel attacks have become significant threats to computer system security. While writing side-channel-resistant code can mitigate these attacks, it is time-consuming and error-prone. Detection approaches provide an alternative by monitoring the system for signs of ongoing attacks. However, distinguishing between malicious and benign processes is complex, error prone, and ineffective against sophisticated attacks. In this paper, we propose a novel approach, IRQGuard, which shifts the focus to proactive mitigation. IRQGuard enables the victim to monitor its own microarchitectural events resulting from microarchitectural state changes. Leveraging existing CPU features, IRQGuard uses interrupt requests (IRQs) triggered by victim-specific microarchitectural state changes within predefined code regions. This self-monitoring eliminates noise of unrelated applications, enabling immediate detection and response to potential attacks. Our proof-of-concept implementation demonstrates that IRQGuard stops information leakage in under 200 CPU cycles, outperforming current methods significantly. We evaluate IRQGuard on both cryptographic (OpenSSL) and non-cryptographic (toilet command-line utility) applications. We demonstrate IRQGuard's real-world viability by protecting an OpenSSH server from cache attacks. IRQGuard offers a practical, low-overhead solution for mitigating a wide range of microarchitectural attacks on Intel, AMD, and Arm CPUs.<br/><br/>Published at ACSAC 2024.</description><category>ACSAC</category></item><item><title>CacheWarp: Software-based Fault Injection using Selective State Reset</title><link>https://roots.ec/publications/zhang2024cachewarp/</link><pubDate>Wed, 14 Aug 2024 00:00:00 +0000</pubDate><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Daniel Weber</dc:creator><dc:creator>Lorenz Hetterich</dc:creator><dc:creator>Youheng Lü</dc:creator><dc:creator>Andreas Kogler</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/zhang2024cachewarp/</guid><description>AMD SEV is a trusted-execution environment (TEE), providing confidentiality and integrity for virtual machines (VMs). With AMD SEV, it is possible to securely run VMs on an untrusted hypervisor. While previous attacks demonstrated architectural shortcomings of earlier SEV versions, AMD claims that SEV-SNP prevents all attacks on the integrity. In this paper, we introduce CacheWarp, a new software-based fault attack on AMD SEV-ES and SEV-SNP, exploiting the possibility to architecturally revert modified cache lines of guest VMs to their previous (stale) state. Unlike previous attacks on the integrity, CacheWarp is not mitigated on the newest SEV-SNP implementation, and it does not rely on specifics of the guest VM. CacheWarp only has to interrupt the VM at an attacker-chosen point to invalidate modified cache lines without them being written back to memory. Consequently, the VM continues with architecturally stale data. In 3 case studies, we demonstrate an attack on RSA in the Intel IPP crypto library, recovering the entire private key, logging into an OpenSSH server without authentication, and escalating privileges to root via the sudo binary. While we implement a software-based mitigation proof-of-concept, we argue that mitigations are difficult, as the root cause is in the hardware.<br/><br/>Published at USENIX Security 2024.</description><category>USENIX Security</category></item><item><title>Switchpoline: A Software Mitigation for Spectre-BTB and Spectre-BHB on ARMv8</title><link>https://roots.ec/publications/bauer2024switchpoline/</link><pubDate>Mon, 01 Jul 2024 00:00:00 +0000</pubDate><dc:creator>Markus Bauer</dc:creator><dc:creator>Lorenz Hetterich</dc:creator><dc:creator>Christian Rossow</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/bauer2024switchpoline/</guid><description>Spectre-BTB, also known as Spectre Variant 2, is often considered the most dangerous Spectre variant. While there are widely-deployed software workarounds on x86, such as Retpoline, there are no automated software workarounds for protecting generic userspace applications on ARMv8. Moreover, hardware solutions do not consider in-place mistraining or variants such as branch-history injection (Spectre-BHI), also known as Spectre-BHB. In this paper, we introduce Switchpoline, the first automated Spectre-BTB and Spectre-BHB software workaround protecting C and C++ userspace applications on ARMv8 against all variants of Spectre-BTB and Spectre-BHB. The main security of Switchpoline is that eliminating indirect branches eliminates attacks on indirect branches. Switchpoline is based on a static compiler pass and a dynamic just-in-time (JIT) compiler component that rewrite indirect control-flow transfers into direct control-flow transfers. Switchpoline successfully prevents Spectre-BTB and Spectre-BHB in userspace applications with a negligible mean performance overhead of 1.8 % measured in the SPEC CPU 2017 benchmark. Moreover, unlike many x86-specific mitigations, Switchpoline is compatible with existing orthogonal defenses, such as (hardware) CFI or Spectre-PHT mitigations. Hence, Switchpoline is a practical generic software mitigation on ARMv8.<br/><br/>Published at ASIACCS 2024.</description><category>ASIACCS</category></item><item><title>Efficient and Generic Microarchitectural Hash-Function Recovery</title><link>https://roots.ec/publications/gerlach2024hash/</link><pubDate>Mon, 20 May 2024 00:00:00 +0000</pubDate><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Simon Schwarz</dc:creator><dc:creator>Nicolas Faroß</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/gerlach2024hash/</guid><description>Modern CPUs use a variety of undocumented microarchitectural hash functions to efficiently distribute data within microarchitectural structures such as caches. A well-known function is the cache slice function that distributes cache lines to the slices of the last-level cache. Knowing these functions improves microarchitectural attacks, such as Prime+Probe or Rowhammer, drastically. However, while several such linear functions have been reverse-engineered, there is no generic or automated approach for reverse-engineering non-linear functions, which have become common with modern CPUs. In this paper, we introduce a novel generic approach for automatically reverse-engineering a wide range of microarchitectural hash functions. Our approach combines techniques initially used for logic-gate minimization and from computer algebra to infer the hash functions based on input-output pairs observed via side channels. With our framework, we infer 3 previously-unknown non-linear hash functions on both AMD and Intel CPUs, including the new Alder Lake hybrid-CPU architecture. We verify our approach by reproducing known hash functions and evaluating side-channel attacks that rely on these functions, resulting in success rates above 97.65%. We stress the need to design such functions with both performance and security in mind and discuss alternative designs that can be used in future CPUs.<br/><br/>Published at S&amp;P 2024.</description><category>S&amp;P</category></item><item><title>FetchBench: Systematic Identification and Characterization of Proprietary Prefetchers</title><link>https://roots.ec/publications/schlueter2023fetchbench/</link><pubDate>Sun, 26 Nov 2023 00:00:00 +0000</pubDate><dc:creator>Till Schlüter</dc:creator><dc:creator>Amit Choudhari</dc:creator><dc:creator>Lorenz Hetterich</dc:creator><dc:creator>Leon Trampert</dc:creator><dc:creator>Hamed Nemati</dc:creator><dc:creator>Ahmad Ibrahim</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Christian Rossow</dc:creator><dc:creator>Nils Ole Tippenhauer</dc:creator><guid>https://roots.ec/publications/schlueter2023fetchbench/</guid><description>Prefetchers are features in modern CPUs that allow speculative fetching of memory based on predictions on future memory use of applications. Different CPU models may use different prefetcher types, and two implementations of the same prefetcher can differ in detail in their characteristics, leading to distinct runtime behavior. For a few implementations, security researchers showed through manual analysis how to exploit specific prefetchers to leak secret data. Identifying such vulnerabilities required tedious reverse-engineering as prefetcher implementations are proprietary and undocumented. So far, no systematic study of prefetchers in common CPUs is available, preventing further security assessment. In this work, we address the following question: How can we systematically identify and characterize under-specified prefetchers in proprietary processors? To answer this question, we systematically analyze approaches to prefetching, design cross-platform tests to identify and characterize them on a given CPU, and demonstrate that our implementation FetchBench can characterize prefetchers on 14 different ARM and x86-64 CPUs. For example, FetchBench uncovers and characterizes a previously unknown replay-based prefetcher on the ARM Cortex-A72 CPU. Based on these findings, we demonstrate two novel attacks that exploit this undocumented prefetcher as a side channel to leak secret information, even from the secure TrustZone into normal world.<br/><br/>Published at CCS 2023.</description><category>CCS</category></item><item><title>Honey, I Cached our Security Tokens - Re-usage of Security Tokens in the Wild</title><link>https://roots.ec/publications/trampert2023honey/</link><pubDate>Mon, 16 Oct 2023 00:00:00 +0000</pubDate><dc:creator>Leon Trampert</dc:creator><dc:creator>Ben Stock</dc:creator><dc:creator>Sebastian Roth</dc:creator><guid>https://roots.ec/publications/trampert2023honey/</guid><description>In order to mitigate the effect of Web attacks, modern browsers support a plethora of different security mechanisms. Mechanisms such as anti-Cross-Site Request Forgery (CSRF) tokens or nonces in a Content Security Policy rely on a random number that must only be used once. Notably, those Web security mechanisms are shipped through HTML tags or HTTP response headers from the server to the client side. To decrease the server load and the traffic burdened on the server infrastructure, many Web applications are served via a Content Delivery Network (CDN), which caches certain responses from the server to deliver them to multiple clients. This, however, affects not only the content but also the settings of the security mechanisms deployed via HTML meta tags or HTTP headers. If those are also cached, their content is fixed, and the security tokens are no longer random for each request. Even if the responses are not cached, operators may re-use tokens, as generating random numbers that are unique for each request introduces additional complexity for preserving the state on the server side. This work sheds light on the re-usage of security tokens in the wild, investigates what caused the static tokens, and elaborates on the security impact of the non-random security tokens.<br/><br/>Published at RAID 2023.</description><category>Web Security</category><category>CDN</category><category>RAID</category></item><item><title>A Rowhammer Reproduction Study Using the Blacksmith Fuzzer</title><link>https://roots.ec/publications/gerlach2023blacksmithrepro/</link><pubDate>Mon, 25 Sep 2023 00:00:00 +0000</pubDate><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Fabian Thomas</dc:creator><dc:creator>Robert Pietsch</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/gerlach2023blacksmithrepro/</guid><description>Rowhammer is a hardware vulnerability that can be exploited to induce bit flips in dynamic random access memory (DRAM), compromising the security of a computer system. Multiple ways of exploiting Rowhammer have been shown and even in the presence of mitigations such as target row refresh (TRR), DRAM modules remain partially vulnerable. In this paper, we present a large-scale reproduction study on the Rowhammer vulnerability using the Blacksmith Rowhammer fuzzer. The main focus of our study is the impact of the fuzzing environment. Our study, uses a diverse set of 10 DRAM chips from various manufacturers, with different capacities and memory frequencies. We show that the runtime, used seeds, and DRAM coverage of the fuzzer have been underestimated in previous work. Additionally, we study the entire hardware setup’s impact on the transferability of Rowhammer by fuzzing the same DRAM on 4 identical machines. The transferability study heavily relates to Rowhammer-based physically unclonable functions (PUFs) which rely on the stability of Rowhammer-induced bit flips. Our results confirm the findings of the Blacksmith fuzzer, showing that even modern DRAM chips are vulnerable to Rowhammer. In addition, we show that PUFs are challenging to achieve on commodity systems due to the high variability of Rowhammer bit flips.<br/><br/>Published at ESORICS 2023.</description><category>ESORICS</category></item><item><title>Indirect Meltdown: Building Novel Side-Channel Attacks from Transient Execution Attacks</title><link>https://roots.ec/publications/weber2023masc/</link><pubDate>Mon, 25 Sep 2023 00:00:00 +0000</pubDate><dc:creator>Daniel Weber</dc:creator><dc:creator>Fabian Thomas</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/weber2023masc/</guid><description>The transient-execution attack Meltdown leaks sensitive information by transiently accessing inaccessible data during out-of-order execution. Although Meltdown is fixed in hardware for recent CPU generations, most currently-deployed CPUs have to rely on software mitigations, such as KPTI. Still, Meltdown is considered non-exploitable on current systems. In this paper, we show that adding another layer of indirection to Meltdown transforms a transient-execution attack into a side-channel attack, leaking metadata instead of data. We show that despite software mitigations, attackers can still leak metadata from other security domains by observing the success rate of Meltdown on non-secret data. With LeakIDT, we present the first cache-line granular monitoring of kernel addresses. LeakIDT allows an attacker to obtain cycle-accurate timestamps for attacker-chosen interrupts. We use our attack to get accurate inter-keystroke timings and fingerprint visited websites. While we propose a low-overhead software mitigation to prevent the exploitation of LeakIDT, we emphasize that the side-channel aspect of transient-execution attacks should not be underestimated.<br/><br/>Published at ESORICS 2023.</description><category>ESORICS</category></item><item><title>Reviving Meltdown 3a</title><link>https://roots.ec/publications/weber2023meltdown3a/</link><pubDate>Mon, 25 Sep 2023 00:00:00 +0000</pubDate><dc:creator>Daniel Weber</dc:creator><dc:creator>Fabian Thomas</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/weber2023meltdown3a/</guid><description>Since the initial discovery of Meltdown and Spectre in 2017, different variants of these attacks have been discovered. One often overlooked variant is Meltdown 3a, also known as Meltdown-CPL-REG. Even though Meltdown-CPL-REG was initially discovered in 2018, the available information regarding the vulnerability is still sparse. In this paper, we analyze Meltdown-CPL-REG on 19 different CPUs from different vendors using an automated tool. We observe that the impact is more diverse than documented and differs from CPU to CPU. Surprisingly, while the newest Intel CPUs do not seem affected by Meltdown-CPL-REG, the newest available AMD CPUs (Zen3+) are still affected by the vulnerability. Furthermore, given our attack primitive CounterLeak, we show that besides up-to-date patches, Meltdown-CPL-REG can still be exploited as we reenable performance-counter-based attacks on cryptographic algorithms, break KASLR, and mount Spectre attacks. Although Meltdown-CPL-REG is not as powerful as other transient-execution attacks, its attack surface should not be underestimated.<br/><br/>Published at ESORICS 2023.</description><category>ESORICS</category></item><item><title>Collide+Power: Leaking Inaccessible Data with Software-based Power Side Channels</title><link>https://roots.ec/publications/kogler2023collidepower/</link><pubDate>Wed, 09 Aug 2023 00:00:00 +0000</pubDate><dc:creator>Andreas Kogler</dc:creator><dc:creator>Jonas Juffinger</dc:creator><dc:creator>Lukas Giner</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Martin Schwarzl</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Daniel Gruss</dc:creator><dc:creator>Stefan Mangard</dc:creator><guid>https://roots.ec/publications/kogler2023collidepower/</guid><description>Differential Power Analysis (DPA) measures single-bit differences between data values used in computer systems by statistical analysis of power traces. In this paper, we show that the mere co-location of data values, e.g., attacker and victim data in the same buffers and caches, leads to power leakage in modern CPUs that depends on a combination of both values, resulting in a novel attack, Collide+Power. We systematically analyze the power leakage of the CPU's memory hierarchy to derive precise leakage models enabling practical end-to-end attacks. These attacks can be conducted in software with any signal related to power consumption, e.g., power consumption interfaces or throttling-induced timing variations. Leakage due to throttling requires 133.3 times more samples than direct power measurements. We develop a novel differential measurement technique amplifying the exploitable leakage by a factor of 8.778 on average, compared to a straightforward DPA approach. We demonstrate that Collide+Power leaks single-bit differences from the CPU's memory hierarchy with fewer than 23000 measurements. Collide+Power varies attacker-controlled data in our end-to-end DPA attacks. We present a Meltdown-style attack, leaking from attacker-chosen memory locations, and a faster MDS-style attack, which leaks 4.82 bit/h. Collide+Power is a generic attack applicable to any modern CPU, arbitrary memory locations, and victim applications and data. However, the Meltdown-style attack is not yet practical, as it is limited by the state of the art of prefetching victim data into the cache, leading to an unrealistic real-world attack runtime with throttling of more than a year for a single bit. Given the different variants and potentially more practical prefetching methods, we consider Collide+Power a relevant threat that is challenging to mitigate.<br/><br/>Published at USENIX Security 2023.</description><category>USENIX Security</category></item><item><title>(M)WAIT for It: Bridging the Gap between Microarchitectural and Architectural Side Channels</title><link>https://roots.ec/publications/zhang2023mwait/</link><pubDate>Wed, 09 Aug 2023 00:00:00 +0000</pubDate><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Taehyun Kim</dc:creator><dc:creator>Daniel Weber</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/zhang2023mwait/</guid><description>In the last years, there has been a rapid increase in microarchitectural attacks, exploiting side effects of various parts of the CPU. Most of them have in common that they rely on timing differences, requiring a high-resolution timer to make microarchitectural states visible to an attacker. In this paper, we present a new primitive that converts microarchitectural states into architectural states without relying on time measurements. We exploit the unprivileged idle-loop optimization instructions umonitor and umwait introduced with the new Intel microarchitectures (Tremont and Alder Lake). Although not documented, these instructions provide architectural feedback about the transient usage of a specified memory region. In three case studies, we show the versatility of our primitive. First, with Spectral, we present a way of enabling transient-execution attacks to leak bits architecturally with up to 200 kbit/s without requiring any timer. Second, we show traditional side-channel attacks without relying on a timer. Finally, we demonstrate that when augmented with a coarse-grained timer, we can also mount interrupt-timing attacks, allowing us to, e.g., detect which website a user opens. Our case studies highlight that the boundary between architecture and microarchitecture becomes more and more blurry, leading to new attack variants and complicating effective countermeasures.<br/><br/>Published at USENIX Security 2023.</description><category>USENIX Security</category></item><item><title>Hammulator: Simulate Now - Exploit Later</title><link>https://roots.ec/publications/thomas2023hammulator/</link><pubDate>Sat, 17 Jun 2023 00:00:00 +0000</pubDate><dc:creator>Fabian Thomas</dc:creator><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/thomas2023hammulator/</guid><description>Rowhammer, first considered a reliability issue, turned out to be a significant threat to the security of systems. Hence, several mitigation techniques have been proposed to prevent the exploitation of the Rowhammer effect. Consequently, attackers developed more sophisticated hammering and exploitation techniques to circumvent mitigations. Still, the development and testing of Rowhammer exploits can be a tedious process, taking multiple hours to get the bit flip at the correct location. In this paper, we propose Hammulator, an open-source rapid-prototyping framework for Rowhammer exploits. We simulate the Rowhammer effect using the gem5 simulator and DRAMsim3 model, with a parameterizable implementation that allows researchers to simulate various types of systems. Hammulator enables faster and more deterministic bit flips, facilitating the development of Rowhammer proof-of-concept exploits and defenses. We evaluate our simulator by reproducing 2 open-source Rowhammer exploits. We also evaluate 2 previously proposed mitigations, PARA and TRR, in our simulator. Additionally, our micro- and macrobenchmarks show that our simulator has a small average overhead in the range of 6.96 % to 10.21 %. Our results show that Hammulator can be used to compare Rowhammer exploits objectively by providing a consistent testing environment. Hammulator and all experiments and evaluations are open source, hoping to ease the research on Rowhammer.<br/><br/>Published at DRAMSec 2023.</description><category>DRAMSec</category></item><item><title>CustomProcessingUnit: Reverse Engineering and Customization of Intel Microcode</title><link>https://roots.ec/publications/borrello2023cpu/</link><pubDate>Thu, 25 May 2023 00:00:00 +0000</pubDate><dc:creator>Pietro Borrello</dc:creator><dc:creator>Catherine Easdon</dc:creator><dc:creator>Martin Schwarzl</dc:creator><dc:creator>Roland Czerny</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/borrello2023cpu/</guid><description>Microcode provides an abstraction layer over the instruction set to decompose complex instructions into simpler micro-operations that can be more easily implemented in hardware. It is an essential optimization to simplify the design of x86 processors. However, introducing an additional layer of software beneath the instruction set poses security and reliability concerns. The microcode details are confidential to the manufacturers, preventing independent auditing or customization of the microcode. Moreover, microcode patches are signed and encrypted to prevent unauthorized patching and reverse engineering. However, recent research has recovered decrypted microcode and reverse-engineered read/write debug mechanisms on Intel Goldmont (Atom), making analysis and customization of microcode possible on a modern Intel microarchitecture. In this work, we present the first framework for static and dynamic analysis of Intel microcode. Building upon prior research, we reverse-engineer Goldmont microcode semantics and reconstruct the patching primitives for microcode customization. For static analysis, we implement a Ghidra processor module for decompilation and analysis of decrypted microcode. For dynamic analysis, we create a UEFI application that can trace and patch microcode to provide complete microcode control on Goldmont systems. Leveraging our framework, we reverse-engineer the confidential Intel microcode update algorithm and perform the first security analysis of its design and implementation. In three further case studies, we illustrate the potential security and performance benefits of microcode customization. We provide the first x86 Pointer Authentication Code (PAC) microcode implementation and its security evaluation, design and implement fast software breakpoints that are more than 1000x faster than standard breakpoints, and present constant-time microcode division, illustrating the potential security and performance benefits of microcode customization.<br/><br/>Published at WOOT 2023.</description><category>WOOT</category></item><item><title>A Security RISC: Microarchitectural Attacks on Hardware RISC-V CPUs</title><link>https://roots.ec/publications/gerlach2023riscv/</link><pubDate>Mon, 22 May 2023 00:00:00 +0000</pubDate><dc:creator>Lukas Gerlach</dc:creator><dc:creator>Daniel Weber</dc:creator><dc:creator>Ruiyi Zhang</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/gerlach2023riscv/</guid><description>Microarchitectural attacks threaten the security of computer systems even in the absence of software vulnerabilities. Such attacks are well explored on x86 and ARM CPUs, with a wide range of proposed but not-yet deployed hardware countermeasures. With the standardization of the RISC-V instruction set architecture and the announcement of support for the architecture by major processor vendors, RISC-V CPUs are on the verge of becoming ubiquitous. However, the microarchitectural attack surface of the first commercially available RISC-V hardware CPUs is not yet explored. This paper analyzes the two commercially-available off-the-shelf 64-bit RISC-V (hardware) CPUs used in most RISC-V systems running a full-fledged commodity Linux system. We evaluate the microarchitectural attack surface, which leads to the introduction of 3 new microarchitectural attack techniques: Cache+Time, a novel cache-line-granular cache attack without shared memory, Flush+Fault exploiting the Harvard cache architecture for Flush+Reload, and CycleDrift exploiting unprivileged access to instruction-retirement information. Additionally, we show that many known attacks are applicable to these RISC-V CPUs, mainly due to non-existing hardware countermeasures and instruction-set subtleties that do not consider the microarchitectural attack surface. We demonstrate our attacks in 6 case studies, including the first RISC-V-specific microarchitectural KASLR break and a CycleDrift-based method for detecting kernel activity. Based on our analysis, we stress the need to consider the microarchitectural attack surface during every step of a CPU design, including custom instruction-set extensions.<br/><br/>Published at S&amp;P 2023.</description><category>S&amp;P</category></item><item><title>Practical Timing Side-Channel Attacks on Memory Compression</title><link>https://roots.ec/publications/schwarzl2023compression/</link><pubDate>Mon, 22 May 2023 00:00:00 +0000</pubDate><dc:creator>Martin Schwarzl</dc:creator><dc:creator>Pietro Borrello</dc:creator><dc:creator>Gururaj Saileshwar</dc:creator><dc:creator>Hanna Müller</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Daniel Gruss</dc:creator><guid>https://roots.ec/publications/schwarzl2023compression/</guid><description>Compression algorithms are widely used as they save memory without losing data. However, elimination of redundant symbols and sequences in data leads to a compression side channel. So far, compression attacks have only focused on the compression-ratio side channel, i.e., the size of compressed data,and largely targeted HTTP traffic and website content. In this paper, we present the first memory compression attacks exploiting timing side channels in compression algorithms, targeting a broad set of applications using compression. Our work systematically analyzes different compression algorithms and demonstrates timing leakage in each. We present Comprezzor,an evolutionary fuzzer which finds memory layouts that lead to amplified latency differences for decompression and therefore enable remote attacks. We demonstrate a remote covert channel exploiting small local timing differences transmitting on average 643.25 bit/h over 14 hops over the internet. We also demonstrate memory compression attacks that can leak secrets bytewise as well as in dictionary attacks in three different case studies. First, we show that an attacker can disclose secrets co-located and compressed with attacker data in PHP applications using Memcached. Second, we present an attack that leaks database records from PostgreSQL, managed by a Python-Flask application, over the internet. Third, we demonstrate an attack that leaks secrets from transparently compressed pages with ZRAM,the memory compression module in Linux. We conclude that memory-compression attacks are a practical threat.<br/><br/>Published at S&amp;P 2023.</description><category>S&amp;P</category></item><item><title>TALUS: Reinforcing TEE Confidentiality with Cryptographic Coprocessors</title><link>https://roots.ec/publications/chakraborty2023talus/</link><pubDate>Mon, 01 May 2023 00:00:00 +0000</pubDate><dc:creator>Dhiman Chakraborty</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Sven Bugiel</dc:creator><guid>https://roots.ec/publications/chakraborty2023talus/</guid><description>Platforms are nowadays typically equipped with trusted execution environments (TEEs), such as Intel SGX or ARM TrustZone. However, recent microarchitectural attacks on TEEs repeatedly broke their confidentiality guarantees, including the leakage of long-term cryptographic secrets. These systems are typically also equipped with a cryptographic coprocessor, such as a TPM or Google Titan. These coprocessors offer a unique set of security features focused on safeguarding cryptographic secrets. Still, despite their simultaneous availability, the integration between these technologies is practically nonexistent, which prevents them from benefitting from each other’s strengths. In this paper, we propose TALUS , a general design and a set of three main requirements for a secure symbiosis between TEEs and cryptographic coprocessors. We implement a proof-of-concept of TALUS based on Intel SGX and a hardware TPM. We show that with TALUS, the long-term secrets used in the SGX life cycle can be moved to the TPM. We demonstrate that our design is robust even in the presence of transient execution attacks, preventing an entire class of attacks due to the reduced attack surface on the shared hardware.<br/><br/>Published at FC 2023.</description><category>FC</category></item><item><title>HyperDbg: Reinventing Hardware-Assisted Debugging</title><link>https://roots.ec/publications/karvandi2022hyperdbg/</link><pubDate>Mon, 07 Nov 2022 00:00:00 +0000</pubDate><dc:creator>Mohammad Sina Karvandi</dc:creator><dc:creator>MohammadHossein Gholamrezaei</dc:creator><dc:creator>Saleh Khalaj Monfared</dc:creator><dc:creator>Soroush Meghdadizanjani</dc:creator><dc:creator>Behrooz Abbassi</dc:creator><dc:creator>Ali Amini</dc:creator><dc:creator>Reza Mortazavi</dc:creator><dc:creator>Saeid Gorgin</dc:creator><dc:creator>Dara Rahmati</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/karvandi2022hyperdbg/</guid><description>Software analysis, debugging, and reverse engineering have a crucial impact in today's software industry. Efficient and stealthy debuggers are especially relevant for malware analysis. However, existing debugging platforms fail to address a transparent, effective, and high-performance low-level debugger due to their detectable fingerprints, complexity, and implementation restrictions. In this paper, we present a new hypervisor-assisted debugger for high-performance and stealthy debugging of user and kernel applications. To accomplish this, HyperDbg relies on state-of-the-art hardware features available in today's CPUs, such as VT-x and Extended Page Table (EPT). In contrast to other widely used existing debuggers, we design HyperDbg using a custom hypervisor, making it independent of OS functionality or API. We propose hardware-based instruction-level emulation and OS-level API hooking via extended page tables to increase the stealthiness. Our results of the dynamic analysis of 10,853 malware samples show that HyperDbg 's stealthiness allows debugging on average 22% and 26% more samples thanWinDbg andx64dbg, respectively. Moreover, in contrast to existing debuggers, HyperDbg is not detected by any of the 13 tested packers and protectors. We improve the performance over other debuggers by deploying a VMX-compatible script engine, eliminating unnecessary context switches. Our experiment on three concrete debugging scenarios shows that compared toWinDbg as the only kernel debugger, HyperDbg performs step-in, conditional breaks, and syscall recording, 2.98x, 1319x, and 2018x faster, respectively. We finally show real-world applications, such as a 0-day analysis, structure reconstruction for reverse engineering, software performance analysis, and code-coverage analysis.<br/><br/>Published at CCS 2022.</description><category>CCS</category></item><item><title>CPU Port Contention Without SMT</title><link>https://roots.ec/publications/rokicki2022portcontention/</link><pubDate>Mon, 26 Sep 2022 00:00:00 +0000</pubDate><dc:creator>Thomas Rokicki</dc:creator><dc:creator>Clémentine Maurice</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/rokicki2022portcontention/</guid><description>CPU port contention has been used in the last years as a stateless side channel to perform side-channel attacks and transient execution attacks. One drawback of this channel is that it heavily relies on simultaneous multi-threading, which can be absent from some CPUs or simply disabled by the OS. In this paper, we present sequential port contention, which does not require SMT. It exploits sub-optimal scheduling to execution ports for instruction-level parallelization. As a result, specifically-crafted instruction sequences on a single thread suffer from an increased latency. We show that sequential port contention can be exploited from web browsers in WebAssembly. We present an automated framework to search for instruction sequences leading to sequential port contention for specific CPU generations, which we evaluated on 50 different CPUs. An attacker can use these sequences from the browser to determine the CPU generation within 12 second with a 95% accuracy. This fingerprint is highly stable and resistant to system noise, and we show that mitigations are either expensive or only probabilistic.<br/><br/>Published at ESORICS 2022.</description><category>ESORICS</category></item><item><title>Robust and Scalable Process Isolation against Spectre in the Cloud</title><link>https://roots.ec/publications/schwarzl2022dpi/</link><pubDate>Mon, 26 Sep 2022 00:00:00 +0000</pubDate><dc:creator>Martin Schwarzl</dc:creator><dc:creator>Pietro Borrello</dc:creator><dc:creator>Andreas Kogler</dc:creator><dc:creator>Kenton Varda</dc:creator><dc:creator>Thomas Schuster</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Daniel Gruss</dc:creator><guid>https://roots.ec/publications/schwarzl2022dpi/</guid><description>In the quest for efficiency and performance, edge-computing providers replace process isolation with sandboxes, to support a high number of tenants per machine. While secure against software vulnerabilities, microarchitectural attacks can bypass these sandboxes. In this paper, we present a Spectre attack leaking secrets from co-located tenants in edge computing. Our remote Spectre attack, using amplification techniques and a remote timing server, leaks 2 bit/min. This motivates our main contribution, DyPrIs, a scalable process-isolation mechanism that only isolates suspicious worker scripts following a lightweight detection mechanism. In the worst case, DyPrIs boils down to process isolation. Our proof-of-concept implementation augments real-world cloud infrastructure used in production at large scale, Cloudflare Workers. With a false-positive rate of only 0.61 %, we demonstrate that DyPrIs outperforms strict process isolation while statistically maintaining its security guarantees, fully mitigating cross-tenant Spectre attacks.<br/><br/>Published at ESORICS 2022.</description><category>ESORICS</category></item><item><title>Browser-based CPU Fingerprinting</title><link>https://roots.ec/publications/trampert2022uarchfp/</link><pubDate>Mon, 26 Sep 2022 00:00:00 +0000</pubDate><dc:creator>Leon Trampert</dc:creator><dc:creator>Christian Rossow</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/trampert2022uarchfp/</guid><description>Mounting microarchitectural attacks, such as Spectre or Rowhammer, is possible from browsers. However, to be realistically exploitable, they require precise knowledge about microarchitectural properties. While a native attacker can easily query many of these properties, the sandboxed environment in browsers prevents this. In this paper, we present eight side-channel-related benchmarks that reveal CPU properties, such as cache sizes or cache associativities. Our benchmarks are implemented in JavaScript and run in unmodified browsers on multiple platforms. Based on a study with 834 participants using 297 different CPU models, we show that we can infer microarchitectural properties with an accuracy of up to 100%. Combining multiple properties also allows identifying the CPU vendor with an accuracy of 97.5%, and the microarchitecture and CPU model each with an accuracy of above 60%. The benchmarks are unaffected by current side-channel and browser fingerprinting mitigations, and can thus be used for more targeted attacks and to increase the entropy in browser fingerprinting.<br/><br/>Published at ESORICS 2022.</description><category>Browser</category><category>Browser Fingerprinting</category><category>ESORICS</category></item><item><title>ÆPIC Leak: Architecturally Leaking Uninitialized Data from the Microarchitecture</title><link>https://roots.ec/publications/borrello2022aepicleak/</link><pubDate>Wed, 10 Aug 2022 00:00:00 +0000</pubDate><dc:creator>Pietro Borrello</dc:creator><dc:creator>Andreas Kogler</dc:creator><dc:creator>Martin Schwarzl</dc:creator><dc:creator>Moritz Lipp</dc:creator><dc:creator>Daniel Gruss</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/borrello2022aepicleak/</guid><description>CPU vulnerabilities undermine the security guarantees provided by software- and hardware-security improvements. While the discovery of transient-execution attacks increased the interest in CPU vulnerabilities on a microarchitectural level, architectural CPU vulnerabilities are still understudied. In this paper, we systematically analyze existing CPU vulnerabilities showing that CPUs suffer from vulnerabilities whose root causes match with those in complex software. We show that transient-execution attacks and architectural vulnerabilities often arise from the same type of bug and identify the blank spots. Investigating the blank spots, we focus on architecturally improperly initialized data locations. We discover ÆPIC Leak, the first architectural CPU bug that leaks stale data from the microarchitecture without using a side channel. ÆPIC Leak works on all recent SunnyCove-based Intel CPUs (i.e., Ice Lake and Alder Lake). It architecturally leaks stale data incorrectly returned by reading undefined APIC-register ranges. ÆPIC Leak samples data transferred between the L2 and last-level cache, including SGX enclave data, from the superqueue. We target data in use, e.g., register values and memory loads, as well as data at rest, e.g., SGX-enclave data pages. Our end-to-end attack extracts AES-NI, RSA, and even the Intel SGX attestation keys from enclaves within a few seconds. We discuss mitigations and conclude that the only short-term mitigations for ÆPIC Leak are to disable APIC MMIO or not rely on SGX.<br/><br/>Published at USENIX Security 2022.</description><category>USENIX Security</category></item><item><title>Rapid Prototyping for Microarchitectural Attacks</title><link>https://roots.ec/publications/easdon2022rapid/</link><pubDate>Wed, 10 Aug 2022 00:00:00 +0000</pubDate><dc:creator>Catherine Easdon</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Martin Schwarzl</dc:creator><dc:creator>Daniel Gruss</dc:creator><guid>https://roots.ec/publications/easdon2022rapid/</guid><description>In recent years, microarchitectural attacks have been demonstrated to be a powerful attack class. However, as our empirical analysis shows, there are numerous implementation challenges that hinder discovery and subsequent mitigation of these vulnerabilities. In this paper, we examine the attack development process, the features and usability of existing tools, and the real-world challenges faced by practitioners. We propose a novel approach to microarchitectural attack development, based on rapid prototyping, and present two open-source software frameworks, libtea and SCFirefox, that improve upon state-of-the-art tooling to facilitate rapid prototyping of attacks. libtea demonstrates that native code attacks can be abstracted sufficiently to permit cross-platform implementations while retaining fine-grained control of microarchitectural behavior. We evaluate its effectiveness by developing proof-of-concept Foreshadow and LVI attacks. Our LVI prototype runs on x86-64 and ARMv8-A, and is the first public demonstration of LVI on ARM. SCFirefox is the first tool for browser-based microarchitectural attack development, providing the functionality of libtea in JavaScript. This functionality can then be used to iteratively port a prototype to unmodified browsers. We demonstrate this process by prototyping the first browser-based ZombieLoad attack and deriving a vanilla JavaScript and WebAssembly PoC running in an unmodified recent version of Firefox. We discuss how libtea and SCFirefox contribute to the security landscape by providing attack researchers and defenders with frameworks to prototype attacks and assess their feasibility.<br/><br/>Published at USENIX Security 2022.</description><category>USENIX Security</category></item><item><title>Repurposing Segmentation as a Practical LVI-NULL Mitigation in SGX</title><link>https://roots.ec/publications/giner2022lvi/</link><pubDate>Wed, 10 Aug 2022 00:00:00 +0000</pubDate><dc:creator>Lukas Giner</dc:creator><dc:creator>Andreas Kogler</dc:creator><dc:creator>Claudio Canella</dc:creator><dc:creator>Michael Schwarz</dc:creator><dc:creator>Daniel Gruss</dc:creator><guid>https://roots.ec/publications/giner2022lvi/</guid><description>Load Value Injection (LVI) uses Meltdown-type data flows in Spectre-like confused-deputy attacks. LVI has been demonstrated in practical attacks on Intel SGX enclaves, and consequently, mitigations were deployed that incur tremendous overheads of factor 2 to 19. However, as we discover, on fixed hardware LVI-NULL leakage is still present. Hence, to mitigate LVI-NULL in SGX enclaves on LVI-fixed CPUs, the expensive mitigations would still be necessary. In this paper, we propose a lightweight mitigation focused on LVI-NULL in SGX, LVI-NULLify. We systematically analyze and categorize LVI-NULL variants. Our analysis reveals that previously proposed mitigations targeting LVI-NULL are not effective. Our novel mitigation addresses this problem by repurposing segmentation, a fast legacy hardware mechanism that x86 already uses for every memory operation. LVI-NULLify consists of a modified SGX-SDK and a compiler extension which put the enclave in control of LVI-NULL-exploitable memory locations. We evaluate LVI-NULLify on the LVI-fixed Comet Lake CPU and observe a performance overhead below 10% for the worst case, which is substantially lower than previous defenses with a prohibitive overhead of 1220% in the worst case. We conclude that LVI-NULLify is a practical solution to protect SGX enclaves against LVI-NULL today.<br/><br/>Published at USENIX Security 2022.</description><category>USENIX Security</category></item><item><title>Minefield: A Software-only Protection for SGX Enclaves against DVFS Attacks</title><link>https://roots.ec/publications/kogler2022minefield/</link><pubDate>Wed, 10 Aug 2022 00:00:00 +0000</pubDate><dc:creator>Andreas Kogler</dc:creator><dc:creator>Daniel Gruss</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/kogler2022minefield/</guid><description>Modern CPUs adapt clock frequencies and voltage levels to workloads to reduce energy consumption and heat dissipation. This mechanism, dynamic voltage and frequency scaling (DVFS), is controlled from privileged software but affects all execution modes, including SGX. Prior work showed that manipulating voltage or frequency can fault instructions and thereby subvert SGX enclaves. Consequently, Intel disabled the overclocking mailbox (OCM) required for software undervolting, also preventing benign use for energy saving. In this paper, we propose Minefield, the first software-level defense against DVFS attacks. The idea of Minefield is not to prevent DVFS faults but to deflect faults to trap instructions and handle them before they lead to harmful behavior. As groundwork for Minefield, we systematically analyze DVFS attacks and observe a timing gap of at least 57.8 us between every OCM transition, leading to random faults over at least 57000 cycles. Minefield places highly fault-susceptible trap instructions in the victim code during compilation. Like redundancy countermeasures, Minefield is scalable and enables enclave developers to choose a security parameter between 0% and almost 100%, yielding a fine-grained security-performance trade-off. Our evaluation shows a density of 0.75, i.e., one trap after every 1-2 instruction, mitigates all known DVFS attacks in 99% on Intel SGX, incurring an overhead of 148.4% on protected enclaves. However, Minefield has no performance effect on the remaining system. Thus, Minefield is a better solution than hardware- or microcode-based patches disabling the OCM interface.<br/><br/>Published at USENIX Security 2022.</description><category>USENIX Security</category></item><item><title>AMD Prefetch Attacks through Power and Time</title><link>https://roots.ec/publications/lipp2022amd/</link><pubDate>Wed, 10 Aug 2022 00:00:00 +0000</pubDate><dc:creator>Moritz Lipp</dc:creator><dc:creator>Daniel Gruss</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/lipp2022amd/</guid><description>Modern operating systems fundamentally rely on the strict isolation of user applications from the kernel. This isolation is enforced by the hardware. On Intel CPUs, this isolation has been shown to be imperfect, for instance, with the prefetch side-channel. With Meltdown, it was even completely circumvented. Both the prefetch side channel and Meltdown have been mitigated with the same software patch on Intel. As AMD is believed to be not vulnerable to these attacks, this software patch is not active by default on AMD CPUs. In this paper, we show that the isolation on AMD CPUs suffers from the same type of side-channel leakage. We discover timing and power variations of the prefetch instruction that can be observed from unprivileged user space. In contrast to previous work on prefetch attacks on Intel, we show that the prefetch instruction on AMD leaks even more information. We demonstrate the significance of this side channel with multiple case studies in real-world scenarios. We demonstrate the first microarchitectural break of (fine-grained) KASLR on AMD CPUs. We monitor kernel activity, e.g., if audio is played over Bluetooth, and establish a covert channel. Finally, we even leak kernel memory with 52.85 B/s with simple Spectre gadgets in the Linux kernel. We show that stronger page table isolation should be activated on AMD CPUs by default to mitigate our presented attacks successfully.<br/><br/>Published at USENIX Security 2022.</description><category>USENIX Security</category></item><item><title>Branch Different - Spectre Attacks on Apple Silicon</title><link>https://roots.ec/publications/hetterich2022applespectre/</link><pubDate>Wed, 29 Jun 2022 00:00:00 +0000</pubDate><dc:creator>Lorenz Hetterich</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/hetterich2022applespectre/</guid><description>Since the disclosure of Spectre, extensive research has been conducted on both new attacks, attack variants, and mitigations. However, most research focuses on x86 CPUs, with only very few insights on ARM CPUs, despite their huge market share. In this paper, we focus on the ARMv8-based Apple CPUs and demonstrate a reliable Spectre attack. For this, we solve several challenges specific to Apple CPUs and their operating system. We systematically evaluate alternative high-resolution timing primitives, as timers used for microarchitectural attacks on other ARM CPUs are unavailable. As cache-maintenance instructions are ineffective, we demonstrate a reliable eviction-set generation from an unprivileged application. Based on these building blocks, we demonstrate a fast Evict+Reload cross-core covert channel, and a Spectre-PHT attack leaking more than 1500 B/s on an iPhone. Without mitigations for all Spectre variants and the rising market share of ARM CPUs, we stress that more research on ARM CPUs is required.<br/><br/>Published at DIMVA 2022.</description><category>DIMVA</category></item><item><title>Finding and Exploiting CPU Features using MSR Templating</title><link>https://roots.ec/publications/kogler2022msrtemplate/</link><pubDate>Mon, 23 May 2022 00:00:00 +0000</pubDate><dc:creator>Andreas Kogler</dc:creator><dc:creator>Daniel Weber</dc:creator><dc:creator>Martin Haubenwallner</dc:creator><dc:creator>Moritz Lipp</dc:creator><dc:creator>Daniel Gruss</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/kogler2022msrtemplate/</guid><description>To ensure backward compatibility while adding new features to CPUs, CPU vendors enable a limited CPU configuration via so-called model-specific registers (MSRs). These MSRs have been introduced for various features, such as debugging, performance monitoring, or security. While many MSRs are documented, there is still a plethora of undocumented or sparsely documented MSRs in modern CPUs. Furthermore, with multiple hundred MSRs, each providing up to 64 configuration bits, it is tedious to find specific configuration options. In this paper, we show that MSRs and their configuration bits can be detected automatically on Intel and AMD CPUs. We introduce MSRevelio, a framework to automatically detect bits that influence the behavior of instructions and semi-automatically find bits controlled by BIOS settings. We show that previously overlooked bits can harden systems against microarchitectural attacks such as Medusa, CrossTalk, and software-prefetch attacks. Additionally, we show that an undocumented lock bit allows disabling AES-NI at runtime, forcing mbedTLS to fall back to an AES implementation vulnerable to cache attacks. Exploiting this fallback inside an SGX enclave, we fully recover the AES key used by the enclave. With our detection approach, we show that security features retrofitted with microcode updates can be easily detected, even before the public documentation of the underlying vulnerability. In our analysis of the Xen hypervisor, we show that Xen’s handling of MSRs was flawed for a long time, allowing guests to access undocumented and unhandled MSRs and fingerprint specific Xen versions. Using automated correlation analysis between documented and undocumented MSRs, we discover a previously undocumented MSR correlating with the CPU’s timestamp counter. This MSR is also accessible from Xen guests, and we demonstrate a Foreshadow attack when all other timers are unavailable or artificially deteriorated. Our results highlight that transparency is crucial for features interacting closely with CPU internals.<br/><br/>Published at S&amp;P 2022.</description><category>S&amp;P</category></item><item><title>Automating Seccomp Filter Generation for Linux Applications</title><link>https://roots.ec/publications/canella2021chestnut/</link><pubDate>Sun, 14 Nov 2021 00:00:00 +0000</pubDate><dc:creator>Claudio Canella</dc:creator><dc:creator>Mario Werner</dc:creator><dc:creator>Daniel Gruss</dc:creator><dc:creator>Michael Schwarz</dc:creator><guid>https://roots.ec/publications/canella2021chestnut/</guid><description>Software vulnerabilities undermine the security of applications. By blocking unused functionality, the impact of potential exploits can be reduced. While seccomp provides a solution for filtering syscalls, it requires manual implementation of filter rules for each individual application. Recent work has investigated approaches to automate this task. However, as we show, these approaches make assumptions that are not necessary or require overly time-consuming analysis. In this paper, we propose Chestnut, an automated approach for generating strict syscall filters with lower requirements and limitations. Chestnut comprises two phases, with the first phase consisting of two static components, i.e., a compiler and a binary analyzer, that statically extract the used syscalls. The compiler-based approach of Chestnut is up to factor 73 faster than previous approaches with the same accuracy. On the binary level, our approach extends over previous ones by also applying to non-PIC binaries. An optional second phase of Chestnut is dynamic refinement to restrict the set of allowed syscalls further. We demonstrate that Chestnut on average blocks 302 syscalls (86.5 %) via the compiler and 288 (82.5 %) using the binary analysis on a set of 18 applications. Chestnut blocks the dangerous exec syscall in 50 % and 77.7 % of the tested applications using the compiler- and binary-based approach, respectively. For the tested applications, Chestnut blocks exploitation of more than 61 % of the 175 CVEs that target the kernel via syscalls.<br/><br/>Published at CCSW 2021.</description><category>CCSW</category></item></channel></rss>