Understanding WebAssembly Speculative Optimizations: Deopts and Inlining in V8

In Chrome M137, V8 introduced two key optimizations for WebAssembly: speculative call_indirect inlining and deoptimization support. These features, originally common in JavaScript JIT compilation, now accelerate WebAssembly execution—especially for WasmGC programs. This Q&A breaks down how they work, why they matter, and what performance gains they deliver.

What are speculative optimizations, and why weren’t they needed for WebAssembly before?

Speculative optimizations are techniques where a compiler generates fast machine code based on assumptions from past runtime behavior. For example, if a variable consistently behaves as an integer, the compiler emits integer-specific code. If the assumption later fails, the engine deoptimizes—discards the optimized code and reverts to slower, general code. This approach is crucial for JavaScript’s dynamic types.

Understanding WebAssembly Speculative Optimizations: Deopts and Inlining in V8 — Source: v8.dev

WebAssembly 1.0 (2017) didn’t require such speculation because its static typing and ahead-of-time compilation (via LLVM or Binaryen) already produced efficient binaries. Languages like C++ and Rust compile to low-level, predictable WebAssembly, minimizing runtime ambiguity. Thus, V8 initially relied on static optimization alone for WebAssembly.

Why are speculative optimizations now needed for WebAssembly?

The introduction of WasmGC (WebAssembly Garbage Collection) changed the landscape. WasmGC supports high-level types—structs, arrays, subtyping—and is designed for managed languages like Java, Kotlin, and Dart. These types introduce runtime polymorphism and dynamic behavior similar to JavaScript, making static optimization less effective.

Without speculation, the compiler must generate generic code to handle all possible type variants, which is slow. By collecting runtime feedback and making optimistic assumptions (e.g., “this function always calls the same target”), V8 can produce specialized, faster machine code. This is why speculative optimizations became essential for WasmGC performance.

What are the two specific optimizations added?

V8 added two complementary techniques:

Speculative call_indirect inlining: Instead of generating a generic indirect call (which requires a table lookup and type check), the compiler inlines the most common target function based on runtime feedback. This eliminates call overhead and enables further optimizations like constant propagation.
Deoptimization support: When the speculative assumption (e.g., the called function) proves incorrect, V8 safely deoptimizes: it discards the optimized code, restores the correct state, and resumes execution at a reliable (unoptimized) point. This ensures correctness while allowing aggressive inlining.

How do speculative inlining and deoptimization work together?

The two optimizations form a feedback loop. First, V8 profiles WebAssembly execution to gather data on indirect call targets. Using this feedback, the compiler speculatively inlines the most frequent target—generating a fast, direct call path. If the prediction fails, the deopt mechanism activates: it jumps back to a pre-optimized state and continues with the generic call sequence.

This cooperation enables V8 to “bet” on common behavior without sacrificing correctness. Over time, deopts provide additional profiling data, allowing re-optimization with better assumptions. The result is code that adapts dynamically to actual program behavior, bridging the gap between static Wasm and dynamic languages.

What performance gains have been observed?

The impact is significant, especially for WasmGC programs. In Dart microbenchmarks, the combination of these optimizations yielded an average speedup of over 50%. For larger, realistic applications (e.g., compiled from Java or Kotlin), improvements ranged from 1% to 8%.

These numbers reflect both micro-level gains (faster indirect calls) and macro-level benefits (better inlining cascading into other optimizations). While not universal, the optimizations provide a meaningful boost for workloads that exhibit predictable indirect call patterns—common in object-oriented WasmGC programs.

What is deoptimization, and how does it ensure correctness?

Deoptimization is the ability to revert from optimized code to a safe, unoptimized state when an assumption fails. Imagine the compiler inlines a function based on the guess that call_indirect always points to target A. If the next call actually goes to target B, the machine code would produce wrong results. Instead, V8 detects the mismatch at runtime, triggers a deopt, and jumps to a “bailout” point where it executes the original, generic WebAssembly code.

This process preserves program correctness while allowing the engine to speculatively optimize. Deopts are well-tested in V8’s JavaScript pipeline and have been adapted for WebAssembly to handle Wasm-specific states (like stack frames and locals). They are a critical safety net for any speculative technique.

What future optimizations do these features enable?

Deoptimization support in WebAssembly unlocks a broader optimization pipeline. For example, V8 can now apply speculative type specialization for WasmGC objects—generating code that assumes a struct always has a certain subtype, with deopts handling mismatches. Similarly, loop-invariant code motion and constant folding become more powerful when combined with inlining.

Beyond that, the infrastructure paves the way for tiered compilation for WebAssembly, similar to JavaScript’s ignition/turbofan. As WasmGC adoption grows, these building blocks will enable aggressive optimizations that were previously impossible, narrowing the gap between WebAssembly and native code performance for managed languages.

Tags: