React Compiler Internals: Architecture, SSA, and SSR Security

Overview

The React Compiler (formerly React Forget) is not a transpiler in the conventional sense. It is a full semantic analysis engine that understands the React component model at a structural level and emits memoized equivalents of component and hook bodies without requiring the author to annotate a single useMemo or useCallback. The compiler operates under a closed world assumption about React's rules: components are pure functions of their props and state, and hooks obey the rules of hooks. Violations of either contract produce undefined behavior both at runtime and under compilation.

The pipeline can be understood in four distinct layers. The first is a syntactic frontend that ingests TypeScript or JavaScript source and produces a Babel AST. The second is a semantic lowering phase that transforms the AST into a High Level Intermediate Representation (HIR), which is the compiler's native working format. The third is a dataflow analysis phase that computes reactive scopes using Static Single Assignment form and alias analysis. The fourth is a code generation phase that emits augmented JavaScript with inlined memoization guards. Each phase has specific invariants that must hold for the next phase to be sound.

This document covers each layer in full architectural depth, including the V8 Garbage Collector implications of reference stable memoization, and a dedicated security analysis of how impure components can cause cross-request data leakage in Server-Side Rendering environments. The audience is an engineer who already writes React professionally and wants to reason about what the compiler does and does not guarantee.

Phase 1: Babel Frontend and AST Construction

The compiler's entry point is a Babel plugin. When the build tool (Vite, webpack, or the Next.js SWC integration) encounters a .js, .jsx, .ts, or .tsx file, the React Compiler plugin receives the file's Babel AST after the parser has already run. The compiler does not invoke its own parser; it consumes the AST that Babel has already produced via @babel/parser, which emits an ESTree-compatible node tree annotated with source locations.

The plugin entry point identifies candidate functions for compilation. A candidate is any function whose name begins with an uppercase letter (a component heuristic) or any function called within JSX that matches the hooks naming convention (use prefix). The compiler then performs a quick pre pass to determine whether the function obeys the rules of React. This pre pass is conservative: if it cannot prove purity statically, it bails out entirely and leaves the function unmodified. Bailout is tracked and surfaced through the compiler's diagnostic API.

The AST representation at this stage is a standard Babel AST. The following is a simplified illustration of what the parser produces for a minimal component.

code

// Source
function Counter({ count }) {
  const doubled = count * 2;
  return <span>{doubled}</span>;
}

// Abbreviated Babel AST (pseudo-JSON)
{
  type: "FunctionDeclaration",
  id: { type: "Identifier", name: "Counter" },
  params: [
    {
      type: "ObjectPattern",
      properties: [{ type: "ObjectProperty", key: { name: "count" } }]
    }
  ],
  body: {
    type: "BlockStatement",
    body: [
      {
        type: "VariableDeclaration",
        declarations: [{
          type: "VariableDeclarator",
          id: { name: "doubled" },
          init: {
            type: "BinaryExpression",
            operator: "*",
            left: { name: "count" },
            right: { type: "NumericLiteral", value: 2 }
          }
        }]
      },
      {
        type: "ReturnStatement",
        argument: {
          type: "JSXElement",
          openingElement: { name: "span" },
          children: [{ type: "JSXExpressionContainer", expression: { name: "doubled" } }]
        }
      }
    ]
  }
}

The AST preserves all syntactic structure but carries no semantic information about reactivity, side effects, or data flow between variables. The compiler's first real work begins in the lowering phase, where this structure is translated into HIR.

Phase 2: HIR Lowering and the High Level Intermediate Representation

The HIR is the React Compiler's internal language. It is not JavaScript, but it is not a low level IR like LLVM IR either. It sits between the two: it is statement oriented like JavaScript but stripped of all syntactic sugar, and it makes data flow and control flow explicit in a form that is amenable to analysis. The lowering from Babel AST to HIR is handled by buildHIR in the compiler's source.

HIR represents a function as a control flow graph (CFG) of basic blocks. Each basic block is a linear sequence of instructions with no internal branches. Branches, loops, and early returns are modeled as explicit edges between blocks. Every value in HIR is a Place, which is a typed slot identified by an integer ID. Instructions consume Places as operands and write their result into a destination Place. This is not yet SSA, but it is the prerequisite structure from which SSA is constructed in the next phase.

The key semantic types in HIR are as follows.

HIR Node Type	Semantics
`LoadLocal`	Read a named variable into a Place
`StoreLocal`	Write a Place into a named variable
`CallExpression`	Call with callee Place and argument Places
`JSXElement`	JSX node with prop Places and children Places
`Phi`	Join point for Places from multiple predecessor blocks
`BinaryExpression`	Arithmetic or logical op on two Places
`PropertyLoad`	Member access on an object Place
`ObjectExpression`	Construct object literal with keyed Place values

Every Place carries a ReactiveScope annotation that will be populated by the analysis phase. Before that phase runs, these annotations are empty. The lowering phase is purely structural; it does not compute any semantic properties. Its job is to make data flow explicit so that subsequent passes have a uniform representation to traverse.

The lowering of the Counter example above produces a CFG with a single basic block (no branches) containing the following HIR instructions.

code

Block 0 (entry):
  $0 = LoadLocal count
  $1 = Primitive 2
  $2 = BinaryExpression (*) $0 $1
  $3 = JSXElement "span" [] [$2]
  Return $3

Each $N is a Place. The structural linearity here is deceptive: in a real component with conditional branches, the CFG will have multiple blocks connected by conditional edges, and Phi nodes will appear at block entry points where values from different predecessor blocks must be merged.

Phase 3: Reactive Scope Inference and Static Single Assignment

This phase is the intellectual core of the compiler. Its goal is to determine which values in the HIR must be recomputed when specific inputs change, and to group those computations into memoization scopes. A reactive scope is a contiguous region of the HIR that can be wrapped in a useMemo-equivalent guard.

The phase begins by converting the HIR into Static Single Assignment form. SSA is a property of an IR in which every Place is assigned exactly once, and every use of a value refers unambiguously to the single instruction that produced it. To achieve this in the presence of control flow, Phi instructions are inserted at the convergence points of branches. A Phi takes one operand from each predecessor block and produces a single Place that represents the merged value downstream.

code

// Source with branch
function Label({ isError, text }) {
  let color;
  if (isError) {
    color = "red";
  } else {
    color = "blue";
  }
  return <span style={{ color }}>{text}</span>;
}

// HIR in SSA form
Block 0 (entry):
  $0 = LoadLocal isError
  Branch $0 -> Block1, Block2

Block 1 (then):
  $1 = Primitive "red"
  Jump -> Block3

Block 2 (else):
  $2 = Primitive "blue"
  Jump -> Block3

Block 3 (merge):
  $3 = Phi($1 from Block1, $2 from Block2)  // $3 is color
  $4 = LoadLocal text
  $5 = ObjectExpression { color: $3 }        // style object
  $6 = JSXElement "span" [style=$5] [$4]
  Return $6

With SSA in place, the compiler runs a backward dataflow pass called reactive dependency analysis. For each Place that is derived from a component prop or state value, the compiler traces forward through all uses of that Place to find which JSX elements or hook calls depend on it transitively. The set of Places that share a common reactive root are grouped into a single reactive scope.

A reactive scope is then materialized as a memoization block in the output. The compiler emits a cache array per component and generates a sequence of cache slot reads and equality checks before each scope. If all inputs to the scope are reference-equal to their cached values, the scope body is skipped and the cached output is returned. This is semantically equivalent to useMemo with the correct dependency array, but the dependency array is computed by the compiler rather than the author.

code

// Compiler output (conceptual, not exact emit)
function Label_compiled(t0) {
  const $ = useMemoCache(3);
  const { isError, text } = t0;

  let color;
  if ($[0] !== isError) {
    color = isError ? "red" : "blue";
    $[0] = isError;
    $[1] = color;
  } else {
    color = $[1];
  }

  let t1;
  if ($[1] !== color || $[2] !== text) {
    t1 = <span style={{ color }}>{text}</span>;
    $[2] = text;
    // Note: color is already cached at $[1]
  } else {
    t1 = $[3];
  }
  $[3] = t1;

  return t1;
}

The cache array is allocated once per component instance via useMemoCache, a React internal hook. Each slot is initialized to a sentinel value (Symbol.for("react.memo_cache_sentinel")) that is guaranteed to fail equality checks on the first render. Subsequent renders will hit the cache for any scope whose inputs have not changed by reference equality.

The SSA transformation is what makes this sound. Because each Place is assigned exactly once, the compiler can reason precisely about which assignments flow into which uses without tracking mutable variable state. Without SSA, alias analysis becomes undecidable in the general case and the compiler would be forced to make conservative approximations that would prevent memoization of any value that flows through a reassigned variable.

Phase 4: V8 GC Implications of Reference-Stable Memoization

The compiler's memoization strategy has a direct and measurable impact on V8's garbage collector, specifically the generational GC and the incremental marking phases of Oilpan (the V8 heap manager used in Chrome).

In a non-compiled React application, every render call to a component that creates derived objects (style={{ color }}, array .map() results, callback closures) allocates new objects on the V8 heap. These objects are allocated in the young generation (also called the nursery or new space), which is collected frequently by a fast scavenge collector. If the component re-renders many times per second (for example, during animation driven by state), these short-lived objects create significant scavenge pressure. The scavenge collector must trace all roots, evacuate surviving objects, and update pointers on every collection cycle. High allocation rate in new space means high scavenge frequency.

When the compiler produces a component that reuses cached references, the style object { color } is only allocated once per cache miss. On subsequent renders where isError has not changed, the same object reference from $[5] is returned. V8 sees no new allocation for that object. There is nothing to collect in the young generation for that render path.

The deeper implication is pointer stability in the old generation. When an object survives enough scavenge cycles, V8 promotes it to the old generation, which is collected by the incremental Major GC. The major GC uses a card table to track cross-generational pointers: when a young object is stored into a field of an old object, V8 marks the old object's card as dirty so the scavenger knows to trace it. If React's fiber tree (which lives in the old generation after a few renders) holds references to freshly allocated style objects in each render, every render marks those fiber cards dirty, causing the scavenger to re-trace a large portion of the fiber tree on each collection.

With compiler-stable references, the fiber node's prop object pointer does not change between renders. The card is not re-dirtied. The scavenger does not need to trace the fiber subtree for those nodes. This effect compounds across a component tree: a compiled tree with stable references produces dramatically fewer dirty cards than an equivalent uncompiled tree, which reduces scavenge work roughly in proportion to the fraction of renders that hit the memoization cache.

The following architectural rules govern when the compiler can guarantee reference stability.

Condition	Compiler behavior
All inputs to a scope are primitive values	Scope cached; output object is reference-stable on cache hit
Input is a prop passed from outside (object)	Stability depends on parent's stability; compiler traces into parent if it is also compiled
Input is the result of a hook (useState)	State identity is managed by React; stable unless `setState` was called
Input involves a module-level mutable variable	Compiler bails out of memoization for that scope
Input involves a `ref.current` read	Compiler bails out; refs are intentionally mutable

The practical performance model is: compiler-emitted memoization is most valuable for components that render frequently with unchanged props (components inside animated containers, items in a virtualized list) because those are precisely the cases where GC pressure from repeated allocation is highest.

Phase 5: SSR Security and Cross-Request Data Leakage from Impure Components

This section addresses a class of security vulnerability that is unique to server-side React rendering and that the React Compiler can silently make worse rather than better. The vulnerability class is cross-request data leakage, sometimes called request bleed or session bleed.

The React SSR model assumes that each call to renderToPipeableStream or renderToString is an isolated execution that produces output solely from the props and React context passed into the root component for that specific request. This assumption holds when all components are pure: their output is a deterministic function of their inputs only, with no reads from or writes to external mutable state.

When a component reads from a module-level mutable variable, it violates purity. In a browser environment this is often harmless because each browser tab has its own JS heap. On a Node.js server, however, all concurrent requests share the same module scope. A module-level variable mutated during one request is visible to all concurrent requests for the lifetime of the process.

code

// DANGEROUS: module-level mutable cache
let cachedUserData = null;

export function UserProfile({ userId }) {
  if (!cachedUserData) {
    cachedUserData = fetchUserDataSync(userId); // synchronous, for illustration
  }
  return <div>{cachedUserData.name}</div>;
}

In this pattern, the first request populates cachedUserData with User A's data. Any subsequent request that renders UserProfile before cachedUserData is cleared will render User A's name regardless of which user made the request. This is a direct data leakage between sessions.

The React Compiler makes this failure mode more likely and harder to detect. When the compiler analyzes UserProfile, it sees cachedUserData as a module-level binding. It cannot mutate module scope, so it correctly bails out of memoizing the component body. However, this bailout is silent: the component is emitted unmodified. An engineer who sees compiled output and assumes "the compiler memoized this" is wrong. The output is unchanged, and the leakage behavior is unchanged. The compiler provides no warning that the component accesses module-level mutable state; it simply does not apply reactive scope memoization.

The more dangerous scenario involves module-level state that the compiler does not bail on because it looks like a read of a stable value, but which is in fact mutated by a side effect outside the component.

code

// request-scoped store, wrongly implemented at module level
const requestContext = {
  currentUser: null,
  requestId: null,
};

export function setRequestContext(user, id) {
  requestContext.currentUser = user;
  requestContext.requestId = id;
}

export function UserBadge() {
  // compiler sees: PropertyLoad on a stable module binding
  // it may memoize this scope thinking `requestContext` is stable
  return <span>{requestContext.currentUser?.name}</span>;
}

The compiler may analyze requestContext as a stable reference (the binding itself never changes; it always points to the same object) and include requestContext.currentUser?.name in a memoized scope with no reactive dependencies. On the first render, this produces the correct output. On a subsequent render in the same request, the memoization cache returns the cached JSX element. Across requests, if the component is somehow reused (which should not happen in a correct SSR implementation but can happen with certain module-level component instance caching patterns), the cached element from a previous request is returned.

The correct architecture for SSR is to use React Context for all request-scoped data, never module-level mutable variables. The compiler is aware of React Context reads via the useContext hook and treats them as reactive dependencies. A context value that changes between requests will correctly invalidate the compiler-generated cache on each render call.

code

// CORRECT: request-scoped data via React Context
const UserContext = React.createContext(null);

export function UserBadge() {
  const user = useContext(UserContext);  // compiler sees this as a reactive dependency
  return <span>{user?.name}</span>;
}

// In the server request handler:
app.get('/profile', (req, res) => {
  const user = getUserFromRequest(req);
  const stream = renderToPipeableStream(
    <UserContext.Provider value={user}>
      <App />
    </UserContext.Provider>
  );
  stream.pipe(res);
});

The following table summarizes the leakage risk by state location pattern.

State location	Leakage risk	Compiler behavior
Module-level `let` / `var`	Critical: shared across all requests	Bails out of memoization; no warning
Module-level object mutated in-place	Critical: same heap object, all requests see mutations	May memoize if reference looks stable; silent incorrect cache
React Context via `useContext`	None: new Provider value per request	Correctly treated as reactive dependency
`useState` inside component	None: component tree is re-instantiated per request	Correctly handled
`useRef` inside component	None: ref is per-instance	Bails out of memoization involving `ref.current` reads
Closure over request-handler-scoped variable	None: closure captures per-request binding	Correctly handled if compiler sees capture

The enforcement rule is absolute: all request-scoped data in a server-rendered React application must flow through React Context, React Server Component props, or function arguments. Module-level mutable state is categorically incompatible with concurrent SSR and with the React Compiler's memoization model.

Essential References

React Compiler source repository and architecture notes: https://github.com/facebook/react/tree/main/compiler
React Compiler working group discussions: https://github.com/reactwg/react-compiler
React Compiler playground (inspect HIR and output): https://playground.react.dev
Babel AST explorer: https://astexplorer.net
V8 blog on generational garbage collection: https://v8.dev/blog/trash-talk
V8 card table and remembered set internals: https://v8.dev/blog/concurrent-marking
React renderToPipeableStream API reference: https://react.dev/reference/react-dom/server/renderToPipeableStream
OWASP Cross-Site Data Leakage (server-side session isolation): https://owasp.org/www-community/vulnerabilities/Insecure_Direct_Object_Reference
React Server Components security model: https://react.dev/blog/2023/03/22/react-labs-what-we-have-been-working-on-march-2023