Reading Code - React

July 15, 2022 13 minute read

It’s daunting to even think about reading the code of a framework I’ve been using for years. I’ve always seen React as a black box that takes JSX and puts things on the screen without thinking too much about it.

Of course, I know this and that about the virtual DOM and the reconciliation algorithm. At least in theory. But it wasn’t until this year that my curiosity finally overwhelmed me. I opened React’s repository but this time with the intent to read the code.

Before you continue reading, I have to mention that this is not meant to be an exhaustive overview of how React works under the hook. In fact, some of my assumptions about that will probably be wrong.

I will focus on the design of the codebase instead. We’ll see what practices and patterns the React core team has applied to a piece of software used by countless engineers. Hopefully, we can draw inspiration from their decisions.

Monorepo

I opened the repository with the expectation to find a src directory. Instead, I saw one named packages, immediately knowing that I’m looking at a monorepo. A monorepo is a single repository that hosts multiple different applications or libraries.

The React repo holds 30+ packages including react, react-dom, react-server, react-devtools and many others whose purpose I don’t know.

Its main benefits are an easier local setup for large projects consisting of multiple independent parts and better code reusability between them. This comes at the cost of more complex deployments and increased overall complexity of the codebase.

I’ve never had the chance to talk about my grudges with monorepos so I might as well use the opportunity.

I’ve worked with various monorepo setups in the last three years and my main problem with them is they make large refactorings too easy. When things are split between repositories you’re forced to implement changes gradually and put more thought in your design.

I find that a productive working flow with a monorepo is harder to implement because your local environment differs from the one running in production. This has bit me more times than I dare confess.

This isn’t a criticism of React, though. I’m sure the maintainers have figured their processes out.

Where to start?

Reading code you’re not familiar with is confusing. Even more so when you don’t know anyone who can walk you through it. So we have to start establishing an understanding of the codebase from somewhere and build upon it, taking notice of patterns that we identify along the way.

Regardless of the software, I always grab one of its public APIs and start from there.

So what are React’s public APIs? Hooks like useState and useEffect are part of them. But starting from them would make it borderline impossible for us to understand what is going on since we don’t have the context of how components or rendering work.

Instead, we’ll start from one of React’s public methods that gets called only once in every application.

import ReactDOM from 'react-dom'

const root = ReactDOM.createRoot(container)
root.render(element)

This is the syntax to mount an application to the DOM from React 18 onwards. By reading the implementation of these functions and tracing the operations they do, we can start chipping away at the codebase.

You may have noticed that the function we’ll be exploring isn’t a part of the core library. It’s part of react-dom which represents the browser bindings for React, connecting it to the DOM. This leads me to think that the core is written in an agnostic way, allowing it to be used in different environments.

Creating the Root

The createRoot function only wraps an internal function and does some light validation. Instead of having the functionality directly there, the authors have separated the validation logic and left the function to create the root instance in a separate file which defines the methods attached to the actual root object.

function createRoot(
  container: Element | Document | DocumentFragment,
  options?: CreateRootOptions
): RootType {
  if (__DEV__) {
    if (!Internals.usingClientEntryPoint && !__UMD__) {
      console.error(
        'You are importing createRoot from "react-dom" which is not supported. ' +
          'You should instead import it from "react-dom/client".'
      )
    }
  }
  return createRootImpl(container, options)
}

The internal function has the same name, followed by the Impl suffix. I would’ve named it something more specific like createRootInstance or createRootEntity instead of relying on this pattern to signal that it’s a private function.

It’s worth noting that the private function actually has the same name but is imported with an alias to avoid the name conflict. In general, I’m against naming functions the same unless I’m implementing some polymorphic behavior.

export function createRoot(
  container: Element | Document | DocumentFragment,
  options?: CreateRootOptions
): RootType {
  if (!isValidContainer(container)) {
    throw new Error(
      'createRoot(...): Target container is not a DOM element.'
    )
  }

  // ...

  const root = createContainer(
    container,
    ConcurrentRoot,
    null,
    isStrictMode,
    concurrentUpdatesByDefaultOverride,
    identifierPrefix,
    onRecoverableError,
    transitionCallbacks
  )

  // ...

  return new ReactDOMRoot(root)
}

The first thing the create root function does is to validate the container and exit early. This is what’s known as a guard clause and it helps us avoid multiple levels of nesting and more complex conditional statements.

The function is about 80 lines long but I’ve omitted the details so we can focus on the most important parts. The container makes the connection to react’s reconciler. Notice the design decision to use factory functions like createContainer but also rely on the new keyword to create the ReactDOMRoot.

If we look at the ReactDOMRoot function it only binds a value to this.

function ReactDOMRoot(internalRoot: FiberRoot) {
  this._internalRoot = internalRoot
}

Its methods to it are not defined inline but are instead attached to ReactDomRoot.prototype, making them available for all instances of that function. It’s worth noticing the multiple assignment in this example, attaching the function to both the ReactDOMRoot and the ReactDOMHydrationRoot.

ReactDOMHydrationRoot.prototype.render =
  ReactDOMRoot.prototype.render = function (
    children: ReactNodeList
  ): void {
    const root = this._internalRoot

    if (root === null) {
      throw new Error('Cannot update an unmounted root.')
    }

    if (__DEV__) {
      // ...
    }

    updateContainer(children, root, null, null)
  }

I’m not a developer who utilises prototypes often. Usually, I rely on factory functions and closures but this approach has its benefits. You can use the instanceof operator to check the object type.

But probably the main reason to rely on prototypes is performance. Due to some overhead when we use closures, prototype-based classing is faster. All functions attached to the prototype are created only once which is not the case with closures.

Connecting to the Reconciler

The container is an internal value that gets created when we initially create the ReactDOMRoot. I decided to follow the updateContainer method a little further, even though it’s part of the reconciler.

export const updateContainer = enableNewReconciler
  ? updateContainer_new
  : updateContainer_old

This is probably the first time I’m seeing a conditional export being used. All exported values in the reconciler are checked this way based on the enableNewReconciler feature flag. Feature flags are mostly used for gradual deployments or A/B testing in normal software development, but they can be used to opt in or out into different versions of the same functionality as well.

The different implementations are kept in files named the same way, differing only in the .new.js or .old.js suffix given to them. I got lost in the implementation of the actual function, though. The code itself was readable and all the functions had descriptive names.

But I was lacking the context around how the reconciliation algorithm works to make more sense of it.

export function updateContainer(
  element: ReactNodeList,
  container: OpaqueRoot,
  parentComponent: ?React$Component<any, any>,
  callback: ?Function
): Lane {
  // ...

  const eventTime = requestEventTime()
  const lane = requestUpdateLane(current)

  if (enableSchedulingProfiler) {
    markRenderScheduled(lane)
  }

  // ...

  const update = createUpdate(eventTime, lane)
  // Caution: React DevTools currently depends on this property
  // being called "element".
  update.payload = { element }

  // ...

  const root = enqueueUpdate(current, update, lane)

  // ...

  return lane
}

I couldn’t tell you much about what the function does beyond the fact that it enqueues the next component tree update, but I wanted to point out the comment below the createUpdate call. It’s a good example of a valuable comment for an operation whose reason wouldn’t be obvious.

This is as far as we’ll go into the reconciler, but before we continue with the renderer, it’s important to understand what internal structure React uses for its components.

What is a Component?

If we look at the signature of ReactDOM.prototype.render we see it expects a component. Fortunately, React doesn’t make users work with the internal object representations directly. Instead, we have a handy HTML-like syntax in JSX that allows us to describe our components much like any other web page.

Then, before our application runs, this JSX is transpiled to regular function calls that create objects.

Every JSX element can be replaced with a call to React.createElement. It’s possible to describe your UI using this function directly but it would be much harder than the alternative syntax. So our render method will receive the result of the call to the function mentioned above.

export function createElement(type, config, children) {
  let propName

  // Reserved names are extracted
  const props = {}

  let key = null
  let ref = null
  let self = null
  let source = null

  if (config != null) {
    // ...

    for (propName in config) {
      if (
        hasOwnProperty.call(config, propName) &&
        !RESERVED_PROPS.hasOwnProperty(propName)
      ) {
        props[propName] = config[propName]
      }
    }
  }

  const childrenLength = arguments.length - 2
  if (childrenLength === 1) {
    props.children = children
  } else if (childrenLength > 1) {
    // ...
  }

  // ...

  return ReactElement(
    type,
    key,
    ref,
    self,
    source,
    ReactCurrentOwner.current,
    props
  )
}

But since React 17, JSX doesn’t automatically get transpiled to React.createElement. Up until this update, each time you used JSX you had to have React in scope for the transpilation to work.

But this was unintuitive for engineers who didn’t know how JSX works. This new update, though, allows build tools to use a different function that is not attached to the React object. Its implementation is quite similar to React.createElement.

export function jsx(type, config, maybeKey) {
  let propName

  // Reserved names are extracted
  const props = {}

  let key = null
  let ref = null

  // ...

  for (propName in config) {
    if (
      hasOwnProperty.call(config, propName) &&
      !RESERVED_PROPS.hasOwnProperty(propName)
    ) {
      props[propName] = config[propName]
    }
  }

  // ...

  return ReactElement(
    type,
    key,
    ref,
    undefined,
    undefined,
    ReactCurrentOwner.current,
    props
  )
}

In the end, both of them delegate to a factory function called ReactElement. They are completely identical but the authors have decided to duplicate them which raises an important question. When should you extract a common function?

The common wisdom preaches to eliminate repetition on sight. But it seems that the practical advice on clean code is to wait and duplicate before you create a common abstraction. Repetitive code is annoying to manage but it’s not difficult. A wrong abstraction can become a hotbed for complexity, though.

const ReactElement = function (
  type,
  key,
  ref,
  self,
  source,
  owner,
  props
) {
  const element = {
    $$typeof: REACT_ELEMENT_TYPE,
    type: type,
    key: key,
    ref: ref,
    props: props,
    _owner: owner,
  }

  if (__DEV__) {
    element._store = {}

    Object.defineProperty(element._store, 'validated', {
      configurable: false,
      enumerable: false,
      writable: true,
      value: false,
    })

    Object.defineProperty(element, '_self', {
      configurable: false,
      enumerable: false,
      writable: false,
      value: self,
    })

    Object.defineProperty(element, '_source', {
      configurable: false,
      enumerable: false,
      writable: false,
      value: source,
    })

    // ...
  }

  return element
}

There is nothing too exciting going on in this method - it just assigns the proper values to the properties of the component object. It’s worth noting the usage of the defineProperty method in development, though.

It gives you more granular control over how every property on the object behaves.

Specifying writable:false means that it can’t be reassigned or deleted afterward. The enumerable property defines whether the value will appear in a for..in loop or when Object.keys is called with the object. And the configurable value specifies whether these options can be changed afterward.

I haven’t had the need to implement such control over an object’s properties in my day-to-day but it’s a handy way for the library to create internals that will only get used in development and prevent them from surfacing in places they don’t want.

The REACT_ELEMENT_TYPE constant is actually a Symbol used to mark that the object is of type component. In all honesty, I haven’t used symbols but this seems like a very good use case. Since they’re calling a factory function, not using the new keyword, checks made with instanceof won’t work.

My first thought was that this is an obvious inconsistency with how the ReactDOMRoot was created. To my surprise, there was a comment above the ReactElement function which explained the specific reason. Consistency is important and when you’re breaking it you should always explain why.

Interactions Between Renderer and Reconciler

At this point, we know how the renderer gets created and what internal representation React uses for its components. But to learn how elements are actually displayed on the screen we need to read the reconciler’s docs.

The reconciler holds the diffing algorithm that does the imperative work underneath React’s declarative syntax. It figures out the changes and tells the renderer to display them. The renderer has to define certain methods to handle the rendering of the components but it doesn’t know when to use them.

These methods are only called by the reconciler.

We won’t be dabbling into the reconciliation immediately, though. Even if it’s enticing to try and understand its secrets, we still have a long way to go when it comes to rendering. I read the reconciler’s README and it goes into details about how it interacts with the renderer.

Essentially, every renderer must follow a specific interface. It must provide a number of methods and properties on which the reconciler will rely on. This means that as long as you have the required methods you can build your very own renderer and display components in the console instead of the browser.

const Reconciler = require('react-reconciler')

const HostConfig = {
  createInstance(type, props) {
    // e.g. DOM renderer returns a DOM node
  },
  // ...
  supportsMutation: true, // it works by mutating nodes
  appendChild(parent, child) {
    // e.g. DOM renderer would call .appendChild() here
  },
  // ...
}

const MyRenderer = Reconciler(HostConfig)

const RendererPublicAPI = {
  render(element, container, callback) {
    // Call MyRenderer.updateContainer() to schedule changes on the roots.
    // See ReactDOM, React Native, or React ART for practical examples.
  },
}

module.exports = RendererPublicAPI

The name used to provide this interface is HostConfig. I understand that they’ve named it as such because the renderer is connecting React to the host environment but when I read the word configuration I imagine something related to environment variables.

Naming things is one of the hardest things in computer science after all and I’m not sure if my preferences about naming would’ve made things better, though.

Renderer Methods

Here you can see all the renderer methods which the reconciler will call at specific times.

The renderer can reach great levels of complexity and we won’t cover each function. Instead, we’ll focus on the main ones, used to display content on the screen in order to piece the puzzle of how rendering works.

It’s important to note that even though you can create your own custom renderer, the react-reconciler API doesn’t follow stability guarantees. This means that it’s getting tweaks far more often than the renderer or the core.

The createInstance function is the method that will get called to render the visual representation for each component. In this case, it will prepare DOM elements, but the renderer can theoretically paint anything on the screen.

export function createInstance(
  type: string,
  props: Props,
  rootContainerInstance: Container,
  hostContext: HostContext,
  internalInstanceHandle: Object
): Instance {
  let parentNamespace: string

  if (__DEV__) {
    // ...
  } else {
    parentNamespace = hostContext
  }

  const domElement: Instance = createElement(
    type,
    props,
    rootContainerInstance,
    parentNamespace
  )

  //...

  return domElement
}

Essentially this function is doing a conditional assignment and it creates a DOM element using a factory function. On a side note, I would have assigned the value to parentNamespace before the conditionals to avoid the else statement.

The createTextInstance function does the same for pure text nodes.

export function createTextInstance(
  text: string,
  rootContainerInstance: Container,
  hostContext: HostContext,
  internalInstanceHandle: Object
): TextInstance {
  if (__DEV__) {
    const hostContextDev = hostContext
    validateDOMNesting(null, text, hostContextDev.ancestorInfo)
  }
  const textNode: TextInstance = createTextNode(
    text,
    rootContainerInstance
  )

  precacheFiberNode(internalInstanceHandle, textNode)
  return textNode
}

The pattern it uses is very similar to what we’ve already seen. It does any necessary validation in development mode and then delegates the creation to another factory function.

These two functions are responsible for creating the proper elements that will then get added to the DOM, but we haven’t looked at how exactly this happens yet. The appendInitialChild and appendChild functions are responsible for this.

export function appendChild(
  parentInstance: Instance,
  child: Instance | TextInstance
): void {
  parentInstance.appendChild(child)
}

They only receive an instance of a DOM element and call its appendChild method to put the rendered component into place. I guess that it would be very hard to pinpoint the exact place where a component has to be inserted so this is always done with the help of the parent element.

Summary

With that, we wrap up our initial overview of React’s rendering process. We learned how renderers work and what methods they implement. We now know how they connect to the reconciler and how components are represented internally.

Together with that, we saw what design practices the React core team has applied in their codebase and most importantly - that reading a library’s implementation is not that scary after all!

Tao of Node

Learn how to build better Node.js applications. A collection of best practices about architecture, tooling, performance and testing.