Take you step by step to understand how the virtual DOM works

Overview

Both React and Vue have the concept of virtual DOM. How should we understand and grasp the essence of virtual DOM? I recommend everyone to learn the Snabbdom project. Snabbdom is a virtual DOM implementation library. The reason for the recommendation is that the code is relatively small, and the core code is only a few hundred lines; the second is that Vue uses the ideas of this project to implement virtual DOM; the third is the design/implementation and expansion ideas of this project It is worth reference.

snabb /snab/, Swedish, means fast.

To learn virtual DOM, we must first know the basics of DOM and the pain points of directly manipulating DOM with JS.

The role and type structure of the DOM

DOM (Document Object Model) is a document object model that uses an object tree structure to represent an HTML/XML document. The end point of each branch of the tree is a node, and each node contains objects. The methods of the DOM API allow you to manipulate this tree in specific ways, using these methods you can change the structure, style, or content of the document.

All nodes in the DOM tree are first a Node , Node is a base class. Element , Text and Comment all inherit from it.
换句话说, Element , Text Comment Node , ELEMENT_NODE ,
TEXT_NODE and COMMENT_NODE represent element nodes (HTML tags), text nodes, and comment nodes. Among them Element and a subclass is HTMLElement , what is the difference between HTMLElement and Element ? HTMLElement represents elements in HTML, such as: <span> , <img> , etc., and some elements are not HTML standard, such as <svg> . You can use the following method to determine whether this element is HTMLElement :

 document.getElementById('myIMG') instanceof HTMLElement;

Why do you need a virtual DOM?

Browser creation of the DOM is “expensive”. For a classic example, we can create a simple div element by document.createElement('div') and print all the attributes:

It can be seen that there are a lot of properties printed out, and when the complex DOM tree is frequently updated, it will cause performance problems. Virtual DOM uses a native JS object to describe a DOM node, so creating a JS object is much less expensive than creating a DOM object.

VNode

Vnode is an object structure describing virtual DOM in Snabbdom, the content is as follows:

 type Key = string | number | symbol;

interface VNode {
  // CSS 选择器,比如:'div#container'。
  sel: string | undefined;
  
  // 通过 modules 操作 CSS classes、attributes 等。
  data: VNodeData | undefined; 
  
   // 虚拟子节点数组,数组元素也可以是 string。
  children: Array<VNode | string> | undefined;
  
  // 指向创建的真实 DOM 对象。
  elm: Node | undefined;
  
  /**
   * text 属性有两种情况:
   * 1. 没有设置 sel 选择器,说明这个节点本身是一个文本节点。
   * 2. 设置了 sel,说明这个节点的内容是一个文本节点。
   */
  text: string | undefined;
  
  // 用于给已存在的 DOM 提供标识,在同级元素之间必须唯一,有效避免不必要地重建操作。
  key: Key | undefined;
}

// vnode.data 上的一些设置,class 或者生命周期函数钩子等等。
interface VNodeData {
  props?: Props;
  attrs?: Attrs;
  class?: Classes;
  style?: VNodeStyle;
  dataset?: Dataset;
  on?: On;
  attachData?: AttachData;
  hook?: Hooks;
  key?: Key;
  ns?: string; // for SVGs
  fn?: () => VNode; // for thunks
  args?: any[]; // for thunks
  is?: string; // for custom elements v1
  [key: string]: any; // for any other 3rd party module
}

For example, define a vnode object like this:

 const vnode = h(
  'div#container',
  { class: { active: true } },
  [
    h('span', { style: { fontWeight: 'bold' } }, 'This is bold'),
    ' and this is just normal text'
]);

We create the vnode object through the h(sel, b, c) function. h() The main purpose of the code implementation is to determine whether the b and c parameters exist, and process them into data and children. Children will eventually be in the form of an array. Finally, the vnode() function returns the VNode type format defined above.

The running process of Snabbdom

For the convenience of understanding, I divided the typical process of Snabbdom into the creation process and the update process, as shown in the following figure:

Diff processing is the process used to calculate the difference between old and new nodes.

Let’s look at a sample code that Snabbdom runs:

 import {
  init,
  classModule,
  propsModule,
  styleModule,
  eventListenersModule,
  h,
} from 'snabbdom';

const patch = init([
  // 通过传入模块初始化 patch 函数
  classModule, // 开启 classes 功能
  propsModule, // 支持传入 props
  styleModule, // 支持内联样式同时支持动画
  eventListenersModule, // 添加事件监听
]);

// <div id="container"></div>
const container = document.getElementById('container');

const vnode = h(
  'div#container.two.classes',
  { on: { click: someFn } },
  [
    h('span', { style: { fontWeight: 'bold' } }, 'This is bold'),
    ' and this is just normal text',
    h('a', { props: { href: '/foo' } }, "I'll take you places!"),
  ]
);

// 传入一个空的元素节点,创建 DOM。
patch(container, vnode);

const newVnode = h(
  'div#container.two.classes',
  { on: { click: anotherEventHandler } },
  [
    h(
      'span',
      { style: { fontWeight: 'normal', fontStyle: 'italic' } },
      'This is now italic type'
    ),
    ' and this is still just normal text',
    h('a', { props: { href: ''/bar' } }, "I'll take you places!"),
  ]
);

// 再次调用 patch(),将旧节点更新为新节点。
patch(vnode, newVnode);

The running process is described as follows:

  • Creation process: First call init() to initialize, and you need to configure the modules to be used during initialization. For example classModule module is used to configure the class attribute of an element in the form of an object, eventListenersModule module is used to configure event listeners and so on. init() will return after calling patch() function. Create an initialized vnode object through the h() function, call the patch() function, and finally create a real DOM object through createElm() .
  • Update process: Create a new vnode object, call the patch() function to update, and complete the difference update between this node and child nodes through patchVnode() and updateChildren() .

Before explaining the specific creation and update process, let’s take a look at the module idea in Snabbdom. Snabbdom abstracts the update code of virtual DOM related properties from the core code through the design of life cycle functions and module extension. So how is this designed and implemented?

Lifecycle functions and modules

Snabbdom provides a rich set of life cycle functions, namely hook functions, which are applicable in modules or can be defined directly on vnodes. For example, we can define the execution of the hook on the vnode like this:

 h('div.row', {
  key: 'myRow',
  hook: {
    // insert 钩子。
    insert: (vnode) => {
      console.log(vnode.elm.offsetHeight);
    },
  },
});

All lifecycle functions are declared as follows:

nametrigger nodecallback parameter
prepatch starts executingnone
initvnode is addedvnode
createA vnode-based DOM element is createdemptyVnode, vnode
insertThe element is inserted into the DOMvnode
prepatchElement is about to be patchedoldVnode, vnode
updateelement updatedoldVnode, vnode
postpatchElement has been patchedoldVnode, vnode
destroyelements are removed directly or indirectlyvnode
removeElement has been removed from the DOMvnode, removeCallback
postThe patch process has been completednone

其中适用于模块的是: pre , create , update , destroy , remove , post .适用于vnode 声明的是: init , create , insert , prepatch , update , postpatch , destroy , remove .

Let’s see how it is implemented. For example, we take the classModule module as an example. Kangkang’s statement:

 import { VNode, VNodeData } from "../vnode";
import { Module } from "./module";

export type Classes = Record<string, boolean>;

function updateClass(oldVnode: VNode, vnode: VNode): void {
  // 这里是更新 class 属性的细节,先不管。
  // ...
}

export const classModule: Module = { create: updateClass, update: updateClass };

You can see that the last exported module definition is an object, and the key of the object is the name of the hook function. The definition of the module object Module is as follows:

 import {
  PreHook,
  CreateHook,
  UpdateHook,
  DestroyHook,
  RemoveHook,
  PostHook,
} from '../hooks';

export type Module = Partial<{
  pre: PreHook;
  create: CreateHook;
  update: UpdateHook;
  destroy: DestroyHook;
  remove: RemoveHook;
  post: PostHook;
}>;

In TS Partial indicates that the attributes of each key in the object can be null, that is to say, which hook you care about in the module definition, just define which hook. Now that the hook is defined, how is it executed in the process? Then let’s look at the init() function:

 // 模块中可能定义的钩子有哪些。
const hooks: Array<keyof Module> = [
  "create",
  "update",
  "remove",
  "destroy",
  "pre",
  "post",
];

export function init(
  modules: Array<Partial<Module>>,
  domApi?: DOMAPI,
  options?: Options
) {
  // 模块中定义的钩子函数最后会存在这里。
  const cbs: ModuleHooks = {
    create: [],
    update: [],
    remove: [],
    destroy: [],
    pre: [],
    post: [],
  };

  // ...

  // 遍历模块中定义的钩子,并存起来。
  for (const hook of hooks) {
    for (const module of modules) {
      const currentHook = module[hook];
      if (currentHook !== undefined) {
        (cbs[hook] as any[]).push(currentHook);
      }
    }
  }
  
  // ...
}

It can be seen that init() traverses each module during execution, and then stores the hook function in the cbs object. When executing, you can Kangkang patch() the function:

 export function init(
  modules: Array<Partial<Module>>,
  domApi?: DOMAPI,
  options?: Options
) {
  // ...
  
  return function patch(
  oldVnode: VNode | Element | DocumentFragment,
   vnode: VNode
  ): VNode {
    // ...
    
    // patch 开始了,执行 pre 钩子。
    for (i = 0; i < cbs.pre.length; ++i) cbs.pre[i]();
    
    // ...
  }
}

Here is an example of the pre pre hook, —dfae9aa2daad514565e3b5fc37697f66—The execution timing of the hook is when patch() starts to execute. It can be seen that the patch() function is called cyclically at the beginning of the execution cbs 04d55558147391e1e382b4a3e4c4c10a—The related hook stored in pre . The calls of other life cycle functions are similar to this. You can see the places where the corresponding life cycle functions are called in other places in the source code.

The design idea here is the observer pattern . The non-core functions are implemented in modules, combined with the definition of the life cycle, the module can define its own hooks of interest, and then init() is processed into cbs object is to register these hooks ; When execution time comes, these hooks are called to notify the module to process. This separates the core code from the module code. From here, we can see that the observer pattern is a common pattern for code decoupling.

Create a process

Next, let’s take a practical example to look at the creation process from virtual DOM to DOM object, first look at the following code:

 const patch = init([
  // 配置需要使用的模块。
  classModule,
  propsModule,
  styleModule,
  eventListenersModule,
]);

// <div id="container"></div>
const container = document.getElementById('container');

const vnode = h(
  'div#container.two.classes',
  { on: { click: someFn } },
  [
    h('span', { style: { fontWeight: 'bold' } }, 'This is bold'),
    ' and this is just normal text',
    h('a', { props: { href: '/foo' } }, "I'll take you places!"),
  ]
);

// 传入一个空的元素节点,创建 DOM。
patch(container, vnode);

The content of the code is to create a vnode, call the patch() function, pass in a DOM node and a vnode to complete the mounting process.

patch()

Next, let’s come to the core function of Kangkang patch() , this function is returned after the call of init() 8be8691eac73139d30f6dde1b9d4a459—, the function is to mount and update the VNode, the signature is as follows:

 function patch(oldVnode: VNode | Element | DocumentFragment, vnode: VNode): VNode {
    // 为简单起见先不关注 DocumentFragment。
    // ...
}

oldVnode parameter is the old VNode or DOM element or document fragment, vnode parameter is the updated object. Here we only focus on the part of the code related to the creation process, which is described as follows:

  1. Call the pre hook registered on the module.
  2. During the creation process oldVnode the parameter is a DOM object, then it is converted to an empty vnode object, and the attribute records elm . Here the empty vnode object means there is no data and children .
  3. Then call sameVnode() , determine that oldVnode and vnode are not the same, and enter the creation of a logical branch. The specific judgment logic of sameVnode() will be introduced in detail in the update process.
  4. Call createElm() to create a new DOM node, insert the DOM node after creation and delete the oldVnode associated old DOM node. There may also be child nodes that need to be inserted during this process. All inserted vnodes are put into the insertedVnodeQueue queue.
  5. Traverse calls the insertedVnodeQueue insert hook of the vnode object inserted in —545f3a301ad6bf0c612152c61df2236d—. Why is it called uniformly here instead of calling it directly after inserting the DOM node in the previous step? This is explained in createElm() .
  6. Finally, the post hook registered on the module is called.

Let’s take a look at createElm() how DOM nodes are created.

createElm()

The detailed processing process of createElm() is as follows:

  1. Call the possible init hook on the vnode object.
  2. Then deal with the following situations:
    1. If vnode.sel === '!' , this is the method Snabbdom uses to delete the original node, which will insert a new comment node. Because the old node will be deleted after createElm() , so this setting can achieve the purpose of uninstallation.
    2. If vnode.sel the selector definition exists:
      1. Parse the selectors and get id , tag and class .
      2. document.createElement() cc0ff75136e7f0f744ea3d476bf40cbb—或—510dacd736fe511e7d39174e080859a5 document.createElementNS DOM 节点,并记录到vnode.elm中, id 、 tag and class .
      3. Call the create hook on the module.
      4. Process children Child node array:
        1. If children is an array, then recursively call createElm() after creating the child node, call appendChild to mount it to vnode.elm under —fae0c62bc816d7c12909.
        2. children vnode.text ,说明这个元素的内容是个文本, createTextNode创建文本节点并挂载到vnode.elm down.
      5. Call the create hook on the vnode. If the insert hook is defined on the vnode, add the vnode to the insertedVnodeQueue queue.
    3. The remaining case is that vnode.sel does not exist, indicating that the node itself is text, then call createTextNode to create a text node and record it to vnode.elm .
  3. Finally it returns vnode.elm .

In general, createElm() is based on different settings of the sel selector to choose how to create a DOM node. Here is a detailed description of the role of the insertedVnodeQueue queue. The reason why this queue is needed is to call insert after all the descendant nodes have been inserted, so that we can calculate the size and position information of the element in insert to be accurate. Combined with the above process of creating child nodes, createElm() creating child nodes is a recursive call, so the queue will record the child nodes first, and then record itself. This way the order is guaranteed when executing the queue at the end of patch() .

update process

The updated code is as follows:

 // ...

const vnode = h(
  'div#container.two.classes',
  { on: { click: someFn } },
  [
    h('span', { style: { fontWeight: 'bold' } }, 'This is bold'),
    ' and this is just normal text',
    h('a', { props: { href: '/foo' } }, "I'll take you places!"),
  ]
);

// 传入一个空的元素节点,创建 DOM。
patch(container, vnode);

// 创建一个新的 vnode。
const newVnode = h(
  'div#container.two.classes',
  { on: { click: anotherEventHandler } },
  [
    h(
      'span',
      { style: { fontWeight: 'normal', fontStyle: 'italic' } },
      'This is now italic type'
    ),
    ' and this is still just normal text',
    h('a', { props: { href: ''/bar' } }, "I'll take you places!"),
  ]
);

// 再次调用 patch(),将旧节点更新为新节点。
patch(vnode, newVnode);

First of all, patch() will still be called first. We mainly focus on the differences with the creation process. patch() sameVnode() oldVnode vnode的, sameVnode() :

 function sameVnode(vnode1: VNode, vnode2: VNode): boolean {
  // 同样的 key。
  const isSameKey = vnode1.key === vnode2.key;
  
  // Web component,自定义元素标签名,看这里:
  // https://developer.mozilla.org/zh-CN/docs/Web/API/Document/createElement
  const isSameIs = vnode1.data?.is === vnode2.data?.is;
  
  // 同样的选择器。
  const isSameSel = vnode1.sel === vnode2.sel;

  // 三者都相同即是相同的。
  return isSameSel && isSameKey && isSameIs;
}

Next, it will call patchVnode() to do the diff update.

patchVnode()

patchVnode() is the core function of virtual DOM. The specific processing flow is as follows:

  1. First execute the prepatch hook on the vnode.
  2. If oldVnode and vnode are the same object reference, return without processing.
  3. Call the update hook on the module and vnode.
  4. If vnode.text is not defined, several cases of children are handled:
    1. If oldVnode.children and vnode.children both exist and are not the same. Then call updateChildren to update.
    2. vnode.children exists and oldVnode.children does not exist. If oldVnode.text exists, clear it first, then call addVnodes to add a new vnode.children .
    3. vnode.children does not exist and oldVnode.children exists. Call removeVnodes remove oldVnode.children .
    4. If oldVnode.children nor vnode.children exist. Empty if oldVnode.text exists.
  5. If defined vnode.text and is different from oldVnode.text . If oldVnode.children exists, call removeVnodes to clear it. Then set the text content by textContent .
  6. Finally, execute the postpatch hook on the vnode.

It can be seen from the process that the changes to the related attributes of the own node in the diff, such as class , style and the like are updated by modules, but there is not much to expand here. You can look at the code related to the module. The main core processing of diff is focused on children , and then Kangkang diff processes children several related functions.

addVnodes()

This is very simple, first call createElm() to create, and then insert it into the corresponding parent.

removeVnodes()

When it is removed, the destory and remove hooks will be called first. Here we will focus on the calling logic and difference between these two hooks.

  • destory , call this hook first. The logic is to call the hook on the vnode object first, and then call the hook on the module. Then call this hook recursively for vnode.children in this order.
  • remove , this hook will only be triggered when the current element is removed from its parent, the child elements of the removed element will not be triggered, and this hook will be called on both the module and the vnode object , the order is to call on the module first and then on the vnode. What’s more special is to wait for all remove will be called before the element is really removed, which can achieve some delayed deletion requirements.

It can be seen from the above that the calling logic of these two hooks is different, especially remove will only be called on elements that are directly detached from the parent.

updateChildren()

updateChildren() is used to deal with sub-node diff, and it is also a more complicated function in Snabbdom.总的oldCh 5f5f78e80f68ff507cd230f267c0a222—和newCh头、尾一共四个指针,这四个指针oldStartIdx 、 oldEndIdx , newStartIdx and newEndIdx . Then compare the two arrays in the while (oldStartIdx <= oldEndIdx && newStartIdx <= newEndIdx) loop, find the same part for multiplexing and update, and move at most a pair of pointers per comparison process. The detailed traversal process is processed in the following order:

  1. If any of these four pointers points to vnode == null, the pointer moves to the middle, for example: start++ or end–, the generation of null will be explained in the following.
  2. If the old and new start nodes are the same, that is, sameVnode(oldStartVnode, newStartVnode) returns true, then the diff is performed with patchVnode() , and both start nodes are one step forward to the middle.
  3. If the old and new end nodes are the same, patchVnode() is also used for processing, and the two end nodes step back to the middle.
  4. If the old start node is the same as the new end node, process the update first with patchVnode() . Then you need to move the DOM node corresponding to oldStart. The strategy of moving is to move before the next sibling node of the corresponding DOM node oldEndVnode . Why move like this? First of all, oldStart is the same as newEnd, indicating that in the current loop processing, the start node of the old array is moved to the right; because each processing is to move the head and tail pointers to the middle, we update the old array to a new one, this time oldEnd may not have been processed yet, but at this point oldStart is determined to be the last in the current processing of the new array, so it makes sense to move before oldEnd’s next sibling. After moving, oldStart++, newEnd–, move one step to the middle of their respective arrays respectively.
  5. If the old end node is the same as the new start node, first use patchVnode() to process the update, and then move the DOM node corresponding to oldStartVnode before the corresponding DOM node, the reason for moving is the same as the previous step. After moving, oldEnd–, newStart++.
  6. If none of the above is the case, use the key of newStartVnode to find the subscript idx in oldChildren , and there are two different processing logics depending on whether the subscript exists:
    1. If the subscript does not exist, newStartVnode is newly created. Create a new DOM by createElm() and insert it before the corresponding DOM of oldStartVnode .
    2. If the subscript exists, it should be handled in two cases:
      1. If the sel of the two vnodes is different, it is still regarded as newly created, and a new DOM is created by createElm() and inserted before the corresponding DOM of oldStartVnode .
      2. If the sel is the same, the update is processed by patchVnode() , and the vnode corresponding to the subscript oldChildren is set to undefined, which is why == null appears in the previous double pointer traversal . Then insert the updated node before the DOM corresponding to oldStartVnode .
    3. After the above operations, newStart++.

After the traversal is over, there are two more cases to deal with. One is that oldCh has been all processed, and there are new nodes in —62b687797611083350524c74f54d9d70 newCh , and need to create a new DOM for each of the remaining newCh ; The other is that newCh all processing is completed, and oldCh there are old nodes, and the redundant nodes need to be removed. Both cases are handled as follows:

 function updateChildren(
    parentElm: Node,
    oldCh: VNode[],
    newCh: VNode[],
    insertedVnodeQueue: VNodeQueue
  ) { 
    // 双指针遍历过程。
    // ...
      
    // newCh 中还有新的节点需要创建。
    if (newStartIdx <= newEndIdx) {
      // 需要插入到最后一个处理好的 newEndIdx 之前。
      before = newCh[newEndIdx + 1] == null ? null : newCh[newEndIdx + 1].elm;
      addVnodes(
        parentElm,
        before,
        newCh,
        newStartIdx,
        newEndIdx,
        insertedVnodeQueue
      );
    }
      
    // oldCh 中还有旧的节点要移除。
    if (oldStartIdx <= oldEndIdx) {
      removeVnodes(parentElm, oldCh, oldStartIdx, oldEndIdx);
    }
  }

Let’s take a look at the processing of updateChildren() with an actual example:

  1. The initial state is as follows, the old child node array is [A, B, C], and the new node array is [B, A, C, D]:
  1. In the first round of comparison, the start and end nodes are different, so see if newStartVnode exists in the old node, find the position in oldCh[1], then execute patchVnode() to update, and then update oldCh[ 1] = undefined, and insert the DOM before — oldStartVnode , newStartIdx move one step backward, the status after processing is as follows:
  1. 第二轮比较, oldStartVnode newStartVnode , patchVnode()更新, oldStartIdx和—03b93ddaf9d3db2acd7ce04051edf7d3 newStartIdx移动, the status after processing is as follows:
  1. The third round of comparison, oldStartVnode == null , oldStartIdx move to the middle, and the status is updated as follows:
  1. Fourth round of comparisons.  oldStartVnode newStartVnode , patchVnode()更新, oldStartIdx和—0951dabc72ba79d7558c3d08830a08fb newStartIdx移动, the status after processing is as follows:
  1. At this point oldStartIdx is greater than oldEndIdx , and the loop ends. At this time, there are new nodes in newCh that have not been processed yet. You need to call addVnodes() to insert, and the final status is as follows:

Summarize

At this point, the working principle and core content of virtual DOM have been sorted out. If you have time, you can go to the details of the source code of Kangkang Snabbdom to have a closer look. The design ideas and implementation are worth learning.