Complex data transformations with normalizr

The author once developed a small program for data sharing, and the sharing logic is similar to Baidu network disk. The current data can be processed by the shared person and then continue to share (you can control the expiration time of the data, whether the data can be processed and continue to share).

The shared data is a deeply nested json object. When the user reads the shared data, it is stored in the Mini Program cloud database (the shared data is different from the business data, and the business server is not used for maintenance). If you get the data and store it directly, the cloud database will soon become very large, and secondly, we will not be able to analyze each item and retrieve each item of sub-data to sharers.

At this time, data conversion is required for splitting and maintenance. We can use normalizr written by redux author Dan Abramov to process the data.

normalizr was created to handle deep, complex nested objects.

how to use

Slightly modify the official example, assuming that the data of the following books are obtained:

 {
  id: "1",
  title: "JavaScript 从入门到放弃",
  // 作者
  author: {
    id: "1",
    name: "chc"
  },
  // 评论
  comments: [
    {
      id: "1",
      content: "作者写的太好了",
      commenter: {
        id: "1",
        name: "chc"
      }
    },
     {
      id: "2",
      content: "楼上造假数据哈",
      commenter: {
        id: "2",
        name: "dcd"
      }
    },
  ]
}

At this time, we can write 3 main bodies: book information, comments and users. We first construct the schema from the basic data:

 import { normalize, schema } from 'normalizr';

// 构造第一个实体 用户信息
const user = new schema.Entity('users');

// 构造第二个实体 评论
const comment = new schema.Entity('comments', {
  // 评价者是用户
  commenter: user
});

// 构造第三个实体 书籍
const book = new schema.Entity('books', {
  // 作者
  author: user,
  // 评论
  comments: [comment]
});

// 传入数据以及当前最大的 schema 信息
const normalizedData = normalize(originalData, book);

Let’s take a look at the final data first.

 {
  "entities": {
    "users": {
      "1": {
        "id": "1",
        "name": "chc"
      },
      "2": {
        "id": "2",
        "name": "dcd"
      }
    },
    "comments": {
      "1": {
        "id": "1",
        "content": "作者写的太好了",
        "commenter": "1"
      },
      "2": {
        "id": "2",
        "content": "楼上造假数据哈",
        "commenter": "2"
      }
    },
    "books": {
      "1": {
        "id": "1",
        "title": "JavaScript 从入门到放弃",
        "author": "1",
        "comments": [
          "1",
          "2"
        ]
      }
    }
  },
  "result": "1"
}

Removing other information, we can see that 3 different entity objects are obtained, users, comments, books. The key of the object is the current id and the value is the data structure of the current tile. At this point we can use objects or arrays (Object.values) to add and update data.

parsing logic

Seeing this, everyone may be very confused. Regardless of the code implementation, let’s first analyze how the library parses the schema we wrote so that you can use it in actual scenarios, and then look at the data and schema definitions again:

data structure

 {
  id: "1",
  title: "JavaScript 从入门到放弃",
  // 作者
  author: {
    id: "1",
    name: "chc"
  },
  // 评论
  comments: [
    {
      id: "1",
      content: "作者写的太好了",
      commenter: {
        id: "1",
        name: "chc"
      }
    },
     {
      id: "2",
      content: "楼上造假数据哈",
      commenter: {
        id: "2",
        name: "dcd"
      }
    },
  ]
}
  • Book information is the first-level object. The data contains id, title, author, and comments. The corresponding schema is as follows const book = new schema.Entity(‘books’, { // 作者 author: user, // 一本书对应多个评论,所以这里使用数组 comments: [comment] });Among them, id and title are the properties of the book itself, no need to pay attention, write out the data structure that needs to be parsed. The books string has nothing to do with parsing, and corresponds to the key of the entities object.
  • look at user const user = new schema.Entity(‘users’);user has no information to parse, just define the entity directly.
  • Finally the comments const comment = new schema.Entity(‘comments’, { // 评价者是用户 commenter: user }); { id: “1”, content: “作者写的太好了”, commenter: { id: “1”, name: “chc” } }Taking the comments out of the original data structure is actually very clear.

Advanced usage

work with arrays

normalizr can parse a single object, so what if the current business passes an array? Similar to comment, you can use it directly like this:

 [
  {
    id: '1',
    title: "JavaScript 从入门到放弃"
    // ...
  },
  {
    id: '2',
    // ...
  }
]

const normalizedData = normalize(originalData, [book]);

reverse parsing

We only need to get the result and entities in the normalizedData just now to get the previous information.

 import { denormalize, schema } from 'normalizr';

//...

denormalize(normalizedData.result, book, normalizedData.entities);

Entity configuration

During development, entity data can be re-parsed according to configuration information.

 const book = new schema.Entity('books', {
  // 作者
  author: user,
  // 一本书对应多个评论,所以这里使用数组
  comments: [comment]
}, {
  // 默认主键为 id,否则使用 idAttribute 中的数据,如 cid,key 等
  idAttribute: 'id',
  // 预处理策略, 参数分别为 实体的输入值, 父对象
  processStrategy: (value, parent, key) => value,
  // 遇到两个id 相同数据的合并策略,默认如下所示,我们还可以继续修改
  mergeStrategy: (prev, prev) => ({
    ...prev,
    ...next,
    // 是否合并过,如果遇到相同的,就会添加该属性
    isMerge: true
  }),
});

// 看一下比较复杂的例子,以 user 为例子
const user = new schema.Entity('users', {
}, {
  processStrategy: (value, parent, key) => {
    // 增加父对象的属性
    // 例如 commenter: "1" => commenterId: "1" 或者 author: "2" => "authorId": "2"
    // 但是目前还无法通过 delete 删除 commenter 或者 author 属性
    parent[`${key}Id`] = value.id

    // 如果是从评论中获取的用户信息就增加 commentIds 属性
    if (key === 'commenter') { 
      return {
        ...value, 
        commentIds: [parent.id] 
      } 
    }
    // 不要忘记返回 value, 否则不会生成 user 数据
    return {
      ...value,
      bookIds: [parent.id]
    };
  }
  mergeStrategy: (prev, prev) => ({
    ...prev,
    ...next,
    // 该用户所有的评论归并到一起去
    commentIds: [...prev.commentIds, ...next.commentIds],
    // 该用户所有的书本归并到一起去
    bookIds: [...prev.bookIds, ...next.bookIds],
    isMerge: true
  }),
})

// 最终获取的用户信息为
{
  "1": {
    "id": "1",
    "name": "chc"
    // 用户 chc 写了评论和书籍,但是没有进行过合并
    "commentIds": ["1"],
    "bookIds": ["1"],
  },
  "2": {
    "id": "2",
    "name": "dcd",
    // 用户 dcd 写了 2 个评论,同时进行了合并处理
    "commentIds": [
      "2",
      "3"
    ],
    "isMerge": true
  }
}

Of course, the library can also perform more complex data formatting, and you can learn and use it further through the api documentation .

other

Of course, normalizr has limited usage scenarios after all, and the person in charge of open source has already been replaced. The main library is currently unmaintained (the issue has also been closed). Of course, the normalizr code itself is stable enough.

The author is also considering some new scenarios and trying to add some new functions (such as id conversion) and optimization (ts refactoring) to normalizr. If you encounter any problems in the process of using normalizr, you can also contact me and store The library is currently in normalizr-helper .