Share a few tips on index creation

Regarding the index in MySQL, Brother Song has talked a lot with his friends before. However, there are still some details that need to be paid attention to when using the index. If these details are ignored, the index may be damaged. The effect of use is greatly reduced.

1. Redundant index

Note that I used redundant indexes here instead of duplicate indexes, because I think that when friends use indexes, the probability of creating duplicate indexes should still be relatively small. It should be very easy to create multiple identical indexes on the same field. Few people make this mistake. However, there will be some redundant indexes that are easily overlooked by everyone. Let’s take a look at them.

1.1 The left column of the joint index

For example, I created a joint index (A, B, C). According to the leftmost matching principle we told you before, when we use A, (A, B) or (A, B, C) to query data , this joint index will be used, so we do not need to create a separate index for the A field, or create a joint index for the A and B fields.

1.2 Add primary key to index

Suppose I have a table with the following fields (ID, A, B, C), where ID is the primary key, and now I have created a joint index (A, ID) on the two fields A and ID.

According to Brother Song’s previous introduction, friends know that in the secondary index, the data stored on the leaf nodes is the ID, so the ID field in this joint index is obviously redundant.

In most cases, we do not need redundant indexes, but there are also some special situations where we may have to create some redundant indexes. This friend still needs to analyze the specific issues in detail.

Another thing to note is that for the same field, if the index types are different, it cannot be regarded as a duplicate index. For example, a normal index and a full-text index. If there are both indexes on the same field, it is not considered a duplicate index.

2. Hidden index sorting

In the last article, Brother Song just talked to everyone about the issue of index sorting.

Based on the content of the previous article, friends, let’s think about this question: Suppose I have a table that contains the following fields (ID, A, B), where ID is the primary key. Now I create an index for the A field. If I There is the following query SQL:

select ... from  table  where A = xxx order  by ID

Since the secondary index A contains the ID field, the above query can use index sorting. At this time, if due to other needs, we expand the index A into a joint index (A, B), then it is obvious that when executing the above query, we will no longer need index sorting, and can only use filesort. It is easy for friends to overlook such problems when creating or modifying indexes, so be sure to be careful.

3. Delete unused indexes

Some indexes may have been created due to excessive consideration and have not been used since they were successfully created. Such indexes should also be deleted.

Friends know that although indexes can increase query speed, they will slow down insertion and modification speeds.

There is a view named schema\_unused\_indexes in the MySQL metadata database sys, which stores various created but unused indexes:

4. Manually update index statistics

When we want to view the execution plan of a SQL, the execution plan will show how many rows of data will be scanned during the execution of the SQL, as follows:

This estimated number of scanned rows is very important. It is an important reference indicator for the MySQL optimizer when executing SQL. If the table does not have this statistical information, or the statistical information is inaccurate, it may cause the optimizer to make wrong decisions. Decide.

This statistical information will be automatically generated or updated when the following conditions are met:

  • Open the table for the first time.
  • The table size changes.
  • Execute SHOW TABLE STATUS
  • Execute SHOW INDEX
  • MySQL client turns on the auto-completion function
  • Open some related tables in the infomation\_schema library

These behaviors will trigger automatic updates of statistical information. If the amount of data in the table is relatively large and you are worried that the above behaviors will reduce the performance of the table, you can also modify the innodb\_stats\_on\_metadata parameter to turn off the above behaviors.

Of course, we can also manually execute analyze tablethe command to update the index statistics.

5. Optimize tables in a timely manner

The index in InnoDB is a B+Tree, which we have talked about with our friends in previous articles. B+Tree organizes data together through a multi-path balanced search tree. However, the nodes in this tree are not necessarily continuous in physical distribution when stored. If they are continuous, they will be lost during data operations. It will be much faster. If continuity is not required, data operation performance will inevitably decrease. Generally speaking, there are several different forms of fragmentation:

  1. Row fragmentation: Data rows are distributed in different places, and multiple random IOs are involved when reading data rows.
  2. Inter-row fragmentation: Logically, rows or data pages should be continuous, but they are not continuous when stored on disk. The original full table scan was sequential IO, but now it is random IO.
  3. Remaining space fragmentation: Friends know that the most basic unit of InnoDB’s operating data table is page, and one page is 16KB. That is, InnoDB reads from the disk and writes to the disk. The minimum unit is 16KB. Sometimes among these 16KB, the effective There is very little data, and there is remaining space elsewhere, which will cause InnoDB to cause a lot of waste when reading and writing data.

For the above situation, we can reorganize the data by executing optimize table. If the storage engine does not support the optimize table command, then we can also alter table <table> engine=xxxreorganize the data by executing the command (xxx in the command is the original engine of the table).

Of course, there are still some details during the execution of the optimize table command. Brother Song will write an article to share with his friends later.