主键值对空格的处理疑问

【 TiDB 使用环境】测试
【 TiDB 版本】v7.1.0
【复现路径】

发现7.1版本中对于主键存在空格时的行为比较怪异:

mysql> select @@sql_mode;
+-------------------------------------------------------------------------------------------------------------------------------------------+
| @@sql_mode                                                                                                                                |
+-------------------------------------------------------------------------------------------------------------------------------------------+
| STRICT_TRANS_TABLES,NO_ZERO_IN_DATE,NO_ZERO_DATE,ERROR_FOR_DIVISION_BY_ZERO,NO_AUTO_CREATE_USER,NO_ENGINE_SUBSTITUTION,ONLY_FULL_GROUP_BY |
+-------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> select @@global.tidb_enable_clustered_index ;
+--------------------------------------+
| @@global.tidb_enable_clustered_index |
+--------------------------------------+
| ON                                   |
+--------------------------------------+
1 row in set (0.00 sec)

聚簇表行为:

mysql> create table tt (id varchar(10) not null primary key);
Query OK, 0 rows affected (0.12 sec)

mysql> show create table tt;
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                         |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tt    | CREATE TABLE `tt` (
  `id` varchar(10) NOT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] CLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> insert into tt values('a ');
Query OK, 1 row affected (0.00 sec)

mysql> insert into tt values('a');
ERROR 1062 (23000): Duplicate entry 'a' for key 'tt.PRIMARY'

mysql> select *,length(id),char_length(id) from tt;
+----+------------+-----------------+
| id | length(id) | char_length(id) |
+----+------------+-----------------+
| a  |          2 |               2 |
+----+------------+-----------------+
1 row in set (0.00 sec)
mysql> select length('a'),char_length('a');
+-------------+------------------+
| length('a') | char_length('a') |
+-------------+------------------+
|           1 |                1 |
+-------------+------------------+
1 row in set (0.00 sec)

mysql> select length('a '),char_length('a ');
+--------------+-------------------+
| length('a ') | char_length('a ') |
+--------------+-------------------+
|            2 |                 2 |
+--------------+-------------------+
1 row in set (0.00 sec)

‘a’ 和 'a '对于id字段来说是不一的,写入时报主键冲突。

非聚簇表情况也一样:

mysql> create table tt2 (id varchar(10) not null primary key NONCLUSTERED);
Query OK, 0 rows affected (0.13 sec)

mysql> show create table tt2;
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table                                                                                                                                                             |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| tt2   | CREATE TABLE `tt2` (
  `id` varchar(10) NOT NULL,
  PRIMARY KEY (`id`) /*T![clustered_index] NONCLUSTERED */
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin |
+-------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.00 sec)

mysql> insert into tt2 values('a ');
Query OK, 1 row affected (0.01 sec)

mysql> insert into tt2 values('a');
ERROR 1062 (23000): Duplicate entry 'a' for key 'tt2.PRIMARY'
mysql> select *,length(id),char_length(id) from tt2;
+----+------------+-----------------+
| id | length(id) | char_length(id) |
+----+------------+-----------------+
| a  |          2 |               2 |
+----+------------+-----------------+
1 row in set (0.01 sec)

【遇到的问题:问题现象及影响】

相同的用例在4.0.14版本中是可以写入的:


my:root@127.0.0.1:4000=> select tidb_version();
                          tidb_version()
-------------------------------------------------------------------
 Release Version: v4.0.14                                         +
 Edition: Community                                               +
 Git Commit Hash: 4f919b07f8f013e48521894c70cb69004f43c622        +
 Git Branch: heads/refs/tags/v4.0.14                              +
 UTC Build Time: 2021-07-26 10:53:03                              +
 GoVersion: go1.13                                                +
 Race Enabled: false                                              +
 TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306+
 Check Table Before Drop: false
(1 row)
my:root@127.0.0.1:4000=>
my:root@127.0.0.1:4000=> use test;
USE
my:root@127.0.0.1:4000=> create table tt2 (id varchar(10) not null primary key);
CREATE TABLE
my:root@127.0.0.1:4000=> show create table tt2;
 Table |                        Create Table
-------+-------------------------------------------------------------
 tt2   | CREATE TABLE "tt2" (                                       +
       |   "id" varchar(10) NOT NULL,                               +
       |   PRIMARY KEY ("id")                                       +
       | ) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
(1 row)

my:root@127.0.0.1:4000=> insert into tt2 values('a ');
INSERT 1
my:root@127.0.0.1:4000=> insert into tt2 values('a');
INSERT 1
my:root@127.0.0.1:4000=>
my:root@127.0.0.1:4000=> select *,length(id),char_length(id) from tt;
error: mysql: 1146: Table 'test.tt' doesn't exist
my:root@127.0.0.1:4000=> select *,length(id),char_length(id) from tt2;
 id | length(id) | char_length(id)
----+------------+-----------------
 a  |          1 |               1
 a  |          2 |               2
(2 rows)

不知道哪个版本的有问题。

https://docs.pingcap.com/zh/tidb/stable/character-set-and-collation#旧框架下的排序规则支持

‘a’ 和 'a ’ 建议应用端做好去重,尽量不要留到数据库端处理

MySQL :: MySQL 8.4 Reference Manual :: 13.3.2 The CHAR and VARCHAR Types
For those cases where trailing pad characters are stripped or comparisons ignore them, if a column has an index that requires unique values, inserting into the column values that differ only in number of trailing pad characters results in a duplicate-key error. For example, if a table contains , an attempt to store causes a duplicate-key error. 'a' 'a '
mysql上的处理方式就是这样,会忽略掉最后空格

1 个赞

已确认是4.0版本旧排序规则的bug,正常的行为是忽略末尾的空格,6.0版本以后默认启用新排序规则已修复了这个bug,所以主键冲突是符合预期的。

这个一般前端做验证好一些。

排序规则的问题~除了这个,还需要注意mysql5.7驱动和8.0驱动的默认排序规则也有变化。

此话题已在最后回复的 60 天后被自动关闭。不再允许新回复。