where条件查询出来的结果,跟条件值不一致

这种检测不出来,比如通过该SQL,共检测出4600+条有问题的数据行,随机取一行:
mysql> select goods_id,pid, concat(goods_id, pid) from xxxxxx ignore index(primary) where goods_id = ‘1331230160172’ and pid = ‘100001258472’;
±--------------±-------------±--------------------------+
| goods_id | pid | concat(goods_id, pid) |
±--------------±-------------±--------------------------+
| 1331230160172 | 100001258472 | 1331230160172100001258472 |
| 1331230160172 | 100001258472 | 1331230160172100001258472 |
±--------------±-------------±--------------------------+
2 rows in set

mysql> select goods_id,pid, concat(goods_id, pid) from xxxxxx where goods_id = ‘1331230160172’ and pid = ‘100001258472’;
±--------------±-------------±--------------------------+
| goods_id | pid | concat(goods_id, pid) |
±--------------±-------------±--------------------------+
| 1331230160172 | 100001258472 | 1331230160172100001258472 |
±--------------±-------------±--------------------------+
1 row in set

mysql>

其实这种 table scan 扫主键值有重复就是有问题的行,如果希望找到和重复行不匹配的索引可以试下这个

select idx.pid, idx.goods_id, idx.rowid from (select pid, goods_id, _tidb_rowid as rowid from xxxxxx use index(primary)) idx left join  (select pid, goods_id, _tidb_rowid as rowid from xxxxxx ignore index(primary)) tb on idx.pid = tb.pid and idx.goods_id = tb.goods_id and idx.rowid = tb.rowid where tb.pid is null;

思路是用 index 扫描 left outer join table 扫描,找到 join 右侧主键为 null 即是 "index 有数据但 table 却没有匹配主键值记录"的索引记录

执行步骤 5 时 restore 日志有没有看到 checksum error 之类的信息?

在这一步之前,可以试试先 rename 目标集群中的表,然后 br restore 数据

另外帮忙查一下重复主键的 mvcc 信息

  1. 获取重复主键的 rowid (handle id):

    select _tidb_rowid, goods_id, pid from xxxxxx ignore index(primary) where goods_id = ‘1331230160172’ and pid = ‘100001258472’;

  2. 根据 rowid 查询 mvcc (假设上一步查到 _tidb_rowid 分别为 123456 和 789012)

    curl http://{TiDBIP}:10080/mvcc/key/{db}/{table}/123456
    curl http://{TiDBIP}:10080/mvcc/key/{db}/{table}/789012

mysql> select _tidb_rowid, goods_id, pid from xxxxxx ignore index(primary) where goods_id = ‘1331230160172’ and pid = ‘100001258472’;
±------------±--------------±-------------+
| _tidb_rowid | goods_id | pid |
±------------±--------------±-------------+
| 346770622 | 1331230160172 | 100001258472 |
| 348570151 | 1331230160172 | 100001258472 |
±------------±--------------±-------------+
2 rows in set

mysql>

===================
curl http://xxx.xxx.xxx:10080/mvcc/key/xxxxxx/xxxxxx/346770622
{
“key”: “7480000000000036715F728000000014AB4CBE”,
“region_id”: 4280036,
“value”: {
“info”: {
“writes”: [
{
“start_ts”: 425965087999131657,
“commit_ts”: 425965087999131660,
“short_value”: “gAAFAAAAAQIDBAUMABkALAB5AIwAMTAwMDAxMjU4NDcyMTMzMTIzMDE2MDE3MkNoYW1waW9u55m9VOefreiilkxodHRwczovL2ltYWdlLWMud2VpbW9id21jLmNvbS9zYWFzLXd4Yml6L2FmODc4NWZiZDFkNjRkYWFiNzU3YzE5OTdlYjEwNzk3LnBuZzIwMjEtMDYtMjkgMDg6NTA6NTQ=”
}
]
}
}
}

curl http://xxx.xxx.xxx:10080/mvcc/key/xxxxxx/xxxxxx/348570151
{
“key”: “7480000000000036715F728000000014C6C227”,
“region_id”: 4280036,
“value”: {
“info”: {
“writes”: [
{
“start_ts”: 425965088002539532,
“commit_ts”: 425965088002539533,
“short_value”: “gAAFAAAAAQIDBAUMABkALAB5AIwAMTAwMDAxMjU4NDcyMTMzMTIzMDE2MDE3MkNoYW1waW9u55m9VOefreiilkxodHRwczovL2ltYWdlLWMud2VpbW9id21jLmNvbS9zYWFzLXd4Yml6L2FmODc4NWZiZDFkNjRkYWFiNzU3YzE5OTdlYjEwNzk3LnBuZzIwMjEtMDYtMjkgMDg6NTA6NTQ=”
}
]
}
}
}

执行步骤 5 时 restore 日志有没有看到 checksum error 之类的信息?

在这一步之前,可以试试先 rename 目标集群中的表,然后 br restore 数据


当时有警告, 我们查询数据,发现数据都在没有重视起来

没有先rename或drop,后面找时间我们在复现一次:先rename在restore吧

最后解决了么?我这边也遇到了相同的问题,目标库有数据的情况下,使用br还原,还原之后在目标库中按主键查询,查出不符合where条件的数据,正常应该返回一条,实际返回了两条。