Issue
opened 03:41AM - 16 Jul 24 UTC
closed 03:18AM - 17 Jul 24 UTC
type/bug
sig/planner
sig/transaction
severity/critical
affects-6.1
affects-6.5
affects-7.1
affects-7.5
affects-8.1
## Bug Report
Please answer these questions before submitting your issue. Tha… nks!
### 1. Minimal reproduce step (Required)
```
mysql> select @@autocommit; -- enable autocommit
+--------------+
| @@autocommit |
+--------------+
| 1 |
+--------------+
create table t (pk int, a int, primary key(pk)); -- create a table with PK
prepare st from 'select * from t where pk=? for update'; -- prepare a PointPlan statement
set @pk=1;
execute st using @pk; -- execute this statement to generate a PointPlan cached in Plan Cache
-- plan of this exec-statement, Lock operations for "for update" are optimized by auto-commit
+-------------+---------+---------+------+---------------+------------------------------------------------------------+---------------+--------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+-------------+---------+---------+------+---------------+------------------------------------------------------------+---------------+--------+------+
| Point_Get_1 | 1.00 | 0 | root | table:t | time:94.1µs, loops:1, Get:{num_rpc:1, total_time:42.5µs} | handle:2 | N/A | N/A |
+-------------+---------+---------+------+---------------+------------------------------------------------------------+---------------+--------+------+
begin;
set @pk=1;
execute st using @pk; -- the optimizer decided to reuse the prior PointPlan, which is incorrect.
mysql> select @@last_plan_from_cache;
+------------------------+
| @@last_plan_from_cache |
+------------------------+
| 1 |
+------------------------+
```
Reusing this PointPlan without Lock in the second exec-statement can cause wrong results.
The correct plan for the second exec-statement should have Lock opearations:
```
+-------------+---------+---------+------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+-------------+---------+---------+------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------+------+
| Point_Get_1 | 1.00 | 0 | root | table:t | time:1.74ms, loops:1, lock_keys: {time:1.69ms, region:1, keys:1, slowest_rpc: {total: 0.000s, region_id: 93, store: store1, }, lock_rpc:165µs, rpc_count:1} | handle:1, lock | N/A | N/A |
+-------------+---------+---------+------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------+------+
```
### 2. What did you expect to see? (Required)
Shouldn't reuse the first PointPlan for the second exec-statement and the second exec-statement's plan should have Lock operations:
```
+-------------+---------+---------+------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+-------------+---------+---------+------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------+------+
| Point_Get_1 | 1.00 | 0 | root | table:t | time:1.74ms, loops:1, lock_keys: {time:1.69ms, region:1, keys:1, slowest_rpc: {total: 0.000s, region_id: 93, store: store1, }, lock_rpc:165µs, rpc_count:1} | handle:1, lock | N/A | N/A |
+-------------+---------+---------+------+---------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------+----------------+--------+------+
```
### 3. What did you see instead (Required)
The second exec-statement's plan has no Lock:
```
+-------------+---------+---------+------+---------------+-------------------------------------------------------------+---------------+--------+------+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
+-------------+---------+---------+------+---------------+-------------------------------------------------------------+---------------+--------+------+
| Point_Get_1 | 1.00 | 0 | root | table:t | time:123.7µs, loops:1, Get:{num_rpc:1, total_time:63.3µs} | handle:1 | N/A | N/A |
+-------------+---------+---------+------+---------------+-------------------------------------------------------------+---------------+--------+------+
```
### 4. What is your TiDB version? (Required)
Master
问题描述
由于 LOCK 语义未被正确执行,上述问题可能造成丢失更新等并发异常,以及并发事务写入结果不正确写入数据丢失。
根因分析
如下例子所述,当以 autocommit 方式在事务外执行 “select for update” 语句时,”for update” 语义不会生效(参考 https://docs.pingcap.com/zh/tidb/stable/pessimistic-transaction#和-mysql-innodb-的差异 第五点),最终生成的执行计划不包含 lock 语义(见下方例子中 Point_Get_1 算子)。
后续 session 开启显示事务后,执行相同 “select for update” 语句,前面的执行计划被复用,导致无 LOCK 语义的计划被使用,造成 LOCK 语义丢失,使得事务内 “select for update” 语句按照 “select” 语义执行。
mysql> select @@autocommit ; – enable autocommit
±-------------+
| @@autocommit |
±-------------+
| 1 |
±-------------+
create table t (pk int, a int, primary key(pk)); – create a table with PK
prepare st from ‘select * from t where pk=? for update’; – prepare a PointPlan statement
set @pk=1;
execute st using @pk ; – execute this statement to generate a PointPlan cached in Plan Cache
– plan of this exec-statement, Lock operations for “for update” are optimized by auto-commit
±------------±--------±--------±-----±--------------±-----------------------------------------------------------±--------------±-------±-----+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
±------------±--------±--------±-----±--------------±-----------------------------------------------------------±--------------±-------±-----+
| Point_Get_1 | 1.00 | 0 | root | table:t | time:94.1µs, loops:1, Get:{num_rpc:1, total_time:42.5µs} | handle:2 | N/A | N/A |
±------------±--------±--------±-----±--------------±-----------------------------------------------------------±--------------±-------±-----+
begin;
set @pk=1;
execute st using @pk ; – the optimizer decided to reuse the prior PointPlan, which is incorrect.
mysql> select @@last_plan_from_cache ;
±-----------------------+
| @@last_plan_from_cache |
±-----------------------+
| 1 |
±-----------------------+
诊断方法
使用上述例子中类似步骤,查看后执行的 “select for update” 语句对应执行计划是否从 plan cache 中获取,以及 operator info 中是否包含 lock 标记。
影响版本
v6.1.0 - v6.1.7
v6.5.0 - v6.5.10
v7.1.0 - v7.1.5
v7.5.0 - v7.5.2
v8.1.0
问题修复版本
修复 PR:
pingcap:master
← qw4990:fix-54652
opened 08:06AM - 16 Jul 24 UTC
<!--
Thank you for contributing to TiDB!
PR Title Format:
1. pkg [, pkg2,… pkg3]: what's changed
2. *: what's changed
-->
### What problem does this PR solve?
<!--
Please create an issue first to describe the problem.
There MUST be one line starting with "Issue Number: " and
linking the relevant issues via the "close" or "ref".
For more info, check https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/contribute-code.html#referring-to-an-issue.
-->
Issue Number: close #54652
Problem Summary: planner: fix the issue of reusing wrong point-plan for "select ... for update"
### What changed and how does it work?
Encode more txn state into the plan cache key, and check whether the key has changed before reusing point-get plans.
### Check List
Tests
- [x] Unit test
- [ ] Integration test
- [ ] Manual test (add detailed scripts or steps below)
- [ ] No need to test
> - [ ] I checked and no code files have been changed.
>
Side effects
- [ ] Performance regression: Consumes more CPU
- [ ] Performance regression: Consumes more Memory
- [ ] Breaking backward compatibility
Documentation
- [ ] Affects user behaviors
- [ ] Contains syntax changes
- [ ] Contains variable changes
- [ ] Contains experimental features
- [ ] Changes MySQL compatibility
### Release note
Please refer to [Release Notes Language Style Guide](https://pingcap.github.io/tidb-dev-guide/contribute-to-tidb/release-notes-style-guide.html) to write a quality release note.
```release-note
None
```
修复版本:v6.1.8, v6.5.11, v7.1.6, v7.5.3, v8.1.1
Workaround 方法
两种方式
2 个赞