explain analyze 中的total_keys和key_skipped_count 的疑问?

请问下各位老师,对sysbench 产生的sbtest1表进行测试
执行了2次explain analyze delete from sbtest1 limit 1000,有下列2个疑问?
1.执行第二次explain analyze delete from sbtest1 limit 1000 后,执行计划里面的total_keys 为什么是3001,按照我的理解应该是2000左右才对,
2.执行第二次explain analyze delete from sbtest1 limit 1000 后,执行计划里面的key_skipped_count 为什么变成了4000

按照我的理解 key_skipped_count RocksDB代表 扫数据时所有遇到的 Key 数量,怎么会变成4000,就执行了2次delete limit 1000,
mysql> explain analyze delete from sbtest1 limit 1000;
±-----------------------------±--------±--------±----------±--------------±--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±-------------------------------±---------±-----+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
±-----------------------------±--------±--------±----------±--------------±--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±-------------------------------±---------±-----+
| Delete_4 | N/A | 0 | root | | time:14.3ms, loops:1 | N/A | 247.5 KB | N/A |
| └─Limit_8 | 1000.00 | 1000 | root | | time:9.89ms, loops:2 | offset:0, count:1000 | N/A | N/A |
| └─TableReader_12 | 1000.00 | 1000 | root | | time:9.88ms, loops:1, cop_task: {num: 3, max: 4.11ms, min: 2.75ms, avg: 3.22ms, p95: 4.11ms, max_proc_keys: 992, p95_proc_keys: 992, tot_proc: 1ms, rpc_num: 3, rpc_time: 9.64ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 1} | data:Limit_11 | 302.7 KB | N/A |
| └─Limit_11 | 1000.00 | 1696 | cop[tikv] | | tikv_task:{proc max:2ms, min:0s, avg: 1.33ms, p80:2ms, p95:2ms, iters:12, tasks:3}, scan_detail: {total_process_keys: 1696, total_process_keys_size: 379904, total_keys: 1699, get_snapshot_time: 832.6µs, rocksdb: {key_skipped_count: 1696, block: {cache_hit_count: 25, read_count: 2, read_byte: 52.2 KB, read_time: 394.1µs}}} | offset:0, count:1000 | N/A | N/A |
| └─TableFullScan_10 | 1000.00 | 1696 | cop[tikv] | table:sbtest1 | tikv_task:{proc max:2ms, min:0s, avg: 1.33ms, p80:2ms, p95:2ms, iters:12, tasks:3} | keep order:false, stats:pseudo | N/A | N/A |
±-----------------------------±--------±--------±----------±--------------±--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±-------------------------------±---------±-----+
5 rows in set (0.10 sec)

mysql> show variables like ‘%paging%’;
±---------------------±------+
| Variable_name | Value |
±---------------------±------+
| tidb_enable_paging | ON |
| tidb_max_paging_size | 50000 |
| tidb_min_paging_size | 128 |
±---------------------±------+
3 rows in set (0.01 sec)

mysql> set tidb_enable_paging = OFF;
Query OK, 0 rows affected (0.00 sec)

mysql> show variables like ‘%paging%’;
±---------------------±------+
| Variable_name | Value |
±---------------------±------+
| tidb_enable_paging | OFF |
| tidb_max_paging_size | 50000 |
| tidb_min_paging_size | 128 |
±---------------------±------+
3 rows in set (0.00 sec)

mysql> explain analyze delete from sbtest1 limit 1000;
±-----------------------------±--------±--------±----------±--------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±-------------------------------±---------±-----+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
±-----------------------------±--------±--------±----------±--------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±-------------------------------±---------±-----+
| Delete_4 | N/A | 0 | root | | time:9ms, loops:1 | N/A | 220.6 KB | N/A |
| └─Limit_8 | 1000.00 | 1000 | root | | time:4.68ms, loops:2 | offset:0, count:1000 | N/A | N/A |
| └─TableReader_12 | 1000.00 | 1000 | root | | time:4.68ms, loops:1, cop_task: {num: 1, max: 4.58ms, proc_keys: 1000, tot_proc: 1ms, rpc_num: 1, rpc_time: 4.56ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 1} | data:Limit_11 | 205.6 KB | N/A |
| └─Limit_11 | 1000.00 | 1000 | cop[tikv] | | tikv_task:{time:2ms, loops:6}, scan_detail: {total_process_keys: 1000, total_process_keys_size: 224000, total_keys: 3001, get_snapshot_time: 781µs, rocksdb: {delete_skipped_count: 1000, key_skipped_count: 4000, block: {cache_hit_count: 14}}} | offset:0, count:1000 | N/A | N/A |
| └─TableFullScan_10 | 1000.00 | 1000 | cop[tikv] | table:sbtest1 | tikv_task:{time:2ms, loops:6} | keep order:false, stats:pseudo | N/A | N/A |
±-----------------------------±--------±--------±----------±--------------±---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±-------------------------------±---------±-----+
5 rows in set (0.09 sec)

参考下:
delete_skipped_count 就是记录了这样被跳过的删除操作的数量。
key_skipped_count 就是记录了这样被跳过的 key 的数量。
需要扫描整个表的数据,total_keys 就是记录了过程中扫描的 key 的总数。

delete_skipped_count 记录被删除标记了,会被跳过
key_skipped_count 记录已经失效了(删除标记之后未释放),会被跳过

以上供你参考

第一个。执行扫描操作时,可能会遇到一些额外的键,例如索引键或其他辅助键,会超过预期

我觉得这个是多版本的机制吧,取数据时会先获取时间戳,然后按时间戳找对应版本,时间戳之前的版本就忽略掉。

key_skipped_count 记录已经失效了(删除标记之后未释放),会被跳过

第一次对这个表进行删除(之前没有删除过数据)key_skipped_count 是1696 这个怎么理解呢?
mysql> explain analyze delete from sbtest1 limit 1000;
±-----------------------------±--------±--------±----------±--------------±--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±-------------------------------±---------±-----+
| id | estRows | actRows | task | access object | execution info | operator info | memory | disk |
±-----------------------------±--------±--------±----------±--------------±--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±-------------------------------±---------±-----+
| Delete_4 | N/A | 0 | root | | time:14.3ms, loops:1 | N/A | 247.5 KB | N/A |
| └─Limit_8 | 1000.00 | 1000 | root | | time:9.89ms, loops:2 | offset:0, count:1000 | N/A | N/A |
| └─TableReader_12 | 1000.00 | 1000 | root | | time:9.88ms, loops:1, cop_task: {num: 3, max: 4.11ms, min: 2.75ms, avg: 3.22ms, p95: 4.11ms, max_proc_keys: 992, p95_proc_keys: 992, tot_proc: 1ms, rpc_num: 3, rpc_time: 9.64ms, copr_cache_hit_ratio: 0.00, distsql_concurrency: 1} | data:Limit_11 | 302.7 KB | N/A |
| └─Limit_11 | 1000.00 | 1696 | cop[tikv] | | tikv_task:{proc max:2ms, min:0s, avg: 1.33ms, p80:2ms, p95:2ms, iters:12, tasks:3}, scan_detail: {total_process_keys: 1696, total_process_keys_size: 379904, total_keys: 1699, get_snapshot_time: 832.6µs, rocksdb: {key_skipped_count: 1696, block: {cache_hit_count: 25, read_count: 2, read_byte: 52.2 KB, read_time: 394.1µs}}} | offset:0, count:1000 | N/A | N/A |
| └─TableFullScan_10 | 1000.00 | 1696 | cop[tikv] | table:sbtest1 | tikv_task:{proc max:2ms, min:0s, avg: 1.33ms, p80:2ms, p95:2ms, iters:12, tasks:3} | keep order:false, stats:pseudo | N/A | N/A |
±-----------------------------±--------±--------±----------±--------------±--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------±-------------------------------±---------±-----+
5 rows in set (0.10 sec)

key_skipped_count是本次请求rocksdb扫描的所有数据(总扫描开销),该值越大说明rocksdb扫描成本越高。
deleted_skipped_count是扫描过的大于gc时间后将标记写入rocksdb中(tomstone状态)的记录,可以理解为rocksdb层过滤的无效数据,该值越大说明compaction需要合并的数据越多。
total_keys是rocksdb返回给tikv的符合条件的在gc范围之内的总记录数,该值与total_process_keys差值越大说明gc时间之内数据版本积攒就越多(比如你频繁删除了数据还未发生gc,就都会被扫描出来发给tikv)
total_process_keys是通过mvcc版本过滤后真正的有效记录,该值越大说明真正需要的记录数就越多。

另外,你这里total_keys: 1699为何大于key_skipped_count: 1696,不太清楚,期待大佬解答。

  • Total_keys:表示 Coprocessor 扫过的 key 的数量。
  • Process_keys:表示 Coprocessor 处理的 key 的数量。相比 total_keys,processed_keys 不包含 MVCC 的旧版本。如果 processed_keys 和 total_keys 相差很大,说明旧版本比较多。
  • Rocksdb_delete_skipped_count:RocksDB 读数据过程中已删除 Key 的扫描数。
  • Rocksdb_key_skipped_count:RocksDB 扫数据时遇到的已删除 (tombstone) Key 数量。

不只是delete会产生mvcc历史数据,update也会的。

1.key_skipped_count是本次请求rocksdb扫描的所有数据(总扫描开销),该值越大说明rocksdb扫描成本越高。
为什么我执行第二次delete limit 1000的时候,这个key_skipped_count 是4000呢?第一次删除了1000行,第二次删除了1000行

2.deleted_skipped_count是扫描过的大于gc时间后将标记写入rocksdb中(tomstone状态)的记录,可以理解为rocksdb层过滤的无效数据,该值越大说明compaction需要合并的数据越多。
我执行第二次delete … limit 1000的时候,这个时候还没有发生gc