【 TiDB 使用环境】生产环境
【 TiDB 版本】7.5.1
【遇到的问题:问题现象及影响】
有3张数据表,一张原始meta表, 一张meta对应的模块数据表(module, module_version), 一张meta对应的版本表(meta_version), 模块数据表大概有上亿条,一个module的record有千万条, module字段有索引。版本表大概有千万条,一个版本的record有百万条, version字段也有索引
其中一个meta id, 可以对应百+条模块数据记录,也可以对应10+版本
执行下面的sql特别特别慢,想请教下应该如何优化,
SELECT
count(*) AS count_1
FROM
(
SELECT
t_meta.id AS t_meta_id
FROM
t_meta
WHERE
t_meta.id IN (
SELECT
anon_2.id
FROM
(
SELECT
t_meta_versions.id AS id
FROM
t_meta_versions
WHERE
t_meta_versions.version = ‘v0918’
) AS anon_2
)
AND t_meta.id IN (
SELECT
anon_3.id
FROM
(
SELECT
t_meta_module.id AS id
FROM
t_meta_module
WHERE
t_meta_module.state = 1
AND t_meta_module.module = ‘Planner’
) AS anon_3
)
) AS anon_1;
【资源配置】8c 32g, 开了tiflash
看看执行计划 另外集群架构发下,你也观察下是不是内存资源不够了
原始语句如下
SELECT count(*) AS count_1 FROM (
SELECT t_meta.id AS t_meta_id FROM t_meta
WHERE t_meta.id IN (
SELECT anon_2.id FROM (
SELECT t_meta_versions.id AS id FROM t_meta_versions
WHERE t_meta_versions.version = 'v0918'
) AS anon_2
)
AND t_meta.id IN (
SELECT anon_3.id FROM (
SELECT t_meta_module.id AS id FROM t_meta_module
WHERE t_meta_module.state = 1
AND t_meta_module.module = 'Planner'
) AS anon_3
)
) AS anon_1;
1、发一下三个表的索引看看
2、发一下explain analyze执行结果看看
3、简化有语句执行效率怎么样?
SELECT count(*) count_1 FROM t_meta
WHERE t_meta.id IN (
SELECT t_meta_versions.id FROM t_meta_versions
WHERE t_meta_versions.version = 'v0918'
)
AND t_meta.id IN (
SELECT t_meta_module.id AS id FROM t_meta_module
WHERE t_meta_module.state = 1
AND t_meta_module.module = 'Planner'
)
4、尝试用exists试试,如果in后面的结果集比较大,用exists应该更好一些。
我感觉使用关联会更好点
in和exists 这只是理论上的,实际上tidb一般in更快
直接改成join咋样?
SELECT COUNT(DISTINCT tm.id) AS count_1
FROM t_meta tm
JOIN t_meta_versions tmv ON tm.id = tmv.id
JOIN t_meta_module tmm ON tm.id = tmm.id
WHERE tmv.version = ‘v0918’
AND tmm.state = 1
AND tmm.module = ‘Planner’;
会有重复行吧。这样子。
-
索引结构:
t_meta里,id为唯一索引
t_meta_versions里,state为是否有效的状态,没有索引,有 meta_id 和 version的联合唯一索引,即uniq_meta_id_version, 还有version列的单独索引
t_meta_module里,state为是否有效的状态,没有索引,有meta_id 和 module的联合唯一索引,即uniq_meta_id_module, 也有module列的单独索引 -
explain analyze执行结果:
HashAgg_201 1.00 1 root time:1s, loops:2, RU:0.000000, partial_worker:{wall_time:1.00197014s, concurrency:5, task_num:1, tot_wait:5.009597322s, tot_exec:6.572µs, tot_time:5.009613005s, max:1.00193351s, p95:1.00193351s}, final_worker:{wall_time:0s, concurrency:5, task_num:1, tot_wait:5.009784638s, tot_exec:13.925µs, tot_time:5.009802224s, max:1.001971863s, p95:1.001971863s}
└─TableReader_203 1.00 3 root time:1s, loops:2, cop_task: {num: 6, max: 0s, min: 0s, avg: 0s, p95: 0s, copr_cache_hit_ratio: 0.00}
└─ExchangeSender_202 1.00 3 mpp[tiflash] tiflash_task:{proc max:1s, min:966.9ms, avg: 982.7ms, p80:1s, p95:1s, iters:3, tasks:3, threads:3}
└─HashAgg_26 1.00 3 mpp[tiflash] tiflash_task:{proc max:1s, min:966.9ms, avg: 982.7ms, p80:1s, p95:1s, iters:3, tasks:3, threads:3}
└─HashJoin_200 241818.27 510438 mpp[tiflash] tiflash_task:{proc max:999.1ms, min:966.9ms, avg: 982ms, p80:999.1ms, p95:999.1ms, iters:1728, tasks:3, threads:12}
├─ExchangeReceiver_54(Build) 241818.27 510438 mpp[tiflash] tiflash_task:{proc max:719.1ms, min:718.1ms, avg: 718.7ms, p80:719.1ms, p95:719.1ms, iters:60, tasks:3, threads:12}
│ └─ExchangeSender_53 241818.27 510438 mpp[tiflash] tiflash_task:{proc max:743.6ms, min:0s, avg: 247.9ms, p80:743.6ms, p95:743.6ms, iters:21, tasks:3, threads:24}
│ └─HashJoin_34 241818.27 510438 mpp[tiflash] tiflash_task:{proc max:716.6ms, min:0s, avg: 238.9ms, p80:716.6ms, p95:716.6ms, iters:21, tasks:3, threads:24}
│ ├─ExchangeReceiver_51(Build) 241818.53 1531314 mpp[tiflash] tiflash_task:{proc max:462.6ms, min:0s, avg: 154.2ms, p80:462.6ms, p95:462.6ms, iters:183, tasks:3, threads:24}
│ │ └─ExchangeSender_50 241818.53 510438 mpp[tiflash] tiflash_task:{proc max:434.3ms, min:0s, avg: 144.8ms, p80:434.3ms, p95:434.3ms, iters:3072, tasks:3, threads:12}
│ │ └─Projection_46 241818.53 510438 mpp[tiflash] tiflash_task:{proc max:425.3ms, min:0s, avg: 141.8ms, p80:425.3ms, p95:425.3ms, iters:3072, tasks:3, threads:12}
│ │ └─HashAgg_47 241818.53 510438 mpp[tiflash] tiflash_task:{proc max:425.3ms, min:0s, avg: 141.8ms, p80:425.3ms, p95:425.3ms, iters:3072, tasks:3, threads:12}
│ │ └─ExchangeReceiver_49 241818.53 510438 mpp[tiflash] tiflash_task:{proc max:383.3ms, min:0s, avg: 127.8ms, p80:383.3ms, p95:383.3ms, iters:66, tasks:3, threads:12}
│ │ └─ExchangeSender_48 241818.53 510438 mpp[tiflash] tiflash_task:{proc max:352.9ms, min:0s, avg: 117.6ms, p80:352.9ms, p95:352.9ms, iters:768, tasks:3, threads:24}
│ │ └─HashAgg_36 241818.53 510438 mpp[tiflash] tiflash_task:{proc max:351.8ms, min:0s, avg: 117.3ms, p80:351.8ms, p95:351.8ms, iters:768, tasks:3, threads:24}
│ │ └─Selection_45 374858.10 510438 mpp[tiflash] tiflash_task:{proc max:166ms, min:0s, avg: 55.3ms, p80:166ms, p95:166ms, iters:11, tasks:3, threads:24}
│ │ └─TableFullScan_44 374858.10 510438 mpp[tiflash] table:t_versions tiflash_task:{proc max:166ms, min:0s, avg: 55.3ms, p80:166ms, p95:166ms, iters:11, tasks:3, threads:24}, tiflash_scan:{dtfile:{total_scanned_packs:201, total_skipped_packs:312, total_scanned_rows:1566020, total_skipped_rows:2445033, total_rs_index_check_time: 1ms, total_read_time: 90ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 21, total_remote_region_num: 0, total_learner_read_time: 37ms}
│ └─TableFullScan_52(Probe) 1286202.00 1284610 mpp[tiflash] table:t_meta tiflash_task:{proc max:645.6ms, min:0s, avg: 215.2ms, p80:645.6ms, p95:645.6ms, iters:21, tasks:3, threads:24}, tiflash_scan:{dtfile:{total_scanned_packs:279, total_skipped_packs:272, total_scanned_rows:2202769, total_skipped_rows:2158975, total_rs_index_check_time: 0ms, total_read_time: 138ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 10, total_remote_region_num: 0, total_learner_read_time: 3ms}
└─Projection_66(Probe) 1272563.62 1277421 mpp[tiflash] tiflash_task:{proc max:863.1ms, min:848.9ms, avg: 854ms, p80:863.1ms, p95:863.1ms, iters:3072, tasks:3, threads:12}
└─HashAgg_67 1272563.62 1277421 mpp[tiflash] tiflash_task:{proc max:863.1ms, min:847.9ms, avg: 853.4ms, p80:863.1ms, p95:863.1ms, iters:3072, tasks:3, threads:12}
└─ExchangeReceiver_69 1272563.62 1277454 mpp[tiflash] tiflash_task:{proc max:762.1ms, min:756.1ms, avg: 758.7ms, p80:762.1ms, p95:762.1ms, iters:156, tasks:3, threads:12}
└─ExchangeSender_68 1272563.62 1277454 mpp[tiflash] tiflash_task:{proc max:553.5ms, min:0s, avg: 184.5ms, p80:553.5ms, p95:553.5ms, iters:768, tasks:3, threads:24}
└─HashAgg_56 1272563.62 1277454 mpp[tiflash] tiflash_task:{proc max:492.5ms, min:0s, avg: 164.2ms, p80:492.5ms, p95:492.5ms, iters:768, tasks:3, threads:24}
└─Selection_65 9113922.75 9114054 mpp[tiflash] tiflash_task:{proc max:135.5ms, min:0s, avg: 45.2ms, p80:135.5ms, p95:135.5ms, iters:158, tasks:3, threads:24}
└─TableFullScan_64 9125959.00 9125734 mpp[tiflash] table:t_module tiflash_task:{proc max:80.5ms, min:0s, avg: 26.8ms, p80:80.5ms, p95:80.5ms, iters:161, tasks:3, threads:24}, tiflash_scan:{dtfile:{total_scanned_packs:1247, total_skipped_packs:1952, total_scanned_rows:9987735, total_skipped_rows:15649079, total_rs_index_check_time: 3ms, total_read_time: 1114ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 71, total_remote_region_num: 0, total_learner_read_time: 27ms}
- 简化语句耗时差不多
- in后面的结果集确实很大,但也有部分情况下in后面的结果集很小,希望都可以考虑到
看着这个 sql 用的字段非常少,tiflash mpp 的执行计划耗时主要是在 mpp 传输数据,我觉得可以尝试下把相关字段用索引全覆盖,然后走 tikv 试试。这样理论来说大概有 300w 数据的索引数据库返回到 tidb 做 join ,可能会快一些
试试改成join以减少嵌套查询的复杂性,并可能提高查询性能
SELECT COUNT(DISTINCT t_meta.id) AS count_1
FROM t_meta
JOIN t_meta_versions ON t_meta.id = t_meta_versions.id
JOIN t_meta_module ON t_meta.id = t_meta_module.id
WHERE t_meta_versions.version = ‘v0918’
AND t_meta_module.state = 1
AND t_meta_module.module = ‘Planner’;
上亿级别的。要么索引关联,要么索引直查,,关联越复杂,越慢。再加上谓词索引过滤掉一大半数据。
一索引,二内存,感觉这个是内存不够