表亿级数据,查询速度慢

【 TiDB 使用环境】生产环境
【 TiDB 版本】7.5.1
【遇到的问题:问题现象及影响】
有3张数据表,一张原始meta表, 一张meta对应的模块数据表(module, module_version), 一张meta对应的版本表(meta_version), 模块数据表大概有上亿条,一个module的record有千万条, module字段有索引。版本表大概有千万条,一个版本的record有百万条, version字段也有索引
其中一个meta id, 可以对应百+条模块数据记录,也可以对应10+版本
执行下面的sql特别特别慢,想请教下应该如何优化,
SELECT
count(*) AS count_1
FROM
(
SELECT
t_meta.id AS t_meta_id
FROM
t_meta
WHERE
t_meta.id IN (
SELECT
anon_2.id
FROM
(
SELECT
t_meta_versions.id AS id
FROM
t_meta_versions
WHERE
t_meta_versions.version = ‘v0918’
) AS anon_2
)
AND t_meta.id IN (
SELECT
anon_3.id
FROM
(
SELECT
t_meta_module.id AS id
FROM
t_meta_module
WHERE
t_meta_module.state = 1
AND t_meta_module.module = ‘Planner’
) AS anon_3
)
) AS anon_1;
【资源配置】8c 32g, 开了tiflash

看看执行计划 另外集群架构发下,你也观察下是不是内存资源不够了

原始语句如下

SELECT count(*) AS count_1 FROM (
    SELECT t_meta.id AS t_meta_id FROM t_meta
    WHERE t_meta.id IN (
        SELECT anon_2.id FROM (
            SELECT t_meta_versions.id AS id FROM t_meta_versions
            WHERE t_meta_versions.version = 'v0918'
        ) AS anon_2
    )
    AND t_meta.id IN (
        SELECT anon_3.id FROM (
            SELECT t_meta_module.id AS id FROM t_meta_module
            WHERE t_meta_module.state = 1
            AND t_meta_module.module = 'Planner'
        ) AS anon_3
    )
) AS anon_1;

1、发一下三个表的索引看看
2、发一下explain analyze执行结果看看
3、简化有语句执行效率怎么样?

SELECT count(*) count_1 FROM t_meta
WHERE t_meta.id IN (
    SELECT t_meta_versions.id FROM t_meta_versions
    WHERE t_meta_versions.version = 'v0918'
)
AND t_meta.id IN (
    SELECT t_meta_module.id AS id FROM t_meta_module
    WHERE t_meta_module.state = 1
    AND t_meta_module.module = 'Planner'
)

4、尝试用exists试试,如果in后面的结果集比较大,用exists应该更好一些。

1 个赞

我感觉使用关联会更好点

1 个赞

in和exists 这只是理论上的,实际上tidb一般in更快

直接改成join咋样?
SELECT COUNT(DISTINCT tm.id) AS count_1
FROM t_meta tm
JOIN t_meta_versions tmv ON tm.id = tmv.id
JOIN t_meta_module tmm ON tm.id = tmm.id
WHERE tmv.version = ‘v0918’
AND tmm.state = 1
AND tmm.module = ‘Planner’;

:thinking:会有重复行吧。这样子。

  1. 索引结构:
    t_meta里,id为唯一索引
    t_meta_versions里,state为是否有效的状态,没有索引,有 meta_id 和 version的联合唯一索引,即uniq_meta_id_version, 还有version列的单独索引
    t_meta_module里,state为是否有效的状态,没有索引,有meta_id 和 module的联合唯一索引,即uniq_meta_id_module, 也有module列的单独索引

  2. explain analyze执行结果:

HashAgg_201	1.00	1	root		time:1s, loops:2, RU:0.000000, partial_worker:{wall_time:1.00197014s, concurrency:5, task_num:1, tot_wait:5.009597322s, tot_exec:6.572µs, tot_time:5.009613005s, max:1.00193351s, p95:1.00193351s}, final_worker:{wall_time:0s, concurrency:5, task_num:1, tot_wait:5.009784638s, tot_exec:13.925µs, tot_time:5.009802224s, max:1.001971863s, p95:1.001971863s}
└─TableReader_203	1.00	3	root		time:1s, loops:2, cop_task: {num: 6, max: 0s, min: 0s, avg: 0s, p95: 0s, copr_cache_hit_ratio: 0.00}
  └─ExchangeSender_202	1.00	3	mpp[tiflash]		tiflash_task:{proc max:1s, min:966.9ms, avg: 982.7ms, p80:1s, p95:1s, iters:3, tasks:3, threads:3}
    └─HashAgg_26	1.00	3	mpp[tiflash]		tiflash_task:{proc max:1s, min:966.9ms, avg: 982.7ms, p80:1s, p95:1s, iters:3, tasks:3, threads:3}
      └─HashJoin_200	241818.27	510438	mpp[tiflash]		tiflash_task:{proc max:999.1ms, min:966.9ms, avg: 982ms, p80:999.1ms, p95:999.1ms, iters:1728, tasks:3, threads:12}
        ├─ExchangeReceiver_54(Build)	241818.27	510438	mpp[tiflash]		tiflash_task:{proc max:719.1ms, min:718.1ms, avg: 718.7ms, p80:719.1ms, p95:719.1ms, iters:60, tasks:3, threads:12}
        │ └─ExchangeSender_53	241818.27	510438	mpp[tiflash]		tiflash_task:{proc max:743.6ms, min:0s, avg: 247.9ms, p80:743.6ms, p95:743.6ms, iters:21, tasks:3, threads:24}
        │   └─HashJoin_34	241818.27	510438	mpp[tiflash]		tiflash_task:{proc max:716.6ms, min:0s, avg: 238.9ms, p80:716.6ms, p95:716.6ms, iters:21, tasks:3, threads:24}
        │     ├─ExchangeReceiver_51(Build)	241818.53	1531314	mpp[tiflash]		tiflash_task:{proc max:462.6ms, min:0s, avg: 154.2ms, p80:462.6ms, p95:462.6ms, iters:183, tasks:3, threads:24}
        │     │ └─ExchangeSender_50	241818.53	510438	mpp[tiflash]		tiflash_task:{proc max:434.3ms, min:0s, avg: 144.8ms, p80:434.3ms, p95:434.3ms, iters:3072, tasks:3, threads:12}
        │     │   └─Projection_46	241818.53	510438	mpp[tiflash]		tiflash_task:{proc max:425.3ms, min:0s, avg: 141.8ms, p80:425.3ms, p95:425.3ms, iters:3072, tasks:3, threads:12}
        │     │     └─HashAgg_47	241818.53	510438	mpp[tiflash]		tiflash_task:{proc max:425.3ms, min:0s, avg: 141.8ms, p80:425.3ms, p95:425.3ms, iters:3072, tasks:3, threads:12}
        │     │       └─ExchangeReceiver_49	241818.53	510438	mpp[tiflash]		tiflash_task:{proc max:383.3ms, min:0s, avg: 127.8ms, p80:383.3ms, p95:383.3ms, iters:66, tasks:3, threads:12}
        │     │         └─ExchangeSender_48	241818.53	510438	mpp[tiflash]		tiflash_task:{proc max:352.9ms, min:0s, avg: 117.6ms, p80:352.9ms, p95:352.9ms, iters:768, tasks:3, threads:24}
        │     │           └─HashAgg_36	241818.53	510438	mpp[tiflash]		tiflash_task:{proc max:351.8ms, min:0s, avg: 117.3ms, p80:351.8ms, p95:351.8ms, iters:768, tasks:3, threads:24}
        │     │             └─Selection_45	374858.10	510438	mpp[tiflash]		tiflash_task:{proc max:166ms, min:0s, avg: 55.3ms, p80:166ms, p95:166ms, iters:11, tasks:3, threads:24}
        │     │               └─TableFullScan_44	374858.10	510438	mpp[tiflash]	table:t_versions	tiflash_task:{proc max:166ms, min:0s, avg: 55.3ms, p80:166ms, p95:166ms, iters:11, tasks:3, threads:24}, tiflash_scan:{dtfile:{total_scanned_packs:201, total_skipped_packs:312, total_scanned_rows:1566020, total_skipped_rows:2445033, total_rs_index_check_time: 1ms, total_read_time: 90ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 21, total_remote_region_num: 0, total_learner_read_time: 37ms}
        │     └─TableFullScan_52(Probe)	1286202.00	1284610	mpp[tiflash]	table:t_meta	tiflash_task:{proc max:645.6ms, min:0s, avg: 215.2ms, p80:645.6ms, p95:645.6ms, iters:21, tasks:3, threads:24}, tiflash_scan:{dtfile:{total_scanned_packs:279, total_skipped_packs:272, total_scanned_rows:2202769, total_skipped_rows:2158975, total_rs_index_check_time: 0ms, total_read_time: 138ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 10, total_remote_region_num: 0, total_learner_read_time: 3ms}
        └─Projection_66(Probe)	1272563.62	1277421	mpp[tiflash]		tiflash_task:{proc max:863.1ms, min:848.9ms, avg: 854ms, p80:863.1ms, p95:863.1ms, iters:3072, tasks:3, threads:12}
          └─HashAgg_67	1272563.62	1277421	mpp[tiflash]		tiflash_task:{proc max:863.1ms, min:847.9ms, avg: 853.4ms, p80:863.1ms, p95:863.1ms, iters:3072, tasks:3, threads:12}
            └─ExchangeReceiver_69	1272563.62	1277454	mpp[tiflash]		tiflash_task:{proc max:762.1ms, min:756.1ms, avg: 758.7ms, p80:762.1ms, p95:762.1ms, iters:156, tasks:3, threads:12}
              └─ExchangeSender_68	1272563.62	1277454	mpp[tiflash]		tiflash_task:{proc max:553.5ms, min:0s, avg: 184.5ms, p80:553.5ms, p95:553.5ms, iters:768, tasks:3, threads:24}
                └─HashAgg_56	1272563.62	1277454	mpp[tiflash]		tiflash_task:{proc max:492.5ms, min:0s, avg: 164.2ms, p80:492.5ms, p95:492.5ms, iters:768, tasks:3, threads:24}
                  └─Selection_65	9113922.75	9114054	mpp[tiflash]		tiflash_task:{proc max:135.5ms, min:0s, avg: 45.2ms, p80:135.5ms, p95:135.5ms, iters:158, tasks:3, threads:24}
                    └─TableFullScan_64	9125959.00	9125734	mpp[tiflash]	table:t_module	tiflash_task:{proc max:80.5ms, min:0s, avg: 26.8ms, p80:80.5ms, p95:80.5ms, iters:161, tasks:3, threads:24}, tiflash_scan:{dtfile:{total_scanned_packs:1247, total_skipped_packs:1952, total_scanned_rows:9987735, total_skipped_rows:15649079, total_rs_index_check_time: 3ms, total_read_time: 1114ms, total_disagg_read_cache_hit_size: 0, total_disagg_read_cache_miss_size: 0}, total_create_snapshot_time: 0ms, total_local_region_num: 71, total_remote_region_num: 0, total_learner_read_time: 27ms}
  1. 简化语句耗时差不多
  2. in后面的结果集确实很大,但也有部分情况下in后面的结果集很小,希望都可以考虑到

看着这个 sql 用的字段非常少,tiflash mpp 的执行计划耗时主要是在 mpp 传输数据,我觉得可以尝试下把相关字段用索引全覆盖,然后走 tikv 试试。这样理论来说大概有 300w 数据的索引数据库返回到 tidb 做 join ,可能会快一些 :see_no_evil:

试试改成join以减少嵌套查询的复杂性,并可能提高查询性能
SELECT COUNT(DISTINCT t_meta.id) AS count_1
FROM t_meta
JOIN t_meta_versions ON t_meta.id = t_meta_versions.id
JOIN t_meta_module ON t_meta.id = t_meta_module.id
WHERE t_meta_versions.version = ‘v0918’
AND t_meta_module.state = 1
AND t_meta_module.module = ‘Planner’;

上亿级别的。要么索引关联,要么索引直查,,关联越复杂,越慢。再加上谓词索引过滤掉一大半数据。

一索引,二内存,感觉这个是内存不够