tidb5.2.3-表关联性能问题

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【TiDB 版本】
5.7.25-TiDB-v5.2.3
【 TiDB 使用环境】
|组件|CPU|内存|本地存储|网络|
|TiDB|8 核+|16 GB+|SAS, 200 GB+|千兆网卡|
|PD|4 核+|8 GB+|SAS, 200 GB+|千兆网卡|
|TiKV|8 核+|32 GB+|SSD, 200 GB+|千兆网卡|
|TiFlash|32 核+|64 GB+|SSD, 200 GB+|千兆网卡|

【概述】 场景 + 问题概述
1、order_item、order_item_details、order_item_extra 各5千万条数据
expenses_item :12205条
2、最后结果集是1千万条,执行时间400多秒。
3、感觉这性能不太理想啊,请问这种表关联还可以进行优化吗?

【统计信息是否最新】

【执行计划内容】


【 SQL 文本、schema 以及 数据分布】
SELECT count(0) FROM order_item oi
INNER JOIN order_item_details oid ON oi.item_no = oid.item_no
INNER JOIN order_item_extra oie ON oi.item_no = oie.item_no
INNER JOIN expenses_item ei ON ei.id = oi.setmanag_item_id
where oi.cust_merch_id = 111267

1 个赞

查一下统计信息中 cust_merch_id 和相关表 join 列的选择率,如果比较高,可以试试加 hint 走 index join

2 个赞

这是统计信息,劳烦你copy出来给看下,不知道怎么看选择率:disappointed_relieved:

HashAgg_20 1.00 1 root time:7m3.6s, loops:2, partial_worker:{wall_time:7m3.631135051s, concurrency:5, task_num:11563, tot_wait:35m17.548307611s, tot_exec:604.271381ms, tot_time:35m18.155366057s, max:7m3.631100076s, p95:7m3.631100076s}, final_worker:{wall_time:7m3.631153672s, concurrency:5, task_num:5, tot_wait:35m18.155651862s, tot_exec:23.604µs, tot_time:35m18.155679881s, max:7m3.631145104s, p95:7m3.631145104s} funcs:count(0)->Column#182 63.7 KB N/A
└─IndexJoin_26 10116.25 11840102 root time:7m3.6s, loops:11564, inner:{total:5m12.6s, concurrency:5, task:471, construct:12.5s, fetch:4m56.7s, build:3.3s}, probe:3.92s inner join, inner:IndexReader_25, outer key:sltas_ods.order_item.item_no, inner key:sltas_ods.order_item_details.item_no, equal cond:eq(sltas_ods.order_item.item_no, sltas_ods.order_item_details.item_no) 1.12 GB N/A
├─HashJoin_72(Build) 10084.05 11840123 root time:6m1.9s, loops:11565, build_hash_table:{total:6m0.1s, fetch:5m58.9s, build:1.15s}, probe:{concurrency:5, total:31m3.3s, max:7m3s, probe:1m3s, fetch:30m0.4s} inner join, equal:[eq(sltas_ods.order_item.setmanag_item_id, sltas_ods.expenses_item.id)] 457.0 MB 0 Bytes
│ ├─IndexJoin_79(Build) 10079.92 11840123 root time:6m0s, loops:11564, inner:{total:29m57s, concurrency:5, task:1977, construct:52.6s, fetch:29m1s, build:3.41s}, probe:12.3s inner join, inner:IndexLookUp_78, outer key:sltas_ods.order_item_extra.item_no, inner key:sltas_ods.order_item.item_no, equal cond:eq(sltas_ods.order_item_extra.item_no, sltas_ods.order_item.item_no) 3.75 GB N/A
│ │ ├─TableReader_110(Build) 10000.00 50369404 root time:96.4ms, loops:49332, cop_task: {num: 290, max: 518ms, min: 3.84ms, avg: 93.2ms, p95: 171.9ms, rpc_num: 290, rpc_time: 27s, copr_cache_hit_ratio: 0.00} data:TableRangeScan_109 96.0 MB N/A
│ │ │ └─TableRangeScan_109 10000.00 50369404 cop[tiflash] table:oie tiflash_task:{proc max:59.8ms, min:2.47ms, p80:25.9ms, p95:35.4ms, iters:1468, tasks:290, threads:290} range:[0,+inf], keep order:false, stats:pseudo N/A N/A
│ │ └─IndexLookUp_78(Probe) 1.01 11840123 root time:28m28.8s, loops:14579, index_task: {total_time: 22m9.8s, fetch_handle: 22m9.8s, build: 7.56ms, wait: 32.3ms}, table_task: {total_time: 2h7m44.4s, num: 9858, concurrency: 9885} 33.0 KB N/A
│ │ ├─IndexRangeScan_75(Build) 4.33 50369403 cop[tikv] table:oi, index:idx_item_no(item_no) time:22m7.7s, loops:55136, cop_task: {num: 3951, max: 1.15s, min: 570.5µs, avg: 348.3ms, p95: 768.4ms, max_proc_keys: 25000, p95_proc_keys: 25000, tot_proc: 21m20.8s, tot_wait: 279ms, rpc_num: 3951, rpc_time: 22m56s, copr_cache_hit_ratio: 0.02}, tikv_task:{proc max:1.09s, min:0s, p80:630ms, p95:719ms, iters:66776, tasks:3951}, scan_detail: {total_process_keys: 50338334, total_keys: 100676748, rocksdb: {delete_skipped_count: 0, key_skipped_count: 50338334, block: {cache_hit_count: 517151368, read_count: 22380, read_byte: 237.5 MB}}} range: decided by [eq(sltas_ods.order_item.item_no, sltas_ods.order_item_extra.item_no)], keep order:false N/A N/A
│ │ └─Selection_77(Probe) 1.01 11840123 cop[tikv] time:15m54.1s, loops:28974, cop_task: {num: 10709, max: 368.3ms, min: 277.4µs, avg: 90.2ms, p95: 209.3ms, max_proc_keys: 9964, p95_proc_keys: 9672, tot_proc: 14m49s, tot_wait: 802ms, rpc_num: 10709, rpc_time: 16m5.5s, copr_cache_hit_ratio: 0.05}, tikv_task:{proc max:293ms, min:0s, p80:144ms, p95:195ms, iters:99505, tasks:10709}, scan_detail: {total_process_keys: 49757647, total_keys: 100753931, rocksdb: {delete_skipped_count: 0, key_skipped_count: 57001240, block: {cache_hit_count: 362250163, read_count: 245780, read_byte: 2.52 GB}}} eq(sltas_ods.order_item.cust_merch_id, 111267) N/A N/A
│ │ └─TableRowIDScan_76 4.33 50369403 cop[tikv] table:oi tikv_task:{proc max:293ms, min:0s, p80:143ms, p95:195ms, iters:99505, tasks:10709} keep order:false N/A N/A
│ └─IndexReader_125(Probe) 12205.00 12205 root time:1.3ms, loops:13, cop_task: {num: 1, max: 1.25ms, proc_keys: 0, rpc_num: 1, rpc_time: 1.24ms, copr_cache_hit_ratio: 1.00} index:IndexFullScan_124 95.7 KB N/A
│ └─IndexFullScan_124 12205.00 12205 cop[tikv] table:ei, index:merchant_base_info_id(merchant_base_info_id) tikv_task:{time:7ms, loops:16} keep order:false N/A N/A
└─IndexReader_25(Probe) 1.00 11840102 root time:4m47.4s, loops:12569, cop_task: {num: 1781, max: 1.01s, min: 262.6µs, avg: 188.2ms, p95: 700.2ms, max_proc_keys: 25000, p95_proc_keys: 25000, tot_proc: 5m3.1s, tot_wait: 142ms, rpc_num: 1781, rpc_time: 5m35.2s, copr_cache_hit_ratio: 0.08} index:IndexRangeScan_24 30.4 KB N/A
└─IndexRangeScan_24 1.00 11840102 cop[tikv] table:oid, index:idx_item_no_order_item_details_0(item_no) tikv_task:{proc max:880ms, min:0s, p80:562ms, p95:646ms, iters:18186, tasks:1781}, scan_detail: {total_process_keys: 11796605, total_keys: 23593240, rocksdb: {delete_skipped_count: 0, key_skipped_count: 11796605, block: {cache_hit_count: 122117697, read_count: 15331, read_byte: 222.9 MB}}} range: decided by [eq(sltas_ods.order_item_details.item_no, sltas_ods.order_item.item_no)], keep order:false N/A N/A

上面这个是执行计划,不是统计信息,统计信息可以 show stats_histograms 看看相关列的 distinct count
https://docs.pingcap.com/zh/tidb/stable/sql-statement-show-histograms/#show-stats_histograms

2 个赞

你看下order_item,5千万条数据,为什么cust_merch_id的索引distinct count ,才1086

统计信息有没有过期,show stats_healthy 看下健康度;或者 select count(distinct cust_merch_id) 计算下

统计信息是100

那这个表统计信息就是准确的,如果数据分布均衡,oi.cust_merch_id = 111267 过滤后返回的行数应该是 5000 条左右?再和其他小表关联,看看执行大概多久?比如
SELECT count(0) FROM order_item oi
INNER JOIN expenses_item ei ON ei.id = oi.setmanag_item_id
where oi.cust_merch_id = 111267;

cust_merch_id = 111267这个商户是1000万左右的数据量,你说的这个sql执行玩,count是11840147条

那就是比较大的商户,数据分布不太均衡,这种通常走 hash join 比较合适

/*+ INL_JOIN(oi,oid,oie,ei,eica) */ 我加了一个这个,看起来好了很多。

把 hint 改成 hash join 试试呢

/*+ HASH_JOIN(oi,oid,oie,ei,eica) */更好一点:joy:,像这种复杂sql,系统默认的执行计划不太好,就自己挨个试验下 Optimizer Hints吗?然后选择一个hint

方便的话可以导出这几个表的统计信息和表结构,研发看看能否进一步优化
https://docs.pingcap.com/zh/tidb/v5.2/statistics/#导出统计信息

统计信息和表结构我整理出来了,但是这些我不想发到公网上来。你加我下微信吧。微信号:wx799983050

收到了,导入统计信息后看到 order_item_extra 的 count 是 0,这个表实际有数据吗

有呢,也是5千万条

可能统计信息没收集?导出的统计信息 order_item_extra.json 最后没有数据
“ext_stats”: null,
“count”: 0,
“modify_count”: 0,
“partitions”: null

如有order_item 表数据分布不均衡的话,建议在cust_merch_id上收集一下直方图统计信息,让优化器根据实际传入的值选择执行计划。

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。