为什么两个sql 开启tiflash副本下性能差别几十倍

Soysauce520 · 2025 年9 月 4 日 08:16

加之后的执行计划看下呢，可能还是估算的问题

nobody · 2025 年9 月 4 日 08:20

两个条件数估算的返回行数太小了，只有 0.29 ，导致优化器认为不值当的用 hash agg。

你收集下统计信息试试？

li214528367 · 2025 年9 月 4 日 08:35

| id | estRows | estCost | actRows | task | access object | execution info | operator info | memory | disk |
| HashAgg_25 | 1.00 | 1797431.32 | 1 | root | | time:92.1ms, loops:2, partial_worker:{wall_time:92.083852ms, concurrency:5, task_num:1, tot_wait:91.697417ms, tot_exec:22.425µs, tot_time:458.667668ms, max:91.734547ms, p95:91.734547ms}, final_worker:{wall_time:92.108213ms, concurrency:5, task_num:5, tot_wait:3.567µs, tot_exec:136ns, tot_time:458.75351ms, max:91.754452ms, p95:91.754452ms} | funcs:count(Column#50)->Column#48 | 6.23 KB | 0 Bytes |
| └─TableReader_27 | 1.00 | 1795904.06 | 1 | root | | time:91.7ms, loops:2, cop_task: {num: 2, max: 0s, min: 0s, avg: 0s, p95: 0s, copr_cache_hit_ratio: 0.00} | MppVersion: 2, data:ExchangeSender_26 | 808 Bytes | N/A |
| └─ExchangeSender_26 | 1.00 | 26938543.28 | 1 | mpp[tiflash] | | tiflash_task:{time:88.5ms, loops:1, threads:1} | ExchangeType: PassThrough | N/A | N/A |
| └─HashAgg_10 | 1.00 | 26938543.28 | 1 | mpp[tiflash] | | tiflash_task:{time:88.5ms, loops:1, threads:1} | funcs:count(1)->Column#50 | N/A | N/A |
| └─Selection_24 | 0.29 | 26938469.76 | 3369270 | mpp[tiflash] | | tiflash_task:{time:88.5ms, loops:112, threads:16} | eq(environment_uun000000000001.record_stress__t.string5, “Alice5”), eq(environment_uun000000000001.record_stress__t.string6, “Alice6”) | N/A | N/A |
| └─TableFullScan_23 | 285485.00 | 25568141.76 | 3369270 | mpp[tiflash] | table:record_stress__t | tiflash_task:{time:88.5ms, loops:112, threads:16}, tiflash_wait: {pipeline_queue_wait: 2ms}, tiflash_scan:{mvcc_input_rows:2305174, mvcc_input_bytes:85291438, mvcc_output_rows:1152587, lm_skip_rows:0, local_regions:32, remote_regions:0, tot_learner_read:4ms, region_balance:{instance_num: 1, max/min: 32/32=1.000000}, delta_rows:51787, delta_bytes:93734470, segments:14, stale_read_regions:0, tot_build_snapshot:0ms, tot_build_bitmap:395ms, tot_build_inputstream:401ms, min_local_stream:70ms, max_local_stream:81ms, dtfile:{data_scanned_rows:4995513, data_skipped_rows:0, mvcc_scanned_rows:3311441, mvcc_skipped_rows:524288, lm_filter_scanned_rows:0, lm_filter_skipped_rows:0, tot_rs_index_check:2ms, tot_read:380ms}} | keep order:false, stats:pseudo | N/A | N/A |

li214528367 · 2025 年9 月 4 日 08:42

我刷新了一下统计信息，现在不加/*+ HASH_AGG() */也很快了

tidb菜鸟一只 · 2025 年9 月 4 日 08:59

就是统计信息太旧了，优化器感觉数据量太小，不需要用 Hash Aggregation， Hash Aggregation算法采用多线程并发优化，执行速度快，但与 Stream Aggregation 算法相比会消耗较多内存，所以选择了 Stream Aggregation
你收集完统计信息的最新执行计划看下，应该自动选择了Hash Aggregation