如何解决并发查询的延迟问题?

【 TiDB 使用环境】生产环境
【资源配置】

TiKV 6 * 16vCore 64GB Mem
TiDB 3 * 16vCore 64GB Mem

【遇到的问题:问题现象及影响】

服务端有一个面板,打开面板后会同时发送13条查询到后端,这13条查询都需要用到一张 12GB 119,352,523条数据的表usd
这些查询单独出现的时候基本都在10秒以下,但同时打过来基本就都在30秒左右了

查询后发现主要的问题流程是这样的

  • 查询因为DATE函数的缘故跳过了最优的索引,使用了次优索引,导致用TableRowIDScan查询了接近 五百万 的数据,随后Selection只选取了其中 1000 条,这个比较容易优化,加索引或改SQL就行
| id                                       | estRows    | actRows | task      | access object                                              | execution info                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | memory    | disk     |
| Projection_45                            | 7.68       | 0       | root      |                                                            | time:30.1s, loops:1, Concurrency:OFF                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    | 2.02 KB   | N/A      |
| └─Selection_46                           | 7.68       | 0       | root      |                                                            | time:30.1s, loops:1                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     | 2.02 KB   | N/A      |
|   └─HashAgg_47                           | 9.60       | 0       | root      |                                                            | time:30.1s, loops:1, partial_worker:{wall_time:30.051346914s, concurrency:5, task_num:0, tot_wait:2m30.256330022s, tot_exec:0s, tot_time:2m30.256337186s, max:30.051275149s, p95:30.051275149s}, final_worker:{wall_time:30.051697299s, concurrency:5, task_num:2, tot_wait:746.438µs, tot_exec:74ns, tot_time:2m30.257469315s, max:30.051632369s, p95:30.051632369s}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   | 31.5 KB   | N/A      |
|     └─Projection_75                      | 5508656.20 | 0       | root      |                                                            | time:30.1s, loops:1, Concurrency:5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 18.2 KB   | N/A      |
|       └─Projection_48                    | 5508656.20 | 0       | root      |                                                            | time:30.1s, loops:1, Concurrency:5                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      | 9.92 KB   | N/A      |
|         └─HashJoin_60                    | 5508656.20 | 0       | root      |                                                            | time:30.1s, loops:1, build_hash_table:{total:11.7ms, fetch:5.37ms, build:6.33ms}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 2.92 MB   | 0 Bytes  |
|           ├─IndexReader_74(Build)        | 21945.43   | 17853   | root      |                                                            | time:6.56ms, loops:19, cop_task: {num: 7, max: 1.87ms, min: 537.7µs, avg: 1.11ms, p95: 1.87ms, max_proc_keys: 992, p95_proc_keys: 992, tot_proc: 1.57ms, tot_wait: 1.23ms, copr_cache_hit_ratio: 0.71, build_task_duration: 4.37µs, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:7, total_time:7.71ms}}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          | 955.1 KB  | N/A      |
|           │ └─IndexRangeScan_73          | 21945.43   | 17853   | cop[tikv] | table:cust, index:PRIMARY(PARTNER_ORG_ID, CUSTOMER_ORG_ID) | tikv_task:{proc max:137ms, min:0s, avg: 46.1ms, p80:133ms, p95:137ms, iters:44, tasks:7}, scan_detail: {total_process_keys: 1472, total_process_keys_size: 197248, total_keys: 2552, get_snapshot_time: 86.6µs, rocksdb: {key_skipped_count: 2550, block: {cache_hit_count: 30}}}, time_detail: {total_process_time: 1.57ms, total_wait_time: 1.23ms, tikv_wall_time: 3.63ms}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | N/A       | N/A      |
|           └─HashJoin_63(Probe)           | 5011463.00 | 0       | root      |                                                            | time:30.1s, loops:1, build_hash_table:{total:1.29ms, fetch:1.26ms, build:28.1µs}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        | 33.7 KB   | 0 Bytes  |
|             ├─TableReader_65(Build)      | 12.00      | 12      | root      |                                                            | time:1.22ms, loops:2, cop_task: {num: 1, max: 1.59ms, proc_keys: 12, tot_proc: 99.7µs, tot_wait: 413.2µs, copr_cache_hit_ratio: 0.00, build_task_duration: 7.07µs, max_distsql_concurrency: 1}, rpc_info:{Cop:{num_rpc:1, total_time:1.58ms}}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | 790 Bytes | N/A      |
|             │ └─TableFullScan_64         | 12.00      | 12      | cop[tikv] | table:offering                                             | tikv_task:{time:0s, loops:1}, scan_detail: {total_process_keys: 12, total_process_keys_size: 841, total_keys: 13, get_snapshot_time: 15µs, rocksdb: {key_skipped_count: 12, block: {cache_hit_count: 11}}}, time_detail: {total_process_time: 99.7µs, total_wait_time: 413.2µs, tikv_wall_time: 612.9µs}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                | N/A       | N/A      |
|             └─IndexLookUp_72(Probe)      | 4009170.40 | 0       | root      | partition:all                                              | time:30.1s, loops:1, index_task: {total_time: 30s, fetch_handle: 497.2ms, build: 512µs, wait: 29.5s}, table_task: {total_time: 2m30.2s, num: 253, concurrency: 5}, next: {wait_index: 3.88ms, wait_table_lookup_build: 182µs, wait_table_lookup_resp: 29.8s}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            | 23.7 MB   | N/A      |
|               ├─IndexRangeScan_69(Build) | 5011463.00 | 4981930 | cop[tikv] | table:usd, index:partner_org_id_idx(partner_org_id)        | time:118.2ms, loops:4924, cop_task: {num: 256, max: 7.02ms, min: 411.6µs, avg: 1.24ms, p95: 2.92ms, max_proc_keys: 2016, p95_proc_keys: 992, tot_proc: 29.6ms, tot_wait: 80.6ms, copr_cache_hit_ratio: 0.72, build_task_duration: 488.5µs, max_distsql_concurrency: 2}, rpc_info:{Cop:{num_rpc:256, total_time:313.2ms}}, tikv_task:{proc max:973ms, min:0s, avg: 65.2ms, p80:35ms, p95:408ms, iters:5839, tasks:256}, scan_detail: {total_process_keys: 41856, total_process_keys_size: 3808896, total_keys: 41927, get_snapshot_time: 5.27ms, rocksdb: {key_skipped_count: 41856, block: {cache_hit_count: 646}}}, time_detail: {total_process_time: 29.6ms, total_suspend_time: 2.22ms, total_wait_time: 80.6ms, total_kv_read_wall_time: 34ms, tikv_wall_time: 157.5ms}                                                                                                                             | N/A       | N/A      |
|               └─Selection_71(Probe)      | 4009170.40 | 1458    | cop[tikv] |                                                            | time:2m28.9s, loops:254, cop_task: {num: 629, max: 1.56s, min: 0s, avg: 377ms, p95: 1.04s, max_proc_keys: 20480, p95_proc_keys: 15797, tot_proc: 56.8s, tot_wait: 249.5ms, copr_cache_hit_ratio: 0.00, build_task_duration: 78.1ms, max_distsql_concurrency: 4, max_extra_concurrency: 1}, rpc_info:{Cop:{num_rpc:629, total_time:3m57.1s}, rpc_errors:{context canceled:1}}, tikv_task:{proc max:1.55s, min:0s, avg: 372.1ms, p80:676ms, p95:1.03s, iters:7556, tasks:628}, scan_detail: {total_process_keys: 4872113, total_process_keys_size: 778806888, total_keys: 5902859, get_snapshot_time: 16ms, rocksdb: {key_skipped_count: 3561631, block: {cache_hit_count: 18038654, read_count: 1, read_byte: 15.9 KB, read_time: 105.4µs}}}, time_detail: {total_process_time: 56.8s, total_suspend_time: 2m56.9s, total_wait_time: 249.5ms, total_kv_read_wall_time: 3m53.3s, tikv_wall_time: 3m55.5s} | N/A       | N/A      |
|                 └─TableRowIDScan_70      | 5011463.00 | 4872113 | cop[tikv] | table:usd                                                  | tikv_task:{proc max:1.55s, min:0s, avg: 371.5ms, p80:674ms, p95:1.02s, iters:7556, tasks:628}                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           | N/A       | N/A      |
  • 同时因为13条SQL都触发了相同的对usd的查询,因此TiKV Selection算子的读取速度变得很慢,total_suspend_time: 2m56.9s,而单条查询时total_suspend_time: 5.48 s,明显TiKV上的延迟导致了查询缓慢

我个人理解是出现了读取热点,而且分配不均匀,但是查看监控后我发现TiKV的CPU、Memory、IO都没有打满(30秒的查询发生在17:42左右,有一个明显的上涨,但没有到达最大)

唯一比较紧张的应该是Unified read pool CPU,但也没有打满,同时我开启了auto-adjust-pool-size,理论上应该可以打满TiKV的

现在虽然通过增改索引可以部分解决问题,但我还是想知道 — 在CPU、Memory、IO都没有打满的情况下,为什么total_suspend_time会升高,是否能优化?

你的分析是对的,其实就是Unified read pool CPU打满了。
短期的用量已经非常高了。

另外你这种场景,应该是一个AP系统的dashboard。

推荐尝试使用tiflash处理AP系统的sql,而不是采用tikv硬扫这么多数据。如果tikv来做这类聚合计算,是难以避免对TP系统的影响的。

执行计划上看也是个非常典型的聚合计算。通过tiflash做这类计算扫描的数据会减少很多。

如果并发高的话可以尝试查询结果物化

https://docs.pingcap.com/zh/tidb/stable/tiflash-results-materialization#tiflash-查询结果物化

3 个赞

一般这种情况我们应用层会做处理,依次查询,避免并发查询。

我看也是,你都有资源部署6个16c64G的tikv了,再部署个tiflash应该也能接受吧,收益肯定肯定比你再多部几套tikv要大

感谢,确实是一个AP的场景,我后续会试一下TiFlash在这类场景的表现的

因为我理解既然TiKV有16核,而且没有完全用到1600%,所以猜测硬盘或IO方面存在限制,导致CPU起不来
按照您说的

短期的用量已经非常高了。

我理解其实上述IO或硬盘问题并不存在,在短时间内Unified read pool CPU的快速上涨其实就代表CPU已经打满了,只不过查询时间还不够长,导致CPU还没涨到1600%就已经结束了。

但是我还是比较疑惑,因为看起来Unified read pool CPU的涨幅并不均匀,最大的已经到了600%,但最小的还在200%,这是不是也代表存在数据分布不均,导致Unified read pool CPU的涨幅看起来不均匀?

这是有可能的。
主要耗时阶段在indexlookup。我看你注意到了selection的选择性很好。所以调整数据分布,可能不如索引来的直接,收益也比较模糊,不好评估。
从成本和收益的角度还是应该先调整索引。

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。