tidb 分组查询慢

execution info 如下:

time:14m12.8s, loops:2, partial_worker:{wall_time:14m12.789413799s, concurrency:5, task_num:541, tot_wait:1h11m3.775483758s, tot_exec:170.857449ms, tot_time:1h11m3.946601487s, max:14m12.789360702s, p95:14m12.789360702s}, final_worker:{wall_time:14m12.78943776s, concurrency:5, task_num:5, tot_wait:1h11m3.946844401s, tot_exec:49.13µs, tot_time:1h11m3.946897548s, max:14m12.789387859s, p95:14m12.789387859s}

time:14m12.8s, loops:542, partial_worker:{wall_time:14m12.176761805s, concurrency:5, task_num:1, tot_wait:1h10m57.012021845s, tot_exec:644.187609ms, tot_time:1h10m57.786230423s, max:14m12.176599303s, p95:14m12.176599303s}, final_worker:{wall_time:14m12.789305829s, concurrency:5, task_num:5, tot_wait:1h11m0.883257546s, tot_exec:2.99572837s, tot_time:1h11m3.878995141s, max:14m12.789260606s, p95:14m12.789260606s}

time:14m11.4s, loops:2, cop_task: {num: 1, max: 2m0.9s, proc_keys: 999863, tot_proc: 59.8s, rpc_num: 16, rpc_time: 13m59.9s, copr_cache_hit_ratio: 0.00}, backoff{tikvRPC: 11.5s, regionMiss: 2ms}

tikv_task:{time:59.7s, loops:977}, scan_detail: {total_process_keys: 999863, total_process_keys_size: 54992465, total_keys: 1000218, rocksdb: {delete_skipped_count: 7, key_skipped_count: 1000224, block: {cache_hit_count: 851, read_count: 31, read_byte: 831.5 KB}}}

tikv_task:{time:553ms, loops:977}

@dba远航 @tidb菜鸟一只

就像我说的,其实你这个在tikv层面没有耗费多少时间,一共将近100万的数据,但是其中不重复的deal_id有50万,这时候tikv就会把每个节点的deal_id和对应最大的content_id返回给tidb-server,最后tidb-server又对这50万条数据进行了汇总,基本时间都耗费在tidb-server上了,如果用tiflash不会存在这个问题。

1 个赞

过多的行在tidb 上进行汇总影响了性能

资源不够吧

多加几个tikv节点试试

看执行计划,表健康度没有问题

这里外面是取总数量,里面查询max(content_id)意义不大,可以只查group by的字段值即可

看数据量也不大,应该不会慢,是不是tidb的cpu配置太低了