tiflash ap查询优化

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】
v5

【概述】 场景 + 问题概述
tiflash的ap查询性能没有想象中的那么好,是我用的姿势不对吗?

SQL:

SELECT
/+ read_from_storage(tiflash[consume_gift_log]) /
time DIV 600 * 600 AS “time”,
alias AS metric,
count(
) AS "
"
FROM
consume_gift_log
WHERE
time >= 1633174955
AND time <= 1633779755
AND alias IN (
',
'1
’,
‘lmpro’,
',
‘*’,
‘latemeet’,
‘us’,
‘ar’,
'l
me’
)
GROUP BY
1,
2
ORDER BY
time DIV 600 * 600;

执行计划


根据这个执行计划看tiflash的耗时不长,发送给tidb的聚合操作耗时10S秒多,我这样理解是对的吗?

然后我参照 https://docs.pingcap.com/zh/tidb/stable/tune-tiflash-performance tiflash调优,设置了 ```
set @@tidb_distsql_scan_concurrency = 80;
set @@tidb_allow_batch_cop = 1;

![image|690x269](upload://cUwXfQbJpFbvrVfllKt6lHojy5a.png) 
但是好像不起作用,那个concurrency没有变化,我应该如何调整优化呢


【TiDB 版本】 v5
2 个赞

2 个赞

可以试试将 GROUP BY 1, 2 改成 GROUP BY time, metric

2 个赞

没有明显变化:joy:

2 个赞

explain select 查看执行计划后 show warnings; 有没有提示

2 个赞

1 个赞

SELECT
/+ read_from_storage(tiflash[consume_gift_log]) /
time DIV 600 * 600 AS “time1”,
alias AS metric,
count(
) AS "
"
FROM consume_gift_log
WHERE
time >= 1633174955 AND time <= 1633779755 and
alias in (‘voice’,‘bothlive’,‘lmpro’,‘ds’,‘lenovo’,‘latemeet’,‘us’,‘ar’,‘ew’) and
source_type in (201)
GROUP BY time1,metric
ORDER BY time1;

1 个赞

time div下推不了?

1 个赞

把 DIV 改成 / 呢,另外 time DIV 600 * 600 这个计算出来跟 time 原值是一样吧

1 个赞

time DIV 600 * 600是想把10min的数据统计成一个数据

1 个赞

1 个赞


根据这个执行计划看tiflash的耗时不长,发送给tidb的聚合操作耗时10S秒多,我这样理解是对的吗?

我想知道是tiflash计算无法下推导致的查询10S多,还是tidb聚合的时候占大头呢?这个执行计划是怎么看的呢?

1 个赞

从上面的执行计划,应该是由于 tiflash 过滤的结果集比较大,tidb 在 table reader 这一步耗时比较高

1 个赞

tidb_distsql_scan_concurrency 调了这个参数没啥用,还需要调哪些参数呢?需要重启tidb-server和tiflash吗,感觉配置没生效

试试加个 hint SELECT /*+ read_from_storage(tiflash[consume_gift_log]) AGG_TO_COP() */ …

SELECT
/+ read_from_storage(tiflash[consume_gift_log]) AGG_TO_COP() /
time DIV 600 * 600 AS “time1”,
alias AS metric,
count(
) AS "
"
FROM
consume_gift_log
WHERE
time >= 1633174955
AND time <= 1633779755
AND alias IN (
‘voice’,
‘bothlive’,
‘lmsdpro’,
‘twoo’,
‘lenovo’,
‘latemeet’,
‘us’,
‘ar’,
‘livdsme’
)
AND source_type IN (201)
GROUP BY
time1,
metric
ORDER BY
time1;

没啥变化:sweat_smile:

发一下 explain analyze 结果

1、不对time做10min处理,2、去掉orderby速度怎么样?

id estRows actRows task access object execution info operator info memory disk
Sort_6 342398.37 5251 root time:12.7s, loops:7 Column#36 164.8 KB 0 Bytes
└─Projection_8 342398.37 5251 root time:12.7s, loops:10, Concurrency:5 mul(intdiv(livemedatahouse.consume_gift_log.time, 600), 600)->Column#36, livemedatahouse.consume_gift_log.alias, Column#35 170.2 KB N/A
└─HashAgg_9 342398.37 5251 root time:12.7s, loops:10, partial_worker:{wall_time:12.688041006s, concurrency:5, task_num:7305, tot_wait:1m1.439624724s, tot_exec:1.992892867s, tot_time:1m3.439108576s, max:12.687964758s, p95:12.687964758s}, final_worker:{wall_time:12.69182294s, concurrency:5, task_num:25, tot_wait:1m3.438241678s, tot_exec:19.741072ms, tot_time:1m3.457994282s, max:12.691762492s, p95:12.691762492s} group by:Column#40, Column#41, funcs:count(1)->Column#35, funcs:firstrow(Column#38)->livemedatahouse.consume_gift_log.time, funcs:firstrow(Column#39)->livemedatahouse.consume_gift_log.alias 4.38 MB N/A
└─Projection_14 4451866.61 7460710 root time:12.3s, loops:7306, Concurrency:5 livemedatahouse.consume_gift_log.time, livemedatahouse.consume_gift_log.alias, mul(intdiv(livemedatahouse.consume_gift_log.time, 600), 600)->Column#40, livemedatahouse.consume_gift_log.alias 375.6 KB N/A
└─TableReader_13 4451866.61 7460710 root time:12.3s, loops:7306, cop_task: {num: 1772, max: 4.3s, min: 33.1ms, avg: 559.8ms, p95: 1.64s, rpc_num: 1772, rpc_time: 16m32s, copr_cache_hit_ratio: 0.00} data:Selection_12 15.7 MB N/A
└─Selection_12 4451866.61 7460710 cop[tiflash] tiflash_task:{proc max:146.5ms, min:13.7ms, p80:65.4ms, p95:78.3ms, iters:208, tasks:1772, threads:1772} eq(livemedatahouse.consume_gift_log.source_type, 201), ge(livemedatahouse.consume_gift_log.time, 1633174955), in(livemedatahouse.consume_gift_log.alias, “voice”, “bothlive”, “lmpro”, “twoo”, “lenovo”, “latemeet”, “us”, “ar”, “liveme”), le(livemedatahouse.consume_gift_log.time, 1633779755) N/A N/A
└─TableFullScan_11 454952989.00 459760062 cop[tiflash] table:consume_gift_log tiflash_task:{proc max:122.5ms, min:6.7ms, p80:48.6ms, p95:61.4ms, iters:8667, tasks:1772, threads:1772} keep order:false N/A N/A

目前 5.0/5.1 还不支持 intdiv 下推到 tiflash 考虑先去掉 DIV,或者 playground 部署 5.2 测一下这个版本走 mpp 的效果