如何使sum下推,或者不使用ExchangeSender完成计算

【 TiDB 使用环境】生产环境
【 TiDB 版本】V7.5.0
【遇到的问题:问题现象及影响】跑了5s多,一看主要是数据传输

sql:
SELECT
date,
count(CASE WHEN tool = 2 THEN 1 END) AS POLYGON_2D,
count(CASE WHEN tool = 3 THEN 1 END) AS POLYLINE_2D,
count(CASE WHEN tool = 4 THEN 1 END) AS POINT_2D,
count(CASE WHEN tool = 5 THEN 1 END) AS CUBOID_3D,
count(CASE WHEN tool = 6 THEN 1 END) AS POLYGON_3D,
count(CASE WHEN tool = 8 THEN 1 END) AS SEG_3D,
count(CASE WHEN tool = 9 THEN 1 END) AS POLYLINE_3D,
count(CASE WHEN tool = 10 THEN 1 END) AS TEXT_SUMMARY_LLM,
count(CASE WHEN tool = 11 THEN 1 END) AS DIALOGUE_PROMPT_LLM,
count(CASE WHEN tool = 12 THEN 1 END) AS DIALOGUE_ANSWER_LLM,
count(CASE WHEN tool = 13 THEN 1 END) AS RECTANGLE_2D,
count(CASE WHEN tool = 14 THEN 1 END) AS CUBOID_2D,
count(CASE WHEN tool = 15 THEN 1 END) AS CIRCLE_2D,
count(CASE WHEN tool = 16 THEN 1 END) AS ELLIPSE_2D,
count(CASE WHEN tool = 17 THEN 1 END) AS SKELETON_2D
FROM
(
SELECT
date_format( t1.create_time, ‘%Y-%m-%d’ ) AS date,
tool,
row_number() over ( PARTITION BY t1.annotate_id ORDER BY t1.create_time DESC ) AS rn
FROM
(select
task_subject_id, annotate_id, JSON_EXTRACT(data, ‘$.tool’ ) AS tool, workflow_id, create_time
from annotate_log
WHERE record_type IN ( 2, 4 )
AND project_id = 2017612633062162347
AND create_time >= ‘2024-07-13 00:00:00’
AND create_time <= DATE_ADD(‘2024-07-19 00:00:00’, INTERVAL 1 DAY )
AND annotate_id IN ( SELECT id FROM annotate WHERE is_delete = 0 AND project_id = 2017612633062162347 )
and workflow_id in ( SELECT id FROM workflow WHERE type = 1 and is_delete = 0 AND project_id = 2017612633062162347)
and task_subject_id in ( SELECT id FROM task_subject WHERE is_delete = 0 AND project_id = 2017612633062162347)
AND project_id = 2017612633062162347
and is_delete = 0
) t1
)t
WHERE
rn = 1
AND date IS NOT NULL
GROUP BY
date
HAVING
date IS NOT NULL
ORDER BY
date;

explain analyze:
_explain_analyze_SELECT_date_sum_IF_tool_2_1_0_AS_POLYGON_2D_sum_202407211602.csv (9.6 KB)

你这sql也没有sum啊,下推啥,你annotate_log表上project_id 字段有索引吗,过滤性好不,不行走下tikv索引试下,感觉走tiflash也不是很合理。

set @@session.tidb_allow_mpp=1;
set @@session.tidb_enforce_mpp=1;

https://docs.pingcap.com/zh/tidb/stable/use-tiflash-mpp-mode#控制是否选择-mpp-模式

建议强制mpp,观察一下执行效率是否会变好,如果不好,看看执行计划是什么。

至于你现在这个执行计划,是个tiflash负优化的情况。

tiflash擅长的是聚合计算,而不是扫描。你原来这个执行计划就是tidb去做聚合了。tiflash做扫描。这是执行时间不理想的根因。

没有sum,louzhu说的是count吧,having可以去掉吧

sql也没有sum