tiflash简单的"limit 1"查询报内存不够

xxxxxxxx · 2025 年2 月 20 日 06:05

tidb版本 6.1.7

背景：我们的架构四个tidb节点，然后有三个tidb是给业务做正常的读写，走的tikv，有一个是给大数据抽数分析的，所有流量走的tiflash，所以算是物理隔离，之所以要这么做是因为之前让优化器自己选的话，大数据抽数的查询是范围查询，每次好几万，然后会被打到tikv，这时候就会影响到业务的正常读写流量（耗时变大），所以我们单独部署了一个tidb给大数据抽数分析用。

现象：近期发现大数据分析有一类limit 1查询报内存不够，导致查询失败。

tiflash配置如下：

  tiflash:
    log.file.max-days: 30
    profiles.default.max_memory_usage: 4294967296
    profiles.default.max_memory_usage_for_all_queries: 8589934592

mpp配置默认。

mysql> show variables like '%mpp%';
+------------------------------------------+-------+
| Variable_name                            | Value |
+------------------------------------------+-------+
| tidb_allow_mpp                           | ON    |
| tidb_enforce_mpp                         | OFF   |
| tidb_mpp_store_fail_ttl                  | 60s   |
| tidb_opt_mpp_outer_join_fixed_build_side | OFF   |
+------------------------------------------+-------+
4 rows in set (0.00 sec)

mysql>

现在连接tidb节点查数具体操作如下

ERROR 1105 (HY000): other error for mpp stream: DB::Exception: Memory limit (for query) exceeded: would use 4.00 GiB (attempt to allocate chunk of 10512061 bytes), maximum: 4.00 GiB: (while reading from DTFile: /work/tidb-oltp-145-v6.1.7/data/tiflash-34145/data/t_77/stable/dmf_1252)

tiflash日志报错如下

[2025/02/19 14:46:02.378 +08:00] [INFO] [MPPTaskStatistics.cpp:127] ["mpp_task_tracing:MPP<query:456116813736706056,task:1> {\"query_tso\":456116813736706056,\"task_id\":1,\"is_root\":true,\"sender_executor_id\":\"ExchangeSender_20\",\"
executors\":[{\"id\":\"ExchangeSender_20\",\"type\":\"ExchangeSender\",\"children\":[\"Limit_19\"],\"outbound_rows\":0,\"outbound_blocks\":0,\"outbound_bytes\":0,\"execution_time_ns\":0,\"partition_num\":1,\"sender_target_task_ids\":[-1
],\"exchange_type\":\"PassThrough\",\"connection_details\":[{\"tunnel_id\":\"tunnel1+-1\",\"sender_target_task_id\":-1,\"sender_target_host\":\"10.45.164.116:43480\",\"is_local\":false,\"packets\":0,\"bytes\":0}]},{\"id\":\"Limit_19\",\"t
ype\":\"Limit\",\"children\":[\"Selection_18\"],\"outbound_rows\":0,\"outbound_blocks\":0,\"outbound_bytes\":0,\"execution_time_ns\":0},{\"id\":\"Selection_18\",\"type\":\"Selection\",\"children\":[\"TableFullScan_17\"],\"outbound_rows\
":0,\"outbound_blocks\":0,\"outbound_bytes\":0,\"execution_time_ns\":0},{\"id\":\"TableFullScan_17\",\"type\":\"TableScan\",\"children\":[],\"outbound_rows\":0,\"outbound_blocks\":0,\"outbound_bytes\":0,\"execution_time_ns\":0,\"connect
ion_details\":[{\"is_local\":true,\"packets\":0,\"bytes\":0},{\"is_local\":false,\"packets\":0,\"bytes\":0}]}],\"host\":\"10.45.165.143:36145\",\"task_init_timestamp\":1739947562184573000,\"task_start_timestamp\":1739947562209934000,\"ta
sk_end_timestamp\":1739947562378048000,\"compile_start_timestamp\":1739947562185304000,\"compile_end_timestamp\":1739947562209834000,\"read_wait_index_start_timestamp\":1739947562186673000,\"read_wait_index_end_timestamp\":1739947562193
046000,\"local_input_bytes\":0,\"remote_input_bytes\":0,\"output_bytes\":0,\"status\":\"CANCELLED\",\"error_message\":\"DB::Exception: Memory limit (for query) exceeded: would use 4.01 GiB (attempt to allocate chunk of 10078218 bytes), 
maximum: 4.00 GiB: (while reading from DTFile: /work/tidb-oltp-145-v6.1.7/data/tiflash-34145/data/t_77/stable/dmf_1030)\",\"working_time\":0,\"m

可以发现执行一条简单的select limit 1会报内存不够，但是如果加上where条件就可以执行出结果，但是tiflash没有索引的概念，按说这两个sql都是全表扫描，所以没搞懂为啥，这算是bug还是预期内的现象呢，如果是预期内现象那原因是啥呢？

有猫万事足 · 2025 年2 月 20 日 06:21

这块即使不是一个bug，也应该是一个可以优化的地方。

solotzg-PingCAP · 2025 年2 月 20 日 07:39

请帮忙确认下是否出现 tiflash 进程实际内存占用过大的情况？

xxxxxxxx · 2025 年2 月 20 日 08:22

不会，我们之前也怀疑是因为tiflash占用内存过大导致，后来reload了tiflash节点重新测试还是一样的，重启了tiflash节点也能稳定复现。

solotzg-PingCAP · 2025 年2 月 20 日 08:40

如果在 sql 中设置 session 变量 tidb_allow_mpp=off（关闭 mpp 改用 cop 协议），能否正常执行完？

The-Fallen-Angel · 2025 年2 月 20 日 10:01

select * from table limit 1 使用mpp的话报内存不足。
select * from table where a =‘’ limit 1 使用mpp的话可以正常查询。
是这个意思吗？

The-Fallen-Angel · 2025 年2 月 21 日 00:24

TiFlash支持谓词下推、投影下推等优化策略，在TiFlash层面对数据进行初步过滤和预聚合，减少需传输到客户端的数据量。不是直接利用索引来加速查询，但是通过减少不必要的数据扫描，实现了类似的性能提升效果。
在MPP模式中，数据被划分成多个分片，并行地在不同的节点上进行处理。每个节点需要足够的内存来处理分配给它的数据分片。如果数据分片过大或者节点间的负载不均衡，某些节点可能会因为内存不足而出现问题。
中间结果缓存在内存问题，如果太大会引起oom。
等等其他内存使用的几个方面，会引起问题。

solotzg-PingCAP · 2025 年2 月 21 日 03:44

这里的 where 是不是主键比较（对应explain 的任务类型是 tablerangescan）？如果换成其他与主键无关的条件，是否还会触发这个问题？

xxxxxxxx · 2025 年2 月 24 日 07:30

普通索引也会报内存不足的错误。

solotzg-PingCAP · 2025 年2 月 24 日 08:50

这样是否能达到预期？

solotzg-PingCAP · 2025 年2 月 24 日 09:19

这个 case 最终的预期是不是通过 sql select * from ? limit ? 或 select * from ? 来从 tiflash 抽数据到下游？

当前已经设置了 tidb_isolation_read_engines 为 tiflash

select * from ? ... limit ? 默认会走 mpp 协议。tiflash 没有索引，where 条件为主键过滤用 tablerangescan，其他为 tablefullscan。在 tablefullscan 任务中 tiflash 会尽快按节点读取相关数据，可能会造成短期内存使用超限的问题
select * from ? 默认走的是 cop 协议，在每个 cop 任务中会处理一小部分数据

初步建议是：

[opt1] 对于 select * from ? limit ? 可以在 session 层面通过设置 tidb_allow_mpp 为 off 关闭 mpp 协议，令 sql 执行 cop 协议
[opt2] session 层面保持 tidb_allow_mpp 为 on，设置 tidb_max_tiflash_threads 为较小值（例如 1），限制 tiflash 执行 mpp 任务的并发数，降低内存使用峰值

xxxxxxxx · 2025 年2 月 24 日 09:38

经过测试
set tidb_allow_mpp = off;
set global tidb_max_tiflash_threads = 1
这两个操作都事可以的

system · 2025 年3 月 3 日 09:39

此话题已在最后回复的 7 天后被自动关闭。不再允许新回复。