tidb频繁出现警告日志:Hist for column should already be loaded as sync but not found.

【 TiDB 使用环境】生产环境
【 TiDB 版本】v7.5.5
【遇到的问题:问题现象及影响】

[2025/02/13 11:28:55.526 +08:00] [WARN] [handle_hist.go:127] ["SyncWaitStatsLoad meets error"] [errors="[\"sync load took too long to return\"]"]
[2025/02/13 11:28:55.526 +08:00] [WARN] [handle_hist.go:127] ["SyncWaitStatsLoad meets error"] [errors="[\"sync load took too long to return\"]"]
[2025/02/13 11:28:55.526 +08:00] [WARN] [plan_stats.go:200] ["SyncWaitStatsLoad failed"] [error="sync load stats timeout"]
[2025/02/13 11:28:55.526 +08:00] [WARN] [handle_hist.go:127] ["SyncWaitStatsLoad meets error"] [errors="[\"sync load took too long to return\"]"]
[2025/02/13 11:28:55.526 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.526 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.526 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.526 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.526 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code]
[2025/02/13 11:28:55.527 +08:00] [WARN] [column.go:170] ["Hist for column should already be loaded as sync but not found."] [16=error_code

日志除了警报对sql执行无影响,但过多的日志导致占用了大量的存储空间(相互sql执行的频率较高)
字段类型是bit(32),简略的表结构如下

create table t_answer_card
(
    answer_card_id              bigint auto_increment comment '答题卡' primary key,
    error_code                  bit(32)                                             null
) comment '答题卡' charset = utf8mb4;

感觉这是在收集统计信息的时候报错了,看看blockbox中 tidb 到 tikv 组件的ping 值高不高。

不高的,平均值在300 us左右


执行 sql 的时候同步加载缺失的统计信息超时
需要确认几个点:
1、当前集群的表是不是比较多,内存占用也比较多?这个会导致统计信息没有完全加载到内存
2、集群负载较高?这个会导致加载动作耗时较长

如果当前环境没有因为统计信息缺失导致的执行计划变化,可以考虑将这个参数设置为 0,关闭这个功能,缺点就是会导致缺失部分统计信息。

整个集群所有库表一共有接近2000张; 和内存及集群的负载关系不大,内存占用和负载都比较低的时候,也有这个日志

1 个赞

tidb_stats_cache_mem_quota 这个变量设置的是多少以及 tidb 服务器内存是多少


以及上述参数设置的多少

tidb_stats_cache_mem_quota 为0,
tidb_stats_load_sync_wait用的默认配置100
tidb服务器配置是32核256G

看下 监控 tidb->Statistics & Plan Management 下面的几个监控

tidb_stats_load_sync_wait 这个参数我已经先调为0了.

增加统计信息加载的超时时间(默认 5 分钟)
SET GLOBAL tidb_stats_load_sync_wait = 600; – 单位:秒

– 增加统计信息加载的并发度(默认 5)
SET GLOBAL tidb_stats_load_concurrency = 10;

参数可以适当调大

我试试看

目前调大统计信息加载的并发度,和超时时间后情况有改善吗

增加并发度和调整值,没啥变化,调为0后,相关日志没了

2 个赞

设置tidb_stats_load_sync_wait 为 0 就是异步加载了,不会报超时日志了 :joy:

1 个赞

排查建议
1、检查统计信息状态
执行 SHOW STATS_META 和 SHOW STATS_HISTOGRAMS,确认目标列的 LastUpdateVersion 是否与当前 TiDB 版本一致。若存在滞后,手动执行 ANALYZE TABLE 刷新统计信息。
2、监控 GC 与负载指标
通过 TiDB Dashboard 观察 GC Duration 和 Region Error 指标,确保 GC 压力正常。若 GC Duration 持续超过 1 分钟,需调大 tikv_gc_life_time 并优化 TiKV 配置。
3、升级或热修复
若确认是版本缺陷,优先升级至 v6.5.0 及以上版本(引入全局统计信息主动同步机制)。对于无法立即升级的环境,可通过设置 tidb_enable_fast_analyze=ON 临时绕过同步检查(可能影响执行计划质量)。
4、清理无效统计信息
对已删除的列或表,执行 DROP STATS 命令清除残留统计信息,避免异步线程尝试加载无效数据。

:thinking:改为0异步加载的话,会有风险吧。会导致统计信息缺失