tidb从6.1.6升级到7.5.6发现统计信息失效，所有查都走了全表扫描

tug_twf · 2025 年4 月 8 日 03:31

问题背景：
这个实例有一个特殊的设置就是关闭了raft_engine，其他参数跟其他实例配置差不多
之前升级过很多版本实例，没有出现过类似情况
这次问题体现是：
升级到7.5.6之后发现很多请求全部卡住，拿sql看了一下发现全部走了全表扫描
如下：

tikv日志报大量的 cdc相关的错误，我们确实也存在cdc服务，

tug_twf · 2025 年4 月 8 日 03:41

tug_twf · 2025 年4 月 8 日 04:18

整个过程中，升级tikv也没有问题，看起来是升级tidb-server之后开始有问题

怀疑是tidb server没有加载到统计信息
从升级过程中的日志来看 tidb server成功加载了 schema

但是stats 看着是加载失败了

还有一个这个错误
[handle_hist.go:125] [“SyncWaitStatsLoad meets error”] [errors="["sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load took too long to return","sync load stats channel is full and timeout sending task to channel","sync load stats channel is full and timeout sending task to channel","sync load stats channel is full and timeout sending task to channel","sync load took too long to return","sync load stats channel is full and timeout

实例的有几个表有分区，分区数如下：

相关stats参数如下
| information_schema_stats_expiry | 86400 |
| innodb_stats_auto_recalc | 1 |
| innodb_stats_method | nulls_equal |
| innodb_stats_on_metadata | 0 |
| innodb_stats_persistent | ON |
| innodb_stats_persistent_sample_pages | 20 |
| innodb_stats_sample_pages | 8 |
| innodb_stats_transient_sample_pages | 8 |
| myisam_stats_method | nulls_unequal |
| tidb_auto_build_stats_concurrency | 1 |
| tidb_build_sampling_stats_concurrency | 2 |
| tidb_build_stats_concurrency | 4 |
| tidb_enable_async_merge_global_stats | OFF |
| tidb_enable_extended_stats | OFF |
| tidb_enable_historical_stats | ON |
| tidb_enable_historical_stats_for_capture | OFF |
| tidb_enable_pseudo_for_outdated_stats | OFF |
| tidb_historical_stats_duration | 168h0m0s |
| tidb_merge_partition_stats_concurrency | 1 |
| tidb_plan_cache_invalidation_on_fresh_stats | OFF |
| tidb_skip_missing_partition_stats | ON |
| tidb_stats_cache_mem_quota | 0 |
| tidb_stats_load_pseudo_timeout | ON |
| tidb_stats_load_sync_wait | 100 |

dba远航 · 2025 年4 月 8 日 06:24

升级过程中出现超时导致的

zhaokede · 2025 年4 月 8 日 08:56

analyze后有没有修复？

tug_twf · 2025 年4 月 8 日 10:36

analyze之后就正常了

Kongdom · 2025 年4 月 8 日 11:20

那应该就是和那个超时报错有关，可以把其他表也手工analyze一下。

tug_twf · 2025 年4 月 9 日 03:49

当时所有表都analyze过了只是analyze是一个比较长的过程，这个过程中对业务还是有影响

Kongdom · 2025 年4 月 9 日 09:04

对，这个是对业务有影响，我们当时都是在深夜执行。优先执行健康度低的大表。

逍遥_猫 · 2025 年4 月 13 日 14:17

c从V5.3以后 tidb_analyze_version 默认值是2 呀，这里是修改过统计信息的一些参数吗？

Soysauce520 · 2025 年4 月 14 日 01:09

估计是从低版本升上来的，这个参数会保持默认

TiDBer_wang · 2025 年6 月 2 日 01:42

升级之后有没有做analyze

lllzd · 2025 年6 月 18 日 01:52

从 TiDB 6.1.6 升级到 7.5.6 后统计信息失效导致全表扫描，可通过执行 ANALYZE TABLE 更新统计信息，并清理执行计划缓存来解决。