【 TiDB 使用环境】测试环境
【 TiDB 版本】6.5.2
【复现路径】tidb 集群没有磁盘空间 重启tidb 集群
【遇到的问题:问题现象及影响】
tikv.log 不断输出 如下日志:集群不可用
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261033: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261035: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261031: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261029: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261025: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261027: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261023: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261021: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261019: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261017: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261015: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261013: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261011: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261009: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261007: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
首先:清除tikv清除日志。释放磁盘空间。
尝试方法1:通过
– 获取机器id
cat /data/tidb-deploy/pd-2379/log/pd.log |grep ‘init cluster id’
[2024/03/31 13:25:37.607 +08:00] [INFO] [server.go:384] [“init cluster id”] [cluster-id=7166168149192488053]
– 获取已分配 ID
[webapp@lg-test-shuabao log]$ cat /data/tidb-deploy/pd-2379/log/pd.log| grep “idAllocator allocates a new id” | awk -F’=’ ‘{print $2}’ | awk -F’]’ ‘{print $1}’ | sort -r -n | head -n 1
1417000
重建pd
./pd-recover -endpoints http://127.0.0.1:2379 -cluster-id 7166168149192488053 -alloc-id 1417000
错误依然存在
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261033: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
通过获取 tso 无异常
https://docs.pingcap.com/zh/tidb/stable/tso#tidb-中的-timestamp-oracle-tso
尝试方法2
尝试升级小版本 从6.5.2 升级到 6.5.8 重启集群问题依旧。
启动集群:tikv节点疯狂数据日志
[2024/03/31 23:10:28.240 +08:00] [WARN] [pd.rs:1707] [“failed to update max timestamp for region 261007: Pd(Other("[components/pd_client/src/client.rs:981]: get timestamp timeout"))”]
【资源配置】进入到 TiDB Dashboard -集群信息 (Cluster Info) -主机(Hosts) 截图此页面
【附件:截图/日志/监控】
同一个虚拟机混合部署 tidb pd 和tikv