TiFlash节点负载高

【 TiDB 使用环境】生产环境
【 TiDB 版本】5.4.3
TiFlash节点负载拉高,原因为读IO高,为了不影响业务,将所有表的tiflash副本分片都设置为0. 但是节点负载依然很高,重启服务,重启服务器都无法解决。

相关日志如下:
root@ /data/tidb-deploy/tiflash-9000/log#tail tiflash_cluster_manager.log

[2024/06/18 15:41:00.355 +08:00] [INFO] [etcd.client] [Try to init master success, ttl: 60, create new key: /tiflash/cluster/leader]

[2024/06/18 15:41:00.355 +08:00] [INFO] [TiFlashManager] [After init, become master]

[2024/06/18 15:41:00.404 +08:00] [INFO] [TiFlashManager] [all replicas are available at global schema version 20469]

[2024/06/18 15:51:13.531 +08:00] [INFO] [etcd.client] [Try to init master success, ttl: 60, create new key: /tiflash/cluster/leader]

[2024/06/18 15:51:13.531 +08:00] [INFO] [TiFlashManager] [After init, become master]

[2024/06/18 15:51:13.580 +08:00] [INFO] [TiFlashManager] [all replicas are available at global schema version 20469]

[2024/06/18 15:54:19.227 +08:00] [INFO] [etcd.client] [Try to init master success, ttl: 60, create new key: /tiflash/cluster/leader]

[2024/06/18 15:54:19.228 +08:00] [INFO] [TiFlashManager] [After init, become master]

[2024/06/18 15:56:19.876 +08:00] [INFO] [TiFlashManager] [all replicas are available at global schema version 20469]

root@ /data/tidb-deploy/tiflash-9000/log#tail tiflash_error.log

[2024/06/18 15:55:57.859 +08:00] [ERROR] [] [“pingcap.pd:Send TsoRequest failed”] [thread_id=37]

[2024/06/18 15:55:57.869 +08:00] [ERROR] [] [“pingcap.pd:get safe point failed: 4: Deadline Exceeded”] [thread_id=38]

[2024/06/18 15:55:57.894 +08:00] [WARN] [] [“pd/oracle:update ts error: Exception: Send TsoRequest failed”] [thread_id=37]

[2024/06/18 15:55:57.896 +08:00] [WARN] [TCPHandler.cpp:69] [“TCPHandler:Client has not sent any data.”] [thread_id=39]

[2024/06/18 15:56:11.870 +08:00] [ERROR] [] [“pingcap.pd:get safe point failed: 4: Deadline Exceeded”] [thread_id=38]

[2024/06/18 15:56:11.876 +08:00] [ERROR] [] [“pingcap.pd:Send TsoRequest failed”] [thread_id=37]

[2024/06/18 15:56:14.889 +08:00] [WARN] [] [“pd/oracle:update ts error: Exception: Send TsoRequest failed”] [thread_id=37]

[2024/06/18 15:56:36.096 +08:00] [WARN] [StorageConfigParser.cpp:215] [“Application:The configuration "path" is deprecated. Check [storage] section for new style.”] [thread_id=1]

[2024/06/18 15:56:51.573 +08:00] [WARN] [TCPHandler.cpp:69] [“TCPHandler:Client has not sent any data.”] [thread_id=26]

[2024/06/18 15:57:26.855 +08:00] [ERROR] [] [“pingcap.pd:Receive TsoResponse failed”] [thread_id=27]

root@ /data/tidb-deploy/tiflash-9000/log#tail tiflash.log

[2024/06/18 15:57:28.874 +08:00] [DEBUG] [] [“grpc:/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tics/contrib/grpc/src/core/lib/iomgr/tcp_posix.cc, line number: 1261, log msg : cannot set inq fd=1311 errno=92”] [thread_id=19]

[2024/06/18 15:57:28.874 +08:00] [DEBUG] [] [“grpc:/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tics/contrib/grpc/src/core/lib/iomgr/tcp_posix.cc, line number: 1261, log msg : cannot set inq fd=1312 errno=92”] [thread_id=19]

[2024/06/18 15:57:28.874 +08:00] [DEBUG] [] [“grpc:/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tics/contrib/grpc/src/core/lib/iomgr/tcp_posix.cc, line number: 1261, log msg : cannot set inq fd=1313 errno=92”] [thread_id=19]

[2024/06/18 15:57:28.874 +08:00] [DEBUG] [] [“grpc:/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tics/contrib/grpc/src/core/lib/iomgr/tcp_posix.cc, line number: 1261, log msg : cannot set inq fd=1314 errno=92”] [thread_id=19]

[2024/06/18 15:57:28.874 +08:00] [DEBUG] [] [“grpc:/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tics/contrib/grpc/src/core/lib/iomgr/tcp_posix.cc, line number: 1261, log msg : cannot set inq fd=1315 errno=92”] [thread_id=19]

[2024/06/18 15:57:28.874 +08:00] [DEBUG] [] [“grpc:/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tics/contrib/grpc/src/core/lib/iomgr/tcp_posix.cc, line number: 1261, log msg : cannot set inq fd=1316 errno=92”] [thread_id=19]

[2024/06/18 15:57:28.874 +08:00] [DEBUG] [] [“grpc:/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tics/contrib/grpc/src/core/lib/iomgr/tcp_posix.cc, line number: 1261, log msg : cannot set inq fd=1317 errno=92”] [thread_id=19]

[2024/06/18 15:57:34.884 +08:00] [INFO] [RateLimiter.cpp:690] [“IOLimitTuner:limiter 0 write 0 read 0 NOT need to tune.”] [thread_id=17]

[2024/06/18 15:57:38.849 +08:00] [INFO] [] [“WaitCheckRegionReady:28714 regions need to fetch latest commit-index in next round, sleep for 5s”] [thread_id=1]

[2024/06/18 15:57:44.872 +08:00] [ERROR] [] [“pingcap.tikv:Get Failed4: Deadline Exceeded”] [thread_id=28]

root@ /data/tidb-deploy/tiflash-9000/log#tail tiflash_stderr.log

Logging debug to /data/tidb-deploy/tiflash-9000/log/tiflash.log

Logging errors to /data/tidb-deploy/tiflash-9000/log/tiflash_error.log

Logging debug to /data/tidb-deploy/tiflash-9000/log/tiflash.log

Logging errors to /data/tidb-deploy/tiflash-9000/log/tiflash_error.log

Logging debug to /data/tidb-deploy/tiflash-9000/log/tiflash.log

Logging errors to /data/tidb-deploy/tiflash-9000/log/tiflash_error.log

Logging debug to /data/tidb-deploy/tiflash-9000/log/tiflash.log

Logging errors to /data/tidb-deploy/tiflash-9000/log/tiflash_error.log

Logging debug to /data/tidb-deploy/tiflash-9000/log/tiflash.log

Logging errors to /data/tidb-deploy/tiflash-9000/log/tiflash_error.log

root@ /data/tidb-deploy/tiflash-9000/log#tail tiflash_tikv.log

[2024/06/18 15:57:28.871 +08:00] [WARN] [future.rs:24] [“paired_future_callback: Failed to send result to the future rx, discarded.”]

[2024/06/18 15:57:28.871 +08:00] [WARN] [future.rs:24] [“paired_future_callback: Failed to send result to the future rx, discarded.”]

[2024/06/18 15:57:28.871 +08:00] [WARN] [future.rs:24] [“paired_future_callback: Failed to send result to the future rx, discarded.”]

[2024/06/18 15:57:28.871 +08:00] [WARN] [future.rs:24] [“paired_future_callback: Failed to send result to the future rx, discarded.”]

[2024/06/18 15:57:28.871 +08:00] [WARN] [future.rs:24] [“paired_future_callback: Failed to send result to the future rx, discarded.”]

[2024/06/18 15:57:28.871 +08:00] [WARN] [future.rs:24] [“paired_future_callback: Failed to send result to the future rx, discarded.”]

[2024/06/18 15:57:28.871 +08:00] [WARN] [future.rs:24] [“paired_future_callback: Failed to send result to the future rx, discarded.”]

[2024/06/18 15:57:28.871 +08:00] [WARN] [future.rs:24] [“paired_future_callback: Failed to send result to the future rx, discarded.”]

[2024/06/18 15:57:28.871 +08:00] [WARN] [future.rs:24] [“paired_future_callback: Failed to send result to the future rx, discarded.”]

[2024/06/18 15:57:42.899 +08:00] [WARN] [store.rs:859] [“[store 128] handle 1 pending peers include 1 ready, 0 entries, 0 messages and 0 snapshots”] [takes=32976]

你的pd节点还好吗?看着很多pd的报错啊

是不是没注意服务器的防火墙开了/

还在同步数据吗。
IO高,是不是HDD存储

tikv 节点和 PD 节点的状态正常么?

1 个赞

报错看着都是访问不了pd。

是HDD存储吗

好多error,跟pd通信出问题了吧