【 TiDB 使用环境】生产环境
【 TiDB 版本】v7.5.3
【复现路径】使用本地磁盘,磁盘满了做了迁移
【遇到的问题:问题现象及影响】
迁移过后,其他表格数据正常,有两个比较大的表(数据量大概在1亿条记录),查询,插入,更新全部报错ERROR 9001 (HY000): PD server timeout
【资源配置】
pd:8c 24G-3副本
kv:8c 24G-3副本
tidb:8c 24G-2副本
tiflash:8c 32G-3副本
【TiDB Operator 版本】: v1.4.5
【K8s 版本】: 1.22.12
【附件:截图/日志/监控】
server日志:
[WARN] [session.go:2287] [“run statement failed”] [schemaVersion=2719] [error=“[tikv:9001]PD server timeout: “] [session=”{\n "currDBName": "",\n "id": 0,\n "status": 2,\n "strictMode": true,\n "user": null\n}”]
[2025/05/19 16:31:01.234 +08:00] [WARN] [backoff.go:179] [“pdRPC backoffer.maxSleep 10000ms is exceeded, errors:\nPD returned regions have gaps, startKey: "t\x80\x00\x00\x00\x00\x00\x05\x1c_r\x011\xf5\xed\xbc\x95\xac2\x80", endKey: "t\x80\x00\x00\x00\x00\x00\x05\x1c_r\xfb", limit: 128 at 2025-05-19T16:30:53.63639004+08:00\nPD returned regions have gaps, startKey: "t\x80\x00\x00\x00\x00\x00\x05\x1c_r\x011\xf5\xed\xbc\x95\xac2\x80", endKey: "t\x80\x00\x00\x00\x00\x00\x05\x1c_r\xfb", limit: 128 at 2025-05-19T16:30:56.231346226+08:00\nPD returned regions have gaps, startKey: "t\x80\x00\x00\x00\x00\x00\x05\x1c_r\x011\xf5\xed\xbc\x95\xac2\x80", endKey: "t\x80\x00\x00\x00\x00\x00\x05\x1c_r\xfb", limit: 128 at 2025-05-19T16:30:59.019954603+08:00\ntotal-backoff-times: 6, backoff-detail: pdRPC:6, maxBackoffTimeExceeded: true, maxExcludedTimeExceeded: false\nlongest sleep type: pdRPC, time: 10294ms”]
[2025/05/19 16:31:01.234 +08:00] [INFO] [tidb.go:285] [“rollbackTxn called due to ddl/autocommit failure”]
[2025/05/19 16:31:01.235 +08:00] [WARN] [session.go:2287] [“run statement failed”] [conn=4169200642] [session_alias=] [schemaVersion=2719] [error=“[tikv:9001]PD server timeout: “] [session=”{\n "currDBName": "",\n "id": 4169200642,\n "status": 2,\n "strictMode": true,\n "user": {\n "Username": "root",\n "Hostname": "10.32.9.235",\n "CurrentUser": false,\n "AuthUsername": "root",\n "AuthHostname": "%",\n "AuthPlugin": "mysql_native_password"\n }\n}”]
[2025/05/19 16:31:01.235 +08:00] [INFO] [conn.go:1131] [“command dispatched failed”] [conn=4169200642] [session_alias=] [connInfo=“id:4169200642, addr:10.32.9.235:65357 status:10, collation:utf8mb4_0900_ai_ci, user:root”] [command=Query] [status=“inTxn:0, autocommit:1”] [sql=“/* ApplicationName=DBeaver 24.3.2 - Main / SELECT akfa. FROM one_cycle_data_metas.autogt_key_frame_anno AS akfa LIMIT 0, 200”] [txn_mode=PESSIMISTIC] [timestamp=458134244796923905] [err=“[tikv:9001]PD server timeout: \ngithub.com/pingcap/errors.AddStack\n\t/root/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20240318064555-6bd07397691f/errors.go:178\ngithub.com/pingcap/errors.(*Error).GenWithStackByArgs\n\t/root/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20240318064555-6bd07397691f/normalize.go:175\ngithub.com/pingcap/tidb/pkg/store/driver/error.ToTiDBErr\n\t/workspace/source/tidb/pkg/store/driver/error/error.go:113\ngithub.com/pingcap/tidb/pkg/store/copr.(*RegionCache).SplitKeyRangesByLocationsWithoutBuckets\n\t/workspace/source/tidb/pkg/store/copr/region_cache.go:244\ngithub.com/pingcap/tidb/pkg/store/copr.buildBatchCopTasksCore\n\t/workspace/source/tidb/pkg/store/copr/batch_coprocessor.go:916\ngithub.com/pingcap/tidb/pkg/store/copr.buildBatchCopTasksForNonPartitionedTable\n\t/workspace/source/tidb/pkg/store/copr/batch_coprocessor.go:488\ngithub.com/pingcap/tidb/pkg/store/copr.(*MPPClient).ConstructMPPTasks\n\t/workspace/source/tidb/pkg/store/copr/mpp.go:80\ngithub.com/pingcap/tidb/pkg/planner/core.(*mppTaskGenerator).constructMPPTasksImpl\n\t/workspace/source/tidb/pkg/planner/core/fragment.go:578\ngithub.com/pingcap/tidb/pkg/planner/core.(*mppTaskGenerator).generateMPPTasksForFragment\n\t/workspace/source/tidb/pkg/planner/core/fragment.go:375\ngithub.com/pingcap/tidb/pkg/planner/core.(*mppTaskGenerator).generateMPPTasksForExchangeSender\n\t/workspace/source/tidb/pkg/planner/core/fragment.go:349\ngithub.com/pingcap/tidb/pkg/planner/core.(*mppTaskGenerator).generateMPPTasks\n\t/workspace/source/tidb/pkg/planner/core/fragment.go:153\ngithub.com/pingcap/tidb/pkg/planner/core.GenerateRootMPPTasks\n\t/workspace/source/tidb/pkg/planner/core/fragment.go:121\ngithub.com/pingcap/tidb/pkg/executor/internal/mpp.(*localMppCoordinator).Execute\n\t/workspace/source/tidb/pkg/executor/internal/mpp/local_mpp_coordinator.go:722\ngithub.com/pingcap/tidb/pkg/executor.(*MPPGather).Open\n\t/workspace/source/tidb/pkg/executor/mpp_gather.go:115\ngithub.com/pingcap/tidb/pkg/executor/internal/exec.(*BaseExecutor).Open\n\t/workspace/source/tidb/pkg/executor/internal/exec/executor.go:168\ngithub.com/pingcap/tidb/pkg/executor.(*LimitExec).Open\n\t/workspace/source/tidb/pkg/executor/executor.go:1398\ngithub.com/pingcap/tidb/pkg/executor.(*ExecStmt).openExecutor\n\t/workspace/source/tidb/pkg/executor/adapter.go:1209\ngithub.com/pingcap/tidb/pkg/executor.(*ExecStmt).Exec\n\t/workspace/source/tidb/pkg/executor/adapter.go:545\ngithub.com/pingcap/tidb/pkg/session.runStmt\n\t/workspace/source/tidb/pkg/session/session.go:2416\ngithub.com/pingcap/tidb/pkg/session.(*session).ExecuteStmt\n\t/workspace/source/tidb/pkg/session/session.go:2275\ngithub.com/pingcap/tidb/pkg/server.(*TiDBContext).ExecuteStmt\n\t/workspace/source/tidb/pkg/server/driver_tidb.go:292\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).handleStmt\n\t/workspace/source/tidb/pkg/server/conn.go:2071\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).handleQuery\n\t/workspace/source/tidb/pkg/server/conn.go:1838\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).dispatch\n\t/workspace/source/tidb/pkg/server/conn.go:1325\ngithub.com/pingcap/tidb/pkg/server.(*clientConn).Run\n\t/workspace/source/tidb/pkg/server/conn.go:1098\ngithub.com/pingcap/tidb/pkg/server.(*Server).onConn\n\t/workspace/source/tidb/pkg/server/server.go:737\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650”]
[2025/05/19 16:31:01.474 +08:00] [WARN] [backoff.go:179] [“pdRPC backoffer.maxSleep 10000ms is exceeded, errors:\nregion not found for key "7480000000000000175F728000000000093332", encode_key: "7480000000000000FF175F728000000000FF0933320000000000FA" at 2025-05-19T16:30:54.79985438+08:00\nregion not found for key "7480000000000000175F728000000000093332", encode_key: "7480000000000000FF175F728000000000FF0933320000000000FA" at 2025-05-19T16:30:56.561061969+08:00\nregion not found for key "7480000000000000175F728000000000093332", encode_key: "7480000000000000FF175F728000000000FF0933320000000000FA" at 2025-05-19T16:30:58.973086253+08:00\ntotal-backoff-times: 7, backoff-detail: pdRPC:7, maxBackoffTimeExceeded: true, maxExcludedTimeExceeded: false\nlongest sleep type: pdRPC, time: 12440ms”]
[2025/05/19 16:31:01.474 +08:00] [ERROR] [distsql.go:1476] [“table reader fetch next chunk failed”] [error="[tikv:9001]PD server timeout: "]
kv和pd均没有ERROR信息
经过AI和先关社区资料查询
1.发现集群有空洞,请问怎么修复?
2.采集先关信息