tikv:9002 TiKV server timeout

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:v4.0.0-rc.2
  • 【问题描述】:[sql=commit] [txn_mode=PESSIMISTIC] [err="[tikv:9002]TiKV server timeout
    服务配置:3台 TIKV 8 vCPU 16 GiB 1台 TIDB 8 vCPU 16 GiB 1台 TIPD 8 vCPU 16 GiB
    若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出打印结果,请务必全选并复制粘贴上传。

  1. 检查下 tikv 日志的报错信息,是否某个节点异常
  2. 可以查看监控信息, grafana 的 over-view 查看负载是否有高的机器
  3. 查看 tikv 监控,是否有资源有瓶颈,比如 raft store cpu 打满等。
max	current

172.20.11.1:20180 22.1% 7.6%
172.20.11.2:20180 191.9% 105.8%
172.20.11.3:20180 173.8% 54.8%

提供的这个信息看不出是哪个指标项的

可以提供一下 Overview 和 TiKV-Details 以及 PD、TiDB 几个面板当时的监控看下


导出监控步骤:

  1. 打开监控面板,选择监控时间
  2. 打开 Grafana 监控面板(先按 d 再按 E 可将所有 Rows 的 Panels 打开,需等待一段时间待页面加载完成)
  3. https://metricstool.pingcap.com/ 使用工具导出 Grafana 数据为快照

现在服务器升级了还出现TiKV server timeout 。 附件为Grafana 快照tidb-vankle-TiDB_2021-01-04T09_16_41.498Z.json.zip (873.4 KB) tidb-vankle-PD_2021-01-04T09_16_50.686Z.json.zip (699.7 KB) tidb-vankle-TiKV-Details_2021-01-04T09_16_15.068Z.json.zip (2.5 MB) tidb-vankle-Overview_2021-01-04T09_16_58.294Z.json.zip (686.1 KB)


从监控上看到有比较多的 TCP 数据包重传的情况,建议可以排查一下网络方面的原因。

服务器都是阿里云虚拟主机之间相互通信都是使用内网通信。目前我们网站访问量不大,时不时发生TiKV server timeout、Region is unavailable ERROR.

可以提供一下包含发生报错时间段的 tidb.log 以及 tikv.log 看下

tidb.log

2021-01-06 02:19:28
ERROR
TiDB 172.20.11.79
[conn.go:728] ["command dispatched failed"] 
[conn=10788] 
[connInfo="id:10788, addr:172.20.11.76:47888 status:10, collation:utf8_general_ci, user:venus"] 
[command=Query] [status="inTxn:0, autocommit:1"] 
[sql="select   id as id,  product_id as productId,  store_id as storeId,  discount_amount as discountAmount,  discount_percentage as discountPercentage, 
 discount_start_time as discountStartTime,  discount_end_time as discountEndTime  from catalog_entity_discount  where product_id = 1826 and  deleted_status = 1"] 
 [txn_mode=PESSIMISTIC] 
 [err="[tikv:9002]TiKV server timeout\
github.com/pingcap/errors.AddStack
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174\
github.com/pingcap/errors.Trace
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15\
github.com/pingcap/tidb/store/tikv.(*tikvSnapshot).get
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/store/tikv/snapshot.go:361\
github.com/pingcap/tidb/store/tikv.(*tikvSnapshot).Get
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/store/tikv/snapshot.go:300\
github.com/pingcap/tidb/executor.(*PointGetExecutor).get
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/executor/point_get.go:298\
github.com/pingcap/tidb/executor.(*PointGetExecutor).Next
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/executor/point_get.go:164\
github.com/pingcap/tidb/executor.Next
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/executor/executor.go:249\
github.com/pingcap/tidb/executor.(*SelectionExec).Next
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/executor/executor.go:1216\
github.com/pingcap/tidb/executor.Next
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/executor/executor.go:249\
github.com/pingcap/tidb/executor.(*ProjectionExec).unParallelExecute
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/executor/projection.go:185\
github.com/pingcap/tidb/executor.(*ProjectionExec).Next
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/executor/projection.go:171\
github.com/pingcap/tidb/executor.Next
 n\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/executor/executor.go:249\
github.com/pingcap/tidb/executor.(*recordSet).Next
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/executor/adapter.go:126\
github.com/pingcap/tidb/server.(*tidbResultSet).Next
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/server/driver_tidb.go:386\
github.com/pingcap/tidb/server.(*clientConn).writeChunks
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/server/conn.go:1405\
github.com/pingcap/tidb/server.(*clientConn).writeResultset
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/server/conn.go:1371\
github.com/pingcap/tidb/server.(*clientConn).handleQuery
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/server/conn.go:1279\
github.com/pingcap/tidb/server.(*clientConn).dispatch
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/server/conn.go:899\
github.com/pingcap/tidb/server.(*clientConn).Run
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/server/conn.go:713\
github.com/pingcap/tidb/server.(*Server).onConn
 \
\t/home/jenkins/agent/workspace/tidb_v4.0.0-rc.2/go/src/github.com/pingcap/tidb/server/server.go:415\
runtime.goexit
 \
\t/usr/local/go/src/runtime/asm_amd64.s:1357"]

titv.log
2021-01-06 02:19:28

ERROR

TiKV 172.20.11.99

[transport.rs:163] ["send raft msg err"] [err="Other(\"[src/server/raft_client.rs:208]: RaftClient send fail\")"]

2021-01-06 02:19:29

ERROR

TiKV 172.20.11.99

[transport.rs:163] ["send raft msg err"] [err="Other(\"[src/server/raft_client.rs:208]: RaftClient send fail\")"]

2021-01-06 02:19:30

ERROR

TiKV 172.20.11.99

[transport.rs:163] ["send raft msg err"] [err="Other(\"[src/server/raft_client.rs:208]: RaftClient send fail\")"]

2021-01-06 02:19:31

ERROR

TiKV 172.20.11.99

[transport.rs:163] ["send raft msg err"] [err="Other(\"[src/server/raft_client.rs:208]: RaftClient send fail\")"]

2021-01-06 02:19:31

ERROR

TiKV 172.20.11.98

[status_server.rs:577] ["failed to register addr to pd"] [response="Response { url: \"http://172.20.11.97:2379/pd/api/v1/component\", status: 400, headers: {\"access-control-allow-headers\": \"accept, content-type, authorization\", \"access-control-allow-methods\": \"POST, GET, OPTIONS, PUT, DELETE\", \"access-control-allow-origin\": \"*\", \"content-type\": \"application/json; charset=UTF-8\", \"date\": \"Tue, 05 Jan 2021 18:19:31 GMT\", \"content-length\": \"72\"} }"]

2021-01-06 02:19:31

ERROR

TiKV 172.20.11.98

[status_server.rs:577] ["failed to register addr to pd"] [response="Response { url: \"http://172.20.11.97:2379/pd/api/v1/component\", status: 400, headers: {\"access-control-allow-headers\": \"accept, content-type, authorization\", \"access-control-allow-methods\": \"POST, GET, OPTIONS, PUT, DELETE\", \"access-control-allow-origin\": \"*\", \"content-type\": \"application/json; charset=UTF-8\", \"date\": \"Tue, 05 Jan 2021 18:19:31 GMT\", \"content-length\": \"72\"} }"]

2021-01-06 02:19:31

ERROR

TiKV 172.20.11.98

[status_server.rs:577] ["failed to register addr to pd"] [response="Response { url: \"http://172.20.11.97:2379/pd/api/v1/component\", status: 400, headers: {\"access-control-allow-headers\": \"accept, content-type, authorization\", \"access-control-allow-methods\": \"POST, GET, OPTIONS, PUT, DELETE\", \"access-control-allow-origin\": \"*\", \"content-type\": \"application/json; charset=UTF-8\", \"date\": \"Tue, 05 Jan 2021 18:19:31 GMT\", \"content-length\": \"72\"} }"]

2021-01-06 02:19:31

ERROR

TiKV 172.20.11.98

[status_server.rs:577] ["failed to register addr to pd"] [response="Response { url: \"http://172.20.11.97:2379/pd/api/v1/component\", status: 400, headers: {\"access-control-allow-headers\": \"accept, content-type, authorization\", \"access-control-allow-methods\": \"POST, GET, OPTIONS, PUT, DELETE\", \"access-control-allow-origin\": \"*\", \"content-type\": \"application/json; charset=UTF-8\", \"date\": \"Tue, 05 Jan 2021 18:19:31 GMT\", \"content-length\": \"72\"} }"]

2021-01-06 02:19:31

ERROR

TiKV 172.20.11.98

[status_server.rs:577] ["failed to register addr to pd"] [response="Response { url: \"http://172.20.11.97:2379/pd/api/v1/component\", status: 400, headers: {\"access-control-allow-headers\": \"accept, content-type, authorization\", \"access-control-allow-methods\": \"POST, GET, OPTIONS, PUT, DELETE\", \"access-control-allow-origin\": \"*\", \"content-type\": \"application/json; charset=UTF-8\", \"date\": \"Tue, 05 Jan 2021 18:19:31 GMT\", \"content-length\": \"72\"} }"]

2021-01-06 02:19:31

ERROR

TiKV 172.20.11.98

[status_server.rs:586] ["failed to register addr to pd after 5 tries"]

2021-01-06 02:19:32

ERROR

TiKV 172.20.11.99

[transport.rs:163] ["send raft msg err"] [err="Other(\"[src/server/raft_client.rs:208]: RaftClient send fail\")"]

RaftClient send fail 的信息可以看下