BCS-DBA
(Liuhuan Happy Study)
2020 年4 月 16 日 03:39
1
为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。
【TiDB 版本】:
SELECT tidb_version()
RELEASE VERSION: v3.0.12
Git COMMIT HASH: 8c4696b3f3408c61dd7454204ddd67c93501609a
Git Branch: heads/refs/tags/v3.0.12
UTC Build TIME: 2020-03-16 09:56:22
GoVersion: go VERSION go1.13 linux/amd64
Race Enabled: FALSE
TiKV MIN VERSION: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
CHECK TABLE BEFORE DROP: FALSE
【问题描述】:
发现tidb日志出现大量如下报错信息,想请教各位前辈,是什么原因导致如下问题的。非常感谢。
集群架构:
tidb 2台
pd 3台
tikv 3台
tidb报错日志:
#tidb1报错日志 :
2020/04/10 16:22:31.838 terror.go:357: [error] write tcp 172.20.5.143:4000->100.121.100.2:59025: write: connection reset by peer
github.com/pingcap/errors.AddStack
/home/jenkins/agent/workspace/tidb_v3.0.12/go/pkg/mod/github.com/pingcap/errors@v0.11.4/errors.go:174
github.com/pingcap/errors.Trace
/home/jenkins/agent/workspace/tidb_v3.0.12/go/pkg/mod/github.com/pingcap/errors@v0.11.4/juju_adaptor.go:15
github.com/pingcap/tidb/server.(*packetIO ).flush
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/packetio.go:172
github.com/pingcap/tidb/server.(*clientConn ).flush
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:967
github.com/pingcap/tidb/server.(*clientConn ).writeInitialHandshake
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:261
github.com/pingcap/tidb/server.(*clientConn ).handshake
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:169
github.com/pingcap/tidb/server.(*Server ).onConn
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/server.go:345
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
#tidb2报错日志 :
[2020/04/16 10:14:39.513 +08:00] [INFO] [set.go:192] [“set session var”] [conn=28515] [name=transaction_read_only] [val=0]
2020/04/16 10:14:40.120 terror.go:357: [error] EOF
github.com/pingcap/errors.AddStack
/home/jenkins/agent/workspace/tidb_v3.0.12/go/pkg/mod/github.com/pingcap/errors@v0.11.4/errors.go:174
github.com/pingcap/errors.Trace
/home/jenkins/agent/workspace/tidb_v3.0.12/go/pkg/mod/github.com/pingcap/errors@v0.11.4/juju_adaptor.go:15
github.com/pingcap/tidb/server.(*packetIO ).readOnePacket
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/packetio.go:80
github.com/pingcap/tidb/server.(*packetIO ).readPacket
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/packetio.go:105
github.com/pingcap/tidb/server.(*clientConn ).readPacket
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:265
github.com/pingcap/tidb/server.(*clientConn ).readOptionalSSLRequestAndHandshakeResponse
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:471
github.com/pingcap/tidb/server.(*clientConn ).handshake
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:172
github.com/pingcap/tidb/server.(*Server ).onConn
/home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/server.go:345
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
1 个赞
1、上游应用连接 TiDB 时是否使用了 LB 、keepalive 等组件
2、应用服务器、HA组件、 TiDB 是否有防火墙以及其他链接回收策略
3、能否将一台应用服务器直连 TiDB server 的物理 IP,观察下是否出现上述报错
1 个赞
BCS-DBA
(Liuhuan Happy Study)
2020 年4 月 16 日 08:57
3
非常感谢您,提供解决问题的思路!
通过您指的方法,定位到:
负载均衡(阿里云SLB)健康检查,导致的tidb1中的报错日志。详情请参考以下连接:
https://help.aliyun.com/knowledge_detail/55205.html?spm=a2c4g.11186623.6.654.207c112byrzUVU
(问题11. 负载均衡服务TCP端口健康检查成功,为什么在后端业务日志中出现网络连接异常信息?)。
针对 #tidb2 中的报错日志,找到原因:TiDB做端口检查时,输出的日志,之后的tidb版本可能会把这部分日志放在 INFO
里。
详情可参考连接:https://github.com/pingcap/tidb/pull/15799
非常感谢!
1 个赞
elvizlai
(Elvizlai)
2020 年4 月 17 日 03:16
5
This merge not resolved ali-slb RST handshake log, it still flush lots of boring logs.
https://github.com/pingcap/tidb/pull/15799
tidb v3.1.0
2020/04/17 11:20:26.873 terror.go:357: [error] write tcp 172.16.0.27:4000->100.116.37.2:19831: write: connection reset by peer
github.com/pingcap/errors.AddStack
/home/jenkins/agent/workspace/tidb_v3.1.0/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174
github.com/pingcap/errors.Trace
/home/jenkins/agent/workspace/tidb_v3.1.0/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15
github.com/pingcap/tidb/server.(*packetIO).flush
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/packetio.go:172
github.com/pingcap/tidb/server.(*clientConn).flush
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:996
github.com/pingcap/tidb/server.(*clientConn).writeInitialHandshake
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:275
github.com/pingcap/tidb/server.(*clientConn).handshake
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:170
github.com/pingcap/tidb/server.(*Server).onConn
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/server.go:353
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
2020/04/17 11:20:27.158 terror.go:357: [error] write tcp 172.16.0.27:4000->100.97.231.1:31464: write: connection reset by peer
github.com/pingcap/errors.AddStack
/home/jenkins/agent/workspace/tidb_v3.1.0/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174
github.com/pingcap/errors.Trace
/home/jenkins/agent/workspace/tidb_v3.1.0/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15
github.com/pingcap/tidb/server.(*packetIO).flush
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/packetio.go:172
github.com/pingcap/tidb/server.(*clientConn).flush
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:996
github.com/pingcap/tidb/server.(*clientConn).writeInitialHandshake
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:275
github.com/pingcap/tidb/server.(*clientConn).handshake
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:170
github.com/pingcap/tidb/server.(*Server).onConn
/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/server.go:353
runtime.goexit
/usr/local/go/src/runtime/asm_amd64.s:1357
1 个赞
cfzjywxk
(cfzjywxk)
2020 年4 月 17 日 03:54
6
Seems tidb send the handshake packet on tcp connection but got connection errors, please check the client connection management utilities.
Also we may need to simplify the error messages here, https://github.com/pingcap/tidb/pull/15799/ has converted the EOF error to single line INFO
error, but others may still print the stack
1 个赞
elvizlai
(Elvizlai)
2020 年4 月 17 日 04:09
7
The client which from 100.97.231.1
is ali-cloud tcp server load balancer, which we can not control.
Yes, the point is packetio flush that annoying errors.
1 个赞
我也是用的阿里云slb负载均衡做的,也报了许多错误
[2020/04/27 17:32:24.109 +08:00] [ERROR] [terror.go:363] [“encountered error”] [error=“write tcp 10.88.20.84:4000->100.122.211.65:55695: write: connection reset by peer”]
目前这报错是不是有办法可以屏蔽或者停止?
1 个赞
cfzjywxk
(cfzjywxk)
2020 年4 月 28 日 12:50
11
这个 https://github.com/pingcap/tidb/pull/16877 修改会将 handshake 阶段的报错日志调整为 debug 级别,4.0.0-rc.1 版本应该不会再打印 stack 信息到日志中
1 个赞
system
(system)
关闭
2022 年10 月 31 日 19:11
12
此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。