TiDB 日志输出大量 EOF

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】: SELECT tidb_version()

RELEASE VERSION: v3.0.12 Git COMMIT HASH: 8c4696b3f3408c61dd7454204ddd67c93501609a Git Branch: heads/refs/tags/v3.0.12 UTC Build TIME: 2020-03-16 09:56:22 GoVersion: go VERSION go1.13 linux/amd64 Race Enabled: FALSE TiKV MIN VERSION: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306 CHECK TABLE BEFORE DROP: FALSE

  • 【问题描述】: 发现tidb日志出现大量如下报错信息,想请教各位前辈,是什么原因导致如下问题的。非常感谢。 集群架构: tidb 2台 pd 3台 tikv 3台

tidb报错日志: #tidb1报错日志: 2020/04/10 16:22:31.838 terror.go:357: [error] write tcp 172.20.5.143:4000->100.121.100.2:59025: write: connection reset by peer github.com/pingcap/errors.AddStack /home/jenkins/agent/workspace/tidb_v3.0.12/go/pkg/mod/github.com/pingcap/errors@v0.11.4/errors.go:174 github.com/pingcap/errors.Trace /home/jenkins/agent/workspace/tidb_v3.0.12/go/pkg/mod/github.com/pingcap/errors@v0.11.4/juju_adaptor.go:15 github.com/pingcap/tidb/server.(*packetIO).flush /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/packetio.go:172 github.com/pingcap/tidb/server.(*clientConn).flush /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:967 github.com/pingcap/tidb/server.(*clientConn).writeInitialHandshake /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:261 github.com/pingcap/tidb/server.(*clientConn).handshake /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:169 github.com/pingcap/tidb/server.(*Server).onConn /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/server.go:345 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1357

#tidb2报错日志: [2020/04/16 10:14:39.513 +08:00] [INFO] [set.go:192] [“set session var”] [conn=28515] [name=transaction_read_only] [val=0] 2020/04/16 10:14:40.120 terror.go:357: [error] EOF github.com/pingcap/errors.AddStack /home/jenkins/agent/workspace/tidb_v3.0.12/go/pkg/mod/github.com/pingcap/errors@v0.11.4/errors.go:174 github.com/pingcap/errors.Trace /home/jenkins/agent/workspace/tidb_v3.0.12/go/pkg/mod/github.com/pingcap/errors@v0.11.4/juju_adaptor.go:15 github.com/pingcap/tidb/server.(*packetIO).readOnePacket /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/packetio.go:80 github.com/pingcap/tidb/server.(*packetIO).readPacket /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/packetio.go:105 github.com/pingcap/tidb/server.(*clientConn).readPacket /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:265 github.com/pingcap/tidb/server.(*clientConn).readOptionalSSLRequestAndHandshakeResponse /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:471 github.com/pingcap/tidb/server.(*clientConn).handshake /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/conn.go:172 github.com/pingcap/tidb/server.(*Server).onConn /home/jenkins/agent/workspace/tidb_v3.0.12/go/src/github.com/pingcap/tidb/server/server.go:345 runtime.goexit /usr/local/go/src/runtime/asm_amd64.s:1357

1、上游应用连接 TiDB 时是否使用了 LB 、keepalive 等组件

2、应用服务器、HA组件、 TiDB 是否有防火墙以及其他链接回收策略

3、能否将一台应用服务器直连 TiDB server 的物理 IP,观察下是否出现上述报错

非常感谢您,提供解决问题的思路!

通过您指的方法,定位到:

负载均衡(阿里云SLB)健康检查,导致的tidb1中的报错日志。详情请参考以下连接: https://help.aliyun.com/knowledge_detail/55205.html?spm=a2c4g.11186623.6.654.207c112byrzUVU

(问题11. 负载均衡服务TCP端口健康检查成功,为什么在后端业务日志中出现网络连接异常信息?)。

针对 #tidb2 中的报错日志,找到原因:TiDB做端口检查时,输出的日志,之后的tidb版本可能会把这部分日志放在 INFO里。

详情可参考连接:https://github.com/pingcap/tidb/pull/15799

非常感谢!

:handshake:

This merge not resolved ali-slb RST handshake log, it still flush lots of boring logs.

tidb v3.1.0

2020/04/17 11:20:26.873 terror.go:357: [error] write tcp 172.16.0.27:4000->100.116.37.2:19831: write: connection reset by peer
github.com/pingcap/errors.AddStack
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174
github.com/pingcap/errors.Trace
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15
github.com/pingcap/tidb/server.(*packetIO).flush
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/packetio.go:172
github.com/pingcap/tidb/server.(*clientConn).flush
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:996
github.com/pingcap/tidb/server.(*clientConn).writeInitialHandshake
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:275
github.com/pingcap/tidb/server.(*clientConn).handshake
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:170
github.com/pingcap/tidb/server.(*Server).onConn
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/server.go:353
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1357
2020/04/17 11:20:27.158 terror.go:357: [error] write tcp 172.16.0.27:4000->100.97.231.1:31464: write: connection reset by peer
github.com/pingcap/errors.AddStack
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174
github.com/pingcap/errors.Trace
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15
github.com/pingcap/tidb/server.(*packetIO).flush
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/packetio.go:172
github.com/pingcap/tidb/server.(*clientConn).flush
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:996
github.com/pingcap/tidb/server.(*clientConn).writeInitialHandshake
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:275
github.com/pingcap/tidb/server.(*clientConn).handshake
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/conn.go:170
github.com/pingcap/tidb/server.(*Server).onConn
	/home/jenkins/agent/workspace/tidb_v3.1.0/go/src/github.com/pingcap/tidb/server/server.go:353
runtime.goexit
	/usr/local/go/src/runtime/asm_amd64.s:1357

Seems tidb send the handshake packet on tcp connection but got connection errors, please check the client connection management utilities. Also we may need to simplify the error messages here, https://github.com/pingcap/tidb/pull/15799/ has converted the EOF error to single line INFO error, but others may still print the stack

The client which from 100.97.231.1 is ali-cloud tcp server load balancer, which we can not control.

Yes, the point is packetio flush that annoying errors.

我也是用的阿里云slb负载均衡做的,也报了许多错误 [2020/04/27 17:32:24.109 +08:00] [ERROR] [terror.go:363] [“encountered error”] [error=“write tcp 10.88.20.84:4000->100.122.211.65:55695: write: connection reset by peer”] 目前这报错是不是有办法可以屏蔽或者停止?

目前还不清楚,该怎么屏蔽或者过滤掉这部分日志。

感谢回复

这个 https://github.com/pingcap/tidb/pull/16877 修改会将 handshake 阶段的报错日志调整为 debug 级别,4.0.0-rc.1 版本应该不会再打印 stack 信息到日志中