【 TiDB 使用环境】生产环境
【 TiDB 版本】:v6.5.2
【复现路径】做过哪些操作出现的问题
【遇到的问题:问题现象及影响】
应用程序报错:
应用报错代码:
Cause: java.sql.SQLException: rpc error: code = Unavailable desc = error reading from server: read tcp 172.16.89.80:53590->172.16.89.85:3930: read: connection reset by peer
; uncategorized SQLException; SQL state [HY000]; error code [1105]; rpc error: code = Unavailable desc = error reading from server: read tcp 172.16.89.80:53590->172.16.89.85:3930: read: connection reset by peer; nested exception is java.sql.SQLException: rpc error: code = Unavailable desc = error reading from server: read tcp 172.16.89.80:53590->172.16.89.85:3930: read: connection reset by peer
节点说明:172.16.89.81(TiDB Server),172.16.89.85(TiFlash)
【附件:截图/日志/监控】
TiDB Server(172.16.89.80)错误日志(大量报错):
[2023/10/12 08:40:00.388 +08:00] [ERROR] [ddl_tiflash_api.go:396] [“get tiflash sync progress failed”] [error=“Get "http://172.16.89.85:20292/tiflash/sync-status/21743\”: dial tcp 172.16.89.85:20292: connect: connection refused"] [tableID=21743] [IsPartition=false]
[2023/10/12 08:40:00.389 +08:00] [ERROR] [tiflash_manager.go:93] [“Fail to get peer status from TiFlash.”] [tableID=21743]
[2023/10/12 08:40:00.390 +08:00] [ERROR] [tiflash_manager.go:119] [“Fail to get peer count from TiFlash.”] [tableID=21743]
[2023/10/12 08:40:00.390 +08:00] [ERROR] [ddl_tiflash_api.go:396] [“get tiflash sync progress failed”] [error=“Get "http://172.16.89.85:20292/tiflash/sync-status/21743\”: dial tcp 172.16.89.85:20292: connect: connection refused"] [tableID=21743] [IsPartition=false]
[2023/10/12 08:40:00.391 +08:00] [ERROR] [tiflash_manager.go:93] [“Fail to get peer status from TiFlash.”] [tableID=21743]
[2023/10/12 08:40:00.391 +08:00] [ERROR] [tiflash_manager.go:119] [“Fail to get peer count from TiFlash.”] [tableID=21743]
TiFlash节点(172.16.89.85)日志(tiflash_error.log):
[2023/10/12 08:39:56.457 +08:00] [WARN] [CoprocessorHandler.cpp:143] [“RegionException: region 531389, message: NOT_FOUND”] [source=CoprocessorHandler] [thread_id=98]
[2023/10/12 08:39:56.457 +08:00] [WARN] [CoprocessorHandler.cpp:143] [“RegionException: region 531997, message: NOT_FOUND”] [source=CoprocessorHandler] [thread_id=84]
[2023/10/12 08:39:56.457 +08:00] [WARN] [CoprocessorHandler.cpp:143] [“RegionException: region 533871, message: NOT_FOUND”] [source=CoprocessorHandler] [thread_id=81]
[2023/10/12 08:39:56.458 +08:00] [WARN] [CoprocessorHandler.cpp:143] [“RegionException: region 533055, message: NOT_FOUND”] [source=CoprocessorHandler] [thread_id=83]
[2023/10/12 08:41:17.801 +08:00] [WARN] [ExchangeReceiver.cpp:210] [“MakeReader fail. retry time: 0”] [source=“MPPquery:444877052576792581:22,task ExchangeReceiver_339 tunnel20+22”] [thread_id=341]
[2023/10/12 08:41:18.816 +08:00] [WARN] [ExchangeReceiver.cpp:210] [“MakeReader fail. retry time: 1”] [source=“MPPquery:444877052576792581:22,task ExchangeReceiver_339 tunnel20+22”] [thread_id=341]
[2023/10/12 08:41:20.236 +08:00] [WARN] [ExchangeReceiver.cpp:210] [“MakeReader fail. retry time: 2”] [source=“MPPquery:444877052576792581:22,task ExchangeReceiver_339 tunnel20+22”] [thread_id=341]
[2023/10/12 08:41:23.080 +08:00] [WARN] [ExchangeReceiver.cpp:210] [“MakeReader fail. retry time: 3”] [source=“MPPquery:444877052576792581:22,task ExchangeReceiver_339 tunnel20+22”] [thread_id=341]
[2023/10/12 08:41:25.593 +08:00] [WARN] [ExchangeReceiver.cpp:210] [“MakeReader fail. retry time: 4”] [source=“MPPquery:444877052576792581:22,task ExchangeReceiver_339 tunnel20+22”] [thread_id=341]
[2023/10/12 08:41:26.621 +08:00] [WARN] [ExchangeReceiver.cpp:210] [“MakeReader fail. retry time: 5”] [source=“MPPquery:444877052576792581:22,task ExchangeReceiver_339 tunnel20+22”] [thread_id=341]
[2023/10/12 08:41:27.634 +08:00] [WARN] [ExchangeReceiver.cpp:210] [“MakeReader fail. retry time: 6”] [source=“MPPquery:444877052576792581:22,task ExchangeReceiver_339 tunnel20+22”] [thread_id=341]
[2023/10/12 08:41:27.885 +08:00] [WARN] [MPPTaskManager.cpp:152] [“Begin to abort query: 444877052576792581, abort type: ONCANCELLATION, reason: Receive cancel request from TiDB”] [thread_id=97]
[2023/10/12 08:41:27.885 +08:00] [WARN] [MPPTaskManager.cpp:195] ["Remaining task in query 444877052576792581 are: MPPquery:444877052576792581:3,task MPPquery:444877052576792581:6,task MPPquery:444877052576792581:16,task MPPquery:444877052576792581:22,task MPPquery:444877052576792581:19,task MPPquery:444877052576792581:9,task MPPquery:444877052576792581:5,task MPPquery:444877052576792581:18,task MPPquery:444877052576792581:21,task MPPquery:444877052576792581:1,task MPPquery:444877052576792581:13,task "] [thread_id=97]
[2023/10/12 08:41:27.885 +08:00] [WARN] [MPPTask.cpp:471] [“Begin abort task: MPPquery:444877052576792581:3,task, abort type: ONCANCELLATION”] [source=MPPquery:444877052576792581:3,task] [thread_id=97]
[2023/10/12 08:41:27.885 +08:00] [WARN] [MPPTask.cpp:500] [“Finish abort task from running”] [source=MPPquery:444877052576792581:3,task] [thread_id=97]
[2023/10/12 08:41:27.885 +08:00] [WARN] [MPPTask.cpp:471] [“Begin abort task: MPPquery:444877052576792581:6,task, abort type: ONCANCELLATION”] [source=MPPquery:444877052576792581:6,task] [thread_id=97]
[2023/10/12 08:41:27.886 +08:00] [WARN] [TiRemoteBlockInputStream.h:136] [“remote reader meets error: Receiver state: ERROR, error message: Read error message from mpp packet: Receive cancel request from TiDB”] [source=“TiRemote(ExchangeReceiver) ExchangeReceiver MPPquery:444877052576792581:22,task ExchangeReceiver_195”] [thread_id=385]
[2023/10/12 08:41:27.886 +08:00] [WARN] [MPPTask.cpp:500] [“Finish abort task from running”] [source=MPPquery:444877052576792581:6,task] [thread_id=97]
[2023/10/12 08:41:27.897 +08:00] [WARN] [MPPTask.cpp:471] [“Begin abort task: MPPquery:444877052576792581:16,task, abort type: ONCANCELLATION”] [source=MPPquery:444877052576792581:16,task] [thread_id=97]
[2023/10/12 08:41:27.897 +08:00] [WARN] [MPPTask.cpp:500] [“Finish abort task from running”] [source=MPPquery:444877052576792581:16,task] [thread_id=97]
[2023/10/12 08:41:27.897 +08:00] [WARN] [MPPTask.cpp:471] [“Begin abort task: MPPquery:444877052576792581:22,task, abort type: ONCANCELLATION”] [source=MPPquery:444877052576792581:22,task] [thread_id=97]
[2023/10/12 08:41:27.897 +08:00] [WARN] [TiRemoteBlockInputStream.h:136] [“remote reader meets error: Receiver state: ERROR, error message: Read error message from mpp packet: Receive cancel request from TiDB”] [source=“TiRemote(ExchangeReceiver) ExchangeReceiver MPPquery:444877052576792581:22,task ExchangeReceiver_195”] [thread_id=393]
[2023/10/12 08:41:27.897 +08:00] [WARN] [TiRemoteBlockInputStream.h:136] [“remote reader meets error: Receiver state: CANCELED, error message: Read error message from mpp packet: Receive cancel request from TiDB”] [source=“TiRemote(ExchangeReceiver) ExchangeReceiver MPPquery:444877052576792581:22,task ExchangeReceiver_285”] [thread_id=524]
[2023/10/12 08:41:27.898 +08:00] [WARN] [TiRemoteBlockInputStream.h:136] [“remote reader meets error: Receiver state: CANCELED, error message: Read error message from mpp packet: Receive cancel request from TiDB”] [source=“TiRemote(ExchangeReceiver) ExchangeReceiver MPPquery:444877052576792581:22,task ExchangeReceiver_285”] [thread_id=533]
[2023/10/12 08:41:27.898 +08:00] [WARN] [TiRemoteBlockInputStream.h:136] [“remote reader meets error: Receiver state: ERROR, error message: Read error message from mpp packet: Receive cancel request from TiDB”] [source=“TiRemote(ExchangeReceiver) ExchangeReceiver MPPquery:444877052576792581:22,task ExchangeReceiver_195”] [thread_id=382]
[2023/10/12 08:41:27.898 +08:00] [WARN] [TiRemoteBlockInputStream.h:136] [“remote reader meets error: Receiver state: ERROR, error message: Read error message from mpp packet: Receive cancel request from TiDB”] [source=“TiRemote(ExchangeReceiver) ExchangeReceiver MPPquery:444877052576792581:22,task ExchangeReceiver_195”] [thread_id=1827]
[2023/10/12 08:41:27.898 +08:00] [WARN] [TiRemoteBlockInputStream.h:136] [“remote reader meets error: Receiver state: CANCELED, error message: Read error message from mpp packet: Receive cancel request from TiDB”] [source=“TiRemote(ExchangeReceiver) ExchangeReceiver MPPquery:444877052576792581:22,task ExchangeReceiver_285”] [thread_id=515]
[2023/10/12 08:41:27.898 +08:00] [WARN] [TiRemoteBlockInputStream.h:136] [“remote reader meets error: Receiver state: ERROR, error message: Read error message from mpp packet: Receive cancel request from TiDB”] [source=“TiRemote(ExchangeReceiver) ExchangeReceiver MPPquery:444877052576792581:22,task ExchangeReceiver_216”] [thread_id=497]
[2023/10/12 08:41:27.898 +08:00] [WARN] [TiRemoteBlockInputStream.h:136] [“remote reader meets error: Receiver state: CANCELED, error message: Read error message from mpp packet: Receive cancel request from TiDB”] [source=“TiRemote(ExchangeReceiver) ExchangeReceiver MPPquery:444877052576792581:22,task ExchangeReceiver_285”] [thread_id=518]
报错时段Request Duration较高
【资源配置】进入到 TiDB Dashboard -集群信息 (Cluster Info) -主机(Hosts) 截图此页面