分区表的一个SQL偶尔变得很慢

  1. 查看 15:27 的日志,region:89705 慢,麻烦反馈下这个 region 信息

  2. 在此之前有 gc ,可以先标记怀疑的方向

  3. 请帮忙收集 15:20 – 15:30 的 tikv.log 日志,多谢。

  4. tidb 慢日志也麻烦反馈下 这个时间段的,多谢。

1:tidb.log只有它的这个日志

[2020/09/08 15:27:31.206 +08:00] [INFO] [coprocessor.go:871] [“[TIME_COP_WAIT] resp_time:324.561384ms txnStartTS:419312448534478892 region_id:89705 store_addr:192.168.0.241:20160 kv_wait_ms:322”] [conn=13706551]
[2020/09/08 15:27:42.358 +08:00] [INFO] [coprocessor.go:871] [“[TIME_COP_WAIT] resp_time:2.46767184s txnStartTS:419312450893774877 region_id:89705 store_addr:192.168.0.241:20160 kv_wait_ms:2466”] [conn=13706768]
3:
logs.tar (62 KB)
4:这个时间段只有这么几个慢SQL

tikv 日志里是 03:xx 不是下午 15:xx 的? 这个时间正确吗?

要+12 时区是-4的 所以03就是15点

  1. 查看 tikv 日志都是报错 No route to 192.168.0.230:20170 ,这个实例是什么?
    [2020/09/08 03:20:04.883 -04:00] [Warn] [raft_client.rs:296] [“RPC batch_raft fail”] [err=“Some(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some(“failed to connect to all addresses”) }))”] [sink_err=“Some(RpcFinished(Some(RpcStatus { status: 14-UNAVAILABLE, details: Some(“failed to connect to all addresses”) })))”] [to_addr=192.168.0.230:20170]
    [2020/09/08 03:20:04.883 -04:00] [Warn] [raft_client.rs:199] [“send to 192.168.0.230:20170 fail, the gRPC connection could be broken”]
    [2020/09/08 03:20:04.883 -04:00] [Error] [transport.rs:163] [“send raft msg err”] [err=“Other(”[src/server/raft_client.rs:208]: RaftClient send fail")"]

这个是tiflash的节点,有问题,但是没去解决,应该不是这个导致慢SQL原因

  1. 先处理一下 tiflash ,如果不使用,可以删除,排查干扰项
  2. 麻烦把这两个 sql 的执行计划也上传下, 在dashboard界面应该可以查看到,多谢。

2:第一个的

Batch_Point_Get_1 root 80 table:GLOBAL_VARIABLES, index:PRIMARY(VARIABLE_NAME), keep order:false, desc:false
第二个的:
Projection_41 root 1 if(gt(Column#46, 0), 1, 0)->Column#47
└─StreamAgg_46 root 1 funcs:count(1)->Column#46
└─HashAgg_239 root 2 group by:Column#45, funcs:firstrow(1)->Column#52
└─Union_50 root 20.36508716558094
├─Projection_51 root 11.114738266026105 1->Column#45
│ └─Union_52 root 11.114738266026105
│ ├─IndexLookUp_59 root 0.5515665615230103
│ │ ├─IndexScan_56 cop 0.5515665615230103 table:t_device_request, partition:p20200615, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_58 cop 0.5515665615230103 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_57 cop 0.5515665615230103 table:t_device_request, partition:p20200615, keep order:false
│ ├─IndexLookUp_66 root 0.48214989067030967
│ │ ├─IndexScan_63 cop 0.48214989067030967 table:t_device_request, partition:p20200622, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_65 cop 0.48214989067030967 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_64 cop 0.48214989067030967 table:t_device_request, partition:p20200622, keep order:false
│ ├─IndexLookUp_73 root 1.8645133478430012
│ │ ├─IndexScan_70 cop 2.3306416848037514 table:t_device_request, partition:p20200629, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_72 cop 1.8645133478430012 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_71 cop 2.3306416848037514 table:t_device_request, partition:p20200629, keep order:false
│ ├─IndexLookUp_80 root 0.01
│ │ ├─IndexScan_77 cop 10 table:t_device_request, partition:p20200706, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false, stats:pseudo
│ │ └─Selection_79 cop 0.01 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_78 cop 10 table:t_device_request, partition:p20200706, keep order:false, stats:pseudo
│ ├─IndexLookUp_87 root 1.060514946092906
│ │ ├─IndexScan_84 cop 1.060514946092906 table:t_device_request, partition:p20200713, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_86 cop 1.060514946092906 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_85 cop 1.060514946092906 table:t_device_request, partition:p20200713, keep order:false
│ ├─IndexLookUp_94 root 1.2854483249568542
│ │ ├─IndexScan_91 cop 1.2854483249568542 table:t_device_request, partition:p20200720, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_93 cop 1.2854483249568542 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_92 cop 1.2854483249568542 table:t_device_request, partition:p20200720, keep order:false
│ ├─IndexLookUp_101 root 1.2676661332077526
│ │ ├─IndexScan_98 cop 1.2676661332077526 table:t_device_request, partition:p20200727, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_100 cop 1.2676661332077526 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_99 cop 1.2676661332077526 table:t_device_request, partition:p20200727, keep order:false
│ ├─IndexLookUp_108 root 0.3361156585291799
│ │ ├─IndexScan_105 cop 0.3361156585291799 table:t_device_request, partition:p20200803, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_107 cop 0.3361156585291799 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_106 cop 0.3361156585291799 table:t_device_request, partition:p20200803, keep order:false
│ ├─IndexLookUp_115 root 1.8777953032311259
│ │ ├─IndexScan_112 cop 1.8777953032311259 table:t_device_request, partition:p20200810, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_114 cop 1.8777953032311259 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_113 cop 1.8777953032311259 table:t_device_request, partition:p20200810, keep order:false
│ ├─IndexLookUp_122 root 0.09324345853702905
│ │ ├─IndexScan_119 cop 0.09324345853702905 table:t_device_request, partition:p20200817, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_121 cop 0.09324345853702905 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_120 cop 0.09324345853702905 table:t_device_request, partition:p20200817, keep order:false
│ ├─IndexLookUp_129 root 1.0247031995358264
│ │ ├─IndexScan_126 cop 1.0247031995358264 table:t_device_request, partition:p20200824, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_128 cop 1.0247031995358264 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_127 cop 1.0247031995358264 table:t_device_request, partition:p20200824, keep order:false
│ ├─IndexLookUp_136 root 1.2510214418991095
│ │ ├─IndexScan_133 cop 1.2510214418991095 table:t_device_request, partition:p20200831, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ │ └─Selection_135 cop 1.2510214418991095 eq(test.t_device_request.merchant_id, 1883)
│ │ └─TableScan_134 cop 1.2510214418991095 table:t_device_request, partition:p20200831, keep order:false
│ └─IndexLookUp_143 root 0.01
│ ├─IndexScan_140 cop 10 table:t_device_request, partition:p20200907, index:INDEX_REQUEST_REQ_SEQ(REQ_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false, stats:pseudo
│ └─Selection_142 cop 0.01 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_141 cop 10 table:t_device_request, partition:p20200907, keep order:false, stats:pseudo
└─Projection_144 root 9.250348899554833 1->Column#45
└─Union_145 root 9.250348899554833
├─IndexLookUp_152 root 0.5515665615230103
│ ├─IndexScan_149 cop 0.5515665615230103 table:t_device_request, partition:p20200615, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_151 cop 0.5515665615230103 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_150 cop 0.5515665615230103 table:t_device_request, partition:p20200615, keep order:false
├─IndexLookUp_159 root 0.48214989067030967
│ ├─IndexScan_156 cop 0.48214989067030967 table:t_device_request, partition:p20200622, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_158 cop 0.48214989067030967 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_157 cop 0.48214989067030967 table:t_device_request, partition:p20200622, keep order:false
├─IndexLookUp_166 root 0
│ ├─IndexScan_163 cop 0 table:t_device_request, partition:p20200629, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_165 cop 0 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_164 cop 0 table:t_device_request, partition:p20200629, keep order:false
├─IndexLookUp_173 root 0.01
│ ├─IndexScan_170 cop 10 table:t_device_request, partition:p20200706, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false, stats:pseudo
│ └─Selection_172 cop 0.01 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_171 cop 10 table:t_device_request, partition:p20200706, keep order:false, stats:pseudo
├─IndexLookUp_180 root 1.060514946092906
│ ├─IndexScan_177 cop 1.060514946092906 table:t_device_request, partition:p20200713, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_179 cop 1.060514946092906 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_178 cop 1.060514946092906 table:t_device_request, partition:p20200713, keep order:false
├─IndexLookUp_187 root 1.2854483249568542
│ ├─IndexScan_184 cop 1.2854483249568542 table:t_device_request, partition:p20200720, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_186 cop 1.2854483249568542 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_185 cop 1.2854483249568542 table:t_device_request, partition:p20200720, keep order:false
├─IndexLookUp_194 root 1.2677901145794817
│ ├─IndexScan_191 cop 1.2677901145794817 table:t_device_request, partition:p20200727, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_193 cop 1.2677901145794817 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_192 cop 1.2677901145794817 table:t_device_request, partition:p20200727, keep order:false
├─IndexLookUp_201 root 0.3361156585291799
│ ├─IndexScan_198 cop 0.3361156585291799 table:t_device_request, partition:p20200803, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_200 cop 0.3361156585291799 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_199 cop 0.3361156585291799 table:t_device_request, partition:p20200803, keep order:false
├─IndexLookUp_208 root 1.8777953032311259
│ ├─IndexScan_205 cop 1.8777953032311259 table:t_device_request, partition:p20200810, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_207 cop 1.8777953032311259 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_206 cop 1.8777953032311259 table:t_device_request, partition:p20200810, keep order:false
├─IndexLookUp_215 root 0.09324345853702905
│ ├─IndexScan_212 cop 0.09324345853702905 table:t_device_request, partition:p20200817, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_214 cop 0.09324345853702905 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_213 cop 0.09324345853702905 table:t_device_request, partition:p20200817, keep order:false
├─IndexLookUp_222 root 1.0247031995358264
│ ├─IndexScan_219 cop 1.0247031995358264 table:t_device_request, partition:p20200824, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_221 cop 1.0247031995358264 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_220 cop 1.0247031995358264 table:t_device_request, partition:p20200824, keep order:false
├─IndexLookUp_229 root 1.2510214418991095
│ ├─IndexScan_226 cop 1.2510214418991095 table:t_device_request, partition:p20200831, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false
│ └─Selection_228 cop 1.2510214418991095 eq(test.t_device_request.merchant_id, 1883)
│ └─TableScan_227 cop 1.2510214418991095 table:t_device_request, partition:p20200831, keep order:false
└─IndexLookUp_236 root 0.01
├─IndexScan_233 cop 10 table:t_device_request, partition:p20200907, index:INDEX_REQUEST_ORI_SEQ(ORI_SEQ), range:[“8E0AD8887D2020-06-26 22:18:365965”,“8E0AD8887D2020-06-26 22:18:365965”], keep order:false, stats:pseudo
└─Selection_235 cop 0.01 eq(test.t_device_request.merchant_id, 1883)
└─TableScan_234 cop 10 table:t_device_request, partition:p20200907, keep order:false, stats:pseudo

  1. 请问查询的where 条件之前反馈的都是 20083117280710624874 之类的值,这个怎么是这些类似时间一样?这是实际查询的值吗?

  1. 如果不正确,麻烦在slow日志里看下,这个值是确实这样,还是dashboard转换了,多谢。

就是这么写的,我为了方便,直接把时间当字符串处理来查询的

抱歉,已经找其他同事再看了,有进程我再反馈同步,多谢。

106 那台 tikv 在 2020-09-01 14:19 左右的 log 还在吗?麻烦提供一下,谢谢~

好的 ,麻烦了

有的

tikv.log.2020-09-01-15_19_24 (658.2 KB)

14:19:06 有一条大查询,时间戳是 419152827524055063,有 13 次查 106 这个 tikv,每次读 96 万条数据,每次耗时 1s 以上。这导致了后面的 cop task 都等待。
我在你给的 tidb 的 slow log 里没有搜到这个时间戳,麻烦在 172.31.18.96 这台 tidb 里按时间戳 419152827524055063 搜一下 slow log,看看是哪条 SQL。

是这个SQL
slow.xls (13 KB)

但是我并不认为是这个SQL造成的,因为我在本地测试的时候,并没有运行其他SQL,依旧造成了偶尔比较慢

本地测试的特征能否描述一下?
是否也是 TiKV 慢?该 SQL 的 process_time 是否长?
最好也有 slow log 和 tikv log。

本地特征还不太一样,tikv日志并没有异常:

然后我检查了241和242的kv日志:

242:

[2020/09/08 03:27:39.247 -04:00] [INFO] [store.rs:2045] [“broadcasting unreachable”] [unreachable_store_id=81081] [store_id=4]
[2020/09/08 03:27:39.255 -04:00] [WARN] [raft_client.rs:199] [“send to 192.168.0.230:20170 fail, the gRPC connection could be broken”]
[2020/09/08 03:27:39.256 -04:00] [ERROR] [transport.rs:163] [“send raft msg err”] [err=“Other("[src/server/raft_client.rs:208]: RaftClient send fail")”]
[2020/09/08 03:27:39.256 -04:00] [INFO] [transport.rs:144] [“resolve store address ok”] [addr=192.168.0.230:20170] [store_id=81081]
[2020/09/08 03:27:39.256 -04:00] [INFO] [raft_client.rs:48] [“server: new connection with tikv endpoint”] [addr=192.168.0.230:20170]
[2020/09/08 03:27:39.256 -04:00] [INFO] [] [“Connect failed: {"created":"@1599550059.256557422","description":"Failed to connect to remote host: No route to host","errno":113,"file":"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc","file_line":205,"os_error":"No route to host","syscall":"getsockopt(SO_ERROR)","target_address":"ipv4:192.168.0.230:20170"}”]
[2020/09/08 03:27:39.256 -04:00] [INFO] [] [“Subchannel 0x7f2a764dc0c0: Retry in 999 milliseconds”]
[2020/09/08 03:27:39.256 -04:00] [WARN] [raft_client.rs:296] [“RPC batch_raft fail”] [err=“Some(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") }))”] [sink_err=“Some(RpcFinished(Some(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") })))”] [to_addr=192.168.0.230:20170]
[2020/09/08 03:27:39.286 -04:00] [WARN] [raft_client.rs:199] [“send to 192.168.0.230:20170 fail, the gRPC connection could be broken”]
[2020/09/08 03:27:39.286 -04:00] [ERROR] [transport.rs:163] [“send raft msg err”] [err=“Other("[src/server/raft_client.rs:208]: RaftClient send fail")”]
[2020/09/08 03:27:39.286 -04:00] [INFO] [transport.rs:144] [“resolve store address ok”] [addr=192.168.0.230:20170] [store_id=81081]
[2020/09/08 03:27:39.286 -04:00] [INFO] [raft_client.rs:48] [“server: new connection with tikv endpoint”] [addr=192.168.0.230:20170]
[2020/09/08 03:27:40.287 -04:00] [INFO] [] [“Connect failed: {"created":"@1599550060.287175501","description":"Failed to connect to remote host: No route to host","errno":113,"file":"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc","file_line":205,"os_error":"No route to host","syscall":"getsockopt(SO_ERROR)","target_address":"ipv4:192.168.0.230:20170"}”]
[2020/09/08 03:27:40.287 -04:00] [INFO] [] [“Subchannel 0x7f2a764dc440: Retry immediately”]
[2020/09/08 03:27:40.287 -04:00] [INFO] [] [“Failed to connect to channel, retrying”]
[2020/09/08 03:27:40.287 -04:00] [WARN] [raft_client.rs:296] [“RPC batch_raft fail”] [err=“Some(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") }))”] [sink_err=“Some(RpcFinished(Some(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") })))”] [to_addr=192.168.0.230:20170]
[2020/09/08 03:27:40.301 -04:00] [WARN] [raft_client.rs:199] [“send to 192.168.0.230:20170 fail, the gRPC connection could be broken”]
[2020/09/08 03:27:40.301 -04:00] [ERROR] [transport.rs:163] [“send raft msg err”] [err=“Other("[src/server/raft_client.rs:208]: RaftClient send fail")”]
[2020/09/08 03:27:40.301 -04:00] [INFO] [transport.rs:144] [“resolve store address ok”] [addr=192.168.0.230:20170] [store_id=81081]
[2020/09/08 03:27:40.301 -04:00] [INFO] [raft_client.rs:48] [“server: new connection with tikv endpoint”] [addr=192.168.0.230:20170]
[2020/09/08 03:27:40.302 -04:00] [INFO] [transport.rs:144] [“resolve store address ok”] [addr=192.168.0.230:20170] [store_id=81081]
[2020/09/08 03:27:43.307 -04:00] [INFO] [] [“Connect failed: {"created":"@1599550063.307198294","description":"Failed to connect to remote host: No route to host","errno":113,"file":"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc","file_line":205,"os_error":"No route to host","syscall":"getsockopt(SO_ERROR)","target_address":"ipv4:192.168.0.230:20170"}”]
[2020/09/08 03:27:43.307 -04:00] [INFO] [] [“Subchannel 0x7f2a764dcb40: Retry immediately”]
[2020/09/08 03:27:43.307 -04:00] [INFO] [] [“Failed to connect to channel, retrying”]
[2020/09/08 03:27:43.307 -04:00] [WARN] [raft_client.rs:296] [“RPC batch_raft fail”] [err=“Some(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") }))”] [sink_err=“Some(RpcFinished(Some(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") })))”] [to_addr=192.168.0.230:20170]
[2020/09/08 03:27:43.311 -04:00] [WARN] [raft_client.rs:199] [“send to 192.168.0.230:20170 fail, the gRPC connection could be broken”]
[2020/09/08 03:27:43.312 -04:00] [ERROR] [transport.rs:163] [“send raft msg err”] [err=“Other("[src/server/raft_client.rs:208]: RaftClient send fail")”]
[2020/09/08 03:27:43.312 -04:00] [INFO] [transport.rs:144] [“resolve store address ok”] [addr=192.168.0.230:20170] [store_id=81081]
[2020/09/08 03:27:43.312 -04:00] [INFO] [raft_client.rs:48] [“server: new connection with tikv endpoint”] [addr=192.168.0.230:20170]
[2020/09/08 03:27:44.315 -04:00] [INFO] [] [“Connect failed: {"created":"@1599550064.315181528","description":"Failed to connect to remote host: No route to host","errno":113,"file":"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc","file_line":205,"os_error":"No route to host","syscall":"getsockopt(SO_ERROR)","target_address":"ipv4:192.168.0.230:20170"}”]
[2020/09/08 03:27:44.315 -04:00] [INFO] [] [“Subchannel 0x7f2a764dcec0: Retry immediately”]
[2020/09/08 03:27:44.315 -04:00] [INFO] [] [“Failed to connect to channel, retrying”]
[2020/09/08 03:27:44.315 -04:00] [WARN] [raft_client.rs:296] [“RPC batch_raft fail”] [err=“Some(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") }))”] [sink_err=“Some(RpcFinished(Some(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") })))”] [to_addr=192.168.0.230:20170]

241:

[2020/09/08 03:27:38.282 -04:00] [WARN] [raft_client.rs:199] [“send to 192.168.0.230:20170 fail, the gRPC connection could be broken”]
[2020/09/08 03:27:38.282 -04:00] [ERROR] [transport.rs:163] [“send raft msg err”] [err=“Other("[src/server/raft_client.rs:208]: RaftClient send fail")”]
[2020/09/08 03:27:38.283 -04:00] [INFO] [transport.rs:144] [“resolve store address ok”] [addr=192.168.0.230:20170] [store_id=81081]
[2020/09/08 03:27:38.283 -04:00] [INFO] [raft_client.rs:48] [“server: new connection with tikv endpoint”] [addr=192.168.0.230:20170]
[2020/09/08 03:27:41.289 -04:00] [INFO] [] [“Connect failed: {"created":"@1599550061.289131863","description":"Failed to connect to remote host: No route to host","errno":113,"file":"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc","file_line":205,"os_error":"No route to host","syscall":"getsockopt(SO_ERROR)","target_address":"ipv4:192.168.0.230:20170"}”]
[2020/09/08 03:27:41.289 -04:00] [INFO] [] [“Subchannel 0x7fd4db05a400: Retry immediately”]
[2020/09/08 03:27:41.289 -04:00] [INFO] [] [“Failed to connect to channel, retrying”]
[2020/09/08 03:27:41.289 -04:00] [WARN] [raft_client.rs:296] [“RPC batch_raft fail”] [err=“Some(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") }))”] [sink_err=“Some(RpcFinished(Some(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") })))”] [to_addr=192.168.0.230:20170]
[2020/09/08 03:27:41.312 -04:00] [INFO] [] [“Connect failed: {"created":"@1599550061.312409626","description":"Failed to connect to remote host: No route to host","errno":113,"file":"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc","file_line":205,"os_error":"No route to host","syscall":"getsockopt(SO_ERROR)","target_address":"ipv4:192.168.0.230:20170"}”]
[2020/09/08 03:27:41.312 -04:00] [INFO] [] [“Subchannel 0x7fd4db05a400: Retry in 1853 milliseconds”]
[2020/09/08 03:27:42.341 -04:00] [WARN] [store.rs:595] [“[store 1] handle 1 pending peers include 1 ready, 0 entries, 1 messages and 0 snapshots”] [takes=3772]
[2020/09/08 03:27:42.342 -04:00] [WARN] [store.rs:595] [“[store 1] handle 1 pending peers include 1 ready, 1 entries, 1 messages and 0 snapshots”] [takes=3433]
[2020/09/08 03:27:42.351 -04:00] [WARN] [raft_client.rs:199] [“send to 192.168.0.230:20170 fail, the gRPC connection could be broken”]
[2020/09/08 03:27:42.351 -04:00] [ERROR] [transport.rs:163] [“send raft msg err”] [err=“Other("[src/server/raft_client.rs:208]: RaftClient send fail")”]
[2020/09/08 03:27:42.354 -04:00] [INFO] [transport.rs:144] [“resolve store address ok”] [addr=192.168.0.230:20170] [store_id=81081]
[2020/09/08 03:27:42.354 -04:00] [INFO] [raft_client.rs:48] [“server: new connection with tikv endpoint”] [addr=192.168.0.230:20170]
[2020/09/08 03:27:42.354 -04:00] [INFO] [] [“Connect failed: {"created":"@1599550062.354638243","description":"Failed to connect to remote host: No route to host","errno":113,"file":"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc","file_line":205,"os_error":"No route to host","syscall":"getsockopt(SO_ERROR)","target_address":"ipv4:192.168.0.230:20170"}”]
[2020/09/08 03:27:42.354 -04:00] [INFO] [] [“Subchannel 0x7fd457bcda80: Retry in 1000 milliseconds”]
[2020/09/08 03:27:42.354 -04:00] [WARN] [raft_client.rs:296] [“RPC batch_raft fail”] [err=“Some(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") }))”] [sink_err=“Some(RpcFinished(Some(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") })))”] [to_addr=192.168.0.230:20170]
[2020/09/08 03:27:42.355 -04:00] [WARN] [raft_client.rs:199] [“send to 192.168.0.230:20170 fail, the gRPC connection could be broken”]
[2020/09/08 03:27:42.355 -04:00] [ERROR] [transport.rs:163] [“send raft msg err”] [err=“Other("[src/server/raft_client.rs:208]: RaftClient send fail")”]
[2020/09/08 03:27:42.356 -04:00] [INFO] [transport.rs:144] [“resolve store address ok”] [addr=192.168.0.230:20170] [store_id=81081]
[2020/09/08 03:27:42.356 -04:00] [INFO] [raft_client.rs:48] [“server: new connection with tikv endpoint”] [addr=192.168.0.230:20170]
[2020/09/08 03:27:43.357 -04:00] [INFO] [] [“Connect failed: {"created":"@1599550063.357088229","description":"Failed to connect to remote host: No route to host","errno":113,"file":"/rust/registry/src/github.com-1ecc6299db9ec823/grpcio-sys-0.5.3/grpc/src/core/lib/iomgr/tcp_client_posix.cc","file_line":205,"os_error":"No route to host","syscall":"getsockopt(SO_ERROR)","target_address":"ipv4:192.168.0.230:20170"}”]
[2020/09/08 03:27:43.357 -04:00] [INFO] [] [“Subchannel 0x7fd4db05ab00: Retry immediately”]
[2020/09/08 03:27:43.357 -04:00] [INFO] [] [“Failed to connect to channel, retrying”]
[2020/09/08 03:27:43.357 -04:00] [WARN] [raft_client.rs:296] [“RPC batch_raft fail”] [err=“Some(RpcFailure(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") }))”] [sink_err=“Some(RpcFinished(Some(RpcStatus { status: 14-UNAVAILABLE, details: Some("failed to connect to all addresses") })))”] [to_addr=192.168.0.230:20170]

都是230节点tiflash错误的日志,这个表没加入tiflash。另外时间是要+12的,所以面板的15:27就是日志的03:27

麻烦把所有节点的 slow log 和 tikv log 发上来,我仔细分析下,谢谢!

所有节点该时间段log:
logs.tar (2.8 MB)
slow日志:

tidb_slow_query.log (4.8 MB)