TIKV扩容之后程序获取TIDB链接超时

tikv扩容之后程序获取TIDB链接超时,之前是3TIDB3PD3TIKV,扩容后是3TIDB3PD5TIKV,各个节点都没有什么负载,麻烦帮忙提供一下排查思路

连接超时排查下网络问题,另外扩容成功了吗?可以看下日志信息,帮助分析。

扩容成功了,数据都有写到扩容的节点,日志看了都没有异常,网络也看了,也没有异常

具体连接的异常信息方便贴下

链接没有异常,就是耗时很长,2秒

监控我有发到你们的邮箱support@pingcap.com

稍等 我看下哈

会不会是 3.0 GA 版本 gRPC Batch Message 的 Bug,在 3.0.1 已经修复了。

怎么判断是不是这个bug呢?

方便的话 可以做下升级,升到 3.0.1

gRPC Batch Message 的 Bug这个Bug有什么现象吗?有没有关于这个bug的文档,我想先确认一下是不是这个问题

连接慢指的是写入的时候慢,还是仅仅程序连接 tidb-server 慢?如果是程序连接 tidb-server 慢,可以测下直连 DB 的快慢。

看了下监控 ,感觉是因为写入冲突出现响应变慢。

应该是写入导致的链接慢,不过之前没扩容前,同样的并发测的是没有超时的,扩容之后就显示超时,这个要先怎么排除写入冲突导致的慢呢?

看下 tikv 日志,里面会有具体的信息。

这个是写入冲突吗? [2019/08/07 17:27:36.270 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 685 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000132B1DF00FE region_epoch { conf_ver: 11 version: 33 } peers { id: 686 store_id: 4 } peers { id: 687 store_id: 5 } peers { id: 688 store_id: 7 } } current_regions { id: 1408 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF000000010E007C00FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE region_epoch { conf_ver: 11 version: 33 } peers { id: 1409 store_id: 4 } peers { id: 1410 store_id: 5 } peers { id: 1411 store_id: 7 } } })”] [cid=276930] [2019/08/07 17:27:36.275 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 685 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000132B1DF00FE region_epoch { conf_ver: 11 version: 33 } peers { id: 686 store_id: 4 } peers { id: 687 store_id: 5 } peers { id: 688 store_id: 7 } } current_regions { id: 1408 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF000000010E007C00FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE region_epoch { conf_ver: 11 version: 33 } peers { id: 1409 store_id: 4 } peers { id: 1410 store_id: 5 } peers { id: 1411 store_id: 7 } } })”] [cid=276931] [2019/08/07 17:27:36.275 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 685 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000132B1DF00FE region_epoch { conf_ver: 11 version: 33 } peers { id: 686 store_id: 4 } peers { id: 687 store_id: 5 } peers { id: 688 store_id: 7 } } current_regions { id: 1408 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF000000010E007C00FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE region_epoch { conf_ver: 11 version: 33 } peers { id: 1409 store_id: 4 } peers { id: 1410 store_id: 5 } peers { id: 1411 store_id: 7 } } })”] [cid=276932] [2019/08/07 17:27:36.275 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 685 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000132B1DF00FE region_epoch { conf_ver: 11 version: 33 } peers { id: 686 store_id: 4 } peers { id: 687 store_id: 5 } peers { id: 688 store_id: 7 } } current_regions { id: 1408 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF000000010E007C00FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE region_epoch { conf_ver: 11 version: 33 } peers { id: 1409 store_id: 4 } peers { id: 1410 store_id: 5 } peers { id: 1411 store_id: 7 } } })”] [cid=276933] [2019/08/07 17:27:36.276 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 685 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000132B1DF00FE region_epoch { conf_ver: 11 version: 33 } peers { id: 686 store_id: 4 } peers { id: 687 store_id: 5 } peers { id: 688 store_id: 7 } } current_regions { id: 1408 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF000000010E007C00FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE region_epoch { conf_ver: 11 version: 33 } peers { id: 1409 store_id: 4 } peers { id: 1410 store_id: 5 } peers { id: 1411 store_id: 7 } } })”] [cid=276935] [2019/08/07 17:27:36.282 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 685 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000132B1DF00FE region_epoch { conf_ver: 11 version: 33 } peers { id: 686 store_id: 4 } peers { id: 687 store_id: 5 } peers { id: 688 store_id: 7 } } current_regions { id: 1408 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF000000010E007C00FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000001292F4000FE region_epoch { conf_ver: 11 version: 33 } peers { id: 1409 store_id: 4 } peers { id: 1410 store_id: 5 } peers { id: 1411 store_id: 7 } } })”] [cid=276934]

看了下1000跟7000的日志,1000的是没有超时的,7000是超时的,7000明显就有很多这种报错: [2019/08/07 18:23:44.427 +08:00] [ERROR] [endpoint.rs:454] [error-response] [err=“region message: “region 3032 is missing” region_not_found { region_id: 3032 }”] [2019/08/07 18:24:14.440 +08:00] [ERROR] [endpoint.rs:454] [error-response] [err=“region message: “peer is not leader” not_leader { region_id: 3036 leader { id: 3038 store_id: 6 } }”] [2019/08/07 18:24:23.586 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 2128 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000008CDF800FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000000102CDB00FE region_epoch { conf_ver: 5 version: 34 } peers { id: 2129 store_id: 1 } peers { id: 2130 store_id: 6 } peers { id: 2131 store_id: 4 } } current_regions { id: 3040 start_key: 7480000000000000FF315F698000000000FF0000040133333732FF35333633FF333539FF3638353031FF3736FF303800000000FB03FF8000000002AA3C7BFF0000000000000000F7 end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000008CDF800FE region_epoch { conf_ver: 5 version: 34 } peers { id: 3041 store_id: 1 } peers { id: 3042 store_id: 6 } peers { id: 3043 store_id: 4 } } })”] [cid=1647289] [2019/08/07 18:24:23.586 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 2128 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000008CDF800FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000000102CDB00FE region_epoch { conf_ver: 5 version: 34 } peers { id: 2129 store_id: 1 } peers { id: 2130 store_id: 6 } peers { id: 2131 store_id: 4 } } current_regions { id: 3040 start_key: 7480000000000000FF315F698000000000FF0000040133333732FF35333633FF333539FF3638353031FF3736FF303800000000FB03FF8000000002AA3C7BFF0000000000000000F7 end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000008CDF800FE region_epoch { conf_ver: 5 version: 34 } peers { id: 3041 store_id: 1 } peers { id: 3042 store_id: 6 } peers { id: 3043 store_id: 4 } } })”] [cid=1647291] [2019/08/07 18:24:23.586 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 2128 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000008CDF800FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000000102CDB00FE region_epoch { conf_ver: 5 version: 34 } peers { id: 2129 store_id: 1 } peers { id: 2130 store_id: 6 } peers { id: 2131 store_id: 4 } } current_regions { id: 3040 start_key: 7480000000000000FF315F698000000000FF0000040133333732FF35333633FF333539FF3638353031FF3736FF303800000000FB03FF8000000002AA3C7BFF0000000000000000F7 end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000008CDF800FE region_epoch { conf_ver: 5 version: 34 } peers { id: 3041 store_id: 1 } peers { id: 3042 store_id: 6 } peers { id: 3043 store_id: 4 } } })”] [cid=1647292] [2019/08/07 18:24:23.586 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “region epoch is not match” epoch_not_match { current_regions { id: 2128 start_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000008CDF800FE end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF00000000102CDB00FE region_epoch { conf_ver: 5 version: 34 } peers { id: 2129 store_id: 1 } peers { id: 2130 store_id: 6 } peers { id: 2131 store_id: 4 } } current_regions { id: 3040 start_key: 7480000000000000FF315F698000000000FF0000040133333732FF35333633FF333539FF3638353031FF3736FF303800000000FB03FF8000000002AA3C7BFF0000000000000000F7 end_key: 7480000000000000FF315F698000000000FF0000050131383536FF35373033FF363038FF0000000000FA0380FF0000000008CDF800FE region_epoch { conf_ver: 5 version: 34 } peers { id: 3041 store_id: 1 } peers { id: 3042 store_id: 6 } peers { id: 3043 store_id: 4 } } })”] [cid=1647293] [2019/08/07 18:24:23.587 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647294] [2019/08/07 18:24:23.587 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647301] [2019/08/07 18:24:23.587 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647304] [2019/08/07 18:24:23.587 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647305] [2019/08/07 18:24:23.587 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647306] [2019/08/07 18:24:23.587 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647309] [2019/08/07 18:24:23.587 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647310] [2019/08/07 18:24:23.587 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647311] [2019/08/07 18:24:23.588 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647312] [2019/08/07 18:24:23.588 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647313] [2019/08/07 18:24:23.588 +08:00] [ERROR] [process.rs:179] [“get snapshot failed”] [err=“Request(message: “peer is not leader” not_leader { region_id: 3040 })”] [cid=1647314]