lighting 导入后 count慢。有些大表会报[1105] [HY000]: Execution terminated due to exceeding the deadline

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:

【TiDB 版本】v4.0.3导出 到 v4.0.12

【问题描述】count 比原来的数据库慢很多 同样一个表 原来的4s多 新数据库 20多秒


若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

导入的时候,有配置执行 analyze 吗? 如果没有的话,先手工analyze table ,再试试。

第一张表 执行analyze table 之后 count还是20多秒 原来的数据库4秒多
第二张 count 之后报[1105] [HY000]: Execution terminated due to exceeding the deadline的表 在analyze table 2000多秒后报错SQL 错误 [9002] [HY000]: TiKV server timeout

需要我提供什么日志吗

ql="/* ApplicationName=DBeaver 7.3.1 - SQLEditor <Script-12.sql> / analyze table game_perceptio.power_applications\r"]
[2021/05/17 20:42:51.961 +08:00] [INFO] [region_cache.go:840] [“switch region leader to specific leader due to kv return NotLeader”] [regionID=3314] [currIdx=0] [leaderStoreID=2]
[2021/05/17 21:07:28.259 +08:00] [WARN] [backoff.go:329] [“tikvRPC backoffer.maxSleep 40000ms is exceeded, errors:\ send tikv request error: wait recvLoop: context deadline exceeded, ctx: region ID: 3314, meta: id:3314 start_key:“t\200\000\000\000\000\000\001\177_r\200\000\000\000\004t\376\356” end_key:“t\200\000\000\000\000\000\001\177_r\200\000\000\000\0249\363\357” region_epoch:<conf_ver:5 version:431 > peers:<id:3315 store_id:1 > peers:<id:3316 store_id:2 > peers:<id:3317 store_id:7 > , peer: id:3316 store_id:2 , addr: 10.144.128.217:20160, idx: 1, reqStoreType: TiKvOnly, runStoreType: tikv, try next peer later at 2021-05-17T21:05:24.904417364+08:00\ send tikv request error: wait recvLoop: context deadline exceeded, ctx: region ID: 3314, meta: id:3314 start_key:“t\200\000\000\000\000\000\001\177_r\200\000\000\000\004t\376\356” end_key:“t\200\000\000\000\000\000\001\177_r\200\000\000\000\0249\363\357” region_epoch:<conf_ver:5 version:431 > peers:<id:3315 store_id:1 > peers:<id:3316 store_id:2 > peers:<id:3317 store_id:7 > , peer: id:3316 store_id:2 , addr: 10.144.128.217:20160, idx: 1, reqStoreType: TiKvOnly, runStoreType: tikv, try next peer later at 2021-05-17T21:06:26.605264781+08:00\ send tikv request error: wait recvLoop: context deadline exceeded, ctx: region ID: 3314, meta: id:3314 start_key:“t\200\000\000\000\000\000\001\177_r\200\000\000\000\004t\376\356” end_key:“t\200\000\000\000\000\000\001\177_r\200\000\000\000\0249\363\357” region_epoch:<conf_ver:5 version:431 > peers:<id:3315 store_id:1 > peers:<id:3316 store_id:2 > peers:<id:3317 store_id:7 > , peer: id:3316 store_id:2 , addr: 10.144.128.217:20160, idx: 1, reqStoreType: TiKvOnly, runStoreType: tikv, try next peer later at 2021-05-17T21:07:28.259055395+08:00”]
[2021/05/17 21:07:28.259 +08:00] [ERROR] [analyze.go:108] [“analyze failed”] [conn=48] [error="[tikv:9002]TiKV server timeout"]
[2021/05/17 21:08:36.659 +08:00] [WARN] [backoff.go:329] [“tikvRPC backoffer.maxSleep 40000ms is exceeded, errors:\ send tikv request error: wait recvLoop: context deadline exceeded, ctx: region ID: 4386, meta: id:4386 start_key:“t\200\000\000\000\000\000\001\177_i\200\000\000\000\000\000\000\001\003\200\000\000\000\000-\245\365\003\200\000\000\000n\205L\\” end_key:“t\200\000\000\000\000\000\001\177_i\200\000\000\000\000\000\000\001\003\200\000\000\000\000Bf\027\365\211O \223” region_epoch:<conf_ver:5 version:561 > peers:<id:4387 store_id:1 > peers:<id:4388 store_id:2 > peers:<id:4389 store_id:7 > , peer: id:4387 store_id:1 , addr: 10.144.135.198:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv, try next peer later at 2021-05-17T21:06:33.697953166+08:00\ send tikv request error: wait recvLoop: context deadline exceeded, ctx: region ID: 4386, meta: id:4386 start_key:“t\200\000\000\000\000\000\001\177_i\200\000\000\000\000\000\000\001\003\200\000\000\000\000-\245\365\003\200\000\000\000n\205L\\” end_key:“t\200\000\000\000\000\000\001\177_i\200\000\000\000\000\000\000\001\003\200\000\000\000\000Bf\027\365\211O \223” region_epoch:<conf_ver:5 version:561 > peers:<id:4387 store_id:1 > peers:<id:4388 store_id:2 > peers:<id:4389 store_id:7 > , peer: id:4387 store_id:1 , addr: 10.144.135.198:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv, try next peer later at 2021-05-17T21:07:35.59073582+08:00\ send tikv request error: wait recvLoop: context deadline exceeded, ctx: region ID: 4386, meta: id:4386 start_key:“t\200\000\000\000\000\000\001\177_i\200\000\000\000\000\000\000\001\003\200\000\000\000\000-\245\365\003\200\000\000\000n\205L\\” end_key:“t\200\000\000\000\000\000\001\177_i\200\000\000\000\000\000\000\001\003\200\000\000\000\000Bf\027\365\211O \223” region_epoch:<conf_ver:5 version:561 > peers:<id:4387 store_id:1 > peers:<id:4388 store_id:2 > peers:<id:4389 store_id:7 > , peer: id:4387 store_id:1 , addr: 10.144.135.198:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv, try next peer later at 2021-05-17T21:08:36.659530442+08:00”]
[2021/05/17 21:08:36.662 +08:00] [WARN] [client_batch.go:622] [“send request is cancelled”] [to=10.144.135.56:20160] [cause=“context canceled”]
[2021/05/17 21:08:36.662 +08:00] [ERROR] [analyze.go:108] [“analyze failed”] [conn=48] [error="[tikv:9002]TiKV server timeout"]
[2021/05/17 21:08:36.662 +08:00] [INFO] [tidb.go:219] [“rollbackTxn for ddl/autocommit failed”]
[2021/05/17 21:08:36.662 +08:00] [WARN] [session.go:1383] [“run statement failed”] [conn=48] [schemaVersion=167] [error="[tikv:9002]TiKV server timeout"] [session="{\ “currDBName”: “”,\ “id”: 48,\ “status”: 2,\ “strictMode”: true,\ “user”: {\ “Username”: “root”,\ “Hostname”: “10.114.156.127”,\ “CurrentUser”: false,\ “AuthUsername”: “root”,\ “AuthHostname”: “%”\ }\ }"]
[2021/05/17 21:08:36.662 +08:00] [INFO] [conn.go:797] [“command dispatched failed”] [conn=48] [connInfo=“id:48, addr:10.114.156.127:14175 status:10, collation:utf8_general_ci, user:root”] [command=Query] [status=“inTxn:0, autocommit:1”] [sql="/
ApplicationName=DBeaver 7.3.1 - SQLEditor <Script-12.sql> */ analyze table game_perceptio.power_applications\r\ "] [txn_mode=PESSIMISTIC] [err="[tikv:9002]TiKV server timeout\ngithub.com/pingcap/errors.AddStack\ \t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/errors.go:174\ github.com/pingcap/errors.Trace\ \t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20201126102027-b0a155152ca3/juju_adaptor.go:15\ github.com/pingcap/tidb/store/tikv.(*RegionRequestSender).onSendFail\ \t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/region_request.go:545\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender).sendReqToRegion\ \t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/region_request.go:482\ngithub.com/pingcap/tidb/store/tikv.(*RegionRequestSender).SendReqCtx\ \t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/region_request.go:313\ngithub.com/pingcap/tidb/store/tikv.(*clientHelper).SendReqCtx\ \t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:985\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTaskOnce\ \t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:884\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).handleTask\ \t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:808\ngithub.com/pingcap/tidb/store/tikv.(*copIteratorWorker).run\ \t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/store/tikv/coprocessor.go:538\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1357"]

lightning 最后的时候会自动的进行 analyze。您这有 4.0.3 和 4.0.12 的不同的执行计划对比吗?

不好意思,社区一直上传不上去,这是两个执行计划对比
https://blog.csdn.net/damantou321/article/details/116977913?spm=1001.2014.3001.5501

https://blog.csdn.net/damantou321/article/details/116977632?spm=1001.2014.3001.5501

您好 可以帮忙看一下吗 这是两个执行计划
https://blog.csdn.net/damantou321/article/details/116977913?spm=1001.2014.3001.5501

https://blog.csdn.net/damantou321/article/details/116977632?spm=1001.2014.3001.5501

  1. 看起来是这里的时间比较长。
  2. 请问两个部署环境,拓扑和硬件是否相同? 包含网络带宽等,可以Ping 一下测试。
  3. 部署都是使用 tiup 吗? 麻烦 tidb cluster edit 看下参数配置。
└─StreamAgg_3052 root 1 funcs:count(Column#1498)->Column#31 1 time:20.5s, loops:2 372 Bytes N/A
└─IndexReader_3053 root 1 index:StreamAgg_3036 30 time:20.5s, loops:2, cop_task: {num: 30, max: 20.5s, min: 187.5ms, avg: 1.86s, p95: 12.4s, max_proc_keys: 60501113, p95_proc_keys: 36606943, tot_proc: 55.8s, tot_wait: 7ms, rpc_num: 30, rpc_time: 55.8s, copr_cache: disabled} 340 Bytes N/A
└─StreamAgg_3036 cop[tikv] 1 funcs:count(1)->Column#1498 30 tikv_task:{proc max:20.5s, min:187ms, p80:668ms, p95:12.4s, iters:157973, tasks:30} N/A N/A
└─IndexFullScan_3051 cop[tikv] 161745333 table:fps_detail, partition:p_max, index:package_id(package_id), keep order:false 161745333 tikv_task:{proc max:20.5s, min:187ms, p80:668ms, p95:12.4s, iters:157973, tasks:30} N/A N/A

您好,经过测试 在旧的tidb执行count的时候 监控看到三个tikv是很平均的
但是在新的tidb上执行count的时候 只有一个tikv资源占用比较高

split Regina也不管用
想请问一下这大概是一个什么情况 网络通信没问题 硬件参数也相同

麻烦上传下grafana 的,over-view,tidb,tikv-detail 监控信息。