【 TiDB 使用环境】生产环境
【 TiDB 版本】
【遇到的问题:问题现象及影响】
使用 br 从 v4.0.14 升级到了 v6.5.8 ,并使用 cdc 进行实时同步,结果在切入流量后 gc 卡住,tidb server 的 gc 正常推进,但是监控上 kv 的 gc 时间一直停留 在较早的时间;通过尝试重启 gc leader 的 tidb server 以及切换 pd 后,扔没有恢复.
【资源配置】进入到 TiDB Dashboard -集群信息 (Cluster Info) -主机(Hosts) 截图此页面
【附件:截图/日志/监控】
[2024/07/20 03:19:07.682 +08:00] [INFO] [range_task.go:310] [“canceling range task because of error”] [name=resolve-locks-runner] [st
artKey=7480000000000011285f728000000001d2b53c] [endKey=7480000000000011285f728000000002eb4124] [error=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [INFO] [range_task.go:310] [“canceling range task because of error”] [name=resolve-locks-runner] [st
artKey=748000000000000f6e5f728000000000b8818c] [endKey=748000000000000f6e5f728000000000c599c4] [error=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [INFO] [range_task.go:310] [“canceling range task because of error”] [name=resolve-locks-runner] [st
artKey=7480000000000011285f6980000000000000030142786f326b385967ff4846424b356b4870ff615a6e626d375653ff6a35697339456168ff00000000000000
00f703800000000186183c] [endKey=7480000000000011285f7280000000009a5800] [error=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [INFO] [region_request.go:1285] [“send request failed, err: context canceled”] [req-ts=4511911237003
50976] [req-type=ScanLock] [region=“{ region id: 76937, ver: 11838, confVer: 124 }”] [region-is-valid=true] [retry-times=0] [replica-
read-type=leader] [replica-selector-state=accessKnownLeader] [stale-read=false] [replica-status=“peer: 77131, store: 1, isEpochStale:
false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liveness-state: reachable; peer: 77132, store: 4,
isEpochStale: false, attempts: 1, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liveness-state: reachable; peer: 77
130, store: 5, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liveness-state: reach
able; peer: 2352384, store: 7, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liven
ess-state: reachable; peer: 2352556, store: 11, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: reso
lved, store-liveness-state: reachable; peer: 2419708, store: 2314744, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch
: 0, store-state: resolved, store-liveness-state: reachable”] [total-backoff-ms=0] [total-backoff-times=0] [total-region-errors=]
[2024/07/20 03:19:07.682 +08:00] [INFO] [range_task.go:310] [“canceling range task because of error”] [name=resolve-locks-runner] [st
artKey=748000000000000f6e5f728000000000ac81ac] [endKey=748000000000000f6e5f728000000000b8818c] [error=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [INFO] [range_task.go:310] [“canceling range task because of error”] [name=resolve-locks-runner] [st
artKey=7480000000000011285f7280000000009a5800] [endKey=7480000000000011285f728000000001d2b53c] [error=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [INFO] [region_request.go:1285] [“send request failed, err: context canceled”] [req-ts=4511911237003
50976] [req-type=ScanLock] [region=“{ region id: 2484216, ver: 12517, confVer: 45886 }”] [region-is-valid=true] [retry-times=0] [repl
ica-read-type=leader] [replica-selector-state=accessKnownLeader] [stale-read=false] [replica-status=“peer: 2484217, store: 2314744, i
sEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liveness-state: reachable; peer: 2484
218, store: 10, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liveness-state: reac
hable; peer: 2484219, store: 6, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-live
ness-state: reachable; peer: 2484220, store: 248, isEpochStale: false, attempts: 1, replica-epoch: 0, store-epoch: 0, store-state: re
solved, store-liveness-state: reachable; peer: 2484221, store: 8, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0,
store-state: resolved, store-liveness-state: reachable; peer: 2484222, store: 5, isEpochStale: false, attempts: 0, replica-epoch: 0,
store-epoch: 0, store-state: resolved, store-liveness-state: reachable”] [total-backoff-ms=0] [total-backoff-times=0] [total-region-
errors=]
[2024/07/20 03:19:07.682 +08:00] [INFO] [range_task.go:310] [“canceling range task because of error”] [name=resolve-locks-runner] [st
artKey=7480000000000011285f698000000000000001038000000003f87c0d] [endKey=7480000000000011285f6980000000000000020419b3906d9a0000000380
000000069bd2e7] [error=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [INFO] [region_request.go:1285] [“send request failed, err: context canceled”] [req-ts=4511911237003
50976] [req-type=ScanLock] [region=“{ region id: 4791199, ver: 12370, confVer: 16960 }”] [region-is-valid=true] [retry-times=0] [repl
ica-read-type=leader] [replica-selector-state=accessKnownLeader] [stale-read=false] [replica-status=“peer: 4791200, store: 2314744, i
sEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liveness-state: reachable; peer: 4845
168, store: 5, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liveness-state: reach
able; peer: 4845233, store: 1, isEpochStale: false, attempts: 1, replica-epoch: 0, store-epoch: 0, store-state: resolved, store-liven
ess-state: reachable; peer: 4845316, store: 12, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0, store-state: reso
lved, store-liveness-state: reachable; peer: 4866911, store: 10, isEpochStale: false, attempts: 0, replica-epoch: 0, store-epoch: 0,
store-state: resolved, store-liveness-state: reachable; peer: 4870837, store: 7, isEpochStale: false, attempts: 0, replica-epoch: 0,
store-epoch: 0, store-state: resolved, store-liveness-state: reachable”] [total-backoff-ms=0] [total-backoff-times=0] [total-region-e
rrors=]
[2024/07/20 03:19:07.682 +08:00] [INFO] [range_task.go:310] [“canceling range task because of error”] [name=resolve-locks-runner] [st
artKey=748000000000000f6e5f728000000001156f98] [endKey=74800000000000108e5f72800000000008cff2] [error=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [INFO] [range_task.go:233] [“range task failed”] [name=resolve-locks-runner] [startKey=] [endKey=] [
“cost time”=1.788887891s] [error=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [ERROR] [gc_worker.go:1057] [“[gc worker] resolve locks failed”] [uuid=642e69e50b00080] [safePoint=4
51191123700350976] [error=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [ERROR] [gc_worker.go:628] [“[gc worker] resolve locks returns an error”] [uuid=642e69e50b00080] [er
ror=“context canceled”]
[2024/07/20 03:19:07.682 +08:00] [ERROR] [gc_worker.go:226] [“[gc worker] runGCJob”] [error=“context canceled”]