【 TiDB 使用环境】测试
【 TiDB 版本】7.5.0
【复现路径】pd在凌晨1点多时,不知道为什么挂掉了,pd日志如下:
[2024/05/08 01:27:00.398 +08:00] [INFO] [grpc_service.go:1948] ["update service GC safe point"]
[service-id=gc_worker] [expire-at=-9223372035139672989] [safepoint=449603756461391872]
[2024/05/08 01:28:40.520 +08:00] [INFO] [grpc_service.go:1893] ["updated gc safe point"] [safe-p
oint=449603756461391872]
[2024/05/08 01:37:00.396 +08:00] [INFO] [grpc_service.go:1948] ["update service GC safe point"]
[service-id=gc_worker] [expire-at=-9223372035139672389] [safepoint=449603913747791872]
[2024/05/08 01:38:37.465 +08:00] [INFO] [lease.go:187] ["stop lease keep alive worker"] [purpose
="leader election"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [allocator_manager.go:772] ["exit allocator daemon"] []
[2024/05/08 01:38:37.466 +08:00] [INFO] [coordinator.go:160] ["patrol regions has been stopped"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [coordinator.go:344] ["drive slow node scheduler is stop
ped"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [coordinator.go:326] ["drive push operator has been stop
ped"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [allocator_manager.go:316] ["exit allocator loop"] []
[2024/05/08 01:38:37.466 +08:00] [INFO] [scheduler_controller.go:364] ["scheduler has been stopp
ed"] [scheduler-name=balance-hot-region-scheduler] [error="context canceled"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [coordinator.go:374] ["coordinator is stopping"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [scheduler_controller.go:364] ["scheduler has been stopped"] [scheduler-name=balance-leader-scheduler] [error="context canceled"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [main.go:284] ["got signal to exit"] [signal=hangup]
[2024/05/08 01:38:37.466 +08:00] [INFO] [server.go:127] ["region syncer has been stopped"]
[2024/05/08 01:38:37.466 +08:00] [INFO] [scheduler_controller.go:364] ["scheduler has been stopped"] [scheduler-name=transfer-witness-leader-scheduler] [error="context canceled"]
随后的日志都是在stop各个模块,是否跟里面的这几个日志提示有关?
stop lease keep alive worker
drive slow node scheduler is stop
drive push operator has been stop
【遇到的问题:问题现象及影响】pd在凌晨挂掉,应该如何进一步排查原因?这似乎不是第一次发生