pump无法产生binlog

【 TiDB 使用环境】
在k8s中部署的生产环境,数据表超过100万

【 TiDB 版本】
v6.5.0

【复现路径】
开启pump程序,同步binlog

【遇到的问题:问题现象及影响】
1、pump无法在指定目录中产生binlog
2、关闭pump服务所有的tidb节点的pod全部宕机,为什么会全部宕机
3、[DBStats]这个是个什么过程,有哪些处理流程

【资源配置】
image

【附件:截图/日志/监控】

pump关闭时tidb节点pod状态
VLVtpEkIVQ

tidb节点pod启动日志:
[2023/04/17 18:25:13.422 +08:00] [INFO] [client.go:328] [“[pumps client] write binlog to available pumps all failed, will try unavailable pumps”]
[2023/04/17 18:25:13.422 +08:00] [WARN] [session.go:2218] [“run statement failed”] [conn=7929359001148457417] [schemaVersion=3496320] [error=“[global:3]critical error write binlog failed, the last error no available pump to write binlog”] [session=“{\n "currDBName": "mysql",\n "id": 7929359001148457417,\n "status": 2,\n "strictMode": true,\n "user": {\n "Username": "root",\n "Hostname": "100.64.32.201",\n "CurrentUser": false,\n "AuthUsername": "root",\n "AuthHostname": "%"\n }\n}”]
[2023/04/17 18:25:13.422 +08:00] [FATAL] [conn.go:1138] [“critical error, stop the server”] [conn=7929359001148457417] [error=“[global:3]critical error write binlog failed, the last error no available pump to write binlog”] [stack=“github.com/pingcap/tidb/server.(*clientConn).Run\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1138\ngithub.com/pingcap/tidb/server.(*Server).onConn\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/server.go:625”]

pump日志:

[2023/04/17 17:31:29.571 +08:00] [INFO] [version.go:50] [“Welcome to Pump”] [“Release Version”=v6.5.0] [“Git Commit Hash”=589d79bcc4f0e9fa982847192baf6dd3eb3a0f41] [“Build TS”=“2022-12-16 08:19:30”] [“Go Version”=go1.19.3] [“Go OS/Arch”=linux/amd64]
[2023/04/17 17:31:29.571 +08:00] [INFO] [main.go:48] [“start pump…”] [config=“{"log-level":"info","node-id":"","addr":"http://10.40.224.48:8250","advertise-addr":"http://10.40.224.48:8250","socket":"","pd-urls":"http://36.0.3.246:2379,http://36.0.14.116:2379,http://36.0.13.229:2379","EtcdDialTimeout":5000000000,"data-dir":"/data/data.pump","heartbeat-interval":2,"gc":"7","log-file":"/data/pump.log","security":{"ssl-ca":"","ssl-cert":"","ssl-key":"","cert-allowed-cn":null},"gen-binlog-interval":3,"MetricsAddr":"","MetricsInterval":15,"storage":{"sync-log":null,"kv_chan_cap":0,"slow_write_threshold":0,"kv":null,"stop-write-at-available-space":null}}”]
[2023/04/17 17:31:29.571 +08:00] [INFO] [client.go:397] [“[pd] create pd client with endpoints”] [pd-address=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”]
[2023/04/17 17:31:29.578 +08:00] [INFO] [base_client.go:360] [“[pd] update member urls”] [old-urls=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”] [new-urls=“[http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-1.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-2.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379]”]
[2023/04/17 17:31:29.578 +08:00] [INFO] [base_client.go:378] [“[pd] switch leader”] [new-leader=http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379] [old-leader=]
[2023/04/17 17:31:29.578 +08:00] [INFO] [base_client.go:105] [“[pd] init cluster id”] [cluster-id=7194634827558957893]
[2023/04/17 17:31:29.578 +08:00] [INFO] [client.go:690] [“[pd] tso dispatcher created”] [dc-location=global]
[2023/04/17 17:31:29.578 +08:00] [INFO] [server.go:132] [“get clusterID success”] [clusterID=7194634827558957893]
[2023/04/17 17:31:29.578 +08:00] [INFO] [client.go:397] [“[pd] create pd client with endpoints”] [pd-address=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”]
[2023/04/17 17:31:29.583 +08:00] [INFO] [base_client.go:360] [“[pd] update member urls”] [old-urls=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”] [new-urls=“[http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-1.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-2.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379]”]
[2023/04/17 17:31:29.583 +08:00] [INFO] [base_client.go:378] [“[pd] switch leader”] [new-leader=http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379] [old-leader=]
[2023/04/17 17:31:29.583 +08:00] [INFO] [base_client.go:105] [“[pd] init cluster id”] [cluster-id=7194634827558957893]
[2023/04/17 17:31:29.583 +08:00] [INFO] [client.go:690] [“[pd] tso dispatcher created”] [dc-location=global]
[2023/04/17 17:31:29.584 +08:00] [INFO] [store.go:75] [“new store”] [path=“tikv://36.0.13.229:2379,36.0.14.116:2379,36.0.3.246:2379?disableGC=true”]
[2023/04/17 17:31:29.584 +08:00] [INFO] [client.go:397] [“[pd] create pd client with endpoints”] [pd-address=“[36.0.13.229:2379,36.0.14.116:2379,36.0.3.246:2379]”]
[2023/04/17 17:31:29.589 +08:00] [INFO] [base_client.go:360] [“[pd] update member urls”] [old-urls=“[http://36.0.13.229:2379,http://36.0.14.116:2379,http://36.0.3.246:2379]”] [new-urls=“[http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-1.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379,http://basic-pd-2.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379]”]
[2023/04/17 17:31:29.589 +08:00] [INFO] [base_client.go:378] [“[pd] switch leader”] [new-leader=http://basic-pd-0.basic-pd-peer.8b72b77e-dbb1-4f43-ba62-512230eb7668.svc:2379] [old-leader=]
[2023/04/17 17:31:29.589 +08:00] [INFO] [base_client.go:105] [“[pd] init cluster id”] [cluster-id=7194634827558957893]
[2023/04/17 17:31:29.589 +08:00] [INFO] [client.go:690] [“[pd] tso dispatcher created”] [dc-location=global]
[2023/04/17 17:31:29.590 +08:00] [INFO] [store.go:81] [“new store with retry success”]
[2023/04/17 17:31:29.590 +08:00] [INFO] [storage.go:138] [NewAppendWithResolver] [options=“{"ValueLogFileSize":524288000,"Sync":true,"KVChanCapacity":1048576,"SlowWriteThreshold":1,"StopWriteAtAvailableSpace":10737418240,"KVConfig":null}”]
[2023/04/17 17:31:29.590 +08:00] [INFO] [storage.go:1408] [“open metadata db”] [config=“{"block-cache-capacity":8388608,"block-restart-interval":16,"block-size":4096,"compaction-L0-trigger":8,"compaction-table-size":67108864,"compaction-total-size":536870912,"compaction-total-size-multiplier":8,"write-buffer":67108864,"write-L0-pause-trigger":24,"write-L0-slowdown-trigger":17}”]
[2023/04/17 17:31:29.595 +08:00] [INFO] [storage.go:220] [“Append info”] [gcTS=0] [maxCommitTS=0] [headPointer=“{"Fid":0,"Offset":0}”] [handlePointer=“{"Fid":0,"Offset":0}”]
[2023/04/17 17:31:29.600 +08:00] [INFO] [server.go:440] [“register success”] [NodeID=ksc_epc:8250]
[2023/04/17 17:31:29.601 +08:00] [INFO] [server.go:457] [“start to server request”] [addr=http://10.40.224.48:8250]
[2023/04/17 17:31:39.597 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":461,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:31:39.598 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853829665947650]
[2023/04/17 17:31:49.596 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":1038,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:31:49.598 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853832024981509]
[2023/04/17 17:31:59.597 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":1530,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:31:59.599 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853834384277507]
[2023/04/17 17:32:09.596 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":2101,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:32:09.598 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853837530005507]
[2023/04/17 17:32:19.596 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":2678,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:32:19.599 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853839889563652]
[2023/04/17 17:32:29.597 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":3170,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:32:29.598 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853842248859652]
[2023/04/17 17:32:39.596 +08:00] [INFO] [storage.go:387] [DBStats] [DBStats=“{"WriteDelayCount":0,"WriteDelayDuration":0,"WritePaused":false,"AliveSnapshots":0,"AliveIterators":0,"IOWrite":3741,"IORead":0,"BlockCacheSize":0,"OpenedTablesCount":0,"LevelSizes":null,"LevelTablesCounts":null,"LevelRead":null,"LevelWrite":null,"LevelDurations":null}”]
[2023/04/17 17:32:39.599 +08:00] [INFO] [server.go:563] [“server info tick”] [writeBinlogCount=0] [alivePullerCount=0] [MaxCommitTS=440853845407694849]

日志中有大量的[DBStats]日志,

用 TiCDC,不要用 binlog 组件了,不兼容…

6.5了别用binlog了

全部宕机是因为你关闭pump,但是tidb的binlog相关参数没有做调整,所以才会宕机,因为tidb会有个检测,把binlog参数关了就行了

tidb 配置下 ignore error

配置文件是否开启binlog?pump 是如何设置的

看上去是操作错误,要先关闭binlog,然后再重启pump