为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。
- 【TiDB 版本】:v4.0.7
- 【问题描述】:drainer突然挂掉,起不来
已经确认kafka没有异常
报错drainer日志:
[2020/10/16 23:42:26.175 +08:00] [INFO] [pump.go:166] [“receive big size binlog”] [size=“108 MB”]
[2020/10/16 23:42:41.757 +08:00] [INFO] [broker.go:212] ["[sarama] Connected to broker at 10.40.14.11:9092 (registered as #1)
“]
[2020/10/16 23:42:41.977 +08:00] [INFO] [async_producer.go:971] [”[sarama] producer/broker/1 state change to [closing] because write tcp 10.40.195.229:59076->10.40.14.11:9092: write: connection reset by peer
“]
[2020/10/16 23:42:41.978 +08:00] [INFO] [broker.go:253] [”[sarama] Closed connection to broker 10.40.14.11:9092
“]
[2020/10/16 23:42:41.978 +08:00] [INFO] [async_producer.go:578] [”[sarama] producer/leader/bi2b_tidb_obinlog/0 state change to [retrying-7]
“]
[2020/10/16 23:42:41.978 +08:00] [INFO] [async_producer.go:588] [”[sarama] producer/leader/bi2b_tidb_obinlog/0 abandoning broker 1
“]
[2020/10/16 23:42:41.978 +08:00] [INFO] [async_producer.go:717] [”[sarama] producer/broker/1 input chan closed
“]
[2020/10/16 23:42:41.978 +08:00] [INFO] [async_producer.go:801] [”[sarama] producer/broker/1 shut down
“]
[2020/10/16 23:42:42.478 +08:00] [INFO] [client.go:772] [”[sarama] client/metadata fetching metadata for [bi2b_tidb_obinlog] from broker 10.40.75.137:9092
“]
[2020/10/16 23:42:42.480 +08:00] [INFO] [async_producer.go:711] [”[sarama] producer/broker/1 starting up
“]
[2020/10/16 23:42:42.480 +08:00] [INFO] [async_producer.go:727] [”[sarama] producer/broker/1 state change to [open] on bi2b_tidb_obinlog/0
“]
[2020/10/16 23:42:42.480 +08:00] [INFO] [async_producer.go:570] [”[sarama] producer/leader/bi2b_tidb_obinlog/0 selected broker 1
“]
[2020/10/16 23:42:42.480 +08:00] [INFO] [async_producer.go:594] [”[sarama] producer/leader/bi2b_tidb_obinlog/0 state change to [flushing-7]
“]
[2020/10/16 23:42:42.480 +08:00] [INFO] [async_producer.go:616] [”[sarama] producer/leader/bi2b_tidb_obinlog/0 state change to [normal]
"]
现象:重试几十次,就直接退出
已经确认kafka没有异常
经测试,应该跟[“receive big size binlog”] [size=“108 MB”]有关,如果我把savepoint移走,重启;就能正常工作;把之前的savepoint恢复,就无法正常工作
感觉跟这个pr:drainer/pump.go: fix when msg bigger than 4M by july2993 · Pull Request #333 · pingcap/tidb-binlog · GitHub 有点关联,可否帮忙看下?