Tidb binlog发送到kafka的binlog数据怎么解析?

使用kafak获取tidb同步的binlog,格式如下:

Time: 1566201952000 ms
19-08-2019 16:05:52 CST test INFO - -------------------------------------------
19-08-2019 16:05:52 CST test INFO - �������"|
19-08-2019 16:05:52 CST test INFO - hydeebinlog_testfCREATE TABLE hydee.binlog_test (
19-08-2019 16:05:52 CST test INFO -    id INT,
19-08-2019 16:05:52 CST test INFO -    name VARCHAR(20)
19-08-2019 16:05:52 CST test INFO - ) ENGINE = InnoDB ROW_FORMAT = DEFAULT23333
19-08-2019 16:05:52 CST test INFO - �����¨�E
19-08-2019 16:05:52 CST test INFO - C
19-08-2019 16:05:52 CST test INFO - hydeebinlog_test
19-08-2019 16:05:52 CST test INFO - idint
19-08-2019 16:05:52 CST test INFO - nameavarchar"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2a23333
19-08-2019 16:05:52 CST test INFO - �����E
19-08-2019 16:05:52 CST test INFO - C
19-08-2019 16:05:52 CST test INFO - hydeebinlog_test
19-08-2019 16:05:52 CST test INFO - idint
19-08-2019 16:05:52 CST test INFO - nameavarchar"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2b23333
19-08-2019 16:05:52 CST test INFO - �����Ԩ�T
19-08-2019 16:05:52 CST test INFO - R
19-08-2019 16:05:52 CST test INFO - hydeebinlog_test
19-08-2019 16:05:52 CST test INFO - idint
19-08-2019 16:05:52 CST test INFO - nameavarchar"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2a"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2b23333
19-08-2019 16:05:52 CST test INFO - �����Ԩ�E
19-08-2019 16:05:52 CST test INFO - C
19-08-2019 16:05:52 CST test INFO - hydeebinlog_test
19-08-2019 16:05:52 CST test INFO - idint
19-08-2019 16:05:52 CST test INFO - nameavarchar"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2a23333
19-08-2019 16:05:52 CST test INFO - �����ը�E
19-08-2019 16:05:52 CST test INFO - C
19-08-2019 16:05:52 CST test INFO - hydeebinlog_test
19-08-2019 16:05:52 CST test INFO - idint
19-08-2019 16:05:52 CST test INFO - nameavarchar"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2a23333
19-08-2019 16:05:52 CST test INFO - ���܍ը�E
19-08-2019 16:05:52 CST test INFO - C
19-08-2019 16:05:52 CST test INFO - hydeebinlog_test
19-08-2019 16:05:52 CST test INFO - idint
19-08-2019 16:05:52 CST test INFO - nameavarchar"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2a23333
19-08-2019 16:05:52 CST test INFO - �����ݨ�E
19-08-2019 16:05:52 CST test INFO - C
19-08-2019 16:05:52 CST test INFO - hydeebinlog_test
19-08-2019 16:05:52 CST test INFO - idint
19-08-2019 16:05:52 CST test INFO - nameavarchar"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2a23333
19-08-2019 16:05:52 CST test INFO - �����ݨ�E
19-08-2019 16:05:52 CST test INFO - C
19-08-2019 16:05:52 CST test INFO - hydeebinlog_test
19-08-2019 16:05:52 CST test INFO - idint
19-08-2019 16:05:52 CST test INFO - nameavarchar"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2a23333
19-08-2019 16:05:52 CST test INFO - �������E
19-08-2019 16:05:52 CST test INFO - C
19-08-2019 16:05:52 CST test INFO - hydeebinlog_test
19-08-2019 16:05:52 CST test INFO - idint
19-08-2019 16:05:52 CST test INFO - nameavarchar"
19-08-2019 16:05:52 CST test INFO - 	
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2b23333
19-08-2019 16:05:52 CST test INFO - ...
19-08-2019 16:05:52 CST test INFO - 
19-08-2019 16:05:52 CST test INFO - 2019-08-19 16:05:52 INFO  JobScheduler:54 - Finished job streaming job 1566201952000 ms.0 from job set of time 1566201952000 ms
19-08-2019 16:05:52 CST test INFO - 2019-08-19 16:05:52 INFO  JobScheduler:54 - Total delay: 0.969 s for time 1566201952000 ms (execution: 0.893 s)
19-08-2019 16:05:52 CST test INFO - 2019-08-19 16:05:52 INFO  ReceivedBlockTracker:54 - Deleting batches: 

怎么把他解析成表里的数据? 看了一下官网的例子,go语言没看懂,有没有简单点的,java版的

binlog是使用什么方式进行的serializer?是使用Kafka的StringDeserializer吗?

https://pingcap.com/docs-cn/v3.0/reference/tools/tidb-binlog/binlog-slave-client/#数据格式 这个网站是否有帮助呢?

谢谢,这个文章看过了,只是给了数据结构,没有说明它是以那种方式进行序列化的

目前官方还没有 java 版本的 demo,不过你可以参考下这个 case,看是否有一些帮助 Java解析drainer发送到kafka中的binlog异常

它把数据结构序列化是使用通用工具还是自定义格式?看了会go源码,没看明白

已解决 使用google的protobuf解析(遇到类似问题参考: Java解析drainer发送到kafka中的binlog异常 ):

20-08-2019 12:49:25 CST testStream INFO - **********************
20-08-2019 12:49:25 CST testStream INFO - type: DML
20-08-2019 12:49:25 CST testStream INFO - commit_ts: 410588031694143489
20-08-2019 12:49:25 CST testStream INFO - dml_data {
20-08-2019 12:49:25 CST testStream INFO -   tables {
20-08-2019 12:49:25 CST testStream INFO -     schema_name: "hydee"
20-08-2019 12:49:25 CST testStream INFO -     table_name: "binlog_test"
20-08-2019 12:49:25 CST testStream INFO -     column_info {
20-08-2019 12:49:25 CST testStream INFO -       name: "id"
20-08-2019 12:49:25 CST testStream INFO -       mysql_type: "int"
20-08-2019 12:49:25 CST testStream INFO -       is_primary_key: false
20-08-2019 12:49:25 CST testStream INFO -     }
20-08-2019 12:49:25 CST testStream INFO -     column_info {
20-08-2019 12:49:25 CST testStream INFO -       name: "name"
20-08-2019 12:49:25 CST testStream INFO -       mysql_type: "varchar"
20-08-2019 12:49:25 CST testStream INFO -       is_primary_key: false
20-08-2019 12:49:25 CST testStream INFO -     }
20-08-2019 12:49:25 CST testStream INFO -     mutations {
20-08-2019 12:49:25 CST testStream INFO -       type: Insert
20-08-2019 12:49:25 CST testStream INFO -       row {
20-08-2019 12:49:25 CST testStream INFO -         columns {
20-08-2019 12:49:25 CST testStream INFO -           int64_value: 7
20-08-2019 12:49:25 CST testStream INFO -         }
20-08-2019 12:49:25 CST testStream INFO -         columns {
20-08-2019 12:49:25 CST testStream INFO -           string_value: "g"
20-08-2019 12:49:25 CST testStream INFO -         }
20-08-2019 12:49:25 CST testStream INFO -       }
20-08-2019 12:49:25 CST testStream INFO -     }
20-08-2019 12:49:25 CST testStream INFO -   }
20-08-2019 12:49:25 CST testStream INFO - }
20-08-2019 12:49:25 CST testStream INFO - 
20-08-2019 12:49:25 CST testStream INFO - ***************************