sync-diff-inspector配置多个表校验时,数据不一致时报panic错误

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】
Sync-diff-inspector 2.0

【概述】 场景 + 问题概述

启动命令:./bin/sync_diff_inspector --config=./config.toml
两张表:一张数据不一致,另一张一致,报错退出状态吗为2,panic错误

【问题】 当前遇到的问题
报错,日志打印异常,log和summary都打印不全

控制台日志:
{“level”:“warn”,“ts”:“2021-12-24T20:11:24.518+0800”,“caller”:“clientv3/retry_interceptor.go:62”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-6f006c1f-7136-4f26-a45a-69d2efe5893f/172.16.21.227:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
A total of 2 tables need to be compared

Comparing the table structure of rt`.`t1 … equivalent
Comparing the table structure of rt`.`t2 … equivalent
Comparing the table data of rt`.`t1 … equivalent
Comparing the table data of rt`.`t2 … failure


Progress [============================================================>] 100% 0/0
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x576f31d]

goroutine 1 [running]:
github.com/pingcap/tidb-tools/sync_diff_inspector/report.(*Report).Print(0xc000a05860, {0x5f82a20, 0xc000010018})
/Users/yunzhanghu1151/code/tidb-tools/sync_diff_inspector/report/report.go:243 +0x79d
main.(*Diff).PrintSummary(0xc000a059a0, {0x5fbb490, 0xc00005a080})
/Users/yunzhanghu1151/code/tidb-tools/sync_diff_inspector/diff.go:113 +0x228
main.checkSyncState({0x5fbb490, 0xc00005a080}, 0xc0001af800)
/Users/yunzhanghu1151/code/tidb-tools/sync_diff_inspector/main.go:116 +0x3c5
main.main()
/Users/yunzhanghu1151/code/tidb-tools/sync_diff_inspector/main.go:79 +0x52d

panic位置修改代码:
for schema, tableMap := range r.TableResults {
for table, result := range tableMap {
//summary.WriteString(fmt.Sprintf("%s error occured in %s
", result.MeetError.Error(), dbutil.TableName(schema, table)))
summary.WriteString(fmt.Sprintf(“error occured in %s
, result=%+v”, dbutil.TableName(schema, table), result))
}
}

日志打印正常,有check failed,但summary仍无表信息打印

控制台输出:
GO111MODULE=on go build -ldflags ‘-X “github.com/pingcap/tidb-tools/pkg/utils.Version=v5.3.0-1-g6c8d736-dirty” -X “github.com/pingcap/tidb-tools/pkg/utils.BuildTS=2021-12-24 11:48:50” -X “github.com/pingcap/tidb-tools/pkg/utils.GitHash=6c8d73635bf771dfb5312a47a171aa2e548052ac” -X “github.com/pingcap/tidb-tools/pkg/utils.GitBranch=fix-sun”’ -o bin/sync_diff_inspector_mac ./sync_diff_inspector
~/code/tidb-tools   fix-sun ±✚  ./bin/sync_diff_inspector_mac --config=./bin/config_test.toml
{“level”:“warn”,“ts”:“2021-12-24T19:49:55.408+0800”,“caller”:“clientv3/retry_interceptor.go:62”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-35b4f806-e09b-4f8e-96d0-bb95e6cc4639/172.16.21.227:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
A total of 2 tables need to be compared

Comparing the table structure of rt`.`t1 … equivalent
Comparing the table data of rt`.`t2 … equivalent
Comparing the table structure of rt`.`t1 … equivalent
Comparing the table data of rt`.`t2 … failure


Progress [============================================================>] 100% 0/0
Error in comparison process:
error occured in rt.t2
, result=&{Schema:rt Table:t2 StructEqual:true DataSkip:false DataEqual:false MeetError:sql: Scan error on column index 1, name “CHECKSUM”: converting driver.Value type []uint8 (“17872535405533884590”) to a int64: value out of range
github.com/pingcap/errors.AddStack
/usr/local/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20211009033009-93128226aaa3/errors.go:174
github.com/pingcap/errors.Trace
/usr/local/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20211009033009-93128226aaa3/juju_adaptor.go:15
github.com/pingcap/tidb-tools/sync_diff_inspector/utils.GetCountAndCRC32Checksum
/Users/yunzhanghu1151/code/tidb-tools/sync_diff_inspector/utils/utils.go:680
github.com/pingcap/tidb-tools/sync_diff_inspector/source.(*MySQLSources).GetCountAndCrc32.func1
/Users/yunzhanghu1151/code/tidb-tools/sync_diff_inspector/source/mysql_shard.go:104
runtime.goexit
/usr/local/Cellar/go/1.17.2/libexec/src/runtime/asm_amd64.s:1581 ChunkMap:map[0:0-0:1:2:0xc00149a860]}error occured in rt.t2
, result=&{Schema:rt Table:t2 StructEqual:true DataSkip:false DataEqual:true MeetError: ChunkMap:map[]}You can view the comparision details through ‘./checking_1224-1/sync_diff.log’

【TiDB 版本】

【附件】summary.txt (306 字节)
config_example.toml (856 字节)

  • 相关日志、配置文件、Grafana 监控(https://metricstool.pingcap.com/)
  • TiUP Cluster Display 信息
  • TiUP CLuster Edit config 信息
  • TiDB-Overview 监控
  • 对应模块的 Grafana 监控(如有 BR、TiDB-binlog、TiCDC 等)
  • 对应模块日志(包含问题前后 1 小时日志)

若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

2 个赞

补充说明:crc32算法容易发生碰撞,故将代码中的sql换为md5统计checksum,出现了上述panic错误;手动修复后发现类型超长问题(原来crc32采用int64存储,而md5是128位),若直接用string,代码又存在异或算法问题,故没有换类型。请帮忙看下问题,是否会考虑升级优化?(如换用crc64,改进算法等)

可以上传下第一次报错的 sync-diff.log 吗?我们排查下报错原因

已转相关人员,问题应该出现在替换为md5算法,上面代码修改也给出了;crc32很容易校验不准(比如上游为0,下游为1,报校验一致),建议做下这方面的测试和优化

另外请问能配置不同名的源表、目标表时,能配置取消上游mysql默认分库分表的规则吗?否则mysql有同名的库表时,结果会报错

关闭默认追加目标端库表到校验列表的功能暂时不支持,预计下个版本会支持。请知晓。

1 个赞

好的,上面的问题修复好了的话烦请通知下,谢谢:pray:

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。