sync-diff-inspector分表数据检查支持原库配置正则匹配吗?
官方看给的例子是3-1的分表配置方法,如果我原库的分表比较多,该如何配置呢?主要是 ```
[[table-config]]
sync-diff-inspector分表数据检查支持原库配置正则匹配吗?
官方看给的例子是3-1的分表配置方法,如果我原库的分表比较多,该如何配置呢?主要是 ```
[[table-config]]
你好,
可以通过分库分表中间件进行数据对比(如果有该中间件的话),这样则不需要将所有分表写出来。
关于是否支持正则表达式,这边确认下,稍后给下回复
好的,还望支持~
你好,这边已经确认,支持正则表达式,支持形式参考下 sync-diff 其他标签:
https://docs.pingcap.com/zh/tidb/v4.0/sync-diff-inspector-overview
######################### Tables config #########################
如果需要对比大量的不同库名或者表名的表的数据,可以通过 table-rule 来设置映射关系。可以只配置 schema 或者 table 的映射关系,也可以都配置
[[table-rules]]
# schema-pattern 和 table-pattern 支持通配符 ?
schema-pattern = “test”
table-pattern = "record_20"
target-schema = “test”
target-table = “record”配置需要对比的目标数据库中的表
[[check-tables]]
# 库的名称
schema = “test”# 需要检查的表的名称 tables = ["record"]
配置该表对应的分表的相关配置
[[table-config]]
# 目标库的名称
schema = “test”# 目标库中表的名称 table = "record" # 为分库分表场景下数据的对比,设置为 true is-sharding = true # 源数据表的配置 [[table-config.source-tables]] # 源数据库实例的 id instance-id = "rds-test" schema = "test" table = "~^record_20*"
这样试了,报错:
[2020/06/22 15:19:05.798 +08:00] [ERROR] [config.go:98] [“must have more than one source tables if comparing sharding tables”] [stack=“github.com/pingcap/log.Error
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/pkg/mod/github.com/pingcap/log@v0.0.0-20191012051959-b742a5d432e9/global.go:42
main.(*TableConfig).Valid
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/sync_diff_inspector/config.go:98
main.(*Config).checkConfig
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/sync_diff_inspector/config.go:307
main.main
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/sync_diff_inspector/main.go:54
runtime.main
\t/usr/local/go/src/runtime/proc.go:203”]
官方能给个示例吗?这个情况还是比较常见的。
# 支持使用正则表达式,需要以‘~’开始,
# 下面的配置会检查所有表名以‘test’为前缀的表
# tables = "~^test.*"
# 下面的配置会检查配置库中所有的表
# tables = "~^"
你好,你可能没理解我意思,你这个贴的是check-tables部分的,指定要检查目标库的哪些表,我的情况是 目标表是一个,原库是几百张月表,在table-config要怎么配置呢?就是上面我发的例子的table-config.source-tables部分。
你好,
配置文件中的正则表达式是通用的,楼上回复的含义是如下示例,测试下,看是否可以
table = "record_20*"
变更为
table = "~^record_20*"
回复好快!
改了会报这个错误,帮忙确认下这种情况是否支持。
[2020/06/23 16:43:55.625 +08:00] [ERROR] [config.go:98] [“must have more than one source tables if comparing sharding tables”] [stack=“github.com/pingcap/log.Error\
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/pkg/mod/github.com/pingcap/log@v0.0.0-20191012051959-b742a5d432e9/global.go:42\
main.(*TableConfig).Valid\
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/sync_diff_inspector/config.go:98\
main.(*Config).checkConfig\
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/sync_diff_inspector/config.go:307\
main.main\
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/sync_diff_inspector/main.go:54\
runtime.main\
\t/usr/local/go/src/runtime/proc.go:203”]
[2020/06/23 16:43:55.625 +08:00] [ERROR] [main.go:56] [“there is something wrong with your config, please check it!”] [stack=“github.com/pingcap/log.Error\
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/pkg/mod/github.com/pingcap/log@v0.0.0-20191012051959-b742a5d432e9/global.go:42\
main.main\
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/sync_diff_inspector/main.go:56\
runtime.main\
\t/usr/local/go/src/runtime/proc.go:203”]
辛苦上传下完整的配置文件
ok,这边核实下
你好,这边已经向研发小伙伴求证,目前不支持 table sources 部分使用正则表达式,
目前如果 is_sharding 为 true ,source table 必须为多个。
这个有计划支持吗?因为数据导入是有支持的,希望在校验这块能够对齐。
当上游的分表表名比较统一,可以尝试使用,table-rules 进行匹配,用table-rules 就不用配置这个 table-config 了,当你的分表名称不是很统一,只能一个一个匹配才需要一个一个写。:
[[table-rules]]
schema-pattern = "diff_test"
table-pattern = "t*"
target-schema = "diff_test"
target-table = "t_10"
[[check-tables]]
schema = "diff_test"
tables = ["t_10"]
我试了,启动后一直报错:
[2020/06/24 14:29:04.355 +08:00] [WARN] [diff.go:648] [“save table summary info failed”] [schema=assets_finance] [table=record_voucher_detail] [error=“chunks of instanceID target schema assets_finance table record_voucher_detail not found”] [errorVerbose=“chunks of instanceID target schema assets_finance table record_voucher_detail not found
github.com/pingcap/errors.NotFoundf
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:117
github.com/pingcap/tidb-tools/pkg/diff.getChunkSummary
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/pkg/diff/checkpoint.go:207
github.com/pingcap/tidb-tools/pkg/diff.updateTableSummary
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/pkg/diff/checkpoint.go:227
github.com/pingcap/tidb-tools/pkg/diff.(*TableDiff).UpdateSummaryInfo.func1.1
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/pkg/diff/diff.go:646
github.com/pingcap/tidb-tools/pkg/diff.(*TableDiff).UpdateSummaryInfo.func1
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/pkg/diff/diff.go:667
runtime.goexit
\t/usr/local/go/src/runtime/asm_amd64.s:1357”]
[2020/06/24 14:29:14.355 +08:00] [WARN] [diff.go:648] [“save table summary info failed”] [schema=assets_finance] [table=record_voucher_detail] [error=“chunks of instanceID target schema assets_finance table record_voucher_detail not found”] [errorVerbose=“chunks of instanceID target schema assets_finance table record_voucher_detail not found
github.com/pingcap/errors.NotFoundf
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:117
github.com/pingcap/tidb-tools/pkg/diff.getChunkSummary
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/pkg/diff/checkpoint.go:207
github.com/pingcap/tidb-tools/pkg/diff.updateTableSummary
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/pkg/diff/checkpoint.go:227
github.com/pingcap/tidb-tools/pkg/diff.(*TableDiff).UpdateSummaryInfo.func1.1
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/pkg/diff/diff.go:646
github.com/pingcap/tidb-tools/pkg/diff.(*TableDiff).UpdateSummaryInfo.func1
\t/home/jenkins/agent/workspace/build_tidb_tools_master/go/src/github.com/pingcap/tidb-tools/pkg/diff/diff.go:667
runtime.goexit
\t/usr/local/go/src/runtime/asm_amd64.s:1357”]
^C
下游 target schema assets_finance table record_voucher_detail 是存在的。
下面是配置文件。
说明一下:上下游表字段是一致的,但是两边的主键不一致,因为分表都是自增id, 所以下游表采用的是关联多个字段作为主键的。
diff-test.toml (2.4 KB)
你好,
请在下游 shell 执行下 show create table assets_finance.record_voucher_detail \G
并返回下截图,确认下确实存在该表。
PS:去掉 instance id 再试下,会不会有报错
[WARN] [diff.go:648] [“save table summary info failed”]
这个问题有相同的 issue https://github.com/pingcap/tidb-tools/issues/354 ,在 issue 里描述了问题的原因,可以看下。这个问题只是会打印日志,造成干扰,不影响校验结果,已经在 https://github.com/pingcap/tidb-tools/pull/355 修复了。
简单来说就是 diff 会定时查 checkpoint 表中的 chunk 状态汇总成状态信息打印出来,方便查看校验进度。查的时候有的表还没划分好 chunk,查的时候没有数据,就会报这个 warn。
instanceid 要配置的,不然配置检查失败会报错