DM 同步分库分表上游两个库的表结构不一致导致 Column count doesn't match value count

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:v1.0.4
  • 【问题描述】: DM 同步分库分表上游两个库的表结构不一致导致 Column count doesn’t match value count

query-status shard_merge { “result”: true, “msg”: “”, “workers”: [ { “result”: true, “worker”: “192.168.179.101:8263”, “msg”: “”, “subTaskStatus”: [ { “name”: “shard_merge”, “stage”: “Paused”, “unit”: “Sync”, “result”: { “isCanceled”: false, “errors”: [ { “Type”: “UnknownError”, “msg”: “[code=36027:class=sync-unit:scope=internal:level=high] current pos (mysql-bin|000001.000009, 3836): gen update sqls failed, schema: test, table: t_test: Column count doesn’t match value count: 5 (columns) vs 4 (values)\ngithub.com/pingcap/dm/pkg/terror.(*Error).Generate\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:232\ngithub.com/pingcap/dm/syncer.genUpdateSQLs\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/dml.go:178\ngithub.com/pingcap/dm/syncer.(*Syncer).handleRowsEvent\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:1511\ngithub.com/pingcap/dm/syncer.(*Syncer).Run\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:1306\ngithub.com/pingcap/dm/syncer.(*Syncer).Process\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:599\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357”, “error”: null } ], “detail”: null }, “unresolvedDDLLockID”: “”, “sync”: { “totalEvents”: “0”, “totalTps”: “0”, “recentTps”: “0”, “masterBinlog”: “(mysql-bin.000009, 5535)”, “masterBinlogGtid”: “”, “syncerBinlog”: “(mysql-bin|000001.000009, 3653)”, “syncerBinlogGtid”: “”, “blockingDDLs”: [ ], “unresolvedGroups”: [ ], “synced”: false } } ], “relayStatus”: { “masterBinlog”: “(mysql-bin.000009, 5535)”, “masterBinlogGtid”: “”, “relaySubDir”: “6584405c-57e6-11ea-a55f-000c2948ff10.000001”, “relayBinlog”: “(mysql-bin.000009, 5535)”, “relayBinlogGtid”: “”, “relayCatchUpMaster”: true, “stage”: “Running”, “result”: null }, “sourceID”: “instance-2” }, { “result”: true, “worker”: “192.168.179.101:8264”, “msg”: “”, “subTaskStatus”: [ { “name”: “shard_merge”, “stage”: “Paused”, “unit”: “Sync”, “result”: { “isCanceled”: false, “errors”: [ { “Type”: “UnknownError”, “msg”: “[code=36027:class=sync-unit:scope=internal:level=high] current pos (mysql-bin|000001.000002, 3661): gen update sqls failed, schema: test, table: t_test: Column count doesn’t match value count: 5 (columns) vs 4 (values)\ngithub.com/pingcap/dm/pkg/terror.(*Error).Generate\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:232\ngithub.com/pingcap/dm/syncer.genUpdateSQLs\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/dml.go:178\ngithub.com/pingcap/dm/syncer.(*Syncer).handleRowsEvent\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:1511\ngithub.com/pingcap/dm/syncer.(*Syncer).Run\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:1306\ngithub.com/pingcap/dm/syncer.(*Syncer).Process\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:599\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357”, “error”: null } ], “detail”: null }, “unresolvedDDLLockID”: “”, “sync”: { “totalEvents”: “0”, “totalTps”: “0”, “recentTps”: “0”, “masterBinlog”: “(mysql-bin.000002, 6214)”, “masterBinlogGtid”: “”, “syncerBinlog”: “(mysql-bin|000001.000002, 3403)”, “syncerBinlogGtid”: “”, “blockingDDLs”: [ ], “unresolvedGroups”: [ ], “synced”: false } } ], “relayStatus”: { “masterBinlog”: “(mysql-bin.000002, 6214)”, “masterBinlogGtid”: “”, “relaySubDir”: “52a7aad6-6cd6-11ea-8068-f875a467c631.000001”, “relayBinlog”: “(mysql-bin.000002, 6214)”, “relayBinlogGtid”: “”, “relayCatchUpMaster”: true, “stage”: “Running”, “result”: null }, “sourceID”: “instance-3” } ] }

请问这种有什么解决办法?

若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

您好: 是指上游两个不同的表结构,向下游同一个表同步吗? 如果是这种没有办法解决。 只能支持相同的表结构.

是的,上游出现不同的表结构肯定是有问题的,这个上游已经修复了,但是如何跳过这个错误?

之前出错的就不管了,直接从最新的位置开始恢复执行。这个要怎么操作?

可以通过一下方式进行尝试跳过:

  1. 如果是需要跳过错误可以使用 sql-skip 或者 Binlog event filter 进行跳过,可以参考官方文档: skip-或-replace-异常的-sql-语句 以及 Binlog event filter
  2. 如果需要重置同步任务,可以参考官方文档: 如何重置数据同步任务

如果使用替换或者跳过的方式解决,那么从中断以后的数据将全部丢失, 1、是否可以查看到relay-log 中每行数据的位置信息 2、能否可以指定跳过执行多少条,或者是指定结束的pos位置?

1、如果是使用跳过的方式必定换导致数据丢失。 2、如果是使用替换的方式,不知道出错的sql是什么,通过query-error 无法查询到

通过 query-status 查询到的如下:

{ “result”: true, “msg”: “”, “workers”: [ { “result”: true, “worker”: “192.168.179.101:8263”, “msg”: “”, “subTaskStatus”: [ { “name”: “shard_merge”, “stage”: “Paused”, “unit”: “Sync”, “result”: { “isCanceled”: false, “errors”: [ { “Type”: “UnknownError”, “msg”: “[code=36027:class=sync-unit:scope=internal:level=high] current pos (mysql-bin|000001.000009, 3836): gen update sqls failed, schema: test, table: t_test: Column count doesn’t match value count: 5 (columns) vs 4 (values)\ngithub.com/pingcap/dm/pkg/terror.(*Error).Generate\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:232\ngithub.com/pingcap/dm/syncer.genUpdateSQLs\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/dml.go:178\ngithub.com/pingcap/dm/syncer.(*Syncer).handleRowsEvent\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:1511\ngithub.com/pingcap/dm/syncer.(*Syncer).Run\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:1306\ngithub.com/pingcap/dm/syncer.(*Syncer).Process\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:599\ngithub.com/pingcap/dm/syncer.(*Syncer).Resume\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:2320\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357”, “error”: null } ], “detail”: null }, “unresolvedDDLLockID”: “”, “sync”: { “totalEvents”: “0”, “totalTps”: “0”, “recentTps”: “0”, “masterBinlog”: “(mysql-bin.000009, 5535)”, “masterBinlogGtid”: “”, “syncerBinlog”: “(mysql-bin|000001.000009, 3653)”, “syncerBinlogGtid”: “”, “blockingDDLs”: [ ], “unresolvedGroups”: [ ], “synced”: false } } ], “relayStatus”: { “masterBinlog”: “(mysql-bin.000009, 5535)”, “masterBinlogGtid”: “”, “relaySubDir”: “6584405c-57e6-11ea-a55f-000c2948ff10.000001”, “relayBinlog”: “(mysql-bin.000009, 5535)”, “relayBinlogGtid”: “”, “relayCatchUpMaster”: true, “stage”: “Running”, “result”: null }, “sourceID”: “instance-2” }, { “result”: true, “worker”: “192.168.179.101:8264”, “msg”: “”, “subTaskStatus”: [ { “name”: “shard_merge”, “stage”: “Paused”, “unit”: “Sync”, “result”: { “isCanceled”: false, “errors”: [ { “Type”: “UnknownError”, “msg”: “[code=36027:class=sync-unit:scope=internal:level=high] current pos (mysql-bin|000001.000002, 3661): gen update sqls failed, schema: test, table: t_test: Column count doesn’t match value count: 5 (columns) vs 4 (values)\ngithub.com/pingcap/dm/pkg/terror.(*Error).Generate\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:232\ngithub.com/pingcap/dm/syncer.genUpdateSQLs\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/dml.go:178\ngithub.com/pingcap/dm/syncer.(*Syncer).handleRowsEvent\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:1511\ngithub.com/pingcap/dm/syncer.(*Syncer).Run\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:1306\ngithub.com/pingcap/dm/syncer.(*Syncer).Process\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:599\ngithub.com/pingcap/dm/syncer.(*Syncer).Resume\n\t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/syncer/syncer.go:2320\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357”, “error”: null } ], “detail”: null }, “unresolvedDDLLockID”: “”, “sync”: { “totalEvents”: “0”, “totalTps”: “0”, “recentTps”: “0”, “masterBinlog”: “(mysql-bin.000002, 6214)”, “masterBinlogGtid”: “”, “syncerBinlog”: “(mysql-bin|000001.000002, 3403)”, “syncerBinlogGtid”: “”, “blockingDDLs”: [ ], “unresolvedGroups”: [ ], “synced”: false } } ], “relayStatus”: { “masterBinlog”: “(mysql-bin.000002, 6214)”, “masterBinlogGtid”: “”, “relaySubDir”: “52a7aad6-6cd6-11ea-8068-f875a467c631.000001”, “relayBinlog”: “(mysql-bin.000002, 6214)”, “relayBinlogGtid”: “”, “relayCatchUpMaster”: true, “stage”: “Running”, “result”: null }, “sourceID”: “instance-3” } ] }

  1. 上下游的表结构不一致,如果需要跳过的话是否能够容忍数据丢失?如果不能的话能否手工在下游同步执行一下之前引起问题的 DDL 。 然后重启 dm-worker 。再进行尝试。
  2. 如果已经找不到之前引起问题的 ddl 的话,建议可以重置同步任务,重新做一次全量 + 增量的同步以保证上下游数据的一致性。