DM导入全量数据报错

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:v1.0.3
  • 【问题描述】:

我目前测试的是将tidb作为mysql的从库使用,使用DM实时同步。 mysql测试库mydumper出来的数据大小为380G左右,在load过程中出现以下几类错误:

1、未知错误

  • [2019/12/26 10:48:21.193 +08:00] [ERROR] [db.go:173] [“execute statements failed after retry”] [task=test] [unit=load] [“worker ID”=5] [queries="[USE xxxxx; INSERT INTO xxxxxx VALUES\ (… [arguments="[]"] [error="[code=10006:class=database:scope=not-set:level=high] execute statement failed: commit: Error 1105: Error: KV error safe to retry tikv restarts txn: Txn(Mvcc(TxnLockNotFound { start_ts: 413487196643786755, commit_ts: 413487201742487553, key: [116, 128, 0, 0, 0, 0, 0, 5, 87, 95, 114, 128, 0, 0, 0, 0, 0, 8, 30] })) [try again later]"] [2019/12/26 10:48:21.195 +08:00] [ERROR] [loader.go:259] [“fail to initial checkpoint”] [task=test] [unit=load] [“worker ID”=5] [“data file”=/home/tidb/deploy_msg/dumped_data.test/xxxxx.xxxxxx.000000105.sql] [error="[code=10006:class=database:scope=downstream:level=high] initialize checkpoint: execute statement failed: begin: context canceled"]

  • [2019/12/26 10:49:09.676 +08:00] [ERROR] [subtask.go:255] [“unit process error”] [subtask=test] [unit=Load] [“error information”="{“msg”:"[code=10006:class=database:scope=downstream:level=high] restore data file (xxxxxx.xxxxxx.000000105.sql) failed: initialize checkpoint: execute statement failed: begin: context canceled\ngithub.com/pingcap/dm/pkg/terror.(*Error).Delegate\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/terror/terror.go:267\ngithub.com/pingcap/ dm/pkg/conn.(*BaseConn).ExecuteSQLWithIgnoreError\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/conn/baseconn.go:145\ngithub.com/pingcap/dm/pkg/conn.(*BaseConn).ExecuteSQL\ \t/home/jenkins/agent/workspace/ build_dm_master/go/src/github.com/pingcap/dm/pkg/conn/baseconn.go:197\ngithub.com/pingcap/dm/loader.(*DBConn).executeSQL.func2\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/loader/db.go:150\ngithub.com/pingcap/dm/pkg/ retry.(*FiniteRetryStrategy).Apply\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/pkg/retry/strategy.go:71\ngithub.com/pingcap/dm/pkg/conn.(*BaseConn).ApplyRetryStrategy\ \t/home/jenkins/agent/workspace/build_dm_master/ go/src/github.com/pingcap/dm/pkg/conn/baseconn.go:203\ngithub.com/pingcap/dm/loader.(*DBConn).executeSQL\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/loader/db.go:145\ngithub.com/pingcap/dm/ loader.(*RemoteCheckPoint).Init\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/loader/checkpoint.go:289\ngithub.com/pingcap/dm/loader.(*Worker).dispatchSQL\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/ github.com/pingcap/dm/loader/loader.go:257\ github.com/pingcap/dm/loader.(*Worker).restoreDataFile\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/loader/loader.go:218\ngithub.com/pingcap/dm/loader.(*Worker).run\ \t/ home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/loader/loader.go:206\ngithub.com/pingcap/dm/loader.(*Loader).initAndStartWorkerPool.func1\ \t/home/jenkins/agent/workspace/build_dm_master/go/src/github.com/pingcap/dm/loader/ loader.go:769\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1357",“error”:{“ErrCode”:10006,“ErrClass”:1,“ErrScope”:2,“ErrLevel”:3,“Message”:“restore data file (dwd_message_center.me_task_params.000000105.sql) failed: initialize checkpoint: execute statement failed: begin: context canceled”,“RawCause”:“context canceled”}}"]

2、Can’t create database

  • [2019/12/26 10:49:11.433 +08:00] [ERROR] [baseconn.go:170] [“execute statement failed”] [task=test] [unit=load] [query=“CREATE DATABASE xxxxxx;”] [argument="[]"] [error=“Error 1007: Can’t create database ‘xxxxxx’; database exists” ]

3、 Table ‘xxxxx.xxxxx’ already exists

4、Duplicate entry ‘17254071’ for key ‘PRIMARY’

翻了下日志,大致就这几类错误吧,而且是必现的,我重新删除任务和数据后,重复同步几次,每次都有这些错误出现。mysql所有数据表都有自增主键,有300多张表。

  1. 发一下inventory.ini,dm-master,dm-worker,task的配置,以及dm-worker的日志 2. 这个是已经存在了,报错,忽略就可以 3. 也一样,这个表之前导入了 4. 需要检查是不是有重复值导入,看你的元数据

看了下之前的记录,发生这些错误时,query-error查看显示都是Paused

所以我再试一下吧,出现这样的就resume,看load数据完成后还会不会再有错误

好的,不过冲突的报错,应该确认是有重复数据了,试完之后还是不行,麻烦反馈下需要的信息,多谢