dm数据同步异常

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:tidb 4.0.8 ,dm2.0
  • 【问题描述】:利用dm工具,将分表合并同步到tidb中,task异常终止,信息如下:
    “stage”: “Paused”,
    “unit”: “Load”,
    “result”: {
    “isCanceled”: false,
    “errors”: [
    {
    “ErrCode”: 10006,
    “ErrClass”: “database”,
    “ErrScope”: “downstream”,
    “ErrLevel”: “high”,
    “Message”: “file ccounting.stock_detail10.79.sql: execute statement failed: commit”,
    “RawCause”: “Error 1062: Duplicate entry ‘1275294250175176704’ for key ‘PRIMARY’”,
    “Workaround”: “”
    去源表(mysql)中查询,各分表,此条数据仅存一条,,然后使用tiup dmctl --master-addr 10.0.79.232:8261 resume-task allbasicdata 恢复任务,同步任务 又恢复正常了,想问一下,为什么回出现这种情况

若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

可以先参考下面的帖子里的方法排查下:

我这个是全量同步的时候出现主键冲突的问题的。按照提供的方法排查,原来是所有分表都在一个task,不知道是否是因为这个原因,现把一个分表一个task方式运行,观察是否还会出现这个问题

你可以先尝试下,一般在全量同步阶段出现主键冲突,问题出在上游有重复数据或者下游已经存在相同数据的可能性较大。

上游的数据主键是雪花算法生成的(查看上游分片,此条数据仅一条,)下游是已经存在,全量同步到一半出现主键冲突

那建议将下游数据清空后重新全量同步下。

下游数据本来是不存在的,整个全量同步同步到一般出现主键冲突

好的,后面如果再出现了,麻烦提供下详细的报错和 dm-worker 日志信息。

感谢xxx

:+1::+1::+1:

具体信息内容:

全量load的时候又出现这个错误了
task具体日志信息;
task相应的内容:
“subTaskStatus”: [
{
“name”: “voucher_header”,
“stage”: “Paused”,
“unit”: “Load”,
“result”: {
“isCanceled”: false,
“errors”: [
{
“ErrCode”: 10006,
“ErrClass”: “database”,
“ErrScope”: “downstream”,
“ErrLevel”: “high”,
“Message”: “file finance.voucher_header_2127.0.sql: execute statement failed: commit”,
“RawCause”: “Error 1062: Duplicate entry ‘C20200221270306830’ for key ‘pack_doc_no_UNIQUE’”,
“Workaround”: “”
}
],
“detail”: null
},
“unresolvedDDLLockID”: “”,
dm_worker.log

[2020/11/19 14:43:29.989 +08:00] [INFO] [checkpoint.go:341] [“initial checkpoint record”] [task=voucher_header] [unit=load] [component=“remote checkpoint”] [sql=“INSERT INTO meta_finance.voucher_header_loader_checkpoint (id, filename, cp_schema, cp_table, offset, end_pos) VALUES(?,?,?,?,?,?)”] [id=mysql-7491supplier] [filename=finance.voucher_header_2127.0.sql] [schema=finance] [table=voucher_header_2127] [offset=0] [“end position”=33554695]
[2020/11/19 14:43:29.998 +08:00] [ERROR] [db.go:175] [“execute statements failed after retry”] [task=voucher_header] [unit=load] [“worker ID”=1] [queries="[INSERT INTO meta_finance.voucher_header_loader_checkpoint (id, filename, cp_schema, cp_table, offset, end_pos) VALUES(?,?,?,?,?,?)]"] [arguments="[[mysql-7491supplier finance.voucher_header_2127.0.sql finance voucher_header_2127 0 33554695]]"] [error="[code=10006:class=database:scope=not-set:level=high], Message: execute statement failed: commit, RawCause: Error 1062: Duplicate entry ‘mysql-7491supplier-finance.voucher_header_2127.0.sql’ for key ‘uk_id_f’"]
[2020/11/19 14:43:29.998 +08:00] [ERROR] [checkpoint.go:355] [“checkpoint record already exists, skip it.”] [task=voucher_header] [unit=load] [component=“remote checkpoint”] [id=mysql-7491supplier] [filename=finance.voucher_header_2127.0.sql]
[2020/11/19 14:43:30.105 +08:00] [INFO] [loader.go:301] [“data are scanned finished.”] [task=voucher_header] [unit=load] [“worker ID”=1] [“data file”=dumped_data.voucher_header/finance.voucher_header_2127.0.sql] [offset=0]
[2020/11/19 14:43:30.367 +08:00] [ERROR] [db.go:175] [“execute statements failed after retry”] [task=voucher_header] [unit=load] [“worker ID”=1] [queries="[USE finance; INSERT INTO voucher_header (id,bill_type,buss_sys,buss_doc_type,buss_doc_no,update_mod,pack_doc_no,doc_type,comp_code,comp_name,ac_doc_no,fisc_year,doc_date,pstng_date,trans_date,header_txt,ref_doc_no,currency,exch_rate,can_doc_no,reason_rev,cc_doc_no,name,city,bank_no,bank_acct,country,street,tax_no_1,entry_date,entry_time,message_id,message_text,outbound_flg,business_detail_Id,create_by,create_time,update_by,update_time,source_type,upload_time,callback_time,write_off,write_off_bill,write_off_id,origin_pack_doc_no,sap_voucher_no,buss_complex_key) VALUES\ (1207346944930770946,2,‘SCC’?绁?,‘TOW0B5000494’,‘A’,‘C20191221270352568’,‘RC’,‘2127’,$.?..?..’,NULL,‘2019’,‘20191218’,‘20191218’,NULL,’?€?.,‘OU191218000544’,‘CNY’,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NULL,NUL?.,‘3’,xxx,’’…"] [arguments="[]"] [error="[code=10006:class=database:scope=not-set:level=high], Message: execute statement failed: commit, RawCause: Error 1062: Duplicate entry ‘C20200221270306830’ for key ‘pack_doc_no_UNIQUE’"]
[2020/11/19 14:43:30.367 +08:00] [INFO] [loader.go:244] [“finish to restore dump sql file”] [task=voucher_header] [unit=load] [“worker ID”=1] [“data file”=dumped_data.voucher_header/finance.voucher_header_2127.0.sql]
[2020/11/19 14:43:30.367 +08:00] [INFO] [loader.go:227] [“start to restore dump sql file”] [task=voucher_header] [unit=load] [“worker ID”=1] [“data file”=dumped_data.voucher_header/finance.voucher_header_2115.26.sql]
[2020/11/19 14:43:30.367 +08:00] [INFO] [status.go:70] [“progress status of load”] [task=voucher_header] [unit=load] [finished_bytes=267325249] [total_bytes=5111111135] [total_file_count=188] [progress=“5.23 %”]
[2020/11/19 14:43:30.368 +08:00] [INFO] [checkpoint.go:341] [“initial checkpoint record”] [task=voucher_header] [unit=load] [component=“remote checkpoint”] [sql=“INSERT INTO meta_finance.voucher_header_loader_checkpoint (id, filename, cp_schema, cp_table, offset, end_pos) VALUES(?,?,?,?,?,?)”] [id=mysql-7491supplier] [filename=finance.voucher_header_2115.26.sql] [schema=finance] [table=voucher_header_2115] [offset=0] [“end position”=33554560]
[2020/11/19 14:43:30.368 +08:00] [INFO] [loader.go:148] [“context canceled, execution goroutine exits”] [task=voucher_header] [unit=load] [“worker ID”=1]
[2020/11/19 14:43:30.368 +08:00] [ERROR] [loader.go:274] [“fail to initial checkpoint”] [task=voucher_header] [unit=load] [“worker ID”=1] [“data file”=dumped_data.voucher_header/finance.voucher_header_2115.26.sql] [offset=0] [error="[code=10006:class=database:scope=downstream:level=high], Message: initialize checkpoint: execute statement failed: begin, RawCause: context canceled"]
[2020/11/19 14:43:30.368 +08:00] [INFO] [loader.go:126] [“start to close…”] [task=voucher_header] [unit=load] [“worker ID”=1]
[2020/11/19 14:43:30.368 +08:00] [INFO] [loader.go:129] [“closed !!!”] [task=voucher_header] [unit=load] [“worker ID”=1]
[2020/11/19 14:43:30.481 +08:00] [INFO] [loader.go:244] [“finish to restore dump sql file”] [task=voucher_header] [unit=load] [“worker ID”=0] [“data file”=dumped_data.voucher_header/finance.voucher_header_2210.14.sql]
[2020/11/19 14:43:30.481 +08:00] [INFO] [loader.go:199] [“context canceled, main goroutine exits”] [task=voucher_header] [unit=load] [“worker ID”=0]
[2020/11/19 14:43:30.481 +08:00] [INFO] [loader.go:126] [“start to close…”] [task=voucher_header] [unit=load] [“worker ID”=0]
[2020/11/19 14:43:30.481 +08:00] [INFO] [loader.go:129] [“closed !!!”] [task=voucher_header] [unit=load] [“worker ID”=0]
[2020/11/19 14:43:30.481 +08:00] [INFO] [loader.go:674] [“all data files have been finished”] [task=voucher_header] [unit=load] [“cost time”=1m39.961854792s]
[2020/11/19 14:43:30.481 +08:00] [INFO] [loader.go:716] [“stop importing data process”] [task=voucher_header] [unit=load]
[2020/11/19 14:43:30.481 +08:00] [INFO] [subtask.go:281] [“unit process returned”] [subtask=voucher_header] [unit=Load] [stage=Paused] [status="{“finishedBytes”:268325574,“totalBytes”:5111111135,“progress”:“5.25 %”,“metaBinlog”:"(my-binlog.001027, 867579075)"}"]
[2020/11/19 14:43:30.481 +08:00] [ERROR] [subtask.go:300] [“unit process error”] [subtask=voucher_header] [unit=Load] [“error information”="{“ErrCode”:10006,“ErrClass”:“database”,“ErrScope”:“downstream”,“ErrLevel”:“high”,“Message”:“file finance.voucher_header_2127.0.sql: execute statement failed: commit”,“RawCause”:“Error 1062: Duplicate entry ‘C20200221270306830’ for key ‘pack_doc_no_UNIQUE’”}"]

[2020/11/19 14:43:32.244 +08:00] [WARN] [task_checker.go:317] [“task can’t auto resume”] [component=“task checker”] [task=voucher_header]
[2020/11/19 14:43:37.244 +08:00] [WARN] [task_checker.go:314] [“task can’t auto resume”] [component=“task checker”] [task=voucher_header] [“paused duration”=4.999960279s]
[2020/11/19 14:43:42.244 +08:00] [WARN] [task_checker.go:314] [“task can’t auto resume”] [component=“task checker”] [task=voucher_header] [“paused duration”=9.99994022s]
[2020/11/19 14:43:47.244 +08:00] [WARN] [task_checker.go:314] [“task can’t auto resume”] [component=“task checker”] [task=voucher_header] [“paused duration”=14.999973116s]

这次报错提示的是唯一键冲突,应该是字段 pack_doc_no 上有重复数据:

“Error 1062: Duplicate entry ‘C20200221270306830’ for key ‘pack_doc_no_UNIQUE’”

你检查下上游是否存在 id 不同但 pack_doc_no 相同的记录。

检查了一遍,仅此一条

你在上游是根据 id 查询的还是 pack_doc_no 字段查询的?

pack_doc_no,pack_doc_no是唯一键

通过resume-task的方式,让全量同步继续同步,增量同步完成之后,tidb和mysql数据总条数是相等的

好的,我们这边再分析下,有结论了反馈给你。