REPLACE INTO sql入库失败

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:v3.0.14
  • 【问题描述】:dm版本为1.0.5

背景: mysql数据通过dm组件实时同步写入tidb集群中。dm配置文件中开启了 safe-mode: true 参数 。 tidb集群环境开启了 tidb_disable_txn_auto_retry = off 参数 。

问题: 帮确认下该sql写入tidb失败的原因是什么呢? 该怎么排查呢 ? 有什么可参数的参数来避免该问题出现吗 ? 多谢

dm-worker.log日志

[2020/06/02 15:46:04.230 +08:00] [WARN] [db.go:285] ["execute transaction"] [task=prod_order_dm] [unit="binlog replication"] [query="[REPLACE INTO `pt_order`.`mt_order` (`pid`,`id`,`union_order`,`flag`,`buyer_id`,`recv_mobile`,`mall_id`,`goods_id`,`goods_icon`,`goods_sku_id`,`goods_spec_desc`,`goods_snap_id`,`goods_group_id`,`goods_group_size`,`out_sku_sn`,`goods_unit_price`,`goods_quantity`,`goods_ship_limit_seconds`,`goods_cat_id`,`success_time`,`pay_success_time`,`process_status`,`finished_status`,`finish_on_refund_end`,`refund_status`,`lock_status`,`lucky_draw_id`,`lucky_deadline`,`group_order_id`,`group_deadline`,`ship_deadline`,`buyer_remark`,`recv_type`,`auto_sign`,`forbid_refund`,`address_id`,`address_snap_id`,`custom_addr_id`,`mall_confirm_deadline`,`mall_confirm_time`,`buyer_confirm_deadline`,`buyer_confirm_time`,`buyer_confirm_delay_count`,`review_deadline`,`buyer_review_count`,`payment_deadline`,`payment_money_type`,`payment_total`,`payment_express`,`payment_system_discount`,`payment_activity_discount`,`payment_mall_discount`,`mall_coins_discount`,`system_coins_discount`,`payment_cash_card`,`payment_method`,`payment_id`,`coupon..."] [argument="[[283241175 1369122624497565732 0 1 19734075 15985319923 140088 41182287 oss/1/202005/f89107659f1d5d7d-4626x3084.jpg 1341848979  1588938194721970006-92049482 0 0  3200 1 172800 22140 2020-06-02 12:21:08 2020-06-02 12:21:08 4 0 0 0 1 0 0000-00-00 00:00:00 0 0000-00-00 00:00:00 0000-00-00 00:00:00  0 0 0 74721946 1591071667866932459-94023338 0 0000-00-00 00:00:00 0000-00-00 00:00:00 0000-00-00 00:00:00 0000-00-00 00:00:00 0 0000-00-00 00:00:00 0 2020-06-02 12:51:07 0 0 0 3200 0 0 0 0 0 0 0 平台券[无门槛券]  0 0 0 0  0 0 0 0 8 0 0  0 60.219.33.107  0 2020-06-02 12:21:07 2020-06-02 12:21:08 19 1250 20200602] [114223982 1369122624497565732 19734075 140088 0  3 2020-06-02 12:21:08 2020-06-02 12:21:08  0   0 0 0  0 0 0 0 0 0] [283241175] [283241175 1369122624497565732 0 1 19734075 15985319923 140088 41182287 oss/1/202005/f89107659f1d5d7d-4626x3084.jpg 1341848979  1588938194721970006-92049482 0 0  3200 1 172800 22140 2020-06-02 12:21:08 2020-06-02 12:21:08 4 0 0 0 1 0 0000-00-00 00:00:00 0 0000-00-00 00:00:00 ..."] ["cost time"=3.473030143s]
tidb.log日志

[2020/06/02 15:46:04.215 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=7, indexValues={41182287, 15985319923, 4, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.213:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.215:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.215:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.214:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.214:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.215:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.214:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [session.go:478] [sql] [conn=252] [label=general] [error="[kv:9007]Write conflict, txnStartTS=417093113808683011, conflictStartTS=417093113808683013, conflictCommitTS=417093113808683013, key={tableID=373, indexID=7, indexValues={41182287, 15985319923, 4, 2882303761844518947, }} primary={tableID=371, indexID=1, indexValues={114223982, 19734075, }} [try again later]"] [txn="Txn{state=invalid}"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [session.go:678] [retrying] [conn=252] [schemaVersion=206] [retryCnt=0] [queryNum=0] [sql="START TRANSACTION"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [session.go:678] [retrying] [conn=252] [schemaVersion=206] [retryCnt=0] [queryNum=1] [sql="REPLACE INTO `pt_order`.`mt_order` (`pid`,`id`,`union_order`,`flag`,`buyer_id`,`recv_mobile`,`mall_id`,`goods_id`,`goods_icon`,`goods_sku_id`,`goods_spec_desc`,`goods_snap_id`,`goods_group_id`,`goods_group_size`,`out_sku_sn`,`goods_unit_price`,`goods_quantity`,`goods_ship_limit_seconds`,`goods_cat_id`,`success_time`,`pay_success_time`,`process_status`,`finished_status`,`finish_on_refund_end`,`refund_status`,`lock_status`,`lucky_draw_id`,`lucky_deadline`,`group_order_id`,`group_deadline`,`ship_deadline`,`buyer_remark`,`recv_type`,`auto_sign`,`forbid_refund`,`address_id`,`address_snap_id`,`custom_addr_id`,`mall_confirm_deadline`,`mall_confirm_time`,`buyer_confirm_deadline`,`buyer_confirm_time`,`buyer_confirm_delay_count`,`review_deadline`,`buyer_review_count`,`payment_deadline`,`payment_money_type`,`payment_total`,`payment_express`,`payment_system_discount`,`payment_activity_discount`,`payment_mall_discount`,`mall_coins_discount`,`system_coins_discount`,`payment_cash_card`,`payment_method`,`payment_id`,`coupon_(len:2048)"]
[2020/06/02 15:46:04.219 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=1, indexValues={283241175, }}, primary: {tableID=371, indexID=1, indexValues={114223982, 19734075, }}, txnStartTS: 417093113808683011, ttl: 3005, type: Put"]
[2020/06/02 15:46:04.221 +08:00] [INFO] [2pc.go:1119] ["2PC clean up done"] [conn=252] [txnStartTS=417093113808683011]
[2020/06/02 15:46:04.221 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, handle=2882303761844518947}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.223 +08:00] [WARN] [session.go:678] [retrying] [conn=252] [schemaVersion=206] [retryCnt=0] [queryNum=2] [sql="REPLACE INTO `pt_order`.`extra` (`id`,`order_id`,`buyer_id`,`mall_id`,`mall_sign_reward_coin`,`out_order_id`,`source`,`create_at`,`update_at`,`lock_reason`,`lock_handler_id`,`tb_open_id`,`tb_nickname`,`is_prime`,`prime_discount`,`account_type`,`sub_mch_id`,`sep_plat_service_fee`,`sep_mall_amount`,`sep_mt_service_fee`,`real_payment_fee`,`deduct_mall_service_fee`,`deduct_mt_service_fee`) VALUES (114223982,'1369122624497565732','19734075','140088',0,'','3','2020-06-02 12:21:08','2020-06-02 12:21:08','',0,'','',0,'0',0,'',0,0,0,0,0,0)"]
[2020/06/02 15:46:04.224 +08:00] [WARN] [session.go:678] [retrying] [conn=252] [schemaVersion=206] [retryCnt=0] [queryNum=3] [sql="DELETE FROM `pt_order`.`mt_order` WHERE `pid` = '283241175' LIMIT 1"]
[2020/06/02 15:46:04.224 +08:00] [WARN] [session.go:678] [retrying] [conn=252] [schemaVersion=206] [retryCnt=0] [queryNum=4] [sql="REPLACE INTO `pt_order`.`mt_order` (`pid`,`id`,`union_order`,`flag`,`buyer_id`,`recv_mobile`,`mall_id`,`goods_id`,`goods_icon`,`goods_sku_id`,`goods_spec_desc`,`goods_snap_id`,`goods_group_id`,`goods_group_size`,`out_sku_sn`,`goods_unit_price`,`goods_quantity`,`goods_ship_limit_seconds`,`goods_cat_id`,`success_time`,`pay_success_time`,`process_status`,`finished_status`,`finish_on_refund_end`,`refund_status`,`lock_status`,`lucky_draw_id`,`lucky_deadline`,`group_order_id`,`group_deadline`,`ship_deadline`,`buyer_remark`,`recv_type`,`auto_sign`,`forbid_refund`,`address_id`,`address_snap_id`,`custom_addr_id`,`mall_confirm_deadline`,`mall_confirm_time`,`buyer_confirm_deadline`,`buyer_confirm_time`,`buyer_confirm_delay_count`,`review_deadline`,`buyer_review_count`,`payment_deadline`,`payment_money_type`,`payment_total`,`payment_express`,`payment_system_discount`,`payment_activity_discount`,`payment_mall_discount`,`mall_coins_discount`,`system_coins_discount`,`payment_cash_card`,`payment_method`,`payment_id`,`coupon_(len:2048)"]
[2020/06/02 15:46:04.225 +08:00] [WARN] [session.go:678] [retrying] [conn=252] [schemaVersion=206] [retryCnt=0] [queryNum=5] [sql=COMMIT]
[2020/06/02 15:46:04.225 +08:00] [WARN] [session.go:699] ["transaction association"] [conn=252] ["retrying txnStartTS"=417093114713079811] ["original txnStartTS"=417093113808683011]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=12, indexValues={0, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=21, indexValues={140088, 0, 0, 0, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=14, indexValues={15985319923, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=13, indexValues={, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=4, indexValues={0, 4, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=8, indexValues={0, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=9, indexValues={0, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=17, indexValues={140088, 1848310130177736704, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=3, indexValues={0, 4, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.226 +08:00] [INFO] [lock_resolver.go:487] ["resolveLock rollback"] [lock="key: {tableID=373, indexID=2, indexValues={1369122624497565732, 2882303761844518947, }}, primary: {tableID=373, indexID=1, indexValues={283241175, }}, txnStartTS: 417093113808683013, ttl: 3002, type: Del"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.211:20460] [cause="context canceled"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.213:20460] [cause="context canceled"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.213:20460] [cause="context canceled"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.213:20460] [cause="context canceled"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.214:20460] [cause="context canceled"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.213:20460] [cause="context canceled"]
[2020/06/02 15:46:04.294 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.295 +08:00] [WARN] [session.go:478] [sql] [conn=162] [label=general] [error="[kv:9007]Write conflict, txnStartTS=417093113808683013, conflictStartTS=417093114713079811, conflictCommitTS=417093114713079813, key={tableID=373, indexID=10, indexValues={126427536, 2882303761844518947, }} primary={tableID=373, indexID=1, indexValues={283241175, }} [try again later]"] [txn="Txn{state=invalid}"]
[2020/06/02 15:46:04.295 +08:00] [WARN] [session.go:678] [retrying] [conn=162] [schemaVersion=206] [retryCnt=0] [queryNum=0] [sql="START TRANSACTION"]
[2020/06/02 15:46:04.295 +08:00] [WARN] [session.go:678] [retrying] [conn=162] [schemaVersion=206] [retryCnt=0] [queryNum=1] [sql="DELETE FROM `pt_order`.`mt_order` WHERE `pid` = '283241175' LIMIT 1"]
[2020/06/02 15:46:04.296 +08:00] [WARN] [session.go:678] [retrying] [conn=162] [schemaVersion=206] [retryCnt=0] [queryNum=2] [sql=COMMIT]
[2020/06/02 15:46:04.296 +08:00] [WARN] [session.go:699] ["transaction association"] [conn=162] ["retrying txnStartTS"=417093114726187014] ["original txnStartTS"=417093113808683013]
[2020/06/02 15:46:04.297 +08:00] [INFO] [2pc.go:1119] ["2PC clean up done"] [conn=162] [txnStartTS=417093113808683013]

详见附件

dm-worker.log (9.6 KB)

tidb.log (31.0 KB)

请问,query-status当前是什么状态?

是 running 状态 ,任务正常运行 ,没出现异常 。

谢谢

看报错是提交时,遇到写入冲突了,你又开启了参数,应该会重试。可以观察下query-status,或者查一下下游是否插入数据,多谢。

query-status 目前看都是running状态,没出现过异常 。

一共有3个tidb节点,报错的这个tidb节点只有dm实时同步源mysql时才会有写入,没有业务读;其他2个tidb节点没有写场景只有读场景 。所以这个问题帮看看该如何排查呢 ,多谢

你好,

此 replace into 是否可在 tidb 中执行,可以试下,语句中出现 ‘0000-00-00 00:00:00’ 看下 tidb sql_mode 是否满足。

这个sql在tidb中重试执行应该是没问题的,每天这个表有百万级的更新和写入,断断续续的会出现几次这种报错,导致数据入tidb失败。

目前的处理方案是:从报错的日志中拿到表的主键id,去上游库中人肉触发一次更新,下游tidb即可正常写入成功 , sql逻辑如下 :

update table_name set update_time = 原更新时间+1秒 where id = xxxx ;

感谢

你好,

可以将 tidb_disable_txn_auto_retry 设置为 on,replace into 也会出现写写冲突,

目前是否有 query-status 报错的完整记录,可以上传看下。对应 tidb.log 和 dm-worker.log 是否有 error 字样,或者将日志截取上传看下。

tidb_disable_txn_auto_retry 设置为 on 的话,这个sql在写冲突时就不会重试了吧 。

默认重试10次, 下面这种是重试的日志吗 ?

对应 tidb.log 和 dm-worker.log 在帖子最上方第一次发帖时已进行了上传,请查询 。 感谢


[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.213:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.215:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.215:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.214:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.214:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.215:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:590] ["wait response is cancelled"] [to=192.168.17.212:20460] [cause="context canceled"]
[2020/06/02 15:46:04.217 +08:00] [WARN] [client_batch.go:577] ["send request is cancelled"] [to=192.168.17.214:20460] [cause="context canceled"]

建议关闭重试,dm-worker 这边有自己的重试机制,并降低 syncer 的并发度。

看是否有所缓解,并观察降低并发是否满足同步需求。

嗯 ,这个可以一试 。不过看文档说明,在出现写入冲突时,建议是打开tidb的事务重试吧?

我们目前syncer的并发度是16 ,应该不高吧?

我们一个yml配置文件里有多个mysql-instances做合库合表,那么这里的并发16,是多个mysql-instances对应的dm-worker共用的么? 还是每个dm-worker都是16并发呢 ? 帮确认下

多谢

syncers:
  global:
    worker-count: 16
    batch: 100
    max-retry: 100
    safe-mode: true

如下图,一个yml配置文件下有多个dm-worker子任务

在 dm 环境下不建议开启事务重试,dm 有重试的机制,

所有 source-id 公用 16 线程。

safe_mode 在集群正常状态下不建议一直开启,replace into 还是有性能损耗的。

可以先按照上面你的优化方法做一下,观察下。

好的 我们试一下看看效果 多谢

:ok_hand: