【 TiDB 使用环境】生产环境,4个tikv节点
【 TiDB 版本】tikv 6.1
【复现路径】
监听cdc, 根据监听到的cdc内容,再使用txnkv往tikv写入
在上述操作进行的同时进行着range delete操作
【遇到的问题:问题现象及影响】
txn往tikv写入时client端报错:
[ERROR] [commit.go:182] [“2PC failed commit key after primary key committed”] [error=“Error(Txn(Error(Mvcc(Error(TxnLockNotFound { start_ts: TimeStamp(439331133747363841), commit_ts: TimeStamp(439331134009508037), key: [0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 88, 195, 0, 0, 1, 132, 128, 217, 196, 16, 151, 5, 146, 83, 118, 61, 0, 0] })))))”] [errorVerbose=“Error(Txn(Error(Mvcc(Error(TxnLockNotFound { start_ts: TimeStamp(439331133747363841), commit_ts: TimeStamp(439331134009508037), key: [0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 88, 195, 0, 0, 1, 132, 128, 217, 196, 16, 151, 5, 146, 83, 118, 61, 0, 0] })))))\ngithub.com/tikv/client-go/v2/error.ExtractKeyErr\n\t/go/pkg/mod/github.com/tikv/client-go/v2@v2.0.1-0.20220531092439-efebaeb9fe53/error/error.go:259\ngithub.com/tikv/client-go/v2/txnkv/transaction.actionCommit.handleSingleBatch\n\t/go/pkg/mod/github.com/tikv/client-go/v2@v2.0.1-0.20220531092439-efebaeb9fe53/txnkv/transaction/commit.go:171\ngithub.com/tikv/client-go/v2/txnkv/transaction.(*batchExecutor).startWorker.func1\n\t/go/pkg/mod/github.com/tikv/client-go/v2@v2.0.1-0.20220531092439-efebaeb9fe53/txnkv/transaction/2pc.go:1993\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1571”] [txnStartTS=439331133747363841] [commitTS=439331134009508037] [keys=“[000000000000000b00004e1900000186345e80b0fee6f0a456410000,000000000000000b000058c30000018480d9c41097059253763d0000,000000000000000b00005a1f000001863424e69897059253762c0000]”] [stack=“github.com/tikv/client-go/v2/txnkv/transaction.actionCommit.handleSingleBatch\n\t/go/pkg/mod/github.com/tikv/client-go/v2@v2.0.1-0.20220531092439-efebaeb9fe53/txnkv/transaction/commit.go:182\ngithub.com/tikv/client-go/v2/txnkv/transaction.(*batchExecutor).startWorker.func1\n\t/go/pkg/mod/github.com/tikv/client-go/v2@v2.0.1-0.20220531092439-efebaeb9fe53/txnkv/transaction/2pc.go:1993”]
然后从这个时候开始,服务就开始监听不到cdc的信息,打开监控发现某一台机器的min resolved ts一直滞后且不变
看golang tikv client代码,貌似报 2PC failed commit key after primary key committed 这个错可能是个很严重的bug?
【资源配置】
【附件:截图/日志/监控】
ticdc监控
图中绿色就是一直滞后的,到后面自己恢复了
tikv相关日志
[INFO] [commit.rs:67] [“txn conflict (lock not found)”] [commit_ts=439331134009508037] [start_ts=439331133747363841] [key=000000000000000BFF000058C300000184FF80D9C41097059253FF763D000000000000FB]
[WARN] [errors.rs:339] [“txn conflicts”] [err=“Error(Txn(Error(Mvcc(Error(TxnLockNotFound { start_ts: TimeStamp(439331133747363841), commit_ts: TimeStamp(439331134009508037), key: [0, 0, 0, 0, 0, 0, 0, 11, 0, 0, 88, 195, 0, 0, 1, 132, 128, 217, 196, 16, 151, 5, 146, 83, 118, 61, 0, 0] })))))”]