修改内存使用策略导致 TiDB自动下线后 无法启动

系统版本 & kernel 版本

CentOS Linux release 7.6.1810 (Core)

4.20.10-1.el7.elrepo.x86_64

TiDB 版本

3.0.5

集群节点

---- ---- ----
tidb 1台 1节点
tikv 1台 3节点
pd 3台 3节点

我做了什么

1. 我使用 mydumper 向TiDB中导入大量的数据
2. 修改操作系统的内存使用策略,防止TiDB使用内存过大导致机器宕机。
sysctl -w vm.overcommit_ratio=90

sysctl -p

echo 2 > /proc/sys/vm/overcommit_memory

我在导入数据的同时,进行修改内存策略的配置,导致TiDB客户端自动下线; 然后在重启TiDB就会失败

TiDB Log

[2019/11/11 16:36:47.912 +08:00] [INFO] [region_cache.go:287] ["invalidate current region, because others failed on same store"] [region=820] [store=192.168.181.57:20172]
[2019/11/11 16:36:47.915 +08:00] [INFO] [server.go:413] ["new connection"] [conn=213] [remoteAddr=192.168.180.32:52698]
[2019/11/11 16:36:48.058 +08:00] [INFO] [2pc.go:293] ["[BIG_TXN]"] [con=183] ["table ID"=49] [size=1938292] [keys=17949] [puts=17949] [dels=0] [locks=0] [txnStartTS=412473467242545155]
[2019/11/11 16:36:48.147 +08:00] [INFO] [server.go:416] ["connection closed"] [conn=212]
[2019/11/11 16:36:48.165 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467124842498] ["original txn"=412473439835389962] [error="[kv:9007]Write conflict, txnStartTS=412473467124842498, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.165 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467098365956] ["original txn"=412473463022026756] [error="[kv:9007]Write conflict, txnStartTS=412473467098365956, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.167 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466901757953] ["original txn"=412473456979607558] [error="[kv:9007]Write conflict, txnStartTS=412473466901757953, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.167 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467124842499] ["original txn"=412473467124842499] [error="[kv:9007]Write conflict, txnStartTS=412473467124842499, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.167 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467294973957] ["original txn"=412473467294973957] [error="[kv:9007]Write conflict, txnStartTS=412473467294973957, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.197 +08:00] [INFO] [2pc.go:293] ["[BIG_TXN]"] [con=171] ["table ID"=49] [size=3064994] [keys=39089] [puts=39089] [dels=0] [locks=0] [txnStartTS=412473467281866755]
[2019/11/11 16:36:48.263 +08:00] [INFO] [server.go:416] ["connection closed"] [conn=213]
[2019/11/11 16:36:48.293 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467098365956]
[2019/11/11 16:36:48.379 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467124842498]
[2019/11/11 16:36:48.402 +08:00] [INFO] [region_cache.go:287] ["invalidate current region, because others failed on same store"] [region=72] [store=192.168.181.57:20172]
[2019/11/11 16:36:48.515 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466901757953]
[2019/11/11 16:36:48.638 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467294973957]
[2019/11/11 16:36:52.571 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467124842499]
[2019/11/11 16:36:54.314 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467347402757] ["original txn"=412473467294973957] [error="[kv:6]Error: KV error safe to retry tikv restarts txn: Txn(Mvcc(TxnLockNotFound { start_ts: 412473467347402757, commit_ts: 412473468527050753, key: [109, 68, 66, 58, 57, 53, 0, 0, 0, 252, 0, 0, 0, 0, 0, 0, 0, 72] })) [try again later]"]
[2019/11/11 16:36:54.612 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467347402757]
[2019/11/11 16:36:54.616 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467347402756] ["original txn"=412473467124842499] [error="[kv:9007]Write conflict, txnStartTS=412473467347402756, conflictStartTS=412473467347402757, conflictCommitTS=412473467347402757, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:54.715 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_WAIT] resp_time:304.535824ms txnStartTS:412473468985802753 region_id:72 store_addr:192.168.181.57:20172 kv_wait_ms:299"]
[2019/11/11 16:36:55.180 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467347402756]
[2019/11/11 16:36:55.259 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473468959588353] ["original txn"=412473467294973957] [error="[kv:9007]Write conflict, txnStartTS=412473468959588353, conflictStartTS=412473467360509954, conflictCommitTS=412473469129981953, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:55.260 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467360509953] ["original txn"=412473456979607558] [error="[kv:9007]Write conflict, txnStartTS=412473467360509953, conflictStartTS=412473467360509954, conflictCommitTS=412473469129981953, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:55.260 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473469038231556] ["original txn"=412473467124842499] [error="[kv:9007]Write conflict, txnStartTS=412473469038231556, conflictStartTS=412473467360509954, conflictCommitTS=412473469129981953, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:55.497 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473468959588353]
[2019/11/11 16:36:55.497 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467373617153] ["original txn"=412473439835389962] [error="[kv:9007]Write conflict, txnStartTS=412473467373617153, conflictStartTS=412473467360509954, conflictCommitTS=412473469129981953, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:55.636 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467360509953]
[2019/11/11 16:36:55.673 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473469038231556]

TiKV Log

[2019/11/11 16:37:41.297 +08:00] [INFO] [raft.rs:1108] ["[region 1584] 1586 [logterm: 6, index: 8, vote: 0] cast MsgRequestVote for 1587 [logterm: 6, index: 8] at term 7"]
[2019/11/11 16:37:41.304 +08:00] [INFO] [pd.rs:566] ["try to transfer leader"] [to_peer="id: 1535 store_id: 5"] [from_peer="id: 1534 store_id: 4"] [region_id=1532]
[2019/11/11 16:37:41.304 +08:00] [INFO] [peer.rs:1762] ["transfer leader"] [peer="id: 1535 store_id: 5"] [peer_id=1534] [region_id=1532]
[2019/11/11 16:37:41.304 +08:00] [INFO] [raft.rs:1294] ["[region 1532] 1534 [term 6] starts to transfer leadership to 1535"]
[2019/11/11 16:37:41.304 +08:00] [INFO] [raft.rs:1304] ["[region 1532] 1534 sends MsgTimeoutNow to 1535 immediately as 1535 already has up-to-date log"]
[2019/11/11 16:37:41.306 +08:00] [INFO] [raft.rs:924] ["[region 1532] 1534 [term: 6] received a MsgRequestVote message with higher term from 1535 [term: 7]"]
[2019/11/11 16:37:41.306 +08:00] [INFO] [raft.rs:723] ["[region 1532] 1534 became follower at term 7"]
[2019/11/11 16:37:41.306 +08:00] [INFO] [raft.rs:1108] ["[region 1532] 1534 [logterm: 6, index: 6, vote: 0] cast MsgRequestVote for 1535 [logterm: 6, index: 6] at term 7"]
[2019/11/11 16:37:41.314 +08:00] [INFO] [pd.rs:566] ["try to transfer leader"] [to_peer="id: 1955 store_id: 5"] [from_peer="id: 1954 store_id: 4"] [region_id=1952]
[2019/11/11 16:37:41.314 +08:00] [INFO] [peer.rs:1762] ["transfer leader"] [peer="id: 1955 store_id: 5"] [peer_id=1954] [region_id=1952]
[2019/11/11 16:37:41.314 +08:00] [INFO] [raft.rs:1294] ["[region 1952] 1954 [term 6] starts to transfer leadership to 1955"]
[2019/11/11 16:37:41.314 +08:00] [INFO] [raft.rs:1304] ["[region 1952] 1954 sends MsgTimeoutNow to 1955 immediately as 1955 already has up-to-date log"]
[2019/11/11 16:37:41.317 +08:00] [INFO] [raft.rs:924] ["[region 1952] 1954 [term: 6] received a MsgRequestVote message with higher term from 1955 [term: 7]"]
[2019/11/11 16:37:41.317 +08:00] [INFO] [raft.rs:723] ["[region 1952] 1954 became follower at term 7"]
[2019/11/11 16:37:41.317 +08:00] [INFO] [raft.rs:1108] ["[region 1952] 1954 [logterm: 6, index: 8, vote: 0] cast MsgRequestVote for 1955 [logterm: 6, index: 8] at term 7"]
[2019/11/11 16:38:13.341 +08:00] [INFO] [pd.rs:566] ["try to transfer leader"] [to_peer="id: 1015 store_id: 5"] [from_peer="id: 1014 store_id: 4"] [region_id=1012]
[2019/11/11 16:38:13.341 +08:00] [INFO] [peer.rs:1762] ["transfer leader"] [peer="id: 1015 store_id: 5"] [peer_id=1014] [region_id=1012]
[2019/11/11 16:38:13.341 +08:00] [INFO] [raft.rs:1294] ["[region 1012] 1014 [term 6] starts to transfer leadership to 1015"]
[2019/11/11 16:38:13.341 +08:00] [INFO] [raft.rs:1304] ["[region 1012] 1014 sends MsgTimeoutNow to 1015 immediately as 1015 already has up-to-date log"]
[2019/11/11 16:38:13.380 +08:00] [INFO] [raft.rs:924] ["[region 1012] 1014 [term: 6] received a MsgRequestVote message with higher term from 1015 [term: 7]"]
[2019/11/11 16:38:13.380 +08:00] [INFO] [raft.rs:723] ["[region 1012] 1014 became follower at term 7"]
[2019/11/11 16:38:13.380 +08:00] [INFO] [raft.rs:1108] ["[region 1012] 1014 [logterm: 6, index: 3415, vote: 0] cast MsgRequestVote for 1015 [logterm: 6, index: 3415] at term 7"]
[2019/11/11 16:38:13.505 +08:00] [INFO] [compact.rs:118] ["compact range finished"] [time_takes=514.809261ms] [cf=lock] [range_end=None] [range_start=None]
[2019/11/11 17:17:12.667 +08:00] [ERROR] [kv.rs:731] ["KvService::batch_raft send response fail"] [err=RemoteStopped]
[2019/11/11 17:17:12.673 +08:00] [INFO] [signal_handler.rs:21] ["receive signal 15, stopping server..."]
[2019/11/11 17:17:12.673 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=snap-handler]
[2019/11/11 17:17:12.675 +08:00] [ERROR] [kv.rs:731] ["KvService::batch_raft send response fail"] [err=RemoteStopped]
[2019/11/11 17:17:12.676 +08:00] [INFO] [status_server.rs:32] ["stopping status server"]
[2019/11/11 17:17:12.676 +08:00] [INFO] [node.rs:357] ["stop raft store thread"] [store_id=4]
[2019/11/11 17:17:12.676 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=split-check]
[2019/11/11 17:17:12.676 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=snapshot-worker]
[2019/11/11 17:17:12.676 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=raft-gc-worker]
[2019/11/11 17:17:12.676 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=compact-worker]
[2019/11/11 17:17:12.676 +08:00] [INFO] [future.rs:170] ["stoping worker"] [worker=pd-worker]
[2019/11/11 17:17:12.676 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=consistency-check]
[2019/11/11 17:17:12.676 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=cleanup-sst]
[2019/11/11 17:17:12.676 +08:00] [INFO] [batch.rs:422] ["shutdown batch system apply"]
[2019/11/11 17:17:12.676 +08:00] [INFO] [router.rs:432] ["broadcasting shutdown"]
[2019/11/11 17:17:12.678 +08:00] [INFO] [batch.rs:428] ["batch system apply is stopped."]
[2019/11/11 17:17:12.678 +08:00] [INFO] [batch.rs:422] ["shutdown batch system raftstore-4"]
[2019/11/11 17:17:12.678 +08:00] [INFO] [router.rs:432] ["broadcasting shutdown"]
[2019/11/11 17:17:12.684 +08:00] [INFO] [batch.rs:428] ["batch system raftstore-4 is stopped."]
[2019/11/11 17:17:12.686 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=region-collector-worker]
[2019/11/11 17:17:12.687 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=addr-resolver]
[2019/11/11 17:17:12.687 +08:00] [INFO] [future.rs:170] ["stoping worker"] [worker=waiter-manager]
[2019/11/11 17:17:12.687 +08:00] [INFO] [future.rs:170] ["stoping worker"] [worker=deadlock-detector]
[2019/11/11 17:17:12.689 +08:00] [WARN] [raft_client.rs:118] ["batch_raft RPC finished fail"] [err="RpcFinished(Some(RpcStatus { status: Internal, details: Some("Received RST_STREAM with error code 2") }))"]
[2019/11/11 17:17:12.689 +08:00] [WARN] [raft_client.rs:118] ["batch_raft RPC finished fail"] [err="RpcFinished(Some(RpcStatus { status: Internal, details: Some("Received RST_STREAM with error code 2") }))"]
[2019/11/11 17:17:12.689 +08:00] [INFO] [gc_worker.rs:730] ["gc-manager is stopped"]
[2019/11/11 17:17:12.689 +08:00] [WARN] [raft_client.rs:132] ["batch_raft/raft RPC finally fail"] [err="RpcFinished(Some(RpcStatus { status: Internal, details: Some("Received RST_STREAM with error code 2") }))"] [to_addr=192.168.181.57:20171]
[2019/11/11 17:17:12.689 +08:00] [WARN] [raft_client.rs:132] ["batch_raft/raft RPC finally fail"] [err="RpcFinished(Some(RpcStatus { status: Internal, details: Some("Received RST_STREAM with error code 2") }))"] [to_addr=192.168.181.57:20172]
[2019/11/11 17:17:12.690 +08:00] [INFO] [mod.rs:374] ["stoping worker"] [worker=gc-worker]
[2019/11/11 17:17:12.690 +08:00] [INFO] [mod.rs:673] ["Storage stopped."]
[2019/11/11 17:17:12.870 +08:00] [ERROR] [readpool_impl.rs:97] ["Failed to send read pool read flow statistics"] [err="channel has been closed"]
[2019/11/11 17:17:12.870 +08:00] [ERROR] [readpool_impl.rs:97] ["Failed to send read pool read flow statistics"] [err="channel has been closed"]
[2019/11/11 17:17:12.870 +08:00] [ERROR] [readpool_impl.rs:97] ["Failed to send read pool read flow statistics"] [err="channel has been closed"]
[2019/11/11 17:17:12.872 +08:00] [ERROR] [readpool_impl.rs:97] ["Failed to send read pool read flow statistics"] [err="channel has been closed"]
[2019/11/11 17:17:12.872 +08:00] [ERROR] [readpool_impl.rs:97] ["Failed to send read pool read flow statistics"] [err="channel has been closed"]
[2019/11/11 17:17:12.872 +08:00] [ERROR] [readpool_impl.rs:97] ["Failed to send read pool read flow statistics"] [err="channel has been closed"]
[2019/11/11 17:17:12.872 +08:00] [ERROR] [readpool_impl.rs:97] ["Failed to send read pool read flow statistics"] [err="channel has been closed"]

PD Log

[2019/11/11 17:17:18.370 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {192.168.181.56:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.181.56:2379: connect: connection refused". Reconnecting..."]
[2019/11/11 17:17:18.370 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {192.168.181.56:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.181.56:2379: connect: connection refused". Reconnecting..."]
[2019/11/11 17:17:18.370 +08:00] [WARN] [grpclog.go:60] ["grpc: addrConn.createTransport failed to connect to {192.168.181.56:2379 0  <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp 192.168.181.56:2379: connect: connection refused". Reconnecting..."]
[2019/11/11 17:17:18.844 +08:00] [INFO] [server.go:1410] ["leadership transfer finished"] [local-member-id=9802567ee6ff1939] [old-leader-member-id=9802567ee6ff1939] [new-leader-member-id=89b6ed455660bf8c] [took=500.307658ms]
[2019/11/11 17:17:18.845 +08:00] [INFO] [peer.go:333] ["stopping remote peer"] [remote-peer-id=89b6ed455660bf8c]
[2019/11/11 17:17:18.845 +08:00] [WARN] [stream.go:290] ["closed TCP streaming connection with remote peer"] [stream-writer-type="stream MsgApp v2"] [remote-peer-id=89b6ed455660bf8c]
[2019/11/11 17:17:18.845 +08:00] [WARN] [stream.go:300] ["stopped TCP streaming connection with remote peer"] [stream-writer-type="stream MsgApp v2"] [remote-peer-id=89b6ed455660bf8c]
[2019/11/11 17:17:18.846 +08:00] [WARN] [stream.go:290] ["closed TCP streaming connection with remote peer"] [stream-writer-type="stream Message"] [remote-peer-id=89b6ed455660bf8c]
[2019/11/11 17:17:18.846 +08:00] [WARN] [stream.go:300] ["stopped TCP streaming connection with remote peer"] [stream-writer-type="stream Message"] [remote-peer-id=89b6ed455660bf8c]
[2019/11/11 17:17:18.846 +08:00] [INFO] [pipeline.go:86] ["stopped HTTP pipelining with remote peer"] [local-member-id=9802567ee6ff1939] [remote-peer-id=89b6ed455660bf8c]
[2019/11/11 17:17:18.846 +08:00] [WARN] [stream.go:435] ["lost TCP streaming connection with remote peer"] [stream-reader-type="stream MsgApp v2"] [local-member-id=9802567ee6ff1939] [remote-peer-id=89b6ed455660bf8c] [error="read tcp 192.168.181.54:39966->192.168.181.55:2380: use of closed network connection"]
[2019/11/11 17:17:18.846 +08:00] [INFO] [stream.go:458] ["stopped stream reader with remote peer"] [stream-reader-type="stream MsgApp v2"] [local-member-id=9802567ee6ff1939] [remote-peer-id=89b6ed455660bf8c]
[2019/11/11 17:17:18.846 +08:00] [WARN] [stream.go:435] ["lost TCP streaming connection with remote peer"] [stream-reader-type="stream Message"] [local-member-id=9802567ee6ff1939] [remote-peer-id=89b6ed455660bf8c] [error="read tcp 192.168.181.54:39964->192.168.181.55:2380: use of closed network connection"]
[2019/11/11 17:17:18.846 +08:00] [INFO] [stream.go:458] ["stopped stream reader with remote peer"] [stream-reader-type="stream Message"] [local-member-id=9802567ee6ff1939] [remote-peer-id=89b6ed455660bf8c]
[2019/11/11 17:17:18.846 +08:00] [INFO] [peer.go:340] ["stopped remote peer"] [remote-peer-id=89b6ed455660bf8c]
[2019/11/11 17:17:18.846 +08:00] [INFO] [peer.go:333] ["stopping remote peer"] [remote-peer-id=8f727d06fdb9b401]
[2019/11/11 17:17:18.847 +08:00] [WARN] [stream.go:290] ["closed TCP streaming connection with remote peer"] [stream-writer-type="stream MsgApp v2"] [remote-peer-id=8f727d06fdb9b401]
[2019/11/11 17:17:18.847 +08:00] [WARN] [stream.go:300] ["stopped TCP streaming connection with remote peer"] [stream-writer-type="stream MsgApp v2"] [remote-peer-id=8f727d06fdb9b401]
[2019/11/11 17:17:18.847 +08:00] [WARN] [stream.go:290] ["closed TCP streaming connection with remote peer"] [stream-writer-type="stream Message"] [remote-peer-id=8f727d06fdb9b401]
[2019/11/11 17:17:18.847 +08:00] [WARN] [stream.go:300] ["stopped TCP streaming connection with remote peer"] [stream-writer-type="stream Message"] [remote-peer-id=8f727d06fdb9b401]
[2019/11/11 17:17:18.847 +08:00] [INFO] [pipeline.go:86] ["stopped HTTP pipelining with remote peer"] [local-member-id=9802567ee6ff1939] [remote-peer-id=8f727d06fdb9b401]
[2019/11/11 17:17:18.847 +08:00] [INFO] [stream.go:458] ["stopped stream reader with remote peer"] [stream-reader-type="stream MsgApp v2"] [local-member-id=9802567ee6ff1939] [remote-peer-id=8f727d06fdb9b401]
[2019/11/11 17:17:18.847 +08:00] [INFO] [stream.go:458] ["stopped stream reader with remote peer"] [stream-reader-type="stream Message"] [local-member-id=9802567ee6ff1939] [remote-peer-id=8f727d06fdb9b401]
[2019/11/11 17:17:18.847 +08:00] [INFO] [peer.go:340] ["stopped remote peer"] [remote-peer-id=8f727d06fdb9b401]
[2019/11/11 17:17:18.847 +08:00] [WARN] [http.go:439] ["failed to find remote peer in cluster"] [local-member-id=9802567ee6ff1939] [remote-peer-id-stream-handler=9802567ee6ff1939] [remote-peer-id-from=89b6ed455660bf8c] [cluster-id=a9b179f280244e43]
[2019/11/11 17:17:18.847 +08:00] [WARN] [http.go:439] ["failed to find remote peer in cluster"] [local-member-id=9802567ee6ff1939] [remote-peer-id-stream-handler=9802567ee6ff1939] [remote-peer-id-from=89b6ed455660bf8c] [cluster-id=a9b179f280244e43]
[2019/11/11 17:17:18.849 +08:00] [INFO] [etcd.go:553] ["stopping serving peer traffic"] [address=192.168.181.54:2380]
[2019/11/11 17:17:18.849 +08:00] [INFO] [etcd.go:560] ["stopped serving peer traffic"] [address=192.168.181.54:2380]
[2019/11/11 17:17:18.849 +08:00] [INFO] [etcd.go:362] ["closed etcd server"] [name=pd_dev25] [data-dir=/home/tidb/deploy/data.pd] [advertise-peer-urls="[http://192.168.181.54:2380]"] [advertise-client-urls="[http://192.168.181.54:2379]"]
[2019/11/11 17:17:18.849 +08:00] [INFO] [server.go:283] ["close server"]
  • TiDB 退出,是否有什么报错信息。能否发下呢?
  • tidb_stderr.log 中是否有日志输出,如果有,辛苦发下呢。

tidb_stderr.log

......
goroutine 479541 [select]:
github.com/pingcap/tidb/store/tikv.sendBatchRequest(0x230d1e0, 0xc016b8dd00, 0xc0004ca7c0, 0x14, 0xc000dab1a0, 0xc2449225d0, 0x4a817c800, 0x0, 0x0, 0x0)
        github.com/pingcap/tidb@/store/tikv/client_batch.go:533 +0x6b0
github.com/pingcap/tidb/store/tikv.(*rpcClient).SendRequest(0xc00003ed20, 0x230d1e0, 0xc016b8dd00, 0xc0004ca7c0, 0x14, 0xc0155eb040, 0x4a817c800, 0x0, 0x0, 0x0)
        github.com/pingcap/tidb@/store/tikv/client.go:281 +0x9cb
github.com/pingcap/tidb/store/tikv.(*RegionRequestSender).sendReqToRegion(0xc01c4b3b08, 0xc01c42c460, 0xc00914c780, 0xc0155eb040, 0x4a817c800, 0xc00914c780, 0x0, 0x0, 0xc086214fdb)
        github.com/pingcap/tidb@/store/tikv/region_request.go:145 +0xb4
github.com/pingcap/tidb/store/tikv.(*RegionRequestSender).SendReqCtx(0xc01c4b3b08, 0xc01c42c460, 0xc0155eb040, 0x514, 0x5, 0x13a, 0x4a817c800, 0xc06b235688, 0x13, 0x9fc12, ...)
        github.com/pingcap/tidb@/store/tikv/region_request.go:116 +0xdd
github.com/pingcap/tidb/store/tikv.(*RegionRequestSender).SendReq(...)
        github.com/pingcap/tidb@/store/tikv/region_request.go:72
github.com/pingcap/tidb/store/tikv.(*tikvStore).SendReq(0xc00033e0f0, 0xc01c42c460, 0xc0155eb040, 0x514, 0x5, 0x13a, 0x4a817c800, 0x98b0, 0xc0155eb040, 0xc0908e)
        github.com/pingcap/tidb@/store/tikv/kv.go:367 +0xdf
github.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter).prewriteSingleBatch(0xc004720960, 0xc01c42c460, 0x514, 0x5, 0x13a, 0xc04a29db90, 0xdb, 0x6c2f, 0xc016b8dd00, 0xc0055f6a70)
        github.com/pingcap/tidb@/store/tikv/2pc.go:575 +0xb0f
github.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter).doActionOnBatches.func1(0x1, 0xc2c7142708, 0xc000f5a8a0, 0xc2c71426f8, 0x514, 0x5, 0x13a, 0xc04a29db90, 0xdb, 0x6c2f, ...)
        github.com/pingcap/tidb@/store/tikv/2pc.go:486 +0x122
created by github.com/pingcap/tidb/store/tikv.(*twoPhaseCommitter).doActionOnBatches
        github.com/pingcap/tidb@/store/tikv/2pc.go:469 +0x21f
......

tidb.log 没有异常信息啊

[2019/11/11 16:36:46.023 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473465224036360]
[2019/11/11 16:36:46.023 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473465145393153] ["original txn"=412473439835389962] [error="[kv:9007]Write conflict, txnStartTS=412473465145393153, conflictStartTS=412473465224036360, conflictCommitTS=412473465224036360, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:46.053 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466757578753] ["original txn"=412473466757578753] [error="[kv:9007]Write conflict, txnStartTS=412473466757578753, conflictStartTS=412473466744471556, conflictCommitTS=412473466757578757, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x32, 0x38, 0x33, 0x0, 0x0, 0xfd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x32, 0x38, 0x33, 0x0, 0x0, 0xfd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:46.092 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466757578753]
[2019/11/11 16:36:46.092 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473465276465154] ["original txn"=412473463022026756] [error="[kv:9007]Write conflict, txnStartTS=412473465276465154, conflictStartTS=412473466665828354, conflictCommitTS=412473466783793153, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:46.092 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466665828353] ["original txn"=412473439704317967] [error="[kv:9007]Write conflict, txnStartTS=412473466665828353, conflictStartTS=412473466665828354, conflictCommitTS=412473466783793153, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:46.159 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473465145393153]
[2019/11/11 16:36:46.160 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466757578755] ["original txn"=412473466757578755] [error="[kv:9007]Write conflict, txnStartTS=412473466757578755, conflictStartTS=412473466744471556, conflictCommitTS=412473466757578757, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x32, 0x38, 0x33, 0x0, 0x0, 0xfd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x32, 0x38, 0x33, 0x0, 0x0, 0xfd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:46.206 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473465276465154]
[2019/11/11 16:36:46.287 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466757578755]
[2019/11/11 16:36:46.287 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466665828353]
[2019/11/11 16:36:46.287 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466757578760] ["original txn"=412473464712855554] [error="[kv:9007]Write conflict, txnStartTS=412473466757578760, conflictStartTS=412473466665828354, conflictCommitTS=412473466783793153, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:46.287 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466823114757] ["original txn"=412473466757578755] [error="[kv:9007]Write conflict, txnStartTS=412473466823114757, conflictStartTS=412473466796900355, conflictCommitTS=412473466823114758, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x32, 0x38, 0x33, 0x0, 0x0, 0xfd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x32, 0x38, 0x33, 0x0, 0x0, 0xfd, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:46.361 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466823114757]
[2019/11/11 16:36:46.362 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466705149953] ["original txn"=412473456979607558] [error="[kv:9007]Write conflict, txnStartTS=412473466705149953, conflictStartTS=412473466665828354, conflictCommitTS=412473466783793153, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:46.396 +08:00] [INFO] [2pc.go:293] ["[BIG_TXN]"] [con=163] ["table ID"=49] [size=1670115] [keys=16351] [puts=16351] [dels=0] [locks=0] [txnStartTS=412473466823114755]
[2019/11/11 16:36:46.549 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466757578760]
[2019/11/11 16:36:46.570 +08:00] [INFO] [2pc.go:293] ["[BIG_TXN]"] [con=184] ["table ID"=49] [size=1363314] [keys=11353] [puts=11353] [dels=0] [locks=0] [txnStartTS=412473466862436355]
[2019/11/11 16:36:46.851 +08:00] [INFO] [region_cache.go:287] ["invalidate current region, because others failed on same store"] [region=472] [store=192.168.181.57:20172]
[2019/11/11 16:36:46.953 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466705149953]
[2019/11/11 16:36:47.185 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466810007558] ["original txn"=412473439835389962] [error="[kv:9007]Write conflict, txnStartTS=412473466810007558, conflictStartTS=412473466810007561, conflictCommitTS=412473466875543555, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:47.185 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466810007559] ["original txn"=412473463022026756] [error="[kv:9007]Write conflict, txnStartTS=412473466810007559, conflictStartTS=412473466810007561, conflictCommitTS=412473466875543555, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:47.185 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466862436353] ["original txn"=412473464712855554] [error="[kv:9007]Write conflict, txnStartTS=412473466862436353, conflictStartTS=412473466810007561, conflictCommitTS=412473466875543555, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:47.517 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466810007558]
[2019/11/11 16:36:47.533 +08:00] [INFO] [region_cache.go:287] ["invalidate current region, because others failed on same store"] [region=836] [store=192.168.181.57:20172]
[2019/11/11 16:36:47.612 +08:00] [INFO] [region_cache.go:287] ["invalidate current region, because others failed on same store"] [region=1328] [store=192.168.181.57:20172]
[2019/11/11 16:36:47.631 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466810007559]
[2019/11/11 16:36:47.725 +08:00] [INFO] [2pc.go:293] ["[BIG_TXN]"] [con=156] ["table ID"=49] [size=1555812] [keys=11917] [puts=11917] [dels=0] [locks=0] [txnStartTS=412473467006615555]
[2019/11/11 16:36:47.758 +08:00] [INFO] [region_cache.go:287] ["invalidate current region, because others failed on same store"] [region=864] [store=192.168.181.57:20172]
[2019/11/11 16:36:47.763 +08:00] [INFO] [server.go:413] ["new connection"] [conn=212] [remoteAddr=192.168.180.32:52696]
[2019/11/11 16:36:47.869 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466862436353]
[2019/11/11 16:36:47.912 +08:00] [INFO] [region_cache.go:287] ["invalidate current region, because others failed on same store"] [region=820] [store=192.168.181.57:20172]
[2019/11/11 16:36:47.915 +08:00] [INFO] [server.go:413] ["new connection"] [conn=213] [remoteAddr=192.168.180.32:52698]
[2019/11/11 16:36:48.058 +08:00] [INFO] [2pc.go:293] ["[BIG_TXN]"] [con=183] ["table ID"=49] [size=1938292] [keys=17949] [puts=17949] [dels=0] [locks=0] [txnStartTS=412473467242545155]
[2019/11/11 16:36:48.147 +08:00] [INFO] [server.go:416] ["connection closed"] [conn=212]
[2019/11/11 16:36:48.165 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467124842498] ["original txn"=412473439835389962] [error="[kv:9007]Write conflict, txnStartTS=412473467124842498, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.165 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467098365956] ["original txn"=412473463022026756] [error="[kv:9007]Write conflict, txnStartTS=412473467098365956, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.167 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473466901757953] ["original txn"=412473456979607558] [error="[kv:9007]Write conflict, txnStartTS=412473466901757953, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.167 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467124842499] ["original txn"=412473467124842499] [error="[kv:9007]Write conflict, txnStartTS=412473467124842499, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.167 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467294973957] ["original txn"=412473467294973957] [error="[kv:9007]Write conflict, txnStartTS=412473467294973957, conflictStartTS=412473467098365955, conflictCommitTS=412473467294973958, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:48.197 +08:00] [INFO] [2pc.go:293] ["[BIG_TXN]"] [con=171] ["table ID"=49] [size=3064994] [keys=39089] [puts=39089] [dels=0] [locks=0] [txnStartTS=412473467281866755]
[2019/11/11 16:36:48.263 +08:00] [INFO] [server.go:416] ["connection closed"] [conn=213]
[2019/11/11 16:36:48.293 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467098365956]
[2019/11/11 16:36:48.379 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467124842498]
[2019/11/11 16:36:48.402 +08:00] [INFO] [region_cache.go:287] ["invalidate current region, because others failed on same store"] [region=72] [store=192.168.181.57:20172]
[2019/11/11 16:36:48.515 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473466901757953]
[2019/11/11 16:36:48.638 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467294973957]
[2019/11/11 16:36:52.571 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467124842499]
[2019/11/11 16:36:54.314 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467347402757] ["original txn"=412473467294973957] [error="[kv:6]Error: KV error safe to retry tikv restarts txn: Txn(Mvcc(TxnLockNotFound { start_ts: 412473467347402757, commit_ts: 412473468527050753, key: [109, 68, 66, 58, 57, 53, 0, 0, 0, 252, 0, 0, 0, 0, 0, 0, 0, 72] })) [try again later]"]
[2019/11/11 16:36:54.612 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467347402757]
[2019/11/11 16:36:54.616 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467347402756] ["original txn"=412473467124842499] [error="[kv:9007]Write conflict, txnStartTS=412473467347402756, conflictStartTS=412473467347402757, conflictCommitTS=412473467347402757, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:54.715 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_WAIT] resp_time:304.535824ms txnStartTS:412473468985802753 region_id:72 store_addr:192.168.181.57:20172 kv_wait_ms:299"]
[2019/11/11 16:36:55.180 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467347402756]
[2019/11/11 16:36:55.259 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473468959588353] ["original txn"=412473467294973957] [error="[kv:9007]Write conflict, txnStartTS=412473468959588353, conflictStartTS=412473467360509954, conflictCommitTS=412473469129981953, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:55.260 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467360509953] ["original txn"=412473456979607558] [error="[kv:9007]Write conflict, txnStartTS=412473467360509953, conflictStartTS=412473467360509954, conflictCommitTS=412473469129981953, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:55.260 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473469038231556] ["original txn"=412473467124842499] [error="[kv:9007]Write conflict, txnStartTS=412473469038231556, conflictStartTS=412473467360509954, conflictCommitTS=412473469129981953, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:55.497 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473468959588353]
[2019/11/11 16:36:55.497 +08:00] [WARN] [txn.go:69] [RunInNewTxn] ["retry txn"=412473467373617153] ["original txn"=412473439835389962] [error="[kv:9007]Write conflict, txnStartTS=412473467373617153, conflictStartTS=412473467360509954, conflictCommitTS=412473469129981953, key=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} primary=[]byte{0x6d, 0x44, 0x42, 0x3a, 0x39, 0x35, 0x0, 0x0, 0x0, 0xfc, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x48} [try again later]"]
[2019/11/11 16:36:55.636 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473467360509953]
[2019/11/11 16:36:55.673 +08:00] [INFO] [2pc.go:1039] ["2PC clean up done"] [txnStartTS=412473469038231556]
  • tidb_stderr.log 文件发来看看,真正错误应该在上面。

tidb_stderr.log (3.6 MB)

出了这个问题

我执行了 ansible-playbook bootstrap.yml --extra-vars "dev_mode=True"

  • 看报错日志中,TIDB 退出原因是内存 OOM 了。
  • 启动报错,是在启动过程中,会进行系统配置的检查,由于你修改了 vm.cvercommit_memory 这个配置,导致启动报错失败

是因为 我设置 echo 2 > /proc/sys/vm/overcommit_memory TiDB是不允许设置为2的 所以我将 操作系统的内存分配策略改为了 0 echo 0 > /proc/sys/vm/overcommit_memory

改回0之后,当前启动成功了吗? 是否还存在问题?

改回0以后重启tidb集群就好了,问题解决了

:+1::+1::+1: