为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。
- 【TiDB 版本】:
3.0.12 - 【问题描述】:
[2020/09/29 08:00:35.196 +08:00] [INFO] [region_cache.go:548] [“switch region leader to specific leader due to kv return NotLeader”] [regionID=421103] [currIdx=1] [leaderStoreID=20530]
为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。
2020/09/29 08:00:56.268 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_WAIT] resp_time:1.728901582s txnStartTS:419781058080538629 region_id:539987 store_addr: kv_wait_ms:1728"]
[2020/09/29 08:00:56.268 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_WAIT] resp_time:1.728806652s txnStartTS:419781058080538629 region_id:539987 store_addr: kv_wait_ms:1728"]
[2020/09/29 08:00:56.269 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_WAIT] resp_time:1.633645594s txnStartTS:419781058067431430 region_id:419943 store_addr: kv_wait_ms:1633"]
[2020/09/29 08:00:56.269 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_WAIT] resp_time:1.634110326s txnStartTS:419781058067431430 region_id:419923 store_addr: kv_wait_ms:1633"]
[2020/09/29 08:00:56.269 +08:00] [INFO] [coprocessor.go:743] ["[TIME_COP_PROCESS] resp_time:1.743473423s txnStartTS:419781058080538629 region_id:355992 store_addr: kv_process_ms:269 scan_total_write:70069 scan_processed_write:70068 scan_total_data:70068 scan_processed_data:70068 scan_total_lock:1 scan_processed_lock:0 kv_wait_ms:1474"]
[2020/09/29 08:01:16.044 +08:00] [WARN] [main.go:166] [“config file conf/tidb.toml contained unknown configuration options: pessimistic-txn.ttl”]
[2020/09/29 08:01:16.044 +08:00] [INFO] [printer.go:41] [“Welcome to TiDB.”] [“Release Version”=v3.0.12] [“Git Commit Hash”=8c4696b3f3408c61dd7454204ddd67c93501609a] [“Git Branch”=heads/refs/tags/v3.0.12] [“UTC Build Time”=“2020-03-16 09:56:22”] [GoVersion=“go version go1.13 linux/amd64”] [“Race Enabled”=false] [“Check Table Before Drop”=false] [“TiKV Min Version”=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]
[2020/09/29 08:01:16.045 +08:00] [INFO] [printer.go:54] [“loaded config”] [config="{“host”:“0.0.0.0”,"advertise-a ]
[2020/09/29 08:01:16.045 +08:00] [INFO] [main.go:255] [“disable Prometheus push client”]
[2020/09/29 08:01:16.045 +08:00] [INFO] [store.go:69] [“new store”] [path=tikv:/]
[2020/09/29 08:01:16.045 +08:00] [INFO] [client.go:162] ["[pd] create pd client with endpoints"] [pd-address="[]"]
[2020/09/29 08:01:16.045 +08:00] [INFO] [systime_mon.go:26] [“start system time monitor”]
[2020/09/29 08:01:16.047 +08:00] [INFO] [client.go:291] ["[pd] switch leader"] [new-leader=http://:2379] [old-leader=]
[2020/09/29 08:01:16.047 +08:00] [INFO] [client.go:186] ["[pd] init cluster id"] [cluster-id=6819249905545873576]
[2020/09/29 08:01:16.049 +08:00] [INFO] [store.go:75] [“new store with retry success”]
[2020/09/29 08:01:16.057 +08:00] [INFO] [tidb.go:72] [“new domain”] [store=tikv-6819249905545873576] [“ddl lease”=45s] [“stats lease”=3s]
[2020/09/29 08:01:16.061 +08:00] [INFO] [ddl.go:434] ["[ddl] start DDL"] [ID=32e1cb7c-c5ee-4320-bb37-d0758c845b2b] [runWorker=true]
[2020/09/29 08:01:16.100 +08:00] [INFO] [ddl.go:422] ["[ddl] start delRangeManager OK"] [“is a emulator”=false]
[2020/09/29 08:01:16.100 +08:00] [INFO] [ddl_worker.go:120] ["[ddl] start DDL worker"] [worker=“worker 1, tp general”]
[2020/09/29 08:01:16.100 +08:00] [INFO] [ddl_worker.go:120] ["[ddl] start DDL worker"] [worker=“worker 2, tp add index”]
[2020/09/29 08:01:17.502 +08:00] [INFO] [domain.go:141] [“full load InfoSchema success”] [usedSchemaVersion=0] [neededSchemaVersion=8209] [“start time”=1.26001042s]
[2020/09/29 08:01:17.516 +08:00] [INFO] [domain.go:350] [“full load and reset schema validator”]
[2020/09/29 08:01:17.555 +08:00] [INFO] [gc_worker.go:155] ["[gc worker] start"] [uuid=5d35ca75b44000c]
[2020/09/29 08:01:17.559 +08:00] [INFO] [server.go:220] [“server is running MySQL protocol”] [addr=0.0.0.0:4000]
[2020/09/29 08:01:17.561 +08:00] [INFO] [http_status.go:255] [“for status and metrics report”] [“listening on addr”=0.0.0.0:10080]
[2020/09/29 08:01:17.578 +08:00] [INFO] [server.go:367] [“new connection”] [conn=1] [remoteAddr=****:46512]
但是后面又发生了 pd switch leader
查了下,没有看到OOM的情况
我们先梳理一下问题:
1、tidb log 中出现 region cache not leader 的信息向上面那个同学的回复,tidb 会 backoff 理论上不影响使用
2、重启的问题具体是指什么?是 tidb server 因某些异常自动重启了吗?
3、pd switch leader 可以 pd-ctl member 确认下当前的 leader,并且过滤下所有 pd 节点的信息看下 raft 选举的日志,确认下当前 pd leader 是否真正的发生过切换~