集群启动失败

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:

[TiDB 版本]
v1.3.1 tiup
Go Version: go1.13
Git Branch: release-1.3
GitHash: d51bd0c
集群版本:v4.0.0

[问题描述]
使用 tiup cluster start ti-cluster 启动集群时,报错:

Error: failed to start tidb: failed to start: tidb 47.114.34.237:4000, please check the instance’s log(/tidb-deploy/tidb-4000/log) for more detail.: timed out waiting for port 4000 to be started after 2m0s

集群无法启动。

查看 tidb-deploy/tidb-4000/log 发现如下信息:

[2021/01/24 23:34:13.124 +08:00] [INFO] [store.go:74] [“new store with retry success”]
[2021/01/24 23:34:18.125 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:34:23.190 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:34:28.295 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:34:33.503 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:34:39.097 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:34:45.286 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:34:51.864 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:34:58.064 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:35:04.595 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:35:10.725 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:35:17.630 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:35:23.917 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:35:30.320 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:35:36.464 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:35:42.651 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:35:49.470 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:35:56.381 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:36:03.199 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:36:03.199 +08:00] [WARN] [backoff.go:319] [“tikvRPC backoffer.maxSleep 20000ms is exceeded, errors:\ send tikv request error: context deadline exceeded, ctx: region ID: 5, meta: id:5 region_epoch:<conf_ver:2 version:1 > peers:<id:9 store_id:2 > , peer: id:9 store_id:2 , addr: 47.114.34.237:20162, idx: 0, try next peer later at 2021-01-24T23:35:49.471389903+08:00\ send tikv request error: context deadline exceeded, ctx: region ID: 5, meta: id:5 region_epoch:<conf_ver:2 version:1 > peers:<id:9 store_id:2 > , peer: id:9 store_id:2 , addr: 47.114.34.237:20162, idx: 0, try next peer later at 2021-01-24T23:35:56.382126184+08:00\ send tikv request error: context deadline exceeded, ctx: region ID: 5, meta: id:5 region_epoch:<conf_ver:2 version:1 > peers:<id:9 store_id:2 > , peer: id:9 store_id:2 , addr: 47.114.34.237:20162, idx: 0, try next peer later at 2021-01-24T23:36:03.199772158+08:00”]
[2021/01/24 23:36:03.199 +08:00] [FATAL] [session.go:1849] [“check bootstrapped failed”] [error="[tikv:9002]TiKV server timeout"] [stack=“github.com/pingcap/tidb/session.getStoreBootstrapVersion\ \t/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/session/session.go:1849\ github.com/pingcap/tidb/session.BootstrapSession\ \t/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/session/session.go:1649\ main.createStoreAndDomain\ \t/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/tidb-server/main.go:295\ main.main\ \t/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/tidb-server/main.go:181\ runtime.main\ \t/usr/local/go/src/runtime/proc.go:203”]
[2021/01/24 23:36:18.390 +08:00] [INFO] [printer.go:42] [“Welcome to TiDB.”] [“Release Version”=v4.0.0] [Edition=Community] [“Git Commit Hash”=689a6b6439ae7835947fcaccf329a3fc303986cb] [“Git Branch”=heads/refs/tags/v4.0.0] [“UTC Build Time”=“2020-05-28 01:37:40”] [GoVersion=go1.13] [“Race Enabled”=false] [“Check Table Before Drop”=false] [“TiKV Min Version”=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]
[2021/01/24 23:36:18.391 +08:00] [INFO] [printer.go:56] [“loaded config”] [config="{“host”:“0.0.0.0”,“advertise-address”:“47.114.34.237”,“port”:4000,“cors”:"",“store”:“tikv”,“path”:“47.114.34.237:2379”,“socket”:"",“lease”:“45s”,“run-ddl”:true,“split-table”:true,“token-limit”:1000,“oom-use-tmp-storage”:true,“tmp-storage-path”:"/tmp/1000_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage",“oom-action”:“log”,“mem-quota-query”:1073741824,“tmp-storage-quota”:-1,“enable-streaming”:false,“enable-batch-dml”:false,“lower-case-table-names”:2,“server-version”:"",“log”:{“level”:“info”,“format”:“text”,“disable-timestamp”:null,“enable-timestamp”:null,“disable-error-stack”:null,“enable-error-stack”:null,“file”:{“filename”:"/tidb-deploy/tidb-4000/log/tidb.log",“max-size”:300,“max-days”:0,“max-backups”:0},“enable-slow-log”:true,“slow-query-file”:“log/tidb_slow_query.log”,“slow-threshold”:300,“expensive-threshold”:10000,“query-log-max-len”:4096,“record-plan-in-slow-log”:1},“security”:{“skip-grant-table”:false,“ssl-ca”:"",“ssl-cert”:"",“ssl-key”:"",“require-secure-transport”:false,“cluster-ssl-ca”:"",“cluster-ssl-cert”:"",“cluster-ssl-key”:"",“cluster-verify-cn”:null},“status”:{“status-host”:“0.0.0.0”,“metrics-addr”:"",“status-port”:10080,“metrics-interval”:15,“report-status”:true,“record-db-qps”:false},“performance”:{“max-procs”:0,“max-memory”:0,“stats-lease”:“3s”,“stmt-count-limit”:5000,“feedback-probability”:0.05,“query-feedback-limit”:1024,“pseudo-estimate-ratio”:0.8,“force-priority”:“NO_PRIORITY”,“bind-info-lease”:“3s”,“txn-total-size-limit”:104857600,“tcp-keep-alive”:true,“cross-join”:true,“run-auto-analyze”:true,“agg-push-down-join”:false,“committer-concurrency”:16,“max-txn-ttl”:600000},“prepared-plan-cache”:{“enabled”:false,“capacity”:100,“memory-guard-ratio”:0.1},“opentracing”:{“enable”:false,“rpc-metrics”:false,“sampler”:{“type”:“const”,“param”:1,“sampling-server-url”:"",“max-operations”:0,“sampling-refresh-interval”:0},“reporter”:{“queue-size”:0,“buffer-flush-interval”:0,“log-spans”:false,“local-agent-host-port”:""}},“proxy-protocol”:{“networks”:"",“header-timeout”:5},“tikv-client”:{“grpc-connection-count”:4,“grpc-keepalive-time”:10,“grpc-keepalive-timeout”:3,“commit-timeout”:“41s”,“max-batch-size”:128,“overload-threshold”:200,“max-batch-wait-time”:0,“batch-wait-size”:8,“enable-chunk-rpc”:true,“region-cache-ttl”:600,“store-limit”:0,“store-liveness-timeout”:“120s”,“copr-cache”:{“enable”:false,“capacity-mb”:1000,“admission-max-result-mb”:10,“admission-min-process-ms”:5}},“binlog”:{“enable”:false,“ignore-error”:false,“write-timeout”:“15s”,“binlog-socket”:"",“strategy”:“range”},“compatible-kill-query”:false,“plugin”:{“dir”:"",“load”:""},“pessimistic-txn”:{“enable”:true,“max-retry-count”:256},“check-mb4-value-in-utf8”:true,“max-index-length”:3072,“alter-primary-key”:false,“treat-old-version-utf8-as-utf8mb4”:true,“enable-table-lock”:false,“delay-clean-table-lock”:0,“split-region-max-num”:1000,“stmt-summary”:{“enable”:true,“enable-internal-query”:false,“max-stmt-count”:200,“max-sql-length”:4096,“refresh-interval”:1800,“history-size”:24},“repair-mode”:false,“repair-table-list”:[],“isolation-read”:{“engines”:[“tikv”,“tiflash”,“tidb”]},“max-server-connections”:0,“new_collations_enabled_on_first_bootstrap”:false,“experimental”:{“allow-auto-random”:false,“allow-expression-index”:false}}"]
[2021/01/24 23:36:18.391 +08:00] [INFO] [main.go:341] [“disable Prometheus push client”]
[2021/01/24 23:36:18.391 +08:00] [INFO] [store.go:68] [“new store”] [path=tikv://47.114.34.237:2379]
[2021/01/24 23:36:18.391 +08:00] [INFO] [client.go:149] ["[pd] create pd client with endpoints"] [pd-address="[47.114.34.237:2379]"]
[2021/01/24 23:36:18.391 +08:00] [INFO] [systime_mon.go:25] [“start system time monitor”]
[2021/01/24 23:36:18.394 +08:00] [INFO] [base_client.go:242] ["[pd] switch leader"] [new-leader=http://47.114.34.237:2379] [old-leader=]
[2021/01/24 23:36:18.394 +08:00] [INFO] [base_client.go:92] ["[pd] init cluster id"] [cluster-id=6921348209864108048]
[2021/01/24 23:36:18.394 +08:00] [INFO] [store.go:74] [“new store with retry success”]
[2021/01/24 23:36:23.396 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:36:28.456 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:36:33.594 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:36:38.843 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:36:44.439 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:36:50.692 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:36:56.944 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:37:03.298 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:37:09.699 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:37:16.106 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:37:22.410 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:37:28.716 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]
[2021/01/24 23:37:34.870 +08:00] [WARN] [client_batch.go:223] [“init create streaming fail”] [target=47.114.34.237:20162] [error=“context deadline exceeded”]


若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

  1. 系统session 调大了吗?
  2. tiup cluster display 看一下 tikv 和pd是否都正常