v4.0.10启动报错

v4.0.10启动 执行tiup cluster start tidb-test命令报错:
Error: failed to start tidb: failed to start: 10.255.194.13 tidb-4000.service, please check the instance’s log(/tidb-deploy/tidb-4000/log) for more detail.: timed out waiting for port 4000 to be started after 2m0s

Verbose debug logs has been written to /root/.tiup/logs/tiup-cluster-debug-2022-05-10-17-56-04.log.
Error: run /root/.tiup/components/cluster/v1.5.0/tiup-cluster (wd:/root/.tiup/data/T5QJvfI) failed: exit status 1

有一篇帖子上说执行 tiup cluster destroy tidb-test 命令,并检查 pd deploy-dir 和 data dir 是否清理干净,执行 tiup cluster destroy tidb-test 命令时又报错:
Error: failed to stop: 10.255.194.13 node_exporter-9100.service, please check the instance’s log() for more detail.: timed out waiting for port 9100 to be stopped after 1m0s

Verbose debug logs has been written to /root/.tiup/logs/tiup-cluster-debug-2022-05-10-17-44-43.log.
Error: run /root/.tiup/components/cluster/v1.5.0/tiup-cluster (wd:/root/.tiup/data/T5QHCAv) failed: exit status 1

再执行 tiup cluster deploy tidb-test v4.0.10 topology.yaml ,报错:
Error: Cluster name ‘tidb-test’ is duplicated (deploy.name_dup)

Please specify another cluster name
Error: run /root/.tiup/components/cluster/v1.5.0/tiup-cluster (wd:/root/.tiup/data/T5QJZUI) failed: exit status 1

用find 命令没找到 deploy-dir ,另一个 是指/data/dir吗,/data/dir这个目录是没有的

请问这个问题怎么解决?

防火墙关了吗?软件刚部署?啥架构?单机部署集群?

没安装防火墙,x86,三台服务器 pd -->10, tidb --> 13, tikv --> 15,刚下的v4.0.10离线安装包

pd和tikv启动都是正常的,但是启动的时候pd有这样的错误日志

[WARN] [tidb_requests.go:47] [“failed to get tidb schema version”] []
[ERROR] [heartbeat_streams.go:122] [“send keepalive message fail”] [target-store-id=1] [error=EOF]
[WARN] [tidb_requests.go:47] [“failed to get tidb schema version”] []

你是怎么部署的,tidb节点部署在哪些节点,去看/tidb-deploy/tidb-4000/log这个目录下面的日志

tidb.log文件的最新记录:

[2022/05/11 10:07:40.548 +08:00] [INFO] [region_cache.go:600] [“mark store’s regions need be refill”] [store=10.255.194.15:20160]
[2022/05/11 10:07:40.548 +08:00] [INFO] [region_cache.go:619] [“switch region peer to next due to send request fail”] [current=“region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv”] [needReload=true] [error=“context deadline exceeded”]
[2022/05/11 10:07:41.589 +08:00] [INFO] [region_cache.go:414] [“invalidate current region, because others failed on same store”] [region=2] [store=10.255.194.15:20160]
[2022/05/11 10:07:47.091 +08:00] [WARN] [client_batch.go:228] [“init create streaming fail”] [target=10.255.194.15:20160] [error=“context deadline exceeded”]
[2022/05/11 10:07:52.091 +08:00] [INFO] [region_cache.go:1640] ["[liveness] request kv status fail"] [store=10.255.194.15:20180] [error=“Get http://10.255.194.15:20180/status: context deadline exceeded”]
[2022/05/11 10:07:52.091 +08:00] [INFO] [region_cache.go:600] [“mark store’s regions need be refill”] [store=10.255.194.15:20160]
[2022/05/11 10:07:52.091 +08:00] [INFO] [region_cache.go:619] [“switch region peer to next due to send request fail”] [current=“region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv”] [needReload=true] [error=“context deadline exceeded”]
[2022/05/11 10:07:53.700 +08:00] [INFO] [region_cache.go:414] [“invalidate current region, because others failed on same store”] [region=2] [store=10.255.194.15:20160]
[2022/05/11 10:07:53.700 +08:00] [WARN] [backoff.go:329] [“regionMiss backoffer.maxSleep 40000ms is exceeded, errors:\ epoch_not_match:<> at 2022-05-11T10:07:41.589573921+08:00\ send tikv request error: context deadline exceeded, ctx: region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv, try next peer later at 2022-05-11T10:07:52.092029831+08:00\ epoch_not_match:<> at 2022-05-11T10:07:53.700484171+08:00”]
[2022/05/11 10:07:53.700 +08:00] [FATAL] [session.go:1980] [“check bootstrapped failed”] [error="[tikv:9002]TiKV server timeout"] [stack=“github.com/pingcap/tidb/session.getStoreBootstrapVersion\ \t/home/jenkins/agent/workspace/tidb_v4.0.10/go/src/github.com/pingcap/tidb/session/session.go:1980\ github.com/pingcap/tidb/session.BootstrapSession\ \t/home/jenkins/agent/workspace/tidb_v4.0.10/go/src/github.com/pingcap/tidb/session/session.go:1765\ main.createStoreAndDomain\ \t/home/jenkins/agent/workspace/tidb_v4.0.10/go/src/github.com/pingcap/tidb/tidb-server/main.go:258\ main.main\ \t/home/jenkins/agent/workspace/tidb_v4.0.10/go/src/github.com/pingcap/tidb/tidb-server/main.go:179\ runtime.main\ \t/usr/local/go/src/runtime/proc.go:203”]
[2022/05/11 10:08:08.840 +08:00] [INFO] [printer.go:33] [“Welcome to TiDB.”] [“Release Version”=v4.0.10] [Edition=Community] [“Git Commit Hash”=dbade8cda4c5a329037746e171449e0a1dfdb8b3] [“Git Branch”=heads/refs/tags/v4.0.10] [“UTC Build Time”=“2021-01-15 02:59:27”] [GoVersion=go1.13] [“Race Enabled”=false] [“Check Table Before Drop”=false] [“TiKV Min Version”=v3.0.0-60965b006877ca7234adaced7890d7b029ed1306]
[2022/05/11 10:08:08.842 +08:00] [INFO] [printer.go:47] [“loaded config”] [config="{“host”:“0.0.0.0”,“advertise-address”:“10.255.194.13”,“port”:4000,“cors”:"",“store”:“tikv”,“path”:“10.255.194.10:2379”,“socket”:"",“lease”:“45s”,“run-ddl”:true,“split-table”:true,“token-limit”:1000,“oom-use-tmp-storage”:true,“tmp-storage-path”:"/tmp/1001_tidb/MC4wLjAuMDo0MDAwLzAuMC4wLjA6MTAwODA=/tmp-storage",“oom-action”:“log”,“mem-quota-query”:1073741824,“tmp-storage-quota”:-1,“enable-streaming”:false,“enable-batch-dml”:false,“lower-case-table-names”:2,“server-version”:"",“log”:{“level”:“info”,“format”:“text”,“disable-timestamp”:null,“enable-timestamp”:null,“disable-error-stack”:null,“enable-error-stack”:null,“file”:{“filename”:"/tidb-deploy/tidb-4000/log/tidb.log",“max-size”:300,“max-days”:0,“max-backups”:0},“enable-slow-log”:true,“slow-query-file”:“log/tidb_slow_query.log”,“slow-threshold”:300,“expensive-threshold”:10000,“query-log-max-len”:4096,“record-plan-in-slow-log”:1},“security”:{“skip-grant-table”:false,“ssl-ca”:"",“ssl-cert”:"",“ssl-key”:"",“require-secure-transport”:false,“cluster-ssl-ca”:"",“cluster-ssl-cert”:"",“cluster-ssl-key”:"",“cluster-verify-cn”:null},“status”:{“status-host”:“0.0.0.0”,“metrics-addr”:"",“status-port”:10080,“metrics-interval”:15,“report-status”:true,“record-db-qps”:false},“performance”:{“max-procs”:0,“max-memory”:0,“server-memory-quota”:0,“memory-usage-alarm-ratio”:0.8,“stats-lease”:“3s”,“stmt-count-limit”:5000,“feedback-probability”:0,“query-feedback-limit”:512,“pseudo-estimate-ratio”:0.8,“force-priority”:“NO_PRIORITY”,“bind-info-lease”:“3s”,“txn-entry-size-limit”:6291456,“txn-total-size-limit”:104857600,“tcp-keep-alive”:true,“cross-join”:true,“run-auto-analyze”:true,“agg-push-down-join”:false,“committer-concurrency”:16,“max-txn-ttl”:600000,“gogc”:100},“prepared-plan-cache”:{“enabled”:false,“capacity”:100,“memory-guard-ratio”:0.1},“opentracing”:{“enable”:false,“rpc-metrics”:false,“sampler”:{“type”:“const”,“param”:1,“sampling-server-url”:"",“max-operations”:0,“sampling-refresh-interval”:0},“reporter”:{“queue-size”:0,“buffer-flush-interval”:0,“log-spans”:false,“local-agent-host-port”:""}},“proxy-protocol”:{“networks”:"",“header-timeout”:5},“tikv-client”:{“grpc-connection-count”:4,“grpc-keepalive-time”:10,“grpc-keepalive-timeout”:3,“commit-timeout”:“41s”,“max-batch-size”:128,“overload-threshold”:200,“max-batch-wait-time”:0,“batch-wait-size”:8,“enable-chunk-rpc”:true,“region-cache-ttl”:600,“store-limit”:0,“store-liveness-timeout”:“5s”,“copr-cache”:{“enable”:false,“capacity-mb”:1000,“admission-max-ranges”:500,“admission-max-result-mb”:10,“admission-min-process-ms”:5}},“binlog”:{“enable”:false,“ignore-error”:false,“write-timeout”:“15s”,“binlog-socket”:"",“strategy”:“range”},“compatible-kill-query”:false,“plugin”:{“dir”:"",“load”:""},“pessimistic-txn”:{“enable”:true,“max-retry-count”:256},“check-mb4-value-in-utf8”:true,“max-index-length”:3072,“graceful-wait-before-shutdown”:0,“alter-primary-key”:false,“treat-old-version-utf8-as-utf8mb4”:true,“enable-table-lock”:false,“delay-clean-table-lock”:0,“split-region-max-num”:1000,“stmt-summary”:{“enable”:true,“enable-internal-query”:false,“max-stmt-count”:200,“max-sql-length”:4096,“refresh-interval”:1800,“history-size”:24},“repair-mode”:false,“repair-table-list”:[],“isolation-read”:{“engines”:[“tikv”,“tiflash”,“tidb”]},“max-server-connections”:0,“new_collations_enabled_on_first_bootstrap”:false,“experimental”:{“allow-expression-index”:false},“enable-collect-execution-info”:true,“skip-register-to-dashboard”:false,“enable-telemetry”:true}"]
[2022/05/11 10:08:08.842 +08:00] [INFO] [main.go:304] [“disable Prometheus push client”]
[2022/05/11 10:08:08.842 +08:00] [INFO] [store.go:68] [“new store”] [path=tikv://10.255.194.10:2379]
[2022/05/11 10:08:08.842 +08:00] [INFO] [client.go:167] ["[pd] create pd client with endpoints"] [pd-address="[10.255.194.10:2379]"]
[2022/05/11 10:08:08.842 +08:00] [INFO] [systime_mon.go:25] [“start system time monitor”]
[2022/05/11 10:08:08.851 +08:00] [INFO] [base_client.go:252] ["[pd] switch leader"] [new-leader=http://10.255.194.10:2379] [old-leader=]
[2022/05/11 10:08:08.851 +08:00] [INFO] [base_client.go:102] ["[pd] init cluster id"] [cluster-id=7096036012931376407]
[2022/05/11 10:08:08.857 +08:00] [INFO] [store.go:74] [“new store with retry success”]
[2022/05/11 10:08:13.860 +08:00] [WARN] [client_batch.go:228] [“init create streaming fail”] [target=10.255.194.15:20160] [error=“context deadline exceeded”]
[2022/05/11 10:08:18.861 +08:00] [INFO] [region_cache.go:1640] ["[liveness] request kv status fail"] [store=10.255.194.15:20180] [error=“Get http://10.255.194.15:20180/status: context deadline exceeded”]
[2022/05/11 10:08:18.861 +08:00] [INFO] [region_cache.go:600] [“mark store’s regions need be refill”] [store=10.255.194.15:20160]
[2022/05/11 10:08:18.861 +08:00] [INFO] [region_cache.go:619] [“switch region peer to next due to send request fail”] [current=“region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv”] [needReload=true] [error=“context deadline exceeded”]
[2022/05/11 10:08:18.932 +08:00] [INFO] [region_cache.go:414] [“invalidate current region, because others failed on same store”] [region=2] [store=10.255.194.15:20160]
[2022/05/11 10:08:23.936 +08:00] [WARN] [client_batch.go:228] [“init create streaming fail”] [target=10.255.194.15:20160] [error=“context deadline exceeded”]
[2022/05/11 10:08:28.937 +08:00] [INFO] [region_cache.go:1640] ["[liveness] request kv status fail"] [store=10.255.194.15:20180] [error=“Get http://10.255.194.15:20180/status: context deadline exceeded”]
[2022/05/11 10:08:28.937 +08:00] [INFO] [region_cache.go:600] [“mark store’s regions need be refill”] [store=10.255.194.15:20160]
[2022/05/11 10:08:28.937 +08:00] [INFO] [region_cache.go:619] [“switch region peer to next due to send request fail”] [current=“region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv”] [needReload=true] [error=“context deadline exceeded”]
[2022/05/11 10:08:29.097 +08:00] [INFO] [region_cache.go:414] [“invalidate current region, because others failed on same store”] [region=2] [store=10.255.194.15:20160]
[2022/05/11 10:08:34.103 +08:00] [WARN] [client_batch.go:228] [“init create streaming fail”] [target=10.255.194.15:20160] [error=“context deadline exceeded”]
[2022/05/11 10:08:39.103 +08:00] [INFO] [region_cache.go:1640] ["[liveness] request kv status fail"] [store=10.255.194.15:20180] [error=“Get http://10.255.194.15:20180/status: context deadline exceeded”]
[2022/05/11 10:08:39.104 +08:00] [INFO] [region_cache.go:600] [“mark store’s regions need be refill”] [store=10.255.194.15:20160]
[2022/05/11 10:08:39.104 +08:00] [INFO] [region_cache.go:619] [“switch region peer to next due to send request fail”] [current=“region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv”] [needReload=true] [error=“context deadline exceeded”]
[2022/05/11 10:08:39.326 +08:00] [INFO] [region_cache.go:414] [“invalidate current region, because others failed on same store”] [region=2] [store=10.255.194.15:20160]
[2022/05/11 10:08:44.336 +08:00] [WARN] [client_batch.go:228] [“init create streaming fail”] [target=10.255.194.15:20160] [error=“context deadline exceeded”]
[2022/05/11 10:08:49.336 +08:00] [INFO] [region_cache.go:1640] ["[liveness] request kv status fail"] [store=10.255.194.15:20180] [error=“Get http://10.255.194.15:20180/status: context deadline exceeded”]
[2022/05/11 10:08:49.336 +08:00] [INFO] [region_cache.go:600] [“mark store’s regions need be refill”] [store=10.255.194.15:20160]
[2022/05/11 10:08:49.336 +08:00] [INFO] [region_cache.go:619] [“switch region peer to next due to send request fail”] [current=“region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv”] [needReload=true] [error=“context deadline exceeded”]
[2022/05/11 10:08:50.098 +08:00] [INFO] [region_cache.go:414] [“invalidate current region, because others failed on same store”] [region=2] [store=10.255.194.15:20160]
[2022/05/11 10:08:55.115 +08:00] [WARN] [client_batch.go:228] [“init create streaming fail”] [target=10.255.194.15:20160] [error=“context deadline exceeded”]
[2022/05/11 10:09:00.116 +08:00] [INFO] [region_cache.go:1640] ["[liveness] request kv status fail"] [store=10.255.194.15:20180] [error=“Get http://10.255.194.15:20180/status: context deadline exceeded”]
[2022/05/11 10:09:00.116 +08:00] [INFO] [region_cache.go:600] [“mark store’s regions need be refill”] [store=10.255.194.15:20160]
[2022/05/11 10:09:00.116 +08:00] [INFO] [region_cache.go:619] [“switch region peer to next due to send request fail”] [current=“region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv”] [needReload=true] [error=“context deadline exceeded”]
[2022/05/11 10:09:01.273 +08:00] [INFO] [region_cache.go:414] [“invalidate current region, because others failed on same store”] [region=2] [store=10.255.194.15:20160]
[2022/05/11 10:09:06.307 +08:00] [WARN] [client_batch.go:228] [“init create streaming fail”] [target=10.255.194.15:20160] [error=“context deadline exceeded”]
[2022/05/11 10:09:11.307 +08:00] [INFO] [region_cache.go:1640] ["[liveness] request kv status fail"] [store=10.255.194.15:20180] [error=“Get http://10.255.194.15:20180/status: context deadline exceeded”]
[2022/05/11 10:09:11.307 +08:00] [INFO] [region_cache.go:600] [“mark store’s regions need be refill”] [store=10.255.194.15:20160]
[2022/05/11 10:09:11.307 +08:00] [INFO] [region_cache.go:619] [“switch region peer to next due to send request fail”] [current=“region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv”] [needReload=true] [error=“context deadline exceeded”]
[2022/05/11 10:09:12.742 +08:00] [INFO] [region_cache.go:414] [“invalidate current region, because others failed on same store”] [region=2] [store=10.255.194.15:20160]
[2022/05/11 10:09:17.808 +08:00] [WARN] [client_batch.go:228] [“init create streaming fail”] [target=10.255.194.15:20160] [error=“context deadline exceeded”]
[2022/05/11 10:09:22.808 +08:00] [INFO] [region_cache.go:1640] ["[liveness] request kv status fail"] [store=10.255.194.15:20180] [error=“Get http://10.255.194.15:20180/status: context deadline exceeded”]
[2022/05/11 10:09:22.808 +08:00] [INFO] [region_cache.go:600] [“mark store’s regions need be refill”] [store=10.255.194.15:20160]
[2022/05/11 10:09:22.809 +08:00] [INFO] [region_cache.go:619] [“switch region peer to next due to send request fail”] [current=“region ID: 2, meta: id:2 region_epoch:<conf_ver:1 version:1 > peers:<id:3 store_id:1 > , peer: id:3 store_id:1 , addr: 10.255.194.15:20160, idx: 0, reqStoreType: TiKvOnly, runStoreType: tikv”] [needReload=true] [error=“context deadline exceeded”]
[2022/05/11 10:09:24.002 +08:00] [INFO] [region_cache.go:414] [“invalidate current region, because others failed on same store”] [region=2] [store=10.255.194.15:20160]

tidb、tikv和pd分别在三台服务器上

[root@server-31a2b6b2-bc1e-454c-9564-cff1e078326b log]# tiup cluster display tidb-test
Starting component cluster: /root/.tiup/components/cluster/v1.5.0/tiup-cluster display tidb-test
Cluster type: tidb
Cluster name: tidb-test
Cluster version: v4.0.10
Deploy user: tidb
SSH type: builtin
Dashboard URL: http://10.255.194.10:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir


10.255.194.10:2379 pd 10.255.194.10 2379/2380 linux/x86_64 Up|L|UI /tidb-data/pd-2379 /tidb-deploy/pd-2379
10.255.194.13:4000 tidb 10.255.194.13 4000/10080 linux/x86_64 Down - /tidb-deploy/tidb-4000
10.255.194.15:20160 tikv 10.255.194.15 20160/20180 linux/x86_64 Up /tidb-data/tikv-20160 /tidb-deploy/tikv-20160
Total nodes: 3

pd tikv都得部署3个节点,你这都是一个节点。改一下部署拓扑重新部署吧

那如果我想单机部署呢,就最简单的部署,把这几个服务都放一台服务器上,都只有一个节点行不行

为什么执行 tiup cluster destroy tidb-test 命令销毁tidb-test也不行呢,这个集群实例销毁不掉

Error: failed to stop: 10.255.194.13 node_exporter-9100.service, please check the instance’s log() for more detail.: timed out waiting for port 9100 to be stopped after 1m0s

Verbose debug logs has been written to /root/.tiup/logs/tiup-cluster-debug-2022-05-11-10-40-51.log.
Error: run /root/.tiup/components/cluster/v1.5.0/tiup-cluster (wd:/root/.tiup/data/T5UP7AU) failed: exit status 1

销毁的话你可以尝试先 tiup cluster stop tidb-test 如果有什么报错可以把tidb的相关进程全都kill掉 之后再destroy

/////
Error: Cluster name ‘tidb-test’ is duplicated (deploy.name_dup)
////
你再次deploy的这个报错是因为destroy报错了没有执行成功

pd和tidb访问tikv都失败了,确认15节点是不是网络有问题,还是有端口访问限制

加上–force强制删除了

是端口的限制问题,换一台服务器就好了

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。