/etc/fstab里每一项都添加了节点 还是不行
不是要每一项都加,需要满足 tidb 装载要求的才加,而且每个环境都不一样,只能依靠你自己排查
正常情况是不需要用 / 根目录的。都是装在挂载的文件系统下面的,你把 yaml 配置文件完善一下吧。把 data_dir 和 deploy_dir 都设置一下,不要使用 / 根目录。
您的意思是只需要data_dir 和 deploy_dir所在的盘挂载吗
是的,你不是三块盘,挂载了三个文件系统/data1, /data2, /data。
你把安装目录还有数据目录都放在你挂载的盘上。对这些盘设置 nodelalloc 和noatime 参数就行了。不要去动根目录。
你的yaml文件指定一下data_dir 和 deploy_dir,tiup 应该就不会去检查 / 了。
apply以后,显示如下,启动集群还是显示check bootstrapped failed
Node Check Result Message
---- ----- ------ -------
172.17.0.157 memory Pass memory size is 32768MB
172.17.0.157 selinux Pass SELinux is disabled
172.17.0.157 command Pass numactl: policy: default
172.17.0.157 timezone Pass time zone is the same as the first PD machine: Etc/UTC
172.17.0.157 cpu-cores Pass number of CPU cores / threads: 24
172.17.0.157 cpu-governor Pass CPU frequency governor is performance
172.17.0.157 service Pass service firewalld not found, ignore
172.17.0.157 os-version Warn OS is Ubuntu 18.04.6 LTS 18.04.6 (ubuntu support is not fullsted, be careful), auto fixing not supported
172.17.0.157 network Pass network speed of eno1 is 1000MB
172.17.0.157 network Pass network speed of eno2 is 1000MB
172.17.0.157 network Pass network speed of eno3 is 1000MB
172.17.0.157 network Pass network speed of eno4 is 1000MB
172.17.0.157 thp Pass THP is disabled
172.17.0.158 cpu-governor Pass CPU frequency governor is performance
172.17.0.158 network Pass network speed of eno1 is 1000MB
172.17.0.158 network Pass network speed of eno2 is 1000MB
172.17.0.158 network Pass network speed of eno3 is 1000MB
172.17.0.158 network Pass network speed of eno4 is 1000MB
172.17.0.158 selinux Pass SELinux is disabled
172.17.0.158 command Pass numactl: policy: default
172.17.0.158 timezone Pass time zone is the same as the first PD machine: Etc/UTC
172.17.0.158 os-version Warn OS is Ubuntu 18.04.6 LTS 18.04.6 (ubuntu support is not fullsted, be careful), auto fixing not supported
172.17.0.158 cpu-cores Pass number of CPU cores / threads: 24
172.17.0.158 memory Pass memory size is 32768MB
172.17.0.158 thp Pass THP is disabled
172.17.0.158 service Pass service firewalld not found, ignore
172.17.0.159 os-version Warn OS is Ubuntu 18.04.6 LTS 18.04.6 (ubuntu support is not fullsted, be careful), auto fixing not supported
172.17.0.159 selinux Pass SELinux is disabled
172.17.0.159 thp Pass THP is disabled
172.17.0.159 service Pass service firewalld not found, ignore
172.17.0.159 command Pass numactl: policy: default
172.17.0.159 timezone Pass time zone is the same as the first PD machine: Etc/UTC
172.17.0.159 cpu-cores Pass number of CPU cores / threads: 24
172.17.0.159 cpu-governor Pass CPU frequency governor is performance
172.17.0.159 memory Pass memory size is 32768MB
172.17.0.159 network Pass network speed of eno4 is 1000MB
172.17.0.159 network Pass network speed of eno1 is 1000MB
172.17.0.159 network Pass network speed of eno2 is 1000MB
172.17.0.159 network Pass network speed of eno3 is 1000MB
172.17.0.154 network Pass network speed of eno1 is 1000MB
172.17.0.154 network Pass network speed of eno2 is 1000MB
172.17.0.154 network Pass network speed of eno3 is 1000MB
172.17.0.154 network Pass network speed of eno4 is 1000MB
172.17.0.154 selinux Pass SELinux is disabled
172.17.0.154 thp Pass THP is disabled
172.17.0.154 command Pass numactl: policy: default
172.17.0.154 os-version Warn OS is Ubuntu 18.04.6 LTS 18.04.6 (ubuntu support is not fullsted, be careful), auto fixing not supported
172.17.0.154 cpu-cores Pass number of CPU cores / threads: 24
172.17.0.154 cpu-governor Pass CPU frequency governor is performance
172.17.0.154 memory Pass memory size is 32768MB
172.17.0.155 cpu-cores Pass number of CPU cores / threads: 24
172.17.0.155 network Pass network speed of eno2 is 1000MB
172.17.0.155 network Pass network speed of eno3 is 1000MB
172.17.0.155 network Pass network speed of eno4 is 1000MB
172.17.0.155 network Pass network speed of eno1 is 1000MB
172.17.0.155 thp Pass THP is disabled
172.17.0.155 selinux Pass SELinux is disabled
172.17.0.155 service Pass service firewalld not found, ignore
172.17.0.155 command Pass numactl: policy: default
172.17.0.155 timezone Pass time zone is the same as the first PD machine: Etc/UTC
172.17.0.155 os-version Warn OS is Ubuntu 18.04.6 LTS 18.04.6 (ubuntu support is not fullsted, be careful), auto fixing not supported
172.17.0.155 cpu-governor Pass CPU frequency governor is performance
172.17.0.155 memory Pass memory size is 32768MB
172.17.0.156 timezone Pass time zone is the same as the first PD machine: Etc/UTC
172.17.0.156 cpu-cores Pass number of CPU cores / threads: 24
172.17.0.156 selinux Pass SELinux is disabled
172.17.0.156 thp Pass THP is disabled
172.17.0.156 command Pass numactl: policy: default
172.17.0.156 os-version Warn OS is Ubuntu 18.04.6 LTS 18.04.6 (ubuntu support is not fullsted, be careful), auto fixing not supported
172.17.0.156 cpu-governor Pass CPU frequency governor is performance
172.17.0.156 memory Pass memory size is 32768MB
172.17.0.156 network Pass network speed of eno1 is 1000MB
172.17.0.156 network Pass network speed of eno2 is 1000MB
172.17.0.156 network Pass network speed of eno3 is 1000MB
172.17.0.156 network Pass network speed of eno4 is 1000MB
172.17.0.156 service Pass service firewalld not found, ignore
+ Try to apply changes to fix failed checks
- Applying changes on 172.17.0.159 ... Done
- Applying changes on 172.17.0.154 ... Done
- Applying changes on 172.17.0.155 ... Done
- Applying changes on 172.17.0.156 ... Done
- Applying changes on 172.17.0.157 ... Done
- Applying changes on 172.17.0.158 ... Done
是我系统的问题吗?但是文档显示6.1.0是支持ubuntu16.04以上的啊
和系统应该没关系,还是你磁盘配置有问题,这是我的配置,你参考一下,按照你的环境修改
vi /etc/fstab
UUID=3c7e03d9-1ce3-46a6-b982-3c201e127673 /tidb6 ext4 defaults,nodelalloc,noatime 0 2
mkdir /tidb6.1 && mount -a
[root@dbserver ~]$mount -t ext4
/dev/mapper/vg-root on / type ext4 (rw,relatime,data=ordered)
/dev/sda1 on /boot type ext4 (rw,relatime,data=ordered)
/dev/sdb on /tidb6.1 type ext4 (rw,noatime,nodelalloc,data=ordered)
tiup cluster template > topology.yaml
global:
user: “tidb”
ssh_port: 22
deploy_dir: “/tidb6.1/tidb-deploy”
data_dir: “/tidb6.1/tidb-data”
server_configs: {}
pd_servers:
- host: 192.168.80.174
tidb_servers: - host: 192.168.80.179
tikv_servers: - host: 192.168.80.176
- host: 192.168.80.177
- host: 192.168.80.178
monitoring_servers: - host: 192.168.80.174
grafana_servers: - host: 192.168.80.174
alertmanager_servers: - host: 192.168.80.174
[stack=“github.com/pingcap/tidb/session.getStoreBootstrapVersion 说明store没起来,也就是tikv没起来。启动集群的顺序是pd,tikv,tidb。
这个不是显示tikv起来了
pan@admin:~$ tiup cluster start tidb-test --init
tiup is checking updates for component cluster ...
Starting component `cluster`: /home/pan/.tiup/components/cluster/v1.11.0/tiup-cluster start tidbt --init
Starting cluster tidb-test...
+ [ Serial ] - SSHKeySet: privateKey=/home/pan/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rpublicKey=/home/pan/.tiup/storage/cluster/clusters/tidb-test/ssh/id_rsa.pub
+ [Parallel] - UserSSH: user=tidb, host=172.17.0.154
+ [Parallel] - UserSSH: user=tidb, host=172.17.0.154
+ [Parallel] - UserSSH: user=tidb, host=172.17.0.156
+ [Parallel] - UserSSH: user=tidb, host=172.17.0.159
+ [Parallel] - UserSSH: user=tidb, host=172.17.0.155
+ [Parallel] - UserSSH: user=tidb, host=172.17.0.158
+ [Parallel] - UserSSH: user=tidb, host=172.17.0.158
+ [Parallel] - UserSSH: user=tidb, host=172.17.0.157
+ [Parallel] - UserSSH: user=tidb, host=172.17.0.158
+ [ Serial ] - StartCluster
Starting component pd
Starting instance 172.17.0.154:2379
Start instance 172.17.0.154:2379 success
Starting component tikv
Starting instance 172.17.0.157:20160
Starting instance 172.17.0.155:20160
Starting instance 172.17.0.156:20160
Start instance 172.17.0.155:20160 success
Start instance 172.17.0.157:20160 success
Start instance 172.17.0.156:20160 success
Starting component tidb
Starting instance 172.17.0.154:4000
tidb.log的具体内容
tidb.log (64 KB)
[2022/11/01 17:06:13.297 +08:00] [FATAL] [session.go:3052] ["check bootstrapped failed"] [error="context deadline exceeded"] [stack="github.com/pingcap/tidb/session.getStoreBootstrapVersion\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/session/session.go:3052\ngithub.com/pingcap/tidb/session.BootstrapSession\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/session/session.go:2827\nmain.createStoreAndDomain\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:296\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:202\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"]
[2022/11/01 17:02:47.500 +08:00] [FATAL] [session.go:3052] ["check bootstrapped failed"] [error="[tikv:9002]TiKV server timeout"] [stack="github.com/pingcap/tidb/session.getStoreBootstrapVersion\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/session/session.go:3052\ngithub.com/pingcap/tidb/session.BootstrapSession\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/session/session.go:2827\nmain.createStoreAndDomain\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:296\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:202\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"]
请求tikv的网络超时了,检查一下tidb主机到tikv主机的网络
现在不超时了,但是Region is unavailable
[2022/11/02 10:55:59.137 +08:00] [FATAL] [session.go:3068] [“check bootstrapped failed”] [error=“[tikv:9005]Region is unavailable”] [stack=“github.com/pingcap/tidb/session.getStoreBootstrapVersion\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/session/session.go:3068\ngithub.com/pingcap/tidb/session.BootstrapSession\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/session/session.go:2843\nmain.createStoreAndDomain\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:296\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:202\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
重新的restart 一下 看看
还是报错
[2022/11/02 16:20:38.205 +08:00] [FATAL] [session.go:3068] [“check bootstrapped failed”] [error=“[tikv:9005]Region is unavailable”] [stack=“github.com/pingcap/tidb/session.getStoreBootstrapVersion\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/session/session.go:3068\ngithub.com/pingcap/tidb/session.BootstrapSession\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/session/session.go:2843\nmain.createStoreAndDomain\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:296\nmain.main\n\t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/tidb-server/main.go:202\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250”]
[2022/11/02 16:18:34.720 +08:00] [WARN] [backoff.go:158] [“regionMiss backoffer.maxSleep 40000ms is exceeded, errors:\nepoch_not_match:<> at 2022-11-02T16:18:33.211197558+08:00\nepoch_not_match:<> at 2022-11-02T16:18:33.714191748+08:00\nepoch_not_match:<> at 2022-11-02T16:18:34.216897365+08:00\nlongest sleep type: regionMiss, time: 40010ms”]
没太看明白,现在是
check bootstrapped failed
过了?应该还是没过吧?- 要不直接把 相关日志扔上来?