单机集群reload配置很慢

【TiDB 使用环境】测试
【TiDB 版本】v8.5.1
【操作系统】

[root@tidbcluster soft]# hostnamectl
   Static hostname: tidbcluster
         Icon name: computer-vm
           Chassis: vm
        Machine ID: 17b2ead1d66b43f3b633fb09f016042e
           Boot ID: eaa877c9baa243f280cf66a8e02f4055
    Virtualization: vmware
  Operating System: CentOS Linux 7 (Core)
       CPE OS Name: cpe:/o:centos:centos:7
            Kernel: Linux 3.10.0-1160.118.1.el7.x86_64
      Architecture: x86-64

【部署方式】
VMware虚拟机,SSD硬盘;系统负载不高,CPU/MEM空闲;
部署方式参考: 《在单机上模拟部署生产环境集群》
https://docs.pingcap.com/zh/tidb/stable/quick-start-with-tidb/#在单机上模拟部署生产环境集群

【集群数据量】
【集群节点数】
【问题复现路径】做过哪些操作出现的问题
执行类似命令:

tiup cluster reload tidb-test

多个虚拟机中部署标准集群,类似操作较快。

:thinking:reload的时候,服务器资源使用怎么样?

资源不紧张


在rokcy9.5 上重新部署单机集群v8.5.1,启动速度和重新加载配置速度都很快。

怀疑是操作系统centos7.9 与v8.5.1 兼容度不好导致。

1 个赞


官网上是有这么一段话,,,当时应该是考虑不支持linux 7版本了

看下资源的使用情况,不行的话就扩容下。

可能是资源的问题

系统层面查看,资源不紧张。

配置文件书写格式都正确么。

按官方文档最简配置指导安装,配置文件复制过来。

贴出来看下吧,有些配置是需要更改的,比如一些地址,按理不应该啊。

两系统均为同一平台虚拟机,分配的CPU/Mem/Disk一致;均安装v8.5.1;

centos79上配置文件:


[root@tidbcluster ~]# cat topo_tidbcluster.yaml


# # Global variables are applied to all deployments and used as the default value of
# # the deployments if a specific deployment value is missing.
global:
 user: "tidb"
 ssh_port: 22
 deploy_dir: "/tidb-deploy"
 data_dir: "/tidb-data"

# # Monitored variables are applied to all the machines.
monitored:
 node_exporter_port: 9100
 blackbox_exporter_port: 9115

server_configs:
 tidb:
   instance.tidb_slow_log_threshold: 300
 tikv:
   readpool.storage.use-unified-pool: false
   readpool.coprocessor.use-unified-pool: true
 pd:
   replication.enable-placement-rules: true
   replication.location-labels: ["host"]
 tiflash:
   logger.level: "info"

pd_servers:
 - host: 192.168.169.41

tidb_servers:
 - host: 192.168.169.41

tikv_servers:
 - host: 192.168.169.41
   port: 20160
   status_port: 20180
   config:
     server.labels: { host: "logic-host-1" }

 - host: 192.168.169.41
   port: 20161
   status_port: 20181
   config:
     server.labels: { host: "logic-host-2" }

 - host: 192.168.169.41
   port: 20162
   status_port: 20182
   config:
     server.labels: { host: "logic-host-3" }

tiflash_servers:
 - host: 192.168.169.41

monitoring_servers:
 - host: 192.168.169.41

grafana_servers:
 - host: 192.168.169.41

rocky9.5上配置


[root@tidb40 soft]# cat topo.yaml

# # Global variables are applied to all deployments and used as the default value of
# # the deployments if a specific deployment value is missing.
global:
 user: "tidb"
 ssh_port: 22
 deploy_dir: "/tidb-deploy"
 data_dir: "/tidb-data"

# # Monitored variables are applied to all the machines.
monitored:
 node_exporter_port: 9100
 blackbox_exporter_port: 9115

server_configs:
 tidb:
   instance.tidb_slow_log_threshold: 300
 tikv:
   readpool.storage.use-unified-pool: false
   readpool.coprocessor.use-unified-pool: true
 pd:
   replication.enable-placement-rules: true
   replication.location-labels: ["host"]
 tiflash:
   logger.level: "info"

pd_servers:
 - host: 192.168.169.40

tidb_servers:
 - host: 192.168.169.40

tikv_servers:
 - host: 192.168.169.40
   port: 20160
   status_port: 20180
   config:
     server.labels: { host: "logic-host-1" }

 - host: 192.168.169.40
   port: 20161
   status_port: 20181
   config:
     server.labels: { host: "logic-host-2" }

 - host: 192.168.169.40
   port: 20162
   status_port: 20182
   config:
     server.labels: { host: "logic-host-3" }

tiflash_servers:
 - host: 192.168.169.40

monitoring_servers:
 - host: 192.168.169.40

grafana_servers:
 - host: 192.168.169.40
[root@tidb40 soft]#

主机用IP地址试下,把一些不常用的配置先注释掉,一个个放开,看下。

最简单的看下是否还慢,然后逐步增加,最好先用ip,主机名的话,还有解析来。