最初部署TIDB集群成功,没有加tiflash节点, 成功之后我想扩容增加一个tiflash节点,有时正常,有时报错?

扩容方式是
Starting component cluster: /home/tidb/.tiup/components/cluster/v1.4.1/tiup-cluster scale-out -y tidb /opt/app/current/conf/tidb/scale-out.yaml

最初部署TIDB集群成功,没有加tiflash节点, 成功之后我想扩容增加一个tiflash节点,有时正常,有时报错?
报错信息是:
e[1A
e[2K - Copy node_exporter → 172.24.0.25 … ⠹ MonitoredConfig: cluster=tidb, user=tidb, node_exporter_port=9100, blackbox_exporter_port=9115, deploy_dir=/data/tidb-deploy/monitor-9100, data_dir=[/data/tidb-data/monitor-9100], log_dir=/data/tidb-deploy/monitor-9100/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/tidb/config-cachee[1B
e[1A
e[2K - Copy node_exporter → 172.24.0.25 … ⠸ MonitoredConfig: cluster=tidb, user=tidb, node_exporter_port=9100, blackbox_exporter_port=9115, deploy_dir=/data/tidb-deploy/monitor-9100, data_dir=[/data/tidb-data/monitor-9100], log_dir=/data/tidb-deploy/monitor-9100/log, cache_dir=/home/tidb/.tiup/storage/cluster/clusters/tidb/config-cachee[1B
e[1A
e[2K - Copy node_exporter → 172.24.0.25 … Donee[1B
Error: stderr: tar (child): /data/tidb-deploy/tiflash-9000/bin/tiflash-v5.0.1-linux-amd64.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
: executor.ssh.execute_failed: Failed to execute command over SSH for ‘tidb@172.24.0.25:22’ {ssh_stderr: tar (child): /data/tidb-deploy/tiflash-9000/bin/tiflash-v5.0.1-linux-amd64.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
, ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/usr/bin:/usr/sbin tar --no-same-owner -zxf /data/tidb-deploy/tiflash-9000/bin/tiflash-v5.0.1-linux-amd64.tar.gz -C /data/tidb-deploy/tiflash-9000/bin && rm /data/tidb-deploy/tiflash-9000/bin/tiflash-v5.0.1-linux-amd64.tar.gz}, cause: Process exited with status 2

Verbose debug logs has been written to /home/tidb/.tiup/logs/tiup-cluster-debug-2021-06-25-14-12-25.log.
Error: run /home/tidb/.tiup/components/cluster/v1.4.1/tiup-cluster (wd:/home/tidb/.tiup/data/SbKAG7H) failed: exit status 1

tiup cluster display tidb 显示是正常的


http://172.24.0.61:2379/dashboard/#/cluster_info/disk 下的磁盘展示是异常的

重新继续扩容tiflash2台 ,扩容没有报错,tiup cluster display tidb 显示是正常的,
但磁盘展示异常

在继续新增tiflash6个节点,这时报和标题一样的错,tiup cluster display tidb 显示是正常的,
同样磁盘展示异常(磁盘图片上面的一样)
错误信息如下:

  • Copy node_exporter → 172.24.0.38 … Done/clusters/tidb/config-cache
    Error: stderr: tar (child): /data/tidb-deploy/tiflash-9000/bin/tiflash-v5.0.1-linux-amd64.tar.gz: Cannot open: No such file or directory
    tar (child): Error is not recoverable: exiting now
    tar: Child returned status 2
    tar: Error is not recoverable: exiting now
    : executor.ssh.execute_failed: Failed to execute command over SSH for ‘tidb@172.24.0.37:22’ {ssh_stderr: tar (child): /data/tidb-deploy/tiflash-9000/bin/tiflash-v5.0.1-linux-amd64.tar.gz: Cannot open: No such file or directory
    tar (child): Error is not recoverable: exiting now
    tar: Child returned status 2
    tar: Error is not recoverable: exiting now
    , ssh_stdout: , ssh_command: export LANG=C; PATH=$PATH:/usr/bin:/usr/sbin tar --no-same-owner -zxf /data/tidb-deploy/tiflash-9000/bin/tiflash-v5.0.1-linux-amd64.tar.gz -C /data/tidb-deploy/tiflash-9000/bin && rm /data/tidb-deploy/tiflash-9000/bin/tiflash-v5.0.1-linux-amd64.tar.gz}, cause: Process exited with status 2

Verbose debug logs has been written to /home/tidb/.tiup/logs/tiup-cluster-debug-2021-06-25-15-53-19.log.
Error: run /home/tidb/.tiup/components/cluster/v1.4.1/tiup-cluster (wd:/home/tidb/.tiup/data/SbKZfUO) failed: exit status 1

1 个赞
  1. 可以先检查一下 Tiflash 的日志是否有报错。
  2. 麻烦上传一下 tiup cluster audit log 。(在 tiup cluster audit 找到 scale-out tiflash 的那个命令的 id 然后 tiup cluster audit <id> > audit.log )
  3. 请问 tiup cluster 的版本是多少? 建议使用最新的版本。

1.root@i-xj1g5wsc:/data/tidb-deploy/tiflash-9000/log# cat tiflash_error.log
2021.06.25 15:53:36.736026 [ 1 ] Application: The configuration “path” is deprecated. Check [storage] section for new style.
2021.06.25 15:57:22.973581 [ 14 ] DiagnosticsService: Cannot find mounted disk of path: /data/tidb-data/tiflash-9000/data
2021.06.25 16:14:30.509690 [ 13 ] DiagnosticsService: Cannot find mounted disk of path: /data/tidb-data/tiflash-9000/data

2.audit.log (442.7 KB)
3. tiup版本/home/tidb/.tiup/components/cluster/v1.4.1/tiup-cluster

确认下 TiUP 是在线还是离线环境?另外看报错是 No such file or directory,可以手动确认下是否存在文件或目录

梳理下:TiUP 扩容报错,但是display 显示是正常的。只不过 dashboard 显示磁盘有问题?

嗯嗯 就是你说的TiUP 扩容报错,但是display 显示是正常的。只不过 dashboard 显示磁盘有问题。
tiup是离线部署,版本是v1.4.1
手动确认monitor节点上是有这个包的,如下图

dashboard 显示磁盘有问题是因为tiflash_error.log中说Cannot find mounted disk of path: /data/tidb-data/tiflash-9000/data 但是我的磁盘是挂载上的

麻烦确认下 tiflash 是否可以正常使用?

看起来和这个问题比较相似 tiflash的磁盘信息在dashboard不显示

目前 tiup cluster display 里面显示的状态是根据 pd-ctl store 的状态来显示的。但是也会有可能 tiflash 这边不断重复 crash 然后被拉起,这样 pd 的状态还是有可能会显示 up 的。所以建议可以通过监控的 uptime 以及 tiflash 的日志来判断 Tiflash 是否存在问题。

好的 谢谢

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。