生产环境 tiup 升级v4.0.2 到v4.0.4 tiflash 等待超时 timed out waiting for port 9000 to be started after 2m0s

tiup cluster display jiuji-tidb-cluster-v2 还是显示v4.0.4 和v4.0.2 的问题怎么解决呢? dashboard 已经是v4.0.5 和 v4.0.4 了。

这个不解决后面感觉就不能继续升级了

表可以直接删除重建么?

对相关的表进行删除重建,或者进行 truncate table 操作是可以解决 “Storage engine DeltaMerge doesn’t support lossy data type modify …” 的问题。
tiup cluster display 的问题我理解对后续升级不会有太大障碍,可以另外立帖子说明情况询问下。

按照官方的文档,一步一步下来的, 启动报错。。。
启动集群
[root@7ef24bdea1ab tidb]# tiup cluster start hx-tidb-cluster
Starting component cluster: /root/.tiup/components/cluster/v1.1.1/tiup-cluster start hx-tidb-cluster
Starting cluster hx-tidb-cluster…

  • [ Serial ] - SSHKeySet: privateKey=/root/.tiup/storage/cluster/clusters/hx-tidb-cluster/ssh/id_rsa, publicKey=/root/.tiup/storage/cluster/clusters/hx-tidb-cluster/ssh/id_rsa.pub
  • [Parallel] - UserSSH: user=hxtidb, host=172.17.0.2
  • [Parallel] - UserSSH: user=hxtidb, host=172.17.0.2
  • [Parallel] - UserSSH: user=hxtidb, host=172.17.0.2
  • [Parallel] - UserSSH: user=hxtidb, host=172.17.0.2
  • [Parallel] - UserSSH: user=hxtidb, host=172.17.0.2
  • [Parallel] - UserSSH: user=hxtidb, host=172.17.0.2
  • [Parallel] - UserSSH: user=hxtidb, host=172.17.0.2
  • [Parallel] - UserSSH: user=hxtidb, host=172.17.0.2
  • [ Serial ] - StartCluster
    Starting component pd
    Starting instance pd 172.17.0.2:2379
    Start pd 172.17.0.2:2379 success
    Starting component node_exporter
    Starting instance 172.17.0.2
    Start 172.17.0.2 success
    Starting component blackbox_exporter
    Starting instance 172.17.0.2
    Start 172.17.0.2 success
    Starting component tikv
    Starting instance tikv 172.17.0.2:20162
    Starting instance tikv 172.17.0.2:20160
    Starting instance tikv 172.17.0.2:20161
    Start tikv 172.17.0.2:20160 success
    Start tikv 172.17.0.2:20162 success
    Start tikv 172.17.0.2:20161 success
    Starting component tidb
    Starting instance tidb 172.17.0.2:4008
    Start tidb 172.17.0.2:4008 success
    Starting component tiflash
    Starting instance tiflash 172.17.0.2:9000
    retry error: operation timed out after 2m0s
    tiflash 172.17.0.2:9000 failed to start: timed out waiting for port 9000 to be started after 2m0s, please check the log of the instance

Error: failed to start tiflash: tiflash 172.17.0.2:9000 failed to start: timed out waiting for port 9000 to be started after 2m0s, please check the log of the instance: timed out waiting for port 9000 to be started after 2m0s

Verbose debug logs has been written to /root/tidb/logs/tiup-cluster-debug-2020-09-08-02-37-12.log.
Error: run /root/.tiup/components/cluster/v1.1.1/tiup-cluster (wd:/root/.tiup/data/S9xbJET) failed: exit status 1

查看集群
[root@7ef24bdea1ab tidb]# tiup cluster list
Starting component cluster: /root/.tiup/components/cluster/v1.1.1/tiup-cluster list
Name User Version Path PrivateKey


hx-tidb-cluster hxtidb v4.0.5 /root/.tiup/storage/cluster/clusters/hx-tidb-cluster /root/.tiup/storage/cluster/clusters/hx-tidb-cluster/ssh/id_rsa

yaml配置文件

global:
user: “hxtidb”
deploy_dir: /home/nscoffee/tidb/hxtidb/deploy
data_dir: /home/nscoffee/tidb/hxtidb/data

monitored:
node_exporter_port: 9100
blackbox_exporter_port: 9115

server_configs:
tidb:
log.slow-threshold: 300
tikv:
readpool.storage.use-unified-pool: false
readpool.coprocessor.use-unified-pool: true
pd:
replication.enable-placement-rules: true
tiflash:
logger.level: “info”

pd_servers:
- host: 172.17.0.2

tidb_servers:
- host: 172.17.0.2
port: 4008
tikv_servers:
- host: 172.17.0.2
port: 20160
status_port: 20170

    - host: 172.17.0.2
      port: 20161
      status_port: 20171

    - host: 172.17.0.2
      port: 20162
      status_port: 20172

tiflash_servers:
- host: 172.17.0.2

monitoring_servers:
- host: 172.17.0.2

grafana_servers:
- host: 172.17.0.2

类似这种升级到 v4.0.4/v4.0.5 后,TiFlash 启动失败的问题,已经在 v4.0.6 版本中修复。可以通过将集群升级至 v4.0.6 或以后的版本解决。