使用tiup扩容tiflash失败

单节点上使用tiup cluster部署pd\tidb\tikv后,再扩容一个tiflash,报错:
2020-11-10T19:18:16.720+0800 DEBUG TaskFinish {“task”: “StartCluster”, “error”: “failed to start tiflash: \ttiflash 172.31.6.18:9543 failed to start: timed out waiting for port 9543 to be started after 2m0s, please check the log of the instance: timed out waiting for port 9543 to be started after 2m0s”, “errorVerbose”: “timed out waiting for port 9543 to be started after 2m0s\ngithub.com/pingcap/tiup/pkg/cluster/module.(*WaitFor).Execute\ \tgithub.com/pingcap/tiup@/pkg/cluster/module/wait_for.go:90\ github.com/pingcap/tiup/pkg/cluster/spec.PortStarted\ \tgithub.com/pingcap/tiup@/pkg/cluster/spec/instance.go:101\ github.com/pingcap/tiup/pkg/cluster/spec.(*BaseInstance).Ready\ \tgithub.com/pingcap/tiup@/pkg/cluster/spec/instance.go:132\ github.com/pingcap/tiup/pkg/cluster/operation.startInstance\ \tgithub.com/pingcap/tiup@/pkg/cluster/operation/action.go:564\ github.com/pingcap/tiup/pkg/cluster/operation.StartComponent.func1\ \tgithub.com/pingcap/tiup@/pkg/cluster/operation/action.go:679\ golang.org/x/sync/errgroup.(*Group).Go.func1\ \tgolang.org/x/sync@v0.0.0-20190911185100-cd5d95a43a6e/errgroup/errgroup.go:57\ runtime.goexit\ \truntime/asm_amd64.s:1357\ \ttiflash 172.31.6.18:9543 failed to start: timed out waiting for port 9543 to be started after 2m0s, please check the log of the instance\ failed to start tiflash”}

./pd-ctl -u http://172.31.6.18:2379 config show 如下:

{
“replication”: {
“enable-placement-rules”: “true”,
“location-labels”: “”,
“max-replicas”: 3,
“strictly-match-label”: “false”
},
“schedule”: {
“enable-cross-table-merge”: “false”,
“enable-debug-metrics”: “false”,
“enable-location-replacement”: “true”,
“enable-make-up-replica”: “true”,
“enable-one-way-merge”: “false”,
“enable-remove-down-replica”: “true”,
“enable-remove-extra-replica”: “true”,
“enable-replace-offline-replica”: “true”,
“high-space-ratio”: 0.7,
“hot-region-cache-hits-threshold”: 3,
“hot-region-schedule-limit”: 4,
“leader-schedule-limit”: 4,
“leader-schedule-policy”: “count”,
“low-space-ratio”: 0.8,
“max-merge-region-keys”: 200000,
“max-merge-region-size”: 20,
“max-pending-peer-count”: 16,
“max-snapshot-count”: 3,
“max-store-down-time”: “30m0s”,
“merge-schedule-limit”: 8,
“patrol-region-interval”: “100ms”,
“region-schedule-limit”: 2048,
“replica-schedule-limit”: 64,
“scheduler-max-waiting-operator”: 5,
“split-merge-interval”: “1h0m0s”,
“store-limit-mode”: “manual”,
“tolerant-size-ratio”: 0
}
}

store 看下。应该没有扩容成功,辛苦将 edit config 和 扩容文件,tiflash log 目录下的文件打包上传下

你好,
感谢你专业的反馈,

此处反馈 log 下的所有文件

hi~ 请在 tiflash 的部署目录 /data/tidb/tidb-deploy/tiflash-9543 下面以 root 用户执行下 bin/tiflash/tiflash server --config-file conf/tiflash.toml 看看错误输出是什么

执行bin/tiflash/tiflash server --config-file conf/tiflash.toml 报错如下:
bin/tiflash/tiflash: error while loading shared libraries: libtiflash_proxy.so: cannot open shared object file: No such file or directory

“此处反馈 log 下的所有文件” 是啥意思?
/data/tidb/tidb-deploy/tiflash-9543/log这个文件夹下没有任何日志。

我按照lucklove的建议执行bin/tiflash/tiflash server --config-file conf/tiflash.toml 报错如下:
bin/tiflash/tiflash: error while loading shared libraries: libtiflash_proxy.so: cannot open shared object file: No such file or directory

那么直接执行 sh scripts/run_tiflash.sh 看看呢

sh scripts/run_tiflash.sh报错如下:

sync ... 0.00user 0.17system 0:01.76elapsed 9%CPU (0avgtext+0avgdata 1764maxresident)k
0inputs+0outputs (0major+70minor)pagefaults 0swaps
ok

bin/tiflash/tiflash: error while loading shared libraries: libcurl.so.4: cannot open shared object file: No such file or directory

这个动态库的可以看下这个:
https://docs.pingcap.com/zh/tidb/stable/maintain-tiflash#查看-tiflash-版本

log/ 下的文件反馈看下吧.
问下这个服务器的剩余内存和磁盘空间辛苦展示下

确实是缺少 libcurl.so.4导致的。安装libcurl4-openssl-dev之后再把 将包含动态库 libtiflash_proxy.so 的目录路径添加到环境变量 LD_LIBRARY_PATH 就可以了。

理论上安装 libcurl4-openssl-dev 就可以了, libtiflash_proxy.so 启动脚本会把它添加到 LD 的搜索路径里

现在还有问题吗?

回复晚了,没问题了,谢谢解答~

:+1: