【 TiDB 使用环境`】 测试环境
【 TiDB 版本】v6.1.0
【遇到的问题】tiup 部署,tiflash启动失败
【复现路径】tiup cluster start
【问题现象及影响】
tiflash_error.log发出来看看
检查下端口、防火墙
/tidb-deploy/tiflash-9000这个文件下生成的文件直线上升,把我100G硬盘都占慢了,后来我就把这个文件夹下所有文件删除了,现在启动后不往里面写东西了,请问这个文件夹里的文件内容大师怎么回事,还有怎样设置才能继续往这里写日志,tiflash_error.log找不到了
在原来的目录下 建一个空的tiflash_error.log 应该就可以了,
d6634ee9e75d26", “func”: “github.com/pingcap/tiup/pkg/cluster/executor.(*CheckPointExecutor).Execute”, “hit”: false}
2022-08-22T09:58:19.649+0800 DEBUG retry error {“error”: “operation timed out after 2m0s”}
2022-08-22T09:58:19.649+0800 DEBUG TaskFinish {“task”: “StartCluster”, “error”: “failed to start tiflash: failed to start: 182.92.101.109 tiflash-9000.service, please check the instance’s log(/tidb-deploy/tiflash-9000/log) for more detail.: timed out waiting for port 9000 to be started after 2m0s”, “errorVerbose”: “timed out waiting for port 9000 to be started after 2m0s\ngithub.com/pingcap/tiup/pkg/cluster/module.(*WaitFor).Execute\
\tgithub.com/pingcap/tiup/pkg/cluster/module/wait_for.go:91\
github.com/pingcap/tiup/pkg/cluster/spec.PortStarted\
\tgithub.com/pingcap/tiup/pkg/cluster/spec/instance.go:116\
github.com/pingcap/tiup/pkg/cluster/spec.(*TiFlashInstance).Ready\
\tgithub.com/pingcap/tiup/pkg/cluster/spec/tiflash.go:803\
github.com/pingcap/tiup/pkg/cluster/operation.startInstance\
\tgithub.com/pingcap/tiup/pkg/cluster/operation/action.go:404\
github.com/pingcap/tiup/pkg/cluster/operation.StartComponent.func1\
\tgithub.com/pingcap/tiup/pkg/cluster/operation/action.go:533\
golang.org/x/sync/errgroup.(*Group).Go.func1\
\tgolang.org/x/sync@v0.0.0-20220513210516-0976fa681c29/errgroup/errgroup.go:74\
runtime.goexit\
\truntime/asm_amd64.s:1571\
failed to start: 182.92.101.109 tiflash-9000.service, please check the instance’s log(/tidb-deploy/tiflash-9000/log) for more detail.\
failed to start tiflash”}
2022-08-22T09:58:19.649+0800 INFO Execute command finished {“code”: 1, “error”: “failed to start tiflash: failed to start: 182.92.101.109 tiflash-9000.service, please check the instance’s log(/tidb-deploy/tiflash-9000/log) for more detail.: timed out waiting for port 9000 to be started after 2m0s”, “errorVerbose”: “timed out waiting for port 9000 to be started after 2m0s\ngithub.com/pingcap/tiup/pkg/cluster/module.(*WaitFor).Execute\
\tgithub.com/pingcap/tiup/pkg/cluster/module/wait_for.go:91\
github.com/pingcap/tiup/pkg/cluster/spec.PortStarted\
\tgithub.com/pingcap/tiup/pkg/cluster/spec/instance.go:116\
github.com/pingcap/tiup/pkg/cluster/spec.(*TiFlashInstance).Ready\
\tgithub.com/pingcap/tiup/pkg/cluster/spec/tiflash.go:803\
github.com/pingcap/tiup/pkg/cluster/operation.startInstance\
\tgithub.com/pingcap/tiup/pkg/cluster/operation/action.go:404\
github.com/pingcap/tiup/pkg/cluster/operation.StartComponent.func1\
\tgithub.com/pingcap/tiup/pkg/cluster/operation/action.go:533\
golang.org/x/sync/errgroup.(*Group).Go.func1\
\tgolang.org/x/sync@v0.0.0-20220513210516-0976fa681c29/errgroup/errgroup.go:74\
runtime.goexit\
\truntime/asm_amd64.s:1571\
failed to start: 182.92.101.109 tiflash-9000.service, please check the instance’s log(/tidb-deploy/tiflash-9000/log) for more detail.\
failed to start tiflash”}
/tidb-deploy/tiflash-9000所有文件删除了,tiflash节点删除并不掉,怎么解决
从错误信息来看,TiFlash 启动时发生了崩溃:
[2022/08/22 14:53:58.426 +08:00] [ERROR] [BaseDaemon.cpp:420] ["BaseDaemon:Attempted access has violated the permissions assigned to the memory area."] [thread_id=5]
[2022/08/22 14:53:59.947 +08:00] [ERROR] [BaseDaemon.cpp:570] ["BaseDaemon:\
0x1ed2661\tfaultSignalHandler(int, siginfo_t*, void*) [tiflash+32319073]\
\tlibs/libdaemon/src/BaseDaemon.cpp:221\
0x7f2cbae9f5d0\t<unknown symbol> [libpthread.so.0+62928]\
0x85d11e0\tgrpc_server_request_registered_call [tiflash+140317152]\
\tcontrib/grpc/src/core/lib/surface/server.cc:0\
0x855fbb6\tgrpc::ServerInterface::RegisteredAsyncRequest::IssueRequest(void*, grpc_byte_buffer**, grpc_impl::ServerCompletionQueue*) [tiflash+139852726]\
\tcontrib/grpc/src/cpp/server/server_cc.cc:209\
0x7ac6f6f\tgrpc::ServerInterface::PayloadAsyncRequest<mpp::EstablishMPPConnectionRequest>::PayloadAsyncRequest(grpc::internal::RpcServiceMethod*, grpc::ServerInterface*, grpc_impl::ServerContext*, grpc::internal::ServerAsyncStreamingInterface*, grpc_impl::CompletionQueue*, grpc_impl::ServerCompletionQueue*, void*, mpp::EstablishMPPConnectionRequest*) [tiflash+128741231]\
\tcontrib/grpc/include/grpcpp/impl/codegen/server_interface.h:270\
0x7ac5a40\tDB::EstablishCallData::EstablishCallData(DB::AsyncFlashService*, grpc_impl::ServerCompletionQueue*, grpc_impl::ServerCompletionQueue*, std::__1::shared_ptr<std::__1::atomic<bool> > const&) [tiflash+128735808]\
\tdbms/src/Flash/EstablishCall.cpp:34\
0x7ac5d3b\tDB::EstablishCallData::spawn(DB::AsyncFlashService*, grpc_impl::ServerCompletionQueue*, grpc_impl::ServerCompletionQueue*, std::__1::shared_ptr<std::__1::atomic<bool> > const&) [tiflash+128736571]\
\tdbms/src/Flash/EstablishCall.cpp:44\
0x1d638c5\tDB::Server::FlashGrpcServerHolder::FlashGrpcServerHolder(DB::Server&, DB::TiFlashRaftConfig const&, Poco::Logger*) [tiflash+30816453]\
\tdbms/src/Server/Server.cpp:643\
0x1d5ab9e\tDB::Server::main(std::__1::vector<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::allocator<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > > const&) [tiflash+30780318]\
\tdbms/src/Server/Server.cpp:1401\
0x7fe644a\tPoco::Util::Application::run() [tiflash+134112330]\
\tcontrib/poco/Util/src/Application.cpp:335\
0x7ff5c0c\tPoco::Util::ServerApplication::run(int, char**) [tiflash+134175756]\
\tcontrib/poco/Util/src/ServerApplication.cpp:618\
0x1d5e4ad\tmainEntryClickHouseServer(int, char**) [tiflash+30794925]\
\tdbms/src/Server/Server.cpp:1549\
0x1d1061e\tmain [tiflash+30475806]\
\tdbms/src/Server/main.cpp:167\
0x7f2cba8cf495\t__libc_start_main [libc.so.6+140437]"] [thread_id=5]
能否提供一下你的硬件架构及操作系统信息?
Distributor ID: CentOS
Description: CentOS Linux release 7.6.1810 (Core)
Release: 7.6.1810
Codename: Core
Architecture: x86-64
cpu是4盒8g,是不是配置低?测试环境最低配置是什么配置?
还有一个问题就是tiflash启动失败,一直往它的部署文件目录写文件是怎么回事,一会儿就把我100G硬盘写满了
临时办法可以不让 TiFlash 生成 core dump 文件
TiFlash 每次启动都会崩溃,每次崩溃产生了一个 core dump 文件,你可以搜一下 CentOS 如何全局关闭 core dump 那么就不会写满你的磁盘了。
tiflash 里有类似于 abort-on-panic
的参数么?我看官档里没写,
https://docs.pingcap.com/zh/tidb/stable/tikv-configuration-file#abort-on-panic
tiflash crash 会自动生成 core dump,现在没有参数能从 tiflash 控制不生成么?
core dump 是否生成主要是操作系统控制的(ulimit),TiFlash 总是会 abort on panic
那这个参数怎么设置呢?
ulimit -c 0
具体可以搜一下 ulimit / core dump
额外再问下,tiflash 有计划增加参数控制 abort on panic 么?和 tikv 类似,在 os 层控制的基础上,在 tiflash 层也可以控制是否产生 core