Tidbv6.6 TIKV 运行中disconnected -> down

【 TiDB 使用环境】测试/ Poc
【 TiDB 版本】6.6
【复现路径】正常使用
【遇到的问题:tikv disconnected down, 无法启动
【资源配置】
【附件:截图/日志/监控】

[2023/03/29 11:28:46.813 +08:00] [FATAL] [lib.rs:497] [“called Result::unwrap() on an Err value: Other(Os { code: 2, kind: NotFound, message: "No such file or directory" })”] [backtrace=" 0: tikv_util::set_panic_hook::{{closure}}\n at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/tikv_util/src/lib.rs:496:18\n 1: <alloc::boxed::Box<F,A> as core::ops::function::Fn>::call\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2032:9\n std::panicking::rust_panic_with_hook\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:692:13\n 2: std::panicking::begin_panic_handler::{{closure}}\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:579:13\n 3: std::sys_common::backtrace::__rust_end_short_backtrace\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:137:18\n 4: rust_begin_unwind\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:575:5\n 5: core::panicking::panic_fmt\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panicking.rs:65:14\n 6: core::result::unwrap_failed\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:1791:5\n 7: core::result::Result<T,E>::unwrap\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/result.rs:1113:23\n server::server::TikvServer::init_storage_stats_task::{{closure}}\n at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/server/src/server.rs:1563:33\n tikv_util::worker::pool::Worker::spawn_interval_task::{{closure}}\n at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/tikv_util/src/worker/pool.rs:384:17\n <core::future::from_generator::GenFuture as core::future::future::Future>::poll\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/future/mod.rs:91:19\n yatp::task::future::RawTask::poll\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/task/future.rs:59:9\n 8: yatp::task::future::TaskCell::poll\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/task/future.rs:103:9\n <yatp::task::future::Runner as yatp::pool::runner::Runner>::handle\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/task/future.rs:387:20\n 9: <tikv_util::yatp_pool::YatpPoolRunner as yatp::pool::runner::Runner>::handle\n at /home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tikv/components/tikv_util/src/yatp_pool/mod.rs:122:24\n yatp::pool::worker::WorkerThread<T,R>::run\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/pool/worker.rs:48:13\n yatp::pool::builder::LazyBuilder::build::{{closure}}\n at /rust/git/checkouts/yatp-e704b73c3ee279b6/bcf431a/src/pool/builder.rs:114:25\n std::sys_common::backtrace::rust_begin_short_backtrace\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:121:18\n 10: std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:551:17\n <core::panic::unwind_safe::AssertUnwindSafe as core::ops::function::FnOnce<()>>::call_once\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/panic/unwind_safe.rs:271:9\n std::panicking::try::do_call\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:483:40\n std::panicking::try\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:447:19\n std::panic::catch_unwind\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:137:14\n std::thread::Builder::spawn_unchecked::{{closure}}\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:550:30\n core::ops::function::FnOnce::call_once{{vtable.shim}}\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:513:5\n 11: <alloc::boxed::Box<F,A> as core::ops::function::FnOnce>::call_once\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2000:9\n <alloc::boxed::Box<F,A> as core::ops::function::FnOnce>::call_once\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/alloc/src/boxed.rs:2000:9\n std::sys::unix::thread::thread::new::thread_start\n at /rust/toolchains/nightly-2022-11-15-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys/unix/thread.rs:108:17\n 12: start_thread\n 13: __clone\n"] [location=components/server/src/server.rs:1563] [thread_name=background-0]

image为什么要把存储目录放发在tmp下?检查一下这个目录是不是被自动清理掉了。tmp目录系统默认会定时去清理的,你放到tmp目录下让人很费解。


存储目录存在,但的确有部分文件消失了。/tmp挂载的是单独的一块ssd。/tmp的命名造成影响?

没这么干的,你重新修改挂载目录把。丢失的文件是找不回来的,数据不重要的话重新初始化到别的路径下把。tmp根据每个系统的发行版不同,清理的规则是不一样的。这种基础知识你应该先了解一下。

cat /etc/cron.daily/tmpwatch 看下有规则没