tikv异常崩溃

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】 生产环境
【概述】
【背景】
【现象】
【业务影响】 6个TIKV实例同时崩溃
【TiDB 版本】 3.0.3
【附件】

日志前后内容:
[2021/12/28 19:25:36.136 +08:00] [ERROR] [endpoint.rs:454] [error-response] [err=“locked primary_lock: 7480000000000014FF5F728000000001E68AE3 lock_version: 430097232318693664 key: 7480000000000014FF5F728000000001E68AE3 lock_ttl: 3001 txn_size: 1”]
[2021/12/28 19:25:42.119 +08:00] [FATAL] [lib.rs:499] [“index out of bounds: the len is 6 but the index is 6”] [backtrace=“stack backtrace:
0: 0x55df826cd51d - backtrace::backtrace::libunwind::trace::h0500f4f2825a5d17
at /rust/registry/src/github.com-1ecc6299db9ec823/backtrace-0.2.3/src/backtrace/libunwind.rs:54
- backtrace::backtrace::trace::h4187244de1605a06
at /rust/registry/src/github.com-1ecc6299db9ec823/backtrace-0.2.3/src/backtrace/mod.rs:70
1: 0x55df826c1cd0 - tikv_util::set_panic_hook::{{closure}}::h195100b0bbd49cfb
at /home/jenkins/.target/release/build/backtrace-e20a32a05fd0b8fe/out/capture.rs:79
2: 0x55df8286664f - std::panicking::rust_panic_with_hook::h8d2408723e9a2bd4
at src/libstd/panicking.rs:479
3: 0x55df8286642d - std::panicking::continue_panic_fmt::hb2aaa9386c4e5e80
at src/libstd/panicking.rs:382
4: 0x55df82875aa5 - rust_begin_unwind
at src/libstd/panicking.rs:309
5: 0x55df8288064b - core::panicking::panic_fmt::h79e840586f23493b
at src/libcore/panicking.rs:85
6: 0x55df82880153 - core::panicking::panic_bounds_check::h9293ee7846bbb139
at src/libcore/panicking.rs:61
7: 0x55df826c9dfe - <usize as core::slice::SliceIndex<[T]>>::index_mut::h3f8e40c7987fdba6
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libcore/slice/mod.rs:2700
- core::slice::<impl core::ops::index::IndexMut for [T]>::index_mut::h48f86041f006f396
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libcore/slice/mod.rs:2561
- <alloc::vec::Vec as core::ops::index::IndexMut>::index_mut::hf0fc8b65337841f2
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/liballoc/vec.rs:1768
- tokio_timer::wheel::Wheel::insert::ha8cb74cd0d278df8
at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.8/src/wheel/mod.rs:116
- tokio_timer::timer::Timer<T,N>::add_entry::h5a58797332f8d8e6
at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.8/src/timer/mod.rs:321
8: 0x55df826c8ebf - tokio_timer::timer::Timer<T,N>::process_queue::hb7228da9f479e6ea
at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.8/src/timer/mod.rs:0
9: 0x55df826c860b - <tokio_timer::timer::Timer<T,N> as tokio_executor::park::Park>::park::ha63fa46e376ecc4d
at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.8/src/timer/mod.rs:357
- tokio_timer::timer::Timer<T,N>::turn::hf4b082b1a5da8166
at /rust/registry/src/github.com-1ecc6299db9ec823/tokio-timer-0.2.8/src/timer/mod.rs:252
10: 0x55df826c7cb3 - tikv_util::timer::start_global_timer::{{closure}}::h611afd0bd89fe34b
at components/tikv_util/src/timer.rs:94
11: 0x55df826c7985 - std::sys_common::backtrace::__rust_begin_short_backtrace::h2e738e4c2425f8b4
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libstd/sys_common/backtrace.rs:77
12: 0x55df826c7975 - std::thread::Builder::spawn_unchecked::{{closure}}::{{closure}}::h28410926e23bd20a
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libstd/thread/mod.rs:470
13: 0x55df826c7965 - <std::panic::AssertUnwindSafe as core::ops::function::FnOnce<()>>::call_once::h776c18757b5aa5cc
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libstd/panic.rs:309
14: 0x55df826c7958 - std::panicking::try::do_call::h0688d0354df20dde
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libstd/panicking.rs:294
- std::panicking::try::hd581c077b089dd61
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250//src/libpanic_abort/lib.rs:29
- std::panic::catch_unwind::h4cd22e498b45e284
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libstd/panic.rs:388
- std::thread::Builder::spawn_unchecked::{{closure}}::had4d02192eec1fae
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libstd/thread/mod.rs:469
- core::ops::function::FnOnce::call_once{{vtable.shim}}::h3b094705c62eb5fd
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libcore/ops/function.rs:231
15: 0x55df82874aae - <alloc::boxed::Box as core::ops::function::FnOnce>::call_once::he71721d2d956d451
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/liballoc/boxed.rs:746
16: 0x55df82876ddb - <alloc::boxed::Box as core::ops::function::FnOnce
>::call_once::he520045b8d28ce5c
at /rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/liballoc/boxed.rs:746
- std::sys_common::thread::start_thread::h2e98d1272dc6d74b
at src/libstd/sys_common/thread.rs:13
- std::sys::unix::thread::thread::new::thread_start::h18485805666ccd3c
at src/libstd/sys/unix/thread.rs:79
17: 0x7ff5b876edd4 - start_thread
18: 0x7ff5b7e75eac - __clone
19: 0x0 - ”] [location=/rustc/0e4a56b4b04ea98bb16caada30cb2418dd06e250/src/libcore/slice/mod.rs:2700] [thread_name=timer]
[2021/12/28 19:26:06.314 +08:00] [INFO] [mod.rs:26] [“Welcome to TiKV.”]
[2021/12/28 19:26:06.315 +08:00] [INFO] [mod.rs:28] []

%E5%9B%BE%E7%89%87

%E5%9B%BE%E7%89%87

1 个赞

这个版本有点老阿,看日志上的描述,判断可能不准
只看到了锁冲突,3.0.3 应该只支持乐观事务,这种锁冲突会影响写入性能,但是不会导致 kv 挂掉

可以尝试升级下版本,可能有bug 已经被修复了

网上搜索了下。
index out of bounds: the len is 6 but the index is 6
这个错误,在5.0版本也有出现。所以和版本也没啥关系。
引用页: https://github.com/tikv/tikv/issues/9970

我看了一下 github ,这个BUG 是p社的小伙伴在测试过程中发现了这个问题并在 5.0之后修复了这个 BUG ~
你可以升级到 5.0及之后的版本。

目前版本已经更新到 5.3.0,你可以调研后再升级

https://docs.pingcap.com/zh/tidb/stable/

好的。3.0.3版本能直接升级到5.3.0吗?

3.X 先升级到 4.X 再从 4.X升级到 5.X

OK.
这版本用了好几年了。确实也该升级了

之前有遇到小伙伴用 tiup 从 3.0→ 4.0→5.0 半个小时就升级完成

赞。:tulip:
我调研后安排升级吧。

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。