tikv crash

为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】
【概述】场景+问题概述
有一个tikv节点宕机了
【背景】做过哪些操作
这个节点配置的磁盘容量是2.1 TB,实际磁盘占用已经满了。
【现象】业务和数据库现象
【业务影响】
出现region is unavailable
【TiDB 版本】
4.0.14
【附件】
dmesg查看到的信息如下:
[4391439.848316] 33743226 total pagecache pages
[4391439.849261] 0 pages in swap cache
[4391439.850203] Swap cache stats: add 0, delete 0, find 0/0
[4391439.851149] Free swap = 0kB
[4391439.852093] Total swap = 0kB
[4391439.853033] SLUB: Unable to allocate memory on node -1 (gfp=0x20)
[4391439.853979] cache: kmalloc-8192, object size: 8192, buffer size: 8192, default order: 3, min order: 1
[4391439.854943] node 0: slabs: 119, objs: 476, free: 23
[4391439.855908] node 1: slabs: 112, objs: 442, free: 24
[4416247.423310] addr-resolver[105685]: segfault at 30 ip 00005621e4441803 sp 00007fc83edf69f8 error 4 in tikv-server[5621e2a04000+2594000]
[4485207.541259] raftstore-18663[236418]: segfault at 48 ip 0000555d94e80a31 sp 00007faa79df49f0 error 4 in tikv-server[555d9343b000+2594000]
[4548531.933668] systemd-fstab-generator[115718]: Ignoring “nofail” for root device
[4548549.572748] systemd-fstab-generator[115962]: Ignoring “nofail” for root device
[4548794.871453] systemd-fstab-generator[126244]: Ignoring “nofail” for root device
[4549038.253627] systemd-fstab-generator[136240]: Ignoring “nofail” for root device
[4549055.412662] systemd-fstab-generator[137937]: Ignoring “nofail” for root device
[4549055.710195] systemd-fstab-generator[137998]: Ignoring “nofail” for root device
[4549084.507056] systemd-fstab-generator[138433]: Ignoring “nofail” for root device
[4549100.583780] systemd-fstab-generator[138664]: Ignoring “nofail” for root device
[4549118.877138] systemd-fstab-generator[140489]: Ignoring “nofail” for root device
[4549119.169820] systemd-fstab-generator[140552]: Ignoring “nofail” for root device
[4655403.917655] grpc-server-2[139048]: segfault at 7ff24ff99790 ip 00005650b7519b47 sp 00007ff2307f9a00 error 6 in tikv-server[5650b5afb000+2594000]
[4832465.364556] raftstore-18663[106245]: segfault at 80 ip 000055bc9225d158 sp 00007ff0ce9f4a70 error 4
[4832465.364569] raftstore-18663[106246]: segfault at 18 ip 000055bc9225d158 sp 00007ff0ce7f3a70 error 4
[4832465.364572] in tikv-server[55bc912ab000+2594000]

[4832465.366648] in tikv-server[55bc912ab000+2594000]

[4835791.828306] raftstore-18663[8294]: segfault at 48 ip 00005565330d3a31 sp 00007f71053f39f0 error 4
[4835791.828312] raftstore-18663[8293]: segfault at 48 ip 00005565330d3a31 sp 00007f71055f49f0 error 4 in tikv-server[55653168e000+2594000]
[4835791.831084] in tikv-server[55653168e000+2594000]
[4863096.040945] traps: steady-timer[108846] general protection ip:5575e85c9d81 sp:7f4ac41f88b0 error:0 in tikv-server[5575e7f54000+2594000]
[5032471.268722] cat[101742]: segfault at 7fff63bce3f0 ip 00007f15599b8f20 sp 00007fff63b8e3a0 error 6 in ld-2.17.so[7f15599b7000+21000]
[5085722.257097] raftstore-18663[18721]: segfault at e28 ip 000056341b3dce54 sp 00007f5b36ff6080 error 4
[5085722.257119] grpc-server-0[18591]: segfault at 78 ip 000056341ba897b1 sp 00007f5bf51f9f10 error 4
[5085722.257122] grpc-server-2[18593]: segfault at 2e ip 000056341ba897b1 sp 00007f5bf47f8f10 error 4
[5085722.257124] grpc-server-3[18594]: segfault at 0 ip 000056341ba897c6 sp 00007f5bf3ff9f10 error 4
[5085722.257125] in tikv-server[56341aabe000+2594000]
[5085722.257127] in tikv-server[56341aabe000+2594000]

[5085722.257128] in tikv-server[56341aabe000+2594000]

  1. TiUP Cluster Display 信息

  2. TiUP Cluster Edit Config 信息

  3. TiDB- Overview 监控

  • 对应模块日志(包含问题前后1小时日志)
1 Like

知道磁盘满了,解决掉,所以想问什么呢?

1 Like

3TB的磁盘,使用2.1TB,集群配置的磁盘容量是2.1TB。

这里有申请内存失败导致crash,我想问的是这个问题

扩容或者清理部分数据

嗯,我们在扩容,我的意思是这种场景下出现申请内存失败不需要关注吗?

你可以扩容完成在观察下

好的,感谢回复

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。