leojiang
(leojiang)
22
1、内存使用一直其中一台kv使用比较高,不是单独几个小时出现这种情况,上面的截图可以看出来。从配置好就是这样。
2、执行tiup ctl tikv --host 127.0.0.1:20160 metrics -t jemalloc > jemalloc.stat
127.0.0.1(换成了我内存使用偏高的ip)
报错:-bash: jemalloc.stat: 权限不够
3、关于让提供的对应时间的log问题,我这边没法提供,现监控保存的数据一直都是其中一台kv内存使用偏高,不是某一时间段使用偏高
yilong
(yi888long)
24
参考这个帖子看看,是不是 arm ,有没有开启 tph?
1 个赞
leojiang
(leojiang)
25
1、我查看我集群中kv服务器都是x86_64,因为centos7操作系统默认是开启THP的没有关闭
我这边执行了关闭并重启了集群
echo never > /sys/kernel/mm/transparent_hugepage/enabled
echo never > /sys/kernel/mm/transparent_hugepage/defrag
查看内存使用,并没有降到0
grep AnonHugePages /proc/meminfo
执行结果:
AnonHugePages: 2048 kB
但是查看禁用已经生效了
cat /proc/sys/vm/nr_hugepages && sysctl vm.nr_hugepages
结果为:
0
vm.nr_hugepages = 0
2、关闭THP对tidb集群是否有影响?
1 个赞
leojiang
(leojiang)
27
多谢
1、我这边配置了开机默认禁用
2、因为是生产环境,我这边还有一个疑问就是禁用了THP,会解决集群中其中一台kv的内存一直使用偏高的问题么?
yilong
(yi888long)
28
您好,给您的帖子主要是arm下存在这个问题,排除下这个可能性,如果是x86,可能不是这个问题。请问,当前tikv还存在内存占用多吗?x86关闭也比较好,不存在问题。
leojiang
(leojiang)
29
多谢,我查看了下最近七天的监控还是其中一台kv占用过多,还有什么调整的方式可以尝试么
麻烦发一下 TiKV-Details 监控下面 grpc 下面的 grpc count 修改如下语句后的图:
sum(rate(tikv_grpc_msg_duration_seconds_count{instance=~"$instance", type="coprocessor"}[1m])) by (instance,type)
看上去 cop 的请求也有点不太均匀。你的这个集群主要是读,是有怀疑主要是 cop 请求造成的。
tiup ctl tikv --host 127.0.0.1:20160 metrics -t jemalloc > jemalloc.stat
这个能提高权限拿下嘛?
leojiang
(leojiang)
33
多谢帮忙分析:grinning:
jemalloc.stat (1004.0 KB)
麻烦再拿一下: ps -T -p tikv-pid
这个 tikv 进程用刚刚你抓的那个 tikv 的进程
ps: 得拿那个内存最高的。刚那个 jemallc.stat 是拿的最高的那台嘛?
leojiang
(leojiang)
35
是的上面截图可以看到192.168.192.40是内存使用最高的
这个命令ps -T -p tikv-pid
在最高的kv上运行是吧
是的,刚抓的 jemalloc 的也是那台机器嘛?
leojiang
(leojiang)
37
刚抓的 jemalloc是内存使用最高的机器IP换成了192.168.192.40
leojiang
(leojiang)
38
[tidb@prdtikv40 ~]$ ps -ef|grep 1930
tidb 1930 1 99 9月09 ? 57-06:27:18 bin/tikv-server --addr 0.0.0.0:20160 --advertise-addr 192.168.192.40:20160 --status-addr 192.168.192.40:20180 --pd 192.168.192.31:2379,192.168.192.32:2379,192.168.192.33:2379 --data-dir /home/tidb/deploy/tikv-20160/data --config conf/tikv.toml --log-file /home/tidb/deploy/tikv-20160/log/tikv.log
tidb 15733 15674 0 15:20 pts/0 00:00:00 grep --color=auto 1930
[tidb@prdtikv40 ~]$ ps -ef|grep 1930^C
[tidb@prdtikv40 ~]$ ps -T -p 1930
PID SPID TTY TIME CMD
1930 1930 ? 00:00:01 tikv-server
1930 1948 ? 00:00:00 tikv-server
1930 1949 ? 00:00:00 tikv-server
1930 1950 ? 00:00:00 tikv-server
1930 1951 ? 00:00:00 tikv-server
1930 1952 ? 00:00:00 tikv-server
1930 1953 ? 00:00:00 tikv-server
1930 1954 ? 00:00:00 tikv-server
1930 1955 ? 00:00:00 tikv-server
1930 1956 ? 00:00:00 tikv-server
1930 1957 ? 00:00:00 tikv-server
1930 1958 ? 00:00:00 tikv-server
1930 1959 ? 00:00:00 tikv-server
1930 1960 ? 00:00:00 tikv-server
1930 1961 ? 00:00:00 tikv-server
1930 1962 ? 00:00:00 tikv-server
1930 1963 ? 00:00:00 tikv-server
1930 1971 ? 00:06:45 slogger
1930 1974 ? 00:00:05 default-executo
1930 1975 ? 00:00:00 resolver-execut
1930 1976 ? 00:05:52 grpc_global_tim
1930 1977 ? 00:28:46 pd-0
1930 1978 ? 00:10:40 timer
1930 1996 ? 00:00:00 addr-resolver
1930 1997 ? 00:06:25 region-collecto
1930 1998 ? 00:02:04 time-monitor
1930 1999 ? 01:52:32 rocksdb:low0
1930 2000 ? 01:55:19 rocksdb:low1
1930 2001 ? 01:53:22 rocksdb:low2
1930 2002 ? 00:11:58 rocksdb:high0
1930 2003 ? 00:00:00 tikv-server
1930 2005 ? 00:05:47 grpc_global_tim
1930 2020 ? 00:00:02 rocksdb:dump_st
1930 2021 ? 00:00:00 rocksdb:pst_st
1930 2022 ? 01:57:04 rocksdb:low3
1930 2023 ? 01:57:02 rocksdb:low4
1930 2024 ? 01:50:18 rocksdb:low5
1930 2025 ? 00:12:34 rocksdb:high1
1930 2026 ? 00:00:00 tikv-server
1930 2062 ? 00:00:05 rocksdb:dump_st
1930 2063 ? 00:00:00 rocksdb:pst_st
1930 2064 ? 01:12:44 gc-worker
1930 2065 ? 00:00:00 lock-collector
1930 2066 ? 4-08:51:43 unified-read-po
1930 2067 ? 4-08:54:34 unified-read-po
1930 2068 ? 4-08:49:46 unified-read-po
1930 2069 ? 4-08:51:12 unified-read-po
1930 2070 ? 4-08:55:04 unified-read-po
1930 2071 ? 4-08:53:09 unified-read-po
1930 2072 ? 4-08:52:53 unified-read-po
1930 2073 ? 4-08:50:17 unified-read-po
1930 2074 ? 4-08:51:57 unified-read-po
1930 2075 ? 00:00:00 gc-worker
1930 2076 ? 4-08:52:58 unified-read-po
1930 2077 ? 4-08:50:11 unified-read-po
1930 2078 ? 4-08:51:36 unified-read-po
1930 2079 ? 00:00:00 store-read-low-
1930 2080 ? 00:00:00 store-read-low-
1930 2081 ? 00:00:00 store-read-low-
1930 2082 ? 00:00:00 store-read-low-
1930 2083 ? 00:00:00 store-read-low-
1930 2084 ? 00:00:00 store-read-low-
1930 2085 ? 00:00:00 store-read-low-
1930 2086 ? 00:00:00 store-read-low-
1930 2087 ? 00:00:57 store-read-norm
1930 2088 ? 00:00:57 store-read-norm
1930 2089 ? 00:00:57 store-read-norm
1930 2090 ? 00:00:58 store-read-norm
1930 2091 ? 00:00:57 store-read-norm
1930 2092 ? 00:00:57 store-read-norm
1930 2093 ? 00:00:58 store-read-norm
1930 2094 ? 00:00:58 store-read-norm
1930 2095 ? 00:00:20 store-read-high
1930 2096 ? 00:00:20 store-read-high
1930 2097 ? 00:00:20 store-read-high
1930 2098 ? 00:00:20 store-read-high
1930 2099 ? 00:00:20 store-read-high
1930 2100 ? 00:00:20 store-read-high
1930 2101 ? 00:00:20 store-read-high
1930 2102 ? 00:00:20 store-read-high
1930 2103 ? 00:06:06 sched-worker-po
1930 2104 ? 00:06:07 sched-worker-po
1930 2105 ? 00:06:07 sched-worker-po
1930 2106 ? 00:06:07 sched-worker-po
1930 2107 ? 00:06:06 sched-worker-po
1930 2108 ? 00:06:07 sched-worker-po
1930 2109 ? 00:06:06 sched-worker-po
1930 2110 ? 00:06:05 sched-worker-po
1930 2111 ? 00:00:54 sched-high-pri-
1930 2112 ? 00:00:54 sched-high-pri-
1930 2113 ? 00:00:54 sched-high-pri-
1930 2114 ? 00:00:54 sched-high-pri-
1930 2115 ? 12:09:59 grpc-server-0
1930 2116 ? 11:55:48 grpc-server-1
1930 2117 ? 10:04:14 grpc-server-2
1930 2118 ? 04:20:24 grpc-server-3
1930 2119 ? 00:14:11 split-check
1930 2121 ? 05:53:56 steady-timer
1930 2122 ? 17:40:48 raftstore-16640
1930 2123 ? 17:41:04 raftstore-16640
1930 2124 ? 09:57:09 future-poller0
1930 2126 ? 01:08:46 apply-0
1930 2127 ? 01:08:31 apply-1
1930 2128 ? 00:21:17 snap-generator-
1930 2129 ? 00:21:51 snap-generator-
1930 2130 ? 01:13:38 snapshot-worker
1930 2131 ? 00:02:42 raft-gc-worker
1930 2132 ? 00:00:05 cleanup-worker
1930 2133 ? 00:11:48 stats-monitor
1930 2134 ? 00:46:04 pd-worker
1930 2135 ? 00:00:00 consistency-che
1930 2136 ? 00:00:00 pd-worker
1930 2137 ? 00:00:25 gc-manager
1930 2138 ? 00:04:51 tso0
1930 2139 ? 00:01:27 cdc
1930 2140 ? 00:00:00 sst-importer0
1930 2141 ? 00:00:00 sst-importer1
1930 2142 ? 00:00:00 sst-importer2
1930 2143 ? 00:00:00 sst-importer3
1930 2144 ? 00:00:00 sst-importer4
1930 2145 ? 00:00:00 sst-importer5
1930 2146 ? 00:00:00 sst-importer6
1930 2147 ? 00:00:00 sst-importer7
1930 2148 ? 00:00:16 debugger0
1930 2149 ? 00:00:00 waiter-manager
1930 2150 ? 00:00:00 deadlock-0
1930 2151 ? 00:00:00 deadlock-detect
1930 2152 ? 00:00:00 waiter-manager
1930 2153 ? 00:00:00 backup-endpoint
1930 2154 ? 00:05:29 rocksdb-metrics
1930 2155 ? 00:01:13 snap-sender0
1930 2156 ? 00:01:13 snap-sender1
1930 2157 ? 00:01:14 snap-sender2
1930 2158 ? 00:01:13 snap-sender3
1930 2159 ? 00:01:19 snap-handler
1930 2161 ? 00:00:00 deadlock-detect
1930 2162 ? 00:38:30 transport-stats
1930 2163 ? 00:20:37 status-server
1930 2183 ? 00:00:05 default-executo
1930 2184 ? 00:01:01 coarsetime
1930 2185 ? 00:00:01 futures-timer
1930 2190 ? 00:00:10 default-executo
1930 21133 ? 00:00:03 default-executo
1930 21134 ? 00:00:02 default-executo
1930 24906 ? 00:00:05 default-executo
1930 27065 ? 00:00:01 default-executo
1930 27160 ? 00:00:01 default-executo
[tidb@prdtikv40 ~]$
192.168.192.40 这台机器从 PD 上看是处于 Up 的状态嘛? pdctl >> store 的信息可以拿一下
另外在拿下 cat /proc/{$pid}/status 的信息看下校对下,看 jemalloc 的内存使用只有 30多G
leojiang
(leojiang)
41
是up状态的没有down过
[tidb@prdtikv40 ~]$ cat /proc/1930/status
Name: tikv-server
Umask: 0022
State: S (sleeping)
Tgid: 1930
Ngid: 0
Pid: 1930
PPid: 1
TracerPid: 0
Uid: 1000 1000 1000 1000
Gid: 1000 1000 1000 1000
FDSize: 4096
Groups: 1000
NStgid: 1930
NSpid: 1930
NSpgid: 1930
NSsid: 1930
VmPeak: 100658824 kB
VmSize: 100648572 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 63507268 kB
VmRSS: 57065104 kB
RssAnon: 57056408 kB
RssFile: 8696 kB
RssShmem: 0 kB
VmData: 100593456 kB
VmStk: 236 kB
VmExe: 36400 kB
VmLib: 3184 kB
VmPTE: 181548 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
THP_enabled: 1
Threads: 147
SigQ: 0/256546
SigPnd: 0000000000000000
ShdPnd: 0000000000000000
SigBlk: 0000000000000000
SigIgn: 0000000000001000
SigCgt: 0000000180004e43
CapInh: 0000000000000000
CapPrm: 0000000000000000
CapEff: 0000000000000000
CapBnd: 000000ffffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: thread vulnerable
Cpus_allowed: ffff
Cpus_allowed_list: 0-15
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001
Mems_allowed_list: 0
voluntary_ctxt_switches: 488
nonvoluntary_ctxt_switches: 7
[tidb@prdtikv40 ~]$