为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】电商和广告业务
【概述】场景+问题概述
【背景】做过哪些操作
【现象】业务和数据库现象
【业务影响】业务出现中断大约10分钟左右
【TiDB 版本】v5.3.0
【附件】
- 相关日志 和 监控
- TiUP Cluster Display 信息
[root@ip-192-168-0-135 ~]# tiup cluster display tidb-bz-live
Starting componentcluster
: /root/.tiup/components/cluster/v1.8.0/tiup-cluster display tidb-bz-live
Cluster type: tidb
Cluster name: tidb-bz-live
Cluster version: v5.3.0
Deploy user: tidb
SSH type: builtin
Dashboard URL: http://192.168.0.186:2379/dashboard
ID Role Host Ports OS/Arch Status Data Dir Deploy Dir
192.168.0.135:9093 alertmanager 192.168.0.135 9093/9094 linux/x86_64 Up /data/local/tidb-data/alertmanager-9093 /data/local/tidb-deploy/alertmanager-9093
192.168.0.150:8300 cdc 192.168.0.150 8300 linux/x86_64 Up /data/local/tidb-data/cdc-8300 /data/local/tidb-deploy/cdc-8300
192.168.0.135:3000 grafana 192.168.0.135 3000 linux/x86_64 Up - /data/local/tidb-deploy/grafana-3000
192.168.0.100:2379 pd 192.168.0.100 2379/2380 linux/x86_64 Up /data/local/tidb-data/pd-2379 /data/local/tidb-deploy/pd-2379
192.168.0.150:2379 pd 192.168.0.150 2379/2380 linux/x86_64 Up /data/local/tidb-data/pd-2379 /data/local/tidb-deploy/pd-2379
192.168.0.186:2379 pd 192.168.0.186 2379/2380 linux/x86_64 Up|L|UI /data/local/tidb-data/pd-2379 /data/local/tidb-deploy/pd-2379
192.168.0.135:9090 prometheus 192.168.0.135 9090 linux/x86_64 Up /data/local/tidb-data/prometheus-8249 /data/local/tidb-deploy/prometheus-8249
192.168.0.186:8250 pump 192.168.0.186 8250 linux/x86_64 Up /data/local/tidb-data/pump-8249 /data/local/tidb-deploy/pump-8249
192.168.0.100:4000 tidb 192.168.0.100 4000/10080 linux/x86_64 Up - /data/local/tidb-deploy/tidb-4000
192.168.0.150:4000 tidb 192.168.0.150 4000/10080 linux/x86_64 Up - /data/local/tidb-deploy/tidb-4000
192.168.0.186:4000 tidb 192.168.0.186 4000/10080 linux/x86_64 Up - /data/local/tidb-deploy/tidb-4000
192.168.0.131:20160 tikv 192.168.0.131 20160/20180 linux/x86_64 Up /data/local/tidb-data/tikv-20160 /data/local/tidb-deploy/tikv-20160
192.168.0.19:20160 tikv 192.168.0.19 20160/20180 linux/x86_64 Up /data/local/tidb-data/tikv-20160 /data/local/tidb-deploy/tikv-20160
192.168.0.71:20160 tikv 192.168.0.71 20160/20180 linux/x86_64 Up /data/local/tidb-data/tikv-20160 /data/local/tidb-deploy/tikv-20160
Total nodes: 14
-
TiUP Cluster Edit Config 信息
-
TiDB- Overview 监控
- 对应模块日志(包含问题前后1小时日志)
我们tidb集群今天出现了中断服务,大约10分钟左右,日志里面出现大量的[2021/12/30 09:53:07.645 +08:00] [INFO] [coprocessor.go:1135] [“memory exceeds quota, rateLimitAction delegate to fallback action”] [“total token count”=1] INFO信息,而且日志里面发现了重启服务
tidb-server has the risk of OOM. Running SQLs and heap profile will be recorded in record path
[“Welcome to TiDB.”] [“Release Version”=v5.3.0]
详细日志请查看附件,请问是什么原因引起的呢?为什么会突然出现这种错误?tidb.tar.gz (16.8 MB)