为提高效率,请提供以下信息,问题描述清晰能够更快得到解决:
【 TiDB 使用环境】
【概述】 场景 + 问题概述
经常有overwritten-nodes,NODE_disk_read_latency_more_than_32ms,disk_read_latency_more_than_32ms告警
以及tidb_tikvclient_backoff_seconds_count,TiDB tikvclient_backoff_count error告警
【背景】 做过哪些操作
业务无变化
【现象】 业务和数据库现象
TiDB_memory_abnormal,TiDB heap memory usage is over 10 GB
TiDB_monitor_keep_alive,TiDB monitor_keep_alive error
【问题】 当前遇到的问题
找不出导致tidb OOM的原因
【业务影响】
tidb oom假死
【TiDB 版本】
4.0.9
【应用软件及版本】
【附件】 相关日志及配置信息
查看tidb日志有个binlog的告警
[2021/10/15 23:03:04.722 +08:00] [WARN] [binloginfo.go:127] [“[binloginfo] disable the skipBinlog flag”]
[2021/10/15 23:03:04.722 +08:00] [WARN] [binloginfo.go:153] [“[binloginfo] start waiting for binlog recovering”]
[2021/10/15 23:03:05.222 +08:00] [WARN] [binloginfo.go:161] [“[binloginfo] binlog recovered”]
[2021/10/15 23:03:05.223 +08:00] [WARN] [binloginfo.go:127] [“[binloginfo] disable the skipBinlog flag”]
[2021/10/15 23:03:05.223 +08:00] [WARN] [binloginfo.go:153] [“[binloginfo] start waiting for binlog recovering”]
[2021/10/15 23:03:05.723 +08:00] [WARN] [binloginfo.go:161] [“[binloginfo] binlog recovered”]
[2021/10/15 23:03:05.923 +08:00] [INFO] [coprocessor.go:1034] [“[TIME_COP_PROCESS] resp_time:607.061625ms txnStartTS:428424609216069635 region_id:97140 store_addr:172.18.41.13:20160 kv_process_ms:605 scan_total_write:644049 scan_processed_write:644048 scan_total_data:0 scan_processed_data:0 scan_total_lock:1 scan_processed_lock:0”] [conn=5540497]
[2021/10/15 23:03:18.457 +08:00] [ERROR] [terror.go:271] [“encountered error”] [error=“write tcp xxxxx.11:10080->172.18.251.4:35918: write: broken pipe”] [stack=“github.com/pingcap/parser/terror.Log
\t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/pingcap/parser@v0.0.0-20201130080042-c3ddfec58248/terror/terror.go:271
github.com/pingcap/tidb/server.writeData
\t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/server/http_handler.go:115
github.com/pingcap/tidb/server.ddlHistoryJobHandler.ServeHTTP
\t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/server/http_handler.go:1054
github.com/gorilla/mux.(*Router).ServeHTTP
\t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/pkg/mod/github.com/gorilla/mux@v1.7.4/mux.go:210
net/http.(*ServeMux).ServeHTTP
\t/usr/local/go/src/net/http/server.go:2387
github.com/pingcap/tidb/server.CorsHandler.ServeHTTP
\t/home/jenkins/agent/workspace/optimization-build-tidb-linux-amd/go/src/github.com/pingcap/tidb/server/util.go:432
net/http.serverHandler.ServeHTTP
\t/usr/local/go/src/net/http/server.go:2802
net/http.(*conn).serve
\t/usr/local/go/src/net/http/server.go:1890”]
出现这个告警后 sql开始堵住了 tidb 发生oom
- TiUP Cluster Display 信息
- TiUP CLuster Edit config 信息
监控(https://metricstool.pingcap.com/)
-
TiDB-Overview Grafana监控
-
TiDB Grafana 监控
-
TiKV Grafana 监控
-
PD Grafana 监控
-
对应模块日志(包含问题前后 1 小时日志)
若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。