【 TiDB 使用环境】生产环境
【 TiDB 版本】
【遇到的问题:问题现象及影响】
突然连接数连接数和内存使用暴涨起来了,查看oom日志,只有里面的语句内存使用只有几MB而已。两台TiDB同时OOM, 求大神们帮忙分析一下
【 TiDB 使用环境】生产环境
【 TiDB 版本】
【遇到的问题:问题现象及影响】
突然连接数连接数和内存使用暴涨起来了,查看oom日志,只有里面的语句内存使用只有几MB而已。两台TiDB同时OOM, 求大神们帮忙分析一下
看看慢查询和当时sql语句分析页面,按时间倒序
可以在日志中搜索下 expensive query 关键字看看, oom_record 有可能有时候没记录到
问题 SQL 定位
INFORMATION_SCHEMA
中的 SLOW_QUERY
和 CLUSTER_SLOW_QUERY 。
也可以通过 tidb dashboard的 慢SQL和 SQL语句找到问题附近的 SQL并分析,定位到具体的问题 SQL,从 SQL 语句分析、慢查询的具体内容,可以详细查看内存使用量。
同时 grep expensive_query tidb.log 辅助定位问题时间点的 SQL 情况。
好的,谢谢
好的 谢谢,我看看
感觉还是有点奇怪,确认是只有这1个expensive sql吗?因为看到plan cache的plan num也是同步上升的,这个sql有使用plan cache吧,那应该能重用plan cache才对呀,为什么plan num会在同时段升高呢?
能再看一下如下面板的指标吗?
performance overview–>Queries Using Plan Cache OPS
Tidb–>Plan Cache Miss OPS
Tidb–>Queries Using Plan Cache OPS
另外还可以看一下cluster_slow_query中该sql的plan_from_cache的值是什么,如果为0,则表示没有使用plan cache,则可能还有其他的硬解析SQL增加。
看起来没问题,时间拉长点看看
对呢,过滤了日志,OOM当天的只有1个expensive sql
Master|root@172.16.5.174|INFORMATION_SCHEMA>SELECT * FROM cluster_slow_query WHERE Query LIKE 'SELECT b.url download_url,c.strategy_forced forced,b.version_name,b.version_code,CASE%' LIMIT 1\G
*************************** 1. row ***************************
INSTANCE: 172.16.5.220:10080
Time: 2024-11-19 10:10:19.949910
Txn_start_ts: 454028746531537035
User: mapgoo-paas
Host: 10.1.9.168
Conn_ID: 3368179780
Session_alias:
Exec_retry_count: 0
Exec_retry_time: 0
Query_time: 2.02739531
Parse_time: 0.000224666
Compile_time: 0.003303374
Rewrite_time: 0.000415837
Preproc_subqueries: 0
Preproc_subqueries_time: 0
Optimize_time: 0.002614232
Wait_TS: 0.000013228
Prewrite_time: 0
Wait_prewrite_binlog_time: 0
Commit_time: 0
Get_commit_ts_time: 0
Commit_backoff_time: 0
Backoff_types:
Resolve_lock_time: 0
Local_latch_wait_time: 0
Write_keys: 0
Write_size: 0
Prewrite_region: 0
Txn_retry: 0
Cop_time: 2.017427683
Process_time: 0.006957334
Wait_time: 0.009076069
Backoff_time: 0
LockKeys_time: 0
Request_count: 9
Total_keys: 312
Process_keys: 224
Rocksdb_delete_skipped_count: 0
Rocksdb_key_skipped_count: 224
Rocksdb_block_cache_hit_count: 1496
Rocksdb_block_read_count: 0
Rocksdb_block_read_byte: 0
DB: manager_admin
Index_names: [std_appinfo:idx_app_packname]
Is_internal: 0
Digest: 0a2afe584e4d2d6a51e2434119dd0ee4dc5a2d56efec67ac847125b08735c982
Stats: std_app_version:454028591237431600[3401;833],std_appinfo:451951293512810812[50;1],std_version_strategy:454028591486468255[5173;1949]
Cop_proc_avg: 0.000773037
Cop_proc_p90: 0.006950168
Cop_proc_max: 0.006950168
Cop_proc_addr: 172.16.5.1:20160
Cop_wait_avg: 0.001008452
Cop_wait_p90: 0.006300307
Cop_wait_max: 0.006300307
Cop_wait_addr: 172.16.5.1:20160
Mem_max: 987948
Disk_max: 0
KV_total: 2.017902085
PD_total: 0.000011824
Backoff_total: 0
Write_sql_response_total: 0.000519431
Result_rows: 236
Warnings: [{"Level":"Note","Message":"[b] remain after pruning paths for b given Prop{SortItems: [{manager_admin.std_app_version.versionid asc}], TaskTp: rootTask}","IsExtra":true},{"Level":"Note","Message":"[c,idx_versionid_is_delete] remain after pruning paths for c given Prop{SortItems: [{manager_admin.std_version_strategy.versionid asc}], TaskTp: rootTask}","IsExtra":true},{"Level":"Note","Message":"[b,idx_muti] remain after pruning paths for b given Prop{SortItems: [{manager_admin.std_app_version.appid asc}], TaskTp: rootTask}","IsExtra":true},{"Level":"Note","Message":"[b] remain after pruning paths for b given Prop{SortItems: [], TaskTp: rootTask}","IsExtra":true},{"Level":"Note","Message":"[c] remain after pruning paths for c given Prop{SortItems: [], TaskTp: rootTask}","IsExtra":true}]
Backoff_Detail:
Prepared: 0
Succ: 1
IsExplicitTxn: 0
IsWriteCacheTable: 0
Plan_from_cache: 0
Plan_from_binding: 0
Has_more_results: 0
Resource_group: default
Request_unit_read: 6.782099614583331
Request_unit_write: 0
Time_queued_by_rc: 0
Plan_digest: 7db1f57a9e9200529f8d8736a7e1b00b8f45d35fb57f1b87dfdf6503f6df509a
Binary_plan: gkSACv1DCg1Qcm9qZWN0aW9uXzEyErs6CgdTb3J0XzEzEs05Nh4AYDUSxS4KEEluZGV4SGFzaEpvaW5fMjMSqh0+FQAYNjAS2AsKDgUqNExvb2tVcF85OBKCBAoRBRPwYVJhbmdlU2Nhbl85NRoBASFkfWisDx1sQCkgGutDY30IQDADOAJAAko2CjQKDW1hbmFnZXJfYWRtaW4SAWEaIAoQaWR4X2FwcF9wYWNrbmFtZRIMYXBwX3BhY2tuYW1lUj9yAWFQOlsiY29tLm1hcGdvby5tbGluayIsRhMA8GFdLCBrZWVwIG9yZGVyOmZhbHNlWhZ0aW1lOjY0MS40wrVzLCBsb29wczozYsoBY29wX3Rhc2s6IHtudW06IDEsIG1heDogNTU4LjbCtXMsIHByb2Nfa2V5czogMCwgdG90XwESGDogNDc2bnMJESB3YWl0OiAxMTUNXwxycGNfEU4BDAV+FCA1MjMuOAV/gGNvcHJfY2FjaGVfaGl0X3JhdGlvOiAxLjAwLCBidWlsZAWTCF9kdQUaGG46IDI4LjUFOoBtYXhfZGlzdHNxbF9jb25jdXJyZW5jeTogMX1qXnRpa3YFOgQ6ewVyBDRtFesQMX0sIHMhoVBkZXRhaWw6IHtnZXRfc25hcHNob3QNoQg5LjAJoDRyb2Nrc2RiOiB7YmxvYyEcEH19fXD/EQEEAXgRCiT/ARKNBQoMU2VsTXFAOTcSiAEKEVRhYmxlUm93SURJFiA2IS54OQUvU49CEwIIFAoSRhMCBFIQPsIBBGodPt0AADgu3QBWnABYGgECIXw5BS+nnI9AKW/T4fov/gBAMAFBixBSamVxKDKKAgwuc3RkQYQYaW5mby5hcFmMBCwgRmQCCCksIHJAADhpc19kZWxldGUsIDApWhMpjgQuMRmzCDJi32J7Agg0LjAFKkFnBF9rRnkCCDk3NDZ5AggzOC5Jq0FsRngCDDMuOTkBegBjvnYCBDMwgnYCZS4YX2V4dHJhX4KQAjKzAYqQAgQxMRHYppACgX8YbmMYWjGbVTL0ARABQAFaFCWICDUuMH0ZEDNi8QFpoQFpORgge3RvdGFsbREQNjU2LjlleEBmZXRjaF9oYW5kbGU6IDY0NW3KZa4MOiA3NiWTjQgkMC4zwrVzfSwgdGEHTlkADDQuMjMhkzGtMjsBKDh9LCBuZXh0OiB7AVEAXwWfDDogNzmRvgUWCVwYbG9va3VwXxGHCDQuMgWvRiIAPHJlc3A6IDQuMTNtc31woBIuwwMEvg422wUQNTcS7AVC2wUENTQhXEwwPTPPrvPTQCkoE0bpgTyDQDDuAmFRDEo9CjtCyQMQYhonChOt3AhpZF91FiASBWFwcGlkEgkVEgxS0gFypeQwIGRlY2lkZWQgYnkgW15kAxxfdmVyc2lvbmGoBGlkQYTdUz7JAwhpZClirQMVSRWJYbEoIG5vdChpc251bGxWJAQVNw2IDGlkKSlOdwZp+QgxLjUh7ABszXQA5kL5AwAyAcIUeDogODYwTW8IbWluwTAIOS4xJc8YYXZnOiA2OckDCHA5NS4rAAB01ZEIMS45qb8BE0ECATRNqJUaBDIsBQxlCgwgMS4zBZvCGgQEMjURqWqQBgSSAb20gc3BvsmVAeBlAQXaADMpIQhwODAJHghwOTUJCRhpdGVyczo2YUkQc2tzOjKONQQMMjEuNyU1qjUEDBKPBgrdxQw1NhK/RsUGJDU1ISrk0C5cBf9G/QJOxgYAYkrGBgBTThIBADGFjyETADZBKCUTBDEwAQshEgkeIRMJCkIUAWb8BkjmKNR3sjv/QCmIdV6vIodmQDBoYasEUnyCWAMcaXNfYXZhaWyBwABkYRMgLCBndChjYXN0lhEDaF9jb2RlLCBkb3VibGUgQklOQVJZKSwgNTQ1Ka2GCDEuOAXuaRYIMmLgYhYDDDEuNTchDSEhDCAxLjLFzCklBC40RUUhHRUnQf1BbQUSPhADCDU4NaWcaQWlfjIRAwQyLsWkwhADBDU3DVdiEAMMMn1qlFIQAynWAd0lEAnZpv4BikcHADIV9K5HByACIeYO0qcM470yPwL5RwgzLjlFuDHBBOoBxahO7gYIMS41BSs2RQcIMS41Jb7NvAQ5NBJPC4m4CDIuOA6pCwB9YfoEYmxSQwcIMi4xRQolxQQxLJJDBwAxDbQBZVJBBwg2LjGpZUYiAOlBCDEuOeFACJD7Av1B4RgYJheqS7aD0D4tAThSzwFpbm5lciBqb2luLCAFDAA6EjQNGkcNNDU3LCBvdXRlciBrZXk6cpIKweoFQWYrANWZ5WAOpQoIdWFsIRsEZDpe9AMe8AoBXlJ/ADZUAB7OCgQ5LkWvMf4AaQWZADoWPgkMOjQuMmGyLtYIHDgsIHRhc2s6Kaokc3RydWN0OjYuMamsRRoMOjQuMF0SCDYuNCmMITsMOjE2MMXLEH1wn/UKPXwIEukNNr0IEDIwEuAGQr0IBDE3IaVI+XdyzXJgYEApwNDAJbHudEAwhha9CAhFCkNCvQgQYxovChcOvQj15SrBCAAJFRUAEirFCACzosUIDT8gX3N0cmF0ZWd5JkoIDgwISs4IETIZKVrWCEJaADLbCGamCAwyLjAxIi8OCDNi/2KQBQ0pgbIMIDIuM6UxibYOLQ8OnQgRIgRheKWKGsQMDDIyNCyhq0IUALmyCDYuORKQCAESYYIIOiA4GikLVsAIDWlWrwUIMC41Xk8PBDE1GlkJAaNeTw8EhAJOrQcIMnMsBfupoxn4BDgwBRshAAEj6aYANUn1TroIFmcMwakIZXNzLiIBRhkALF9zaXplOiAxMjMyMBJwEARhbA0xEDMxMiwgSskPADMeaRC58kRrZXlfc2tpcHBlZF9jb3VudDopkR7hDyaOEA0eDDE0OTZm9g8ExwQuMQkQMTkSiQFC9g8kMTghDPG8kXXuokZxA04xCQBjSjEJQvcPnXYANGr7CEhpE5UAnPmiQClAAg1TqZxxQDDsFvgPADdSCAVONQM2AQkitwgAMhKNCJH4AMlCJwMWEw8QOiA5NjYaiQoh5S2xKcAl9xA6IDkwMTrIDwA5GnAPVe4AMRapCBauCwg5NDga3w1W8QJqQBIEMzQNjG7xAkJAEjJsAYppCAAxHr8MWXdRX2JKAg6kERg9rI0VlpVkNq0BDmoIFkANbeYxcwDpZmkIDSrFTiKtDw0VGmcIBDc5JXWJXgAzyYBuZwgIMS40heKuZwgNcgFkUmYICDEwMhovCkYjAB5nCAQzMg5oCAiYvgvd7BwhMU5x3k2M50IpAQBSDsQQbmYIgTkAb3ZmCHEE2WEarwhabghCMwMNEQ5ICABlhncIEUAuNwBGfwgNVEonBx6ICAgyLjChcFEdAGcyiAgtiYKICAQxMA5uCgAsSTkRNkkxBDMzbToShggENjcW4xUQfXD9ixQumgEUtOFt3JioSpoBBKUKRtYADl4UDdoMLnVybE7yBw0fGfkeKQgQX2ZvcmMOlA12VwARQA5NF05gAC6DABEsFrQNNGNhc2UoZXEobGVuZ3Roeu8NHGNvbnRlbnQpFiMOACB+agARL0AtPkNvbHVtbiM0OCwgZGF0ZQHWBG1hfl0OCGNyZQEsoQ5ILCAlWS0lbS0lZCAlSDolaTolUx1WADl+iQDhUn4kAAhtZDVaIwAeMwsioRhaKAAxxjFiYRuStwGGKQAa4AtGTgAxzVEwWTkwaW1laV9saW1pdF9lbg5WDVJrCpJAAAFLDggJQqkKZj4ADRoAc+6CADpEABEPTn4BDRxKBwEMaGlnaBEfujoACG9sZPLAAC5BAAUMkoAAPssAAF9WHgI+JwA6+QEQYmxhY2tu+QFBUg77GDhkZXZpY2VfcGxhdGZvcm2SowBRbQFFDmkKamACLnkEfdD56y7OBQQPQyrjGhhPRkZwyKQOLnYFGEDifOCjBOtGEAcALkabAlp5ABA6ZGVzY1J+AChwnIMPIcN1ed7uIEpiAATfCHZjAP7YBf7YBULYBbVytX0INTQsFRYAORkWXvYEUWL+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQWaPQVSlARGEgU88PIPeP///////////wEYAQ==
Prev_stmt:
Query: SELECT b.url download_url,c.strategy_forced forced,b.version_name,b.version_code,CASE WHEN LENGTH(b.content)=0 THEN ' ' ELSE b.content END content , DATE_FORMAT(b.create_time,'%Y-%m-%d %H:%i:%S') create_time,b.size,b.md5,a.app_packname package_name,b.versionid package_id,b.versionid Versionid,b.appid Appid, c.strategy_imei_limit_enabled StrategyImeiLimitEnabled,c.strategy_imei_limit_imeis StrategyImeiLimitImeis,c.strategy_versions_limit_enabled StrategyVersionsLimitEnabled,c.strategy_versions_limit_versions StrategyVersionsLimitVersions,c.strategy_high_version StrategyHighVersion, c.strategy_holds_limit_enabled StrategyHoldsLimitEnabled,c.strategy_holds_limit_holds StrategyHoldsLimitHolds,c.version_strategy_id,c.strategy_imei_black_enabled,a.device_platform,c.strategy_appkey_enabled FROM std_appinfo a LEFT JOIN std_app_version b ON a.appid =b.appid LEFT JOIN std_version_strategy c ON b.versionid=c.versionid WHERE a.is_delete=0 AND b.is_delete=0 AND c.is_delete=0 AND b.is_availabled=0 AND c.is_availabled=0 AND a.app_packname = 'com.000.mlink' AND b.version_code>545 order by b.create_time desc ;
1 row in set (0.07 sec)
看了一下Plan_from_cache:为0呢
我之前是因为虚拟内存彻底关闭后导致了,后面设置vm.swappiness = 1 暂时没再发生了
这是什么,发错了吗?
你开 swap,只是为了让tidb 无内存可申请的时候用 swap。但是生产环境是关 swap 的,因为用 swap 性能抖动会非常厉害。
在哪里看出我这边使用了swap?