OOM问题排查,查看日志中的SQL内存耗用不大

【 TiDB 使用环境】生产环境
【 TiDB 版本】
【遇到的问题:问题现象及影响】

突然连接数连接数和内存使用暴涨起来了,查看oom日志,只有里面的语句内存使用只有几MB而已。两台TiDB同时OOM, 求大神们帮忙分析一下

看看慢查询和当时sql语句分析页面,按时间倒序

1 个赞

可以在日志中搜索下 expensive query 关键字看看, oom_record 有可能有时候没记录到

1 个赞
  • 问题 SQL 定位

    • 获取可能的问题 SQL,可以查看 INFORMATION_SCHEMA 中的 SLOW_QUERYCLUSTER_SLOW_QUERY 。
    • 也可以通过 tidb dashboard的 慢SQL和 SQL语句找到问题附近的 SQL并分析,定位到具体的问题 SQL,从 SQL 语句分析、慢查询的具体内容,可以详细查看内存使用量。

    • 同时 grep expensive_query tidb.log 辅助定位问题时间点的 SQL 情况。

看了一下,发现这条SQL在数据库状态正常的情况下,执行挺快的

好的,谢谢

好的 谢谢,我看看

在dashboards的sql语句分析,找出来这条sql,看看执行计划有个,每个执行计划执行的时间统计
举个例子


这个是吧?

感觉还是有点奇怪,确认是只有这1个expensive sql吗?因为看到plan cache的plan num也是同步上升的,这个sql有使用plan cache吧,那应该能重用plan cache才对呀,为什么plan num会在同时段升高呢?
能再看一下如下面板的指标吗?
performance overview–>Queries Using Plan Cache OPS
Tidb–>Plan Cache Miss OPS
Tidb–>Queries Using Plan Cache OPS

另外还可以看一下cluster_slow_query中该sql的plan_from_cache的值是什么,如果为0,则表示没有使用plan cache,则可能还有其他的硬解析SQL增加。

看起来没问题,时间拉长点看看


image

我们oom的时间是17:17


你好,请问是拉长这个时间吗

对呢,过滤了日志,OOM当天的只有1个expensive sql


Master|root@172.16.5.174|INFORMATION_SCHEMA>SELECT * FROM  cluster_slow_query WHERE Query LIKE 'SELECT b.url download_url,c.strategy_forced forced,b.version_name,b.version_code,CASE%' LIMIT 1\G
*************************** 1. row ***************************
                     INSTANCE: 172.16.5.220:10080
                         Time: 2024-11-19 10:10:19.949910
                 Txn_start_ts: 454028746531537035
                         User: mapgoo-paas
                         Host: 10.1.9.168
                      Conn_ID: 3368179780
                Session_alias: 
             Exec_retry_count: 0
              Exec_retry_time: 0
                   Query_time: 2.02739531
                   Parse_time: 0.000224666
                 Compile_time: 0.003303374
                 Rewrite_time: 0.000415837
           Preproc_subqueries: 0
      Preproc_subqueries_time: 0
                Optimize_time: 0.002614232
                      Wait_TS: 0.000013228
                Prewrite_time: 0
    Wait_prewrite_binlog_time: 0
                  Commit_time: 0
           Get_commit_ts_time: 0
          Commit_backoff_time: 0
                Backoff_types: 
            Resolve_lock_time: 0
        Local_latch_wait_time: 0
                   Write_keys: 0
                   Write_size: 0
              Prewrite_region: 0
                    Txn_retry: 0
                     Cop_time: 2.017427683
                 Process_time: 0.006957334
                    Wait_time: 0.009076069
                 Backoff_time: 0
                LockKeys_time: 0
                Request_count: 9
                   Total_keys: 312
                 Process_keys: 224
 Rocksdb_delete_skipped_count: 0
    Rocksdb_key_skipped_count: 224
Rocksdb_block_cache_hit_count: 1496
     Rocksdb_block_read_count: 0
      Rocksdb_block_read_byte: 0
                           DB: manager_admin
                  Index_names: [std_appinfo:idx_app_packname]
                  Is_internal: 0
                       Digest: 0a2afe584e4d2d6a51e2434119dd0ee4dc5a2d56efec67ac847125b08735c982
                        Stats: std_app_version:454028591237431600[3401;833],std_appinfo:451951293512810812[50;1],std_version_strategy:454028591486468255[5173;1949]
                 Cop_proc_avg: 0.000773037
                 Cop_proc_p90: 0.006950168
                 Cop_proc_max: 0.006950168
                Cop_proc_addr: 172.16.5.1:20160
                 Cop_wait_avg: 0.001008452
                 Cop_wait_p90: 0.006300307
                 Cop_wait_max: 0.006300307
                Cop_wait_addr: 172.16.5.1:20160
                      Mem_max: 987948
                     Disk_max: 0
                     KV_total: 2.017902085
                     PD_total: 0.000011824
                Backoff_total: 0
     Write_sql_response_total: 0.000519431
                  Result_rows: 236
                     Warnings: [{"Level":"Note","Message":"[b] remain after pruning paths for b given Prop{SortItems: [{manager_admin.std_app_version.versionid asc}], TaskTp: rootTask}","IsExtra":true},{"Level":"Note","Message":"[c,idx_versionid_is_delete] remain after pruning paths for c given Prop{SortItems: [{manager_admin.std_version_strategy.versionid asc}], TaskTp: rootTask}","IsExtra":true},{"Level":"Note","Message":"[b,idx_muti] remain after pruning paths for b given Prop{SortItems: [{manager_admin.std_app_version.appid asc}], TaskTp: rootTask}","IsExtra":true},{"Level":"Note","Message":"[b] remain after pruning paths for b given Prop{SortItems: [], TaskTp: rootTask}","IsExtra":true},{"Level":"Note","Message":"[c] remain after pruning paths for c given Prop{SortItems: [], TaskTp: rootTask}","IsExtra":true}]
               Backoff_Detail: 
                     Prepared: 0
                         Succ: 1
                IsExplicitTxn: 0
            IsWriteCacheTable: 0
              Plan_from_cache: 0
            Plan_from_binding: 0
             Has_more_results: 0
               Resource_group: default
            Request_unit_read: 6.782099614583331
           Request_unit_write: 0
            Time_queued_by_rc: 0
                  Plan_digest: 7db1f57a9e9200529f8d8736a7e1b00b8f45d35fb57f1b87dfdf6503f6df509a
                  Binary_plan: gkSACv1DCg1Qcm9qZWN0aW9uXzEyErs6CgdTb3J0XzEzEs05Nh4AYDUSxS4KEEluZGV4SGFzaEpvaW5fMjMSqh0+FQAYNjAS2AsKDgUqNExvb2tVcF85OBKCBAoRBRPwYVJhbmdlU2Nhbl85NRoBASFkfWisDx1sQCkgGutDY30IQDADOAJAAko2CjQKDW1hbmFnZXJfYWRtaW4SAWEaIAoQaWR4X2FwcF9wYWNrbmFtZRIMYXBwX3BhY2tuYW1lUj9yAWFQOlsiY29tLm1hcGdvby5tbGluayIsRhMA8GFdLCBrZWVwIG9yZGVyOmZhbHNlWhZ0aW1lOjY0MS40wrVzLCBsb29wczozYsoBY29wX3Rhc2s6IHtudW06IDEsIG1heDogNTU4LjbCtXMsIHByb2Nfa2V5czogMCwgdG90XwESGDogNDc2bnMJESB3YWl0OiAxMTUNXwxycGNfEU4BDAV+FCA1MjMuOAV/gGNvcHJfY2FjaGVfaGl0X3JhdGlvOiAxLjAwLCBidWlsZAWTCF9kdQUaGG46IDI4LjUFOoBtYXhfZGlzdHNxbF9jb25jdXJyZW5jeTogMX1qXnRpa3YFOgQ6ewVyBDRtFesQMX0sIHMhoVBkZXRhaWw6IHtnZXRfc25hcHNob3QNoQg5LjAJoDRyb2Nrc2RiOiB7YmxvYyEcEH19fXD/EQEEAXgRCiT/ARKNBQoMU2VsTXFAOTcSiAEKEVRhYmxlUm93SURJFiA2IS54OQUvU49CEwIIFAoSRhMCBFIQPsIBBGodPt0AADgu3QBWnABYGgECIXw5BS+nnI9AKW/T4fov/gBAMAFBixBSamVxKDKKAgwuc3RkQYQYaW5mby5hcFmMBCwgRmQCCCksIHJAADhpc19kZWxldGUsIDApWhMpjgQuMRmzCDJi32J7Agg0LjAFKkFnBF9rRnkCCDk3NDZ5AggzOC5Jq0FsRngCDDMuOTkBegBjvnYCBDMwgnYCZS4YX2V4dHJhX4KQAjKzAYqQAgQxMRHYppACgX8YbmMYWjGbVTL0ARABQAFaFCWICDUuMH0ZEDNi8QFpoQFpORgge3RvdGFsbREQNjU2LjlleEBmZXRjaF9oYW5kbGU6IDY0NW3KZa4MOiA3NiWTjQgkMC4zwrVzfSwgdGEHTlkADDQuMjMhkzGtMjsBKDh9LCBuZXh0OiB7AVEAXwWfDDogNzmRvgUWCVwYbG9va3VwXxGHCDQuMgWvRiIAPHJlc3A6IDQuMTNtc31woBIuwwMEvg422wUQNTcS7AVC2wUENTQhXEwwPTPPrvPTQCkoE0bpgTyDQDDuAmFRDEo9CjtCyQMQYhonChOt3AhpZF91FiASBWFwcGlkEgkVEgxS0gFypeQwIGRlY2lkZWQgYnkgW15kAxxfdmVyc2lvbmGoBGlkQYTdUz7JAwhpZClirQMVSRWJYbEoIG5vdChpc251bGxWJAQVNw2IDGlkKSlOdwZp+QgxLjUh7ABszXQA5kL5AwAyAcIUeDogODYwTW8IbWluwTAIOS4xJc8YYXZnOiA2OckDCHA5NS4rAAB01ZEIMS45qb8BE0ECATRNqJUaBDIsBQxlCgwgMS4zBZvCGgQEMjURqWqQBgSSAb20gc3BvsmVAeBlAQXaADMpIQhwODAJHghwOTUJCRhpdGVyczo2YUkQc2tzOjKONQQMMjEuNyU1qjUEDBKPBgrdxQw1NhK/RsUGJDU1ISrk0C5cBf9G/QJOxgYAYkrGBgBTThIBADGFjyETADZBKCUTBDEwAQshEgkeIRMJCkIUAWb8BkjmKNR3sjv/QCmIdV6vIodmQDBoYasEUnyCWAMcaXNfYXZhaWyBwABkYRMgLCBndChjYXN0lhEDaF9jb2RlLCBkb3VibGUgQklOQVJZKSwgNTQ1Ka2GCDEuOAXuaRYIMmLgYhYDDDEuNTchDSEhDCAxLjLFzCklBC40RUUhHRUnQf1BbQUSPhADCDU4NaWcaQWlfjIRAwQyLsWkwhADBDU3DVdiEAMMMn1qlFIQAynWAd0lEAnZpv4BikcHADIV9K5HByACIeYO0qcM470yPwL5RwgzLjlFuDHBBOoBxahO7gYIMS41BSs2RQcIMS41Jb7NvAQ5NBJPC4m4CDIuOA6pCwB9YfoEYmxSQwcIMi4xRQolxQQxLJJDBwAxDbQBZVJBBwg2LjGpZUYiAOlBCDEuOeFACJD7Av1B4RgYJheqS7aD0D4tAThSzwFpbm5lciBqb2luLCAFDAA6EjQNGkcNNDU3LCBvdXRlciBrZXk6cpIKweoFQWYrANWZ5WAOpQoIdWFsIRsEZDpe9AMe8AoBXlJ/ADZUAB7OCgQ5LkWvMf4AaQWZADoWPgkMOjQuMmGyLtYIHDgsIHRhc2s6Kaokc3RydWN0OjYuMamsRRoMOjQuMF0SCDYuNCmMITsMOjE2MMXLEH1wn/UKPXwIEukNNr0IEDIwEuAGQr0IBDE3IaVI+XdyzXJgYEApwNDAJbHudEAwhha9CAhFCkNCvQgQYxovChcOvQj15SrBCAAJFRUAEirFCACzosUIDT8gX3N0cmF0ZWd5JkoIDgwISs4IETIZKVrWCEJaADLbCGamCAwyLjAxIi8OCDNi/2KQBQ0pgbIMIDIuM6UxibYOLQ8OnQgRIgRheKWKGsQMDDIyNCyhq0IUALmyCDYuORKQCAESYYIIOiA4GikLVsAIDWlWrwUIMC41Xk8PBDE1GlkJAaNeTw8EhAJOrQcIMnMsBfupoxn4BDgwBRshAAEj6aYANUn1TroIFmcMwakIZXNzLiIBRhkALF9zaXplOiAxMjMyMBJwEARhbA0xEDMxMiwgSskPADMeaRC58kRrZXlfc2tpcHBlZF9jb3VudDopkR7hDyaOEA0eDDE0OTZm9g8ExwQuMQkQMTkSiQFC9g8kMTghDPG8kXXuokZxA04xCQBjSjEJQvcPnXYANGr7CEhpE5UAnPmiQClAAg1TqZxxQDDsFvgPADdSCAVONQM2AQkitwgAMhKNCJH4AMlCJwMWEw8QOiA5NjYaiQoh5S2xKcAl9xA6IDkwMTrIDwA5GnAPVe4AMRapCBauCwg5NDga3w1W8QJqQBIEMzQNjG7xAkJAEjJsAYppCAAxHr8MWXdRX2JKAg6kERg9rI0VlpVkNq0BDmoIFkANbeYxcwDpZmkIDSrFTiKtDw0VGmcIBDc5JXWJXgAzyYBuZwgIMS40heKuZwgNcgFkUmYICDEwMhovCkYjAB5nCAQzMg5oCAiYvgvd7BwhMU5x3k2M50IpAQBSDsQQbmYIgTkAb3ZmCHEE2WEarwhabghCMwMNEQ5ICABlhncIEUAuNwBGfwgNVEonBx6ICAgyLjChcFEdAGcyiAgtiYKICAQxMA5uCgAsSTkRNkkxBDMzbToShggENjcW4xUQfXD9ixQumgEUtOFt3JioSpoBBKUKRtYADl4UDdoMLnVybE7yBw0fGfkeKQgQX2ZvcmMOlA12VwARQA5NF05gAC6DABEsFrQNNGNhc2UoZXEobGVuZ3Roeu8NHGNvbnRlbnQpFiMOACB+agARL0AtPkNvbHVtbiM0OCwgZGF0ZQHWBG1hfl0OCGNyZQEsoQ5ILCAlWS0lbS0lZCAlSDolaTolUx1WADl+iQDhUn4kAAhtZDVaIwAeMwsioRhaKAAxxjFiYRuStwGGKQAa4AtGTgAxzVEwWTkwaW1laV9saW1pdF9lbg5WDVJrCpJAAAFLDggJQqkKZj4ADRoAc+6CADpEABEPTn4BDRxKBwEMaGlnaBEfujoACG9sZPLAAC5BAAUMkoAAPssAAF9WHgI+JwA6+QEQYmxhY2tu+QFBUg77GDhkZXZpY2VfcGxhdGZvcm2SowBRbQFFDmkKamACLnkEfdD56y7OBQQPQyrjGhhPRkZwyKQOLnYFGEDifOCjBOtGEAcALkabAlp5ABA6ZGVzY1J+AChwnIMPIcN1ed7uIEpiAATfCHZjAP7YBf7YBULYBbVytX0INTQsFRYAORkWXvYEUWL+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQX+PQWaPQVSlARGEgU88PIPeP///////////wEYAQ==
                    Prev_stmt: 
                        Query: SELECT b.url download_url,c.strategy_forced forced,b.version_name,b.version_code,CASE WHEN LENGTH(b.content)=0 THEN ' ' ELSE b.content END content ,    DATE_FORMAT(b.create_time,'%Y-%m-%d %H:%i:%S') create_time,b.size,b.md5,a.app_packname package_name,b.versionid package_id,b.versionid Versionid,b.appid Appid,    c.strategy_imei_limit_enabled StrategyImeiLimitEnabled,c.strategy_imei_limit_imeis StrategyImeiLimitImeis,c.strategy_versions_limit_enabled StrategyVersionsLimitEnabled,c.strategy_versions_limit_versions StrategyVersionsLimitVersions,c.strategy_high_version StrategyHighVersion,    c.strategy_holds_limit_enabled StrategyHoldsLimitEnabled,c.strategy_holds_limit_holds StrategyHoldsLimitHolds,c.version_strategy_id,c.strategy_imei_black_enabled,a.device_platform,c.strategy_appkey_enabled   FROM std_appinfo a   LEFT JOIN std_app_version b ON a.appid =b.appid   LEFT JOIN std_version_strategy c ON b.versionid=c.versionid WHERE a.is_delete=0 AND b.is_delete=0 AND c.is_delete=0      AND b.is_availabled=0      AND c.is_availabled=0  AND a.app_packname = 'com.000.mlink'  AND b.version_code>545  order by b.create_time desc ;
1 row in set (0.07 sec)

看了一下Plan_from_cache:为0呢

我之前是因为虚拟内存彻底关闭后导致了,后面设置vm.swappiness = 1 暂时没再发生了

这是什么,发错了吗?

你开 swap,只是为了让tidb 无内存可申请的时候用 swap。但是生产环境是关 swap 的,因为用 swap 性能抖动会非常厉害。

在哪里看出我这边使用了swap?