TIDB内存溢出循环重启

goroutine 479 [select]:
github.com/pingcap/tidb/domain.(*Domain).handleEvolvePlanTasksLoop.func1(0xc0003d3440, 0x3707560, 0xc00270c200)
/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/domain/domain.go:913 +0x1c1
created by github.com/pingcap/tidb/domain.(*Domain).handleEvolvePlanTasksLoop
/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/domain/domain.go:908 +0x73

goroutine 464 [select, 3 minutes]:
go.etcd.io/etcd/clientv3.(*lessor).keepAliveCtxCloser({“level”:“warn”,“ts”:“2022-08-09T11:41:47.308+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-e1f52210-d6f8-495d-a667-b2c01ae462e3/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
{“level”:“warn”,“ts”:“2022-08-09T11:41:48.196+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-48902fbb-0317-4aa5-815b-a8c309ad3ed1/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
{“level”:“warn”,“ts”:“2022-08-09T11:41:48.312+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-e1f52210-d6f8-495d-a667-b2c01ae462e3/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
{“level”:“warn”,“ts”:“2022-08-09T11:57:42.942+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-b7122453-edc5-401e-9c15-f3b19a6957d0/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
{“level”:“warn”,“ts”:“2022-08-09T11:57:50.062+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-b318aef7-2b8e-4d96-8c70-01cc8176f889/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
{“level”:“warn”,“ts”:“2022-08-09T11:57:51.967+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-b318aef7-2b8e-4d96-8c70-01cc8176f889/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
{“level”:“warn”,“ts”:“2022-08-09T12:01:54.049+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-804d5f0a-1041-4a8f-b792-604a99da1e6e/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
{“level”:“warn”,“ts”:“2022-08-09T12:01:54.934+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-68b7b320-1768-45ce-ae37-2607d87a9a83/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
{“level”:“warn”,“ts”:“2022-08-09T12:01:55.055+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-804d5f0a-1041-4a8f-b792-604a99da1e6e/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
fatal error: runtime: out of memory

runtime stack:
fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x318b551, 0x16)
/usr/local/go/src/runtime/panic.go:774 +0x72
runtime.sysMap(0xc720000000, 0x4000000, 0x51aae98)
/usr/local/go/src/runtime/mem_linux.go:169 +0xc5
runtime.(*mheap).sysAlloc(0x5190d60, 0x30000, 0xffffffff010dd5c9, 0x29dbff6)
/usr/local/go/src/runtime/malloc.go:701 +0x1cd
runtime.(*mheap).grow(0x5190d60, 0x18, 0xffffffff)
/usr/local/go/src/runtime/mheap.go:1252 +0x42
runtime.(*mheap).allocSpanLocked(0x5190d60, 0x18, 0x51aaea8, 0x4d15320)
/usr/local/go/src/runtime/mheap.go:1163 +0x291
runtime.(*mheap).alloc_m(0x5190d60, 0x18, 0x101, 0x0)
/usr/local/go/src/runtime/mheap.go:1015 +0xc2
runtime.(*mheap).alloc.func1()
/usr/local/go/src/runtime/mheap.go:1086 +0x4c
runtime.(*mheap).alloc(0x5190d60, 0x18, 0x7fdba5000101, 0x112b580)
/usr/local/go/src/runtime/mheap.go:1085 +0x8a
runtime.largeAlloc(0x30000, 0x1150100, 0xc71ffd4000)
/usr/local/go/src/runtime/malloc.go:1138 +0x97
runtime.mallocgc.func1()
/usr/local/go/src/runtime/malloc.go:1033 +0x46
runtime.systemstack(0x0)
/usr/local/go/src/runtime/asm_amd64.s:370 +0x66
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1146

频繁内存溢出重启,时间间隔大概是3分钟,实在是找不到解决方法了,哪位高人指点下

是不是开启了spm 自动演进,看下tidb_evolve_plan_baselines 变量设置


这个变量是关闭的

tidb_allow_remove_auto_inc 0
tidb_auto_analyze_end_time 23:59 +0000
tidb_auto_analyze_ratio 0.5
tidb_auto_analyze_start_time 00:00 +0000
tidb_backoff_lock_fast 100
tidb_backoff_weight 2
tidb_batch_commit 0
tidb_batch_delete 0
tidb_batch_insert 0
tidb_build_stats_concurrency 4
tidb_capture_plan_baselines off
tidb_check_mb4_value_in_utf8 1
tidb_checksum_table_concurrency 4
tidb_config
tidb_constraint_check_in_place 0
tidb_current_ts 0
tidb_ddl_error_count_limit 512
tidb_ddl_reorg_batch_size 256
tidb_ddl_reorg_priority PRIORITY_LOW
tidb_ddl_reorg_worker_cnt 4
tidb_disable_txn_auto_retry 1
tidb_distsql_scan_concurrency 15
tidb_dml_batch_size 20000
tidb_enable_cascades_planner 0
tidb_enable_chunk_rpc 1
tidb_enable_fast_analyze 0
tidb_enable_index_merge 0
tidb_enable_noop_functions 0
tidb_enable_radix_join 0
tidb_enable_slow_log 1
tidb_enable_stmt_summary 1
tidb_enable_streaming 0
tidb_enable_table_partition on
tidb_enable_vectorized_expression 1
tidb_enable_window_function 1
tidb_evolve_plan_baselines off
tidb_evolve_plan_task_end_time 23:59 +0000
tidb_evolve_plan_task_max_time 600
tidb_evolve_plan_task_start_time 00:00 +0000
tidb_expensive_query_time_threshold 60
tidb_force_priority NO_PRIORITY
tidb_general_log 0
tidb_hash_join_concurrency 5
tidb_hashagg_final_concurrency 4
tidb_hashagg_partial_concurrency 4
tidb_index_join_batch_size 25000
tidb_index_lookup_concurrency 4
tidb_index_lookup_join_concurrency 4
tidb_index_lookup_size 20000
tidb_index_serial_scan_concurrency 1
tidb_init_chunk_size 32
tidb_isolation_read_engines tikv, tiflash, tidb
tidb_low_resolution_tso 0
tidb_max_chunk_size 1024
tidb_max_delta_schema_count 1024
tidb_mem_quota_hashjoin 34359738368
tidb_mem_quota_indexlookupjoin 34359738368
tidb_mem_quota_indexlookupreader 34359738368
tidb_mem_quota_mergejoin 34359738368
tidb_mem_quota_nestedloopapply 34359738368
tidb_mem_quota_query 1073741824
tidb_mem_quota_sort 34359738368
tidb_mem_quota_topn 34359738368
tidb_metric_query_range_duration 60
tidb_metric_query_step 60
tidb_opt_agg_push_down 0
tidb_opt_concurrency_factor 3
tidb_opt_copcpu_factor 3
tidb_opt_correlation_exp_factor 1
tidb_opt_correlation_threshold 0.9
tidb_opt_cpu_factor 3
tidb_opt_desc_factor 3
tidb_opt_disk_factor 1.5
tidb_opt_distinct_agg_push_down 0
tidb_opt_insubq_to_join_and_agg 1
tidb_opt_join_reorder_threshold 0
tidb_opt_memory_factor 0.001
tidb_opt_network_factor 1
tidb_opt_scan_factor 1.5
tidb_opt_seek_factor 20
tidb_opt_write_row_id 0
tidb_optimizer_selectivity_level 0
tidb_pprof_sql_cpu 0
tidb_projection_concurrency 4
tidb_query_log_max_len 4096
tidb_record_plan_in_slow_log 1
tidb_replica_read leader
tidb_retry_limit 10
tidb_row_format_version 2
tidb_scatter_region 0
tidb_skip_isolation_level_check 0
tidb_skip_utf8_check 0
tidb_slow_log_threshold 300
tidb_slow_query_file log/tidb_slow_query.log
tidb_snapshot
tidb_stmt_summary_history_size 24
tidb_stmt_summary_internal_query 0
tidb_stmt_summary_max_sql_length 4096
tidb_stmt_summary_max_stmt_count 200
tidb_stmt_summary_refresh_interval 1800
tidb_store_limit 0
tidb_txn_mode pessimistic
tidb_use_plan_baselines on
tidb_wait_split_region_finish 1
tidb_wait_split_region_timeout 300
tidb_window_concurrency 4

循环重启日志
Aug 10 08:56:29 ecs-d3f7-0001 kernel: lowmem_reserve[]: 0 0 0 0
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Node 0 DMA: 14kB (U) 08kB 016kB 132kB (U) 264kB (U) 1128kB (U) 1256kB (U) 0512kB 11024kB (U) 12048kB (M) 34096kB (M) = 15908kB
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Node 0 DMA32: 1453
4kB (UEM) 7828kB (UEM) 33016kB (UE) 13132kB (UE) 5864kB (UE) 22128kB (U) 7256kB (U) 9512kB (UM) 861024kB (UM) 02048kB 04096kB = 122532kB
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Node 0 Normal: 26894kB (UEM) 6768kB (UEM) 52416kB (UEM) 19032kB (UEM) 5664kB (UEM) 35128kB (UEM) 11256kB (U) 2512kB (U) 201024kB (M) 02048kB 0*4096kB = 63012kB
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=1048576kB
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Aug 10 08:56:29 ecs-d3f7-0001 kernel: 2860 total pagecache pages
Aug 10 08:56:29 ecs-d3f7-0001 kernel: 0 pages in swap cache
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Swap cache stats: add 0, delete 0, find 0/0
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Free swap = 0kB
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Total swap = 0kB
Aug 10 08:56:29 ecs-d3f7-0001 kernel: 8388382 pages RAM
Aug 10 08:56:29 ecs-d3f7-0001 kernel: 0 pages HighMem/MovableOnly
Aug 10 08:56:29 ecs-d3f7-0001 kernel: 193862 pages reserved
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ pid ] uid tgid total_vm rss nr_ptes swapents oom_score_adj name
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 984] 0 984 26254 2464 29 0 0 systemd-journal
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1017] 0 1017 11184 140 24 0 -1000 systemd-udevd
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1051] 0 1051 13882 112 28 0 -1000 auditd
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1711] 81 1711 14692 273 34 0 -900 dbus-daemon
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1715] 0 1715 699273 4802 101 0 0 uniagent
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1716] 999 1716 153061 2411 62 0 0 polkitd
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1717] 0 1717 5419 76 16 0 0 irqbalance
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1718] 0 1718 6595 74 18 0 0 systemd-logind
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1720] 0 1720 136959 1027 84 0 0 NetworkManager
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1722] 998 1722 29451 109 30 0 0 chronyd
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1745] 0 1745 143558 2896 97 0 0 tuned
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1746] 0 1746 46403 6939 28 0 0 node_exporter
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1749] 0 1749 33012 7866 56 0 0 blackbox_export
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1750] 0 1750 3349767 218394 634 0 0 pd-server
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1752] 0 1752 28296 47 11 0 0 run_blackbox_ex
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1753] 0 1753 28296 46 11 0 0 run_node_export
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1755] 0 1755 26989 26 10 0 0 tee
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1756] 0 1756 26989 26 9 0 0 tee
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1808] 0 1808 25724 513 51 0 0 dhclient
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 1856] 0 1856 54637 1247 40 0 0 rsyslogd
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2169] 0 2169 30522 152 11 0 0 wrapper
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2398] 0 2398 22425 259 42 0 0 master
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2403] 89 2403 22468 252 45 0 0 qmgr
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2493] 0 2493 3135972 40789 171 0 0 java
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2730] 0 2730 28230 256 57 0 -1000 sshd
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2733] 0 2733 31573 155 18 0 0 crond
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2735] 0 2735 6477 52 17 0 0 atd
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2742] 0 2742 27527 33 10 0 0 agetty
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2743] 0 2743 27527 34 9 0 0 agetty
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2799] 0 2799 12261 155 26 0 0 hostguard
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2801] 0 2801 311790 1865 70 0 0 hostguard
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 2945] 0 2945 455062 5898 68 0 0 telescope
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 3266] 0 3266 39329 348 79 0 0 sshd
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [ 3268] 0 3268 28862 101 13 0 0 bash
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [11347] 89 11347 22451 251 45 0 0 pickup
Aug 10 08:56:29 ecs-d3f7-0001 kernel: [11698] 0 11698 8175881 7730983 15297 0 0 tidb-server
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Out of memory: Kill process 11698 (tidb-server) score 916 or sacrifice child
Aug 10 08:56:29 ecs-d3f7-0001 kernel: Killed process 11698 (tidb-server), UID 0, total-vm:32703524kB, anon-rss:30923932kB, file-rss:0kB, shmem-rss:0kB
Aug 10 08:56:30 ecs-d3f7-0001 systemd: tidb-4000.service: main process exited, code=killed, status=9/KILL
Aug 10 08:56:30 ecs-d3f7-0001 systemd: Unit tidb-4000.service entered failed state.
Aug 10 08:56:30 ecs-d3f7-0001 systemd: tidb-4000.service failed.
Aug 10 08:56:45 ecs-d3f7-0001 systemd: tidb-4000.service holdoff time over, scheduling restart.
Aug 10 08:56:45 ecs-d3f7-0001 systemd: Stopped tidb service.
Aug 10 08:56:45 ecs-d3f7-0001 systemd: Started tidb service.
Aug 10 09:00:41 ecs-d3f7-0001 systemd: tidb-4000.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 10 09:00:41 ecs-d3f7-0001 systemd: Unit tidb-4000.service entered failed state.
Aug 10 09:00:41 ecs-d3f7-0001 systemd: tidb-4000.service failed.
Aug 10 09:00:56 ecs-d3f7-0001 systemd: tidb-4000.service holdoff time over, scheduling restart.
Aug 10 09:00:56 ecs-d3f7-0001 systemd: Stopped tidb service.
Aug 10 09:00:56 ecs-d3f7-0001 systemd: Started tidb service.
Aug 10 09:01:01 ecs-d3f7-0001 systemd: Started Session 15 of user root.
Aug 10 09:04:36 ecs-d3f7-0001 systemd: tidb-4000.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Aug 10 09:04:36 ecs-d3f7-0001 systemd: Unit tidb-4000.service entered failed state.
Aug 10 09:04:36 ecs-d3f7-0001 systemd: tidb-4000.service failed.
Aug 10 09:04:51 ecs-d3f7-0001 systemd: tidb-4000.service holdoff time over, scheduling restart.
Aug 10 09:04:51 ecs-d3f7-0001 systemd: Stopped tidb service.
Aug 10 09:04:51 ecs-d3f7-0001 systemd: Started tidb service.

这是所有的配置信息,请高人指点下,感激:pray::pray::pray:

什么版本?
问题的复现步骤?

v4.0.0版本。
步骤


错误日志信息:
goroutine 479 [select]:
github.com/pingcap/tidb/domain.(*Domain.handleEvolvePlanTasksLoop.func1(0xc0003d3440, 0x3707560, 0xc00270c200)
/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/domain/domain.go:913 +0x1c1
created by github.com/pingcap/tidb/domain.(*Domain.handleEvolvePlanTasksLoop
/home/jenkins/agent/workspace/tidb_v4.0.0/go/src/github.com/pingcap/tidb/domain/domain.go:908 +0x73

goroutine 464 [select, 3 minutes]:
go.etcd.io/etcd/clientv3.(*lessor.keepAliveCtxCloser({“level”:“warn”,“ts”:“2022-08-09T11:41:47.308+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-e1f52210-d6f8-495d-a667-b2c01ae462e3/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
{“level”:“warn”,“ts”:“2022-08-09T11:41:48.196+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-48902fbb-0317-4aa5-815b-a8c309ad3ed1/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
{“level”:“warn”,“ts”:“2022-08-09T11:41:48.312+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-e1f52210-d6f8-495d-a667-b2c01ae462e3/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
{“level”:“warn”,“ts”:“2022-08-09T11:57:42.942+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-b7122453-edc5-401e-9c15-f3b19a6957d0/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
{“level”:“warn”,“ts”:“2022-08-09T11:57:50.062+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-b318aef7-2b8e-4d96-8c70-01cc8176f889/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
{“level”:“warn”,“ts”:“2022-08-09T11:57:51.967+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-b318aef7-2b8e-4d96-8c70-01cc8176f889/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
{“level”:“warn”,“ts”:“2022-08-09T12:01:54.049+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-804d5f0a-1041-4a8f-b792-604a99da1e6e/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = DeadlineExceeded desc = context deadline exceeded”}
{“level”:“warn”,“ts”:“2022-08-09T12:01:54.934+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-68b7b320-1768-45ce-ae37-2607d87a9a83/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
{“level”:“warn”,“ts”:“2022-08-09T12:01:55.055+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-804d5f0a-1041-4a8f-b792-604a99da1e6e/192.168.0.41:2379”,“attempt”:0,“error”:“rpc error: code = Unavailable desc = transport is closing”}
fatal error: runtime: out of memory

runtime stack:
fatal error: runtime: out of memory

runtime stack:
runtime.throw(0x318b551, 0x16)
/usr/local/go/src/runtime/panic.go:774 +0x72
runtime.sysMap(0xc720000000, 0x4000000, 0x51aae98)
/usr/local/go/src/runtime/mem_linux.go:169 +0xc5
runtime.(*mheap).sysAlloc(0x5190d60, 0x30000, 0xffffffff010dd5c9, 0x29dbff6)
/usr/local/go/src/runtime/malloc.go:701 +0x1cd
runtime.(*mheap).grow(0x5190d60, 0x18, 0xffffffff)
/usr/local/go/src/runtime/mheap.go:1252 +0x42
runtime.(*mheap).allocSpanLocked(0x5190d60, 0x18, 0x51aaea8, 0x4d15320)
/usr/local/go/src/runtime/mheap.go:1163 +0x291
runtime.(*mheap).alloc_m(0x5190d60, 0x18, 0x101, 0x0)
/usr/local/go/src/runtime/mheap.go:1015 +0xc2
runtime.(*mheap).alloc.func1()
/usr/local/go/src/runtime/mheap.go:1086 +0x4c
runtime.(*mheap).alloc(0x5190d60, 0x18, 0x7fdba5000101, 0x112b580)
/usr/local/go/src/runtime/mheap.go:1085 +0x8a
runtime.largeAlloc(0x30000, 0x1150100, 0xc71ffd4000)
/usr/local/go/src/runtime/malloc.go:1138 +0x97
runtime.mallocgc.func1()
/usr/local/go/src/runtime/malloc.go:1033 +0x46
runtime.systemstack(0x0)
/usr/local/go/src/runtime/asm_amd64.s:370 +0x66
runtime.mstart()
/usr/local/go/src/runtime/proc.go:1146

内存使用高是会tidb-server会自动重启(三台随机),
服务器配置:
pd_servers、tidb_servers 台服务器16核32G 、ssd 500G (3台)
tikv_servers 服务器16核32G 、ssd 1T (3台)

参考下下方帖子:

由于你使用版本较低,请尽可能升级新版。

贴出来的错误日志是完整的吗?

tidb_stderr.log (10.0 MB)

一般oom的原因:
1、大数据量或大并发导致内存使用过多
2、 检查下oom前有没有大数据量的慢SQL,可以看下dashboard的慢SQL,检查STATEMENTS_SUMMARY、STATEMENTS_SUMMARY_HISTORY 按max_mem sum下看看占内存高的
3、 analyze_version=2,调整参数值,按照下面链接中的步骤,删除现有version=2的统计信息
https://docs.pingcap.com/zh/tidb/stable/statistics#统计信息简介

4、其他bug

把这个配置项改成 0 试试看?

所有 tidb-server 都改成 0

[performance]
feedback-probability = 0.0

1 个赞


image
配置加上去了还是会出现频繁重启的情况

  1. 麻烦抓一下快要重启前的 DEBUG 包,抓去方法如下 http://{TiDBIP}:10080/debug/zip?seconds=60
  2. 发下 OOM 前后,tidb 分组下 server 中 memory usage 中 process 内存的变化趋势。