【 TiDB 使用环境】生产环境
【 TiDB 版本】v6.5.0
【遇到的问题:问题现象及影响】
单个 tidb 节点在内存占用高一段时间后直接重启,但看起来不是被 oom-killer 杀掉而是自己退出的,集群没有使用 resource_control,请问 tidb 在内存无外界限制时会主动退出么?
journalctl -u tidb-4000
Oct 25 08:03:08 i-tjdhcwxv systemd[1]: Started tidb service.
Nov 13 11:09:42 i-tjdhcwxv systemd[1]: tidb-4000.service: main process exited, code=exited, status=2/INVALIDARGUMENT
Nov 13 11:09:42 i-tjdhcwxv systemd[1]: Unit tidb-4000.service entered failed state.
Nov 13 11:09:42 i-tjdhcwxv systemd[1]: tidb-4000.service failed.
Nov 13 11:09:57 i-tjdhcwxv systemd[1]: tidb-4000.service holdoff time over, scheduling restart.
Nov 13 11:09:57 i-tjdhcwxv systemd[1]: Stopped tidb service.
Nov 13 11:09:57 i-tjdhcwxv systemd[1]: Started tidb service.
less -S /tidb-deploy/tidb-4000/log/tidb.log
[2024/11/13 06:14:46.565 +08:00] [ERROR] [systime_mon.go:34] ["system time jump backward"] [last=1731449686569834829]
[2024/11/13 10:04:48.426 +08:00] [ERROR] [domain.go:1966] ["handle ddl event failed"] [event="(Event Type: truncate table, Table ID: 1577487, Table Name mr_xs_sale_acct_custmanager"] [error="[kv:1062]Duplicate entry '1577487-0-1' for key 'stats_histograms.tbl'"]
[2024/11/13 11:09:57.652 +08:00] [INFO] [printer.go:34] ["Welcome to TiDB."] ["Release Version"=v6.5.0] [Edition=Community] ["Git Commit Hash"=706c3fa3c526cdba5b3e9f066b1a568fb96c56e3] ["Git Branch"=heads/refs/tags/v6.5.0] ["UTC Build Time"="2022-12-27 03:50:44"] [GoV[2024/11/13 11:09:57.654 +08:00] [INFO] [printer.go:48] ["loaded config"] [config="{\"host\":\"0.0.0.0\",\"advertise-address\":\"10.120.33.68\",\"port\":4000,\"cors\":\"\",\"store\":\"tikv\",\"path\":\"10.120.33.68:2379,10.120.33.42:2379,10.120.33.151:2379\",\"socket\[2024/11/13 14:15:12.942 +08:00] [ERROR] [distsql.go:1331] ["table reader fetch next chunk failed"] [conn=7718824514661921075] [error="loadRegion from PD failed, key: \"74800000000003A0B45F728000000000170E7C\", err: rpc error: code = Canceled desc = context canceled"][2024/11/13 14:15:12.969 +08:00] [ERROR] [distsql.go:1331] ["table reader fetch next chunk failed"] [conn=7718824514661921075] [error="loadRegion from PD failed, key: \"74800000000003A0B45F728000000000171052\", err: rpc error: code = Canceled desc = context canceled"][2024/11/13 14:16:36.936 +08:00] [ERROR] [systime_mon.go:34] ["system time jump backward"] [last=1731478596955239458]
[2024/11/13 17:17:14.929 +08:00] [ERROR] [systime_mon.go:34] ["system time jump backward"] [last=1731489434936772496]
less -S /tidb-deploy/tidb-4000/log/tidb_stderr.log
panic: Out Of Memory Quota![conn_id=5393317839498368223]
goroutine 1452969363 [running]:
github.com/pingcap/tidb/util/memory.(*PanicOnExceed).Action(0xc117ca6080, 0xc32c4f5360)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/memory/action.go:148 +0x11f
github.com/pingcap/tidb/util/memory.(*Tracker).Consume.func2(0x42d45a0?, 0xc00024f470?)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/memory/tracker.go:449 +0xff
github.com/pingcap/tidb/util/memory.(*Tracker).Consume(0xc0802e28c0?, 0x1)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/memory/tracker.go:461 +0x2c2
github.com/pingcap/tidb/util/chunk.(*SortedRowContainer).keyColumnsLess(0xc0802e28c0, 0x17b2302, 0x17b2301)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/chunk/row_container.go:497 +0x48
sort.partialInsertionSort_func({0xc00db47f88?, 0xc084074c20?}, 0x17b0c90, 0x17b42a7)
/usr/local/go/src/sort/zsortfunc.go:202 +0x112
sort.pdqsort_func({0xc00db47f88?, 0xc084074c20?}, 0x17b0c90?, 0x17baed7?, 0xc017ae7310?)
/usr/local/go/src/sort/zsortfunc.go:101 +0x190
sort.pdqsort_func({0xc00db47f88?, 0xc084074c20?}, 0x17aa060?, 0x17e01df?, 0xc6abf0c1e0?)
/usr/local/go/src/sort/zsortfunc.go:121 +0x2a8
sort.pdqsort_func({0xc00db47f88?, 0xc084074c20?}, 0x1795bd0?, 0x20e0007?, 0x6e696d5f315f646e?)
/usr/local/go/src/sort/zsortfunc.go:121 +0x2a8
sort.pdqsort_func({0xc00fa7bf88?, 0xc084074c20?}, 0xc0754a4c30?, 0x0?, 0xc0c951b4f0?)
/usr/local/go/src/sort/zsortfunc.go:121 +0x2a8
sort.Slice({0x4135620, 0xc0754a4c30}, 0x33ee400?)
/usr/local/go/src/sort/slice.go:23 +0x97
github.com/pingcap/tidb/util/chunk.(*SortedRowContainer).Sort(0xc0802e28c0)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/chunk/row_container.go:529 +0x1c5
github.com/pingcap/tidb/util/chunk.(*SortedRowContainer).sortAndSpillToDisk(0xc0802e28c0)
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/chunk/row_container.go:533 +0x1e
created by github.com/pingcap/tidb/util/chunk.(*SortAndSpillDiskAction).Action.func1
/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/util/chunk/row_container.go:609 +0x32f
grafana tidb
【资源配置】
server_configs:
tidb:
graceful-wait-before-shutdown: 120
log.level: error
performance.txn-total-size-limit: 1099511627776
pessimistic-txn.pessimistic-auto-commit: true
pd:
dashboard.enable-telemetry: false
log.level: warn
schedule.low-space-ratio: 0.9
goroutine (992.7 KB)
heap (6.9 MB)
running_sql (41.9 KB)