TiKV访问异常,监控各项指标异常

【 TiDB 使用环境】生产环境
【 TiDB 版本】v7.5.0
【复现路径】无
【遇到的问题:问题现象及影响】TiKV访问异常,查看监控TiKV的各项指标比如CPU、IO等都突降,目前怀疑是region merge导致的,但是不确定。1、想确认下从监控哪里可以佐证是merge的影响。2、如果是merge导致,有没有什么办法优化下,降低影响。
“max-merge-region-keys”: 200000,
“max-merge-region-size”: 20,
“enable-cross-table-merge”: “true”,
“merge-schedule-limit”: 8,
【资源配置】三台物理机分别运行1个PD节点和2和TiKV节点
【附件:截图/日志/监控】
TiKV-Summary:Cluster



建议提供一下底层日志。

资源使用率降低不是好事吗,只要集群没有问题能正常运行就好

如果不想region合并可以把enable-cross-table-merge关掉,然后把“max-merge-region-keys”和“max-merge-region-size”调整的大一点,merge-schedule-limit这个参数应该已经算小了,感觉有影响可以再改小一点。。。

访问异常,业务端有显示报错是什么吗?

怎么定位到 merge 影响的

数据查询使用正常不

看看tikv日志呢

TiKV的日志都是INFO的信息,没有特别明显的报错,部分如下,其他的类似,不知道从监控面板能从哪块看出些端倪呢
[2024/05/13 17:41:36.960 +08:00] [INFO] [raft.rs:2660] [“switched to configuration”] [config=“Configuration { voters: Configuration { incoming: Configuration { voters: {175277, 175276, 175275} }, outgoing: Configuration { voters: {} } }, learners: {167480, 167178, 165966}, learners_next: {}, auto_leave: false }”] [raft_id=167178]
[region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.965 +08:00] [INFO] [apply.rs:1689] [“execute admin command”] [command=“cmd_type: ChangePeerV2 change_peer_v2 { changes { change_type: RemoveNode peer { id: 167178 store_id: 2 role: Learner } } }”] [index=1273] [term=72] [peer_id=167178] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.965 +08:00] [INFO] [apply.rs:2283] [“exec ConfChangeV2”] [epoch=“conf_ver: 5462 version: 435”] [kind=Simple] [peer_id=167178] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.965 +08:00] [INFO] [apply.rs:2464] [“conf change successfully”] [“current region”=“id: 155344 start_key: ? end_key: ? region_epoch { conf_ver: 5463 version: 435 } peers { id: 165966 store_id: 12 role: Learner } peers { id: 167480 store_id: 10 role: Learner } peers { id: 175277 store_id: 1 } peers { id: 175275
store_id: 3 } peers { id: 175276 store_id: 11 }”] [“original region”=“id: 155344 start_key: ? end_key: ? region_epoch { conf_ver: 5462 version: 435 } peers { id: 165966 store_id: 12 role: Learner } peers { id: 167178 store_id: 2 role: Learner } peers { id: 167480 store_id: 10 role: Learner } peers { id: 175277 store_id: 1 } peer
s { id: 175275 store_id: 3 } peers { id: 175276 store_id: 11 }”] [changes=“[change_type: RemoveNode peer { id: 167178 store_id: 2 role: Learner }]”] [peer_id=167178] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.965 +08:00] [INFO] [router.rs:283] [“shutdown mailbox”] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.966 +08:00] [INFO] [raft.rs:2660] [“switched to configuration”] [config=“Configuration { voters: Configuration { incoming: Configuration { voters: {175277, 175276, 175275} }, outgoing: Configuration { voters: {} } }, learners: {167480, 165966}, learners_next: {}, auto_leave: false }”] [raft_id=167178] [region
_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.966 +08:00] [INFO] [peer.rs:3706] [“delays destroy”] [reason=UnFlushLogGc] [merged_by_target=false] [peer_id=167178] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.968 +08:00] [INFO] [peer.rs:3759] [“starts destroy”] [is_latest_initialized=true] [is_peer_initialized=true] [merged_by_target=false] [peer_id=167178] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.968 +08:00] [INFO] [peer.rs:1289] [“begin to destroy”] [peer_id=167178] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.968 +08:00] [INFO] [pd.rs:1752] [“remove peer statistic record in pd”] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.968 +08:00] [INFO] [peer_storage.rs:1049] [“finish clear peer meta”] [takes=7.225µs] [raft_key=1] [apply_key=1] [meta_key=1] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.970 +08:00] [INFO] [peer.rs:1393] [“peer destroy itself”] [keep_data=false] [clean=true] [takes=1.56587ms] [peer_id=167178] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.970 +08:00] [INFO] [router.rs:283] [“shutdown mailbox”] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.970 +08:00] [INFO] [region.rs:624] [“register deleting data in range”] [end_key=?] [start_key=?] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:36.971 +08:00] [INFO] [raft.rs:1362] [“received a message with higher term from 156327”] [“msg type”=MsgAppend] [message_term=18] [term=17] [from=156327] [raft_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.971 +08:00] [INFO] [raft.rs:1127] [“became follower at term 18”] [term=18] [raft_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.973 +08:00] [INFO] [apply.rs:1689] [“execute admin command”] [command=“cmd_type: ChangePeerV2 change_peer_v2 { changes { peer { id: 175283 store_id: 2 } } changes { peer { id: 175282 store_id: 12 } } changes { change_type: AddLearnerNode peer { id: 156326 store_id: 1 role: Learner } } changes { change_type: A
ddLearnerNode peer { id: 164800 store_id: 3 role: Learner } } }”] [index=1172] [term=18] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.973 +08:00] [INFO] [apply.rs:2283] [“exec ConfChangeV2”] [epoch=“conf_ver: 5041 version: 473”] [kind=EnterJoint] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.973 +08:00] [INFO] [apply.rs:2464] [“conf change successfully”] [“current region”=“id: 156324 start_key: ? end_key: ? region_epoch { conf_ver: 5045 version: 473 } peers { id: 156326 store_id: 1 role: DemotingVoter } peers { id: 156327 store_id: 10 } peers { id: 164800 store_id: 3 role: DemotingVoter } peers {
id: 175283 store_id: 2 role: IncomingVoter } peers { id: 175282 store_id: 12 role: IncomingVoter }”] [“original region”=“id: 156324 start_key: ? end_key: ? region_epoch { conf_ver: 5041 version: 473 } peers { id: 156326 store_id: 1 } peers { id: 156327 store_id: 10 } peers { id: 164800 store_id: 3 } peers { id: 175283 store_id:
2 role: Learner } peers { id: 175282 store_id: 12 role: Learner }”] [changes=“[peer { id: 175283 store_id: 2 }, peer { id: 175282 store_id: 12 }, change_type: AddLearnerNode peer { id: 156326 store_id: 1 role: Learner }, change_type: AddLearnerNode peer { id: 164800 store_id: 3 role: Learner }]”] [peer_id=175283] [region_id=15632
4] [thread_id=0x5]
[2024/05/13 17:41:36.974 +08:00] [INFO] [raft.rs:2660] [“switched to configuration”] [config=“Configuration { voters: Configuration { incoming: Configuration { voters: {175282, 156327, 175283} }, outgoing: Configuration { voters: {164800, 156326, 156327} } }, learners: {}, learners_next: {164800, 156326}, auto_leave: false }”] [r
aft_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.975 +08:00] [INFO] [apply.rs:1689] [“execute admin command”] [command=“cmd_type: ChangePeerV2 change_peer_v2 {}”] [index=1173] [term=18] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.975 +08:00] [INFO] [apply.rs:2283] [“exec ConfChangeV2”] [epoch=“conf_ver: 5045 version: 473”] [kind=LeaveJoint] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.975 +08:00] [INFO] [apply.rs:2494] [“leave joint state successfully”] [region=“id: 156324 start_key: ? end_key: ? region_epoch { conf_ver: 5049 version: 473 } peers { id: 156326 store_id: 1 role: Learner } peers { id: 156327 store_id: 10 } peers { id: 164800 store_id: 3 role: Learner } peers { id: 175283 stor
e_id: 2 } peers { id: 175282 store_id: 12 }”] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.976 +08:00] [INFO] [raft.rs:2660] [“switched to configuration”] [config=“Configuration { voters: Configuration { incoming: Configuration { voters: {175282, 156327, 175283} }, outgoing: Configuration { voters: {} } }, learners: {164800, 156326}, learners_next: {}, auto_leave: false }”] [raft_id=175283] [region
_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.976 +08:00] [INFO] [apply.rs:1689] [“execute admin command”] [command=“cmd_type: ChangePeerV2 change_peer_v2 { changes { change_type: RemoveNode peer { id: 156326 store_id: 1 role: Learner } } }”] [index=1174] [term=18] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.976 +08:00] [INFO] [apply.rs:2283] [“exec ConfChangeV2”] [epoch=“conf_ver: 5049 version: 473”] [kind=Simple] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.976 +08:00] [INFO] [apply.rs:2464] [“conf change successfully”] [“current region”=“id: 156324 start_key: ? end_key: ? region_epoch { conf_ver: 5050 version: 473 } peers { id: 156327 store_id: 10 } peers { id: 164800 store_id: 3 role: Learner } peers { id: 175283 store_id: 2 } peers { id: 175282 store_id: 12 }
“] [“original region”=“id: 156324 start_key: ? end_key: ? region_epoch { conf_ver: 5049 version: 473 } peers { id: 156326 store_id: 1 role: Learner } peers { id: 156327 store_id: 10 } peers { id: 164800 store_id: 3 role: Learner } peers { id: 175283 store_id: 2 } peers { id: 175282 store_id: 12 }”] [changes=”[change_type: RemoveN
ode peer { id: 156326 store_id: 1 role: Learner }]”] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.977 +08:00] [INFO] [raft.rs:2660] [“switched to configuration”] [config=“Configuration { voters: Configuration { incoming: Configuration { voters: {175282, 156327, 175283} }, outgoing: Configuration { voters: {} } }, learners: {164800}, learners_next: {}, auto_leave: false }”] [raft_id=175283] [region_id=1563
24] [thread_id=0x5]
[2024/05/13 17:41:36.978 +08:00] [INFO] [apply.rs:1689] [“execute admin command”] [command=“cmd_type: ChangePeerV2 change_peer_v2 { changes { change_type: RemoveNode peer { id: 164800 store_id: 3 role: Learner } } }”] [index=1175] [term=18] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.978 +08:00] [INFO] [apply.rs:2283] [“exec ConfChangeV2”] [epoch=“conf_ver: 5050 version: 473”] [kind=Simple] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:36.978 +08:00] [INFO] [apply.rs:2464] [“conf change successfully”] [“current region”=“id: 156324 start_key: ? end_key: ? region_epoch { conf_ver: 5051 version: 473 } peers { id: 156327 store_id: 10 } peers { id: 175283 store_id: 2 } peers { id: 175282 store_id: 12 }”] [“original region”=“id: 156324 start_key: ?
end_key: ? region_epoch { conf_ver: 5050 version: 473 } peers { id: 156327 store_id: 10 } peers { id: 164800 store_id: 3 role: Learner } peers { id: 175283 store_id: 2 } peers { id: 175282 store_id: 12 }”] [changes=“[change_type: RemoveNode peer { id: 164800 store_id: 3 role: Learner }]”] [peer_id=175283] [region_id=156324] [thre
ad_id=0x5]
[2024/05/13 17:41:36.978 +08:00] [INFO] [raft.rs:2660] [“switched to configuration”] [config=“Configuration { voters: Configuration { incoming: Configuration { voters: {175282, 156327, 175283} }, outgoing: Configuration { voters: {} } }, learners: {}, learners_next: {}, auto_leave: false }”] [raft_id=175283] [region_id=156324] [t
hread_id=0x5]
[2024/05/13 17:41:38.745 +08:00] [INFO] [endpoint.rs:575] [“the max gap of leader resolved-ts is large”] [last_resolve_attempt=None] [duration_to_last_update_safe_ts=15571ms] [min_memory_lock=None] [txn_num=0] [lock_num=0] [min_lock=None] [applied_index=922] [read_state=“ReadState { idx: 920, ts: 449732477813784625 }”] [gap=45324
ms] [region_id=155240] [thread_id=0x5]
[2024/05/13 17:41:39.505 +08:00] [INFO] [pd.rs:1674] [“try to merge”] [merge=“target { id: 157442 start_key: ? end_key: ? region_epoch { conf_ver: 5363 version: 496 } peers { id: 157443 store_id: 3 } peers { id: 157444 store_id: 11 } peers { id: 157445 store_id: 2 } }”] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.505 +08:00] [INFO] [apply.rs:1689] [“execute admin command”] [command=“cmd_type: PrepareMerge prepare_merge { min_index: 196 target { id: 157442 start_key: ? end_key: ? region_epoch { conf_ver: 5363 version: 496 } peers { id: 157443 store_id: 3 } peers { id: 157444 store_id: 11 } peers { id: 157445 store_id:
2 } } }”] [index=196] [term=6] [peer_id=174916] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.513 +08:00] [INFO] [endpoint.rs:277] [“region met split/merge command, stop tracking since key range changed, wait for re-register”] [req_type=PrepareMerge] [thread_id=0x5]
[2024/05/13 17:41:39.514 +08:00] [INFO] [endpoint.rs:750] [“deregister observe region”] [observe_id=ObserveId(19880)] [region_id=174913] [store_id=Some(2)] [thread_id=0x5]
[2024/05/13 17:41:39.514 +08:00] [INFO] [endpoint.rs:703] [“register observe region”] [region=“id: 174913 start_key: ? end_key: ? region_epoch { conf_ver: 5364 version: 497 } peers { id: 174914 store_id: 3 } peers { id: 174915 store_id: 11 } peers { id: 174916 store_id: 2 }”] [thread_id=0x5]
[2024/05/13 17:41:39.514 +08:00] [INFO] [apply.rs:2848] [“asking delegate to stop”] [source_region_id=174913] [peer_id=157445] [region_id=157442] [thread_id=0x5]
[2024/05/13 17:41:39.515 +08:00] [INFO] [apply.rs:4141] [“source logs are all applied now”] [peer_id=174916] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.515 +08:00] [INFO] [apply.rs:4066] [“remove delegate from apply delegates”] [peer_id=174916] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.515 +08:00] [INFO] [router.rs:283] [“shutdown mailbox”] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.515 +08:00] [INFO] [apply.rs:2870] [“execute CommitMerge”] [source_region=“id: 174913 start_key: ? end_key: ? region_epoch { conf_ver: 5364 version: 497 } peers { id: 174914 store_id: 3 } peers { id: 174915 store_id: 11 } peers { id: 174916 store_id: 2 }”] [index=1137] [term=10] [entries=0] [commit=196] [peer
_id=157445] [region_id=157442] [thread_id=0x5]
[2024/05/13 17:41:39.521 +08:00] [INFO] [endpoint.rs:277] [“region met split/merge command, stop tracking since key range changed, wait for re-register”] [req_type=CommitMerge] [thread_id=0x5]
[2024/05/13 17:41:39.521 +08:00] [INFO] [util.rs:1575] [“reset safe_ts due to merge”] [peer_id=157445] [region_id=157442] [safe_ts=449732485612568775] [target_safe_ts=449732485612568775] [source_safe_ts=449732485612568775] [thread_id=0x5]
[2024/05/13 17:41:39.521 +08:00] [INFO] [peer.rs:5502] [“require updating max ts”] [initial_status=42949673956] [region_id=157442] [thread_id=0x5]
[2024/05/13 17:41:39.521 +08:00] [INFO] [endpoint.rs:750] [“deregister observe region”] [observe_id=ObserveId(19879)] [region_id=157442] [store_id=Some(2)] [thread_id=0x5]
[2024/05/13 17:41:39.521 +08:00] [INFO] [endpoint.rs:703] [“register observe region”] [region=“id: 157442 start_key: ? end_key: ? region_epoch { conf_ver: 5363 version: 498 } peers { id: 157443 store_id: 3 } peers { id: 157444 store_id: 11 } peers { id: 157445 store_id: 2 }”] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [peer.rs:4797] [“notify pd with merge”] [target_region=“id: 157442 start_key: ? end_key: ? region_epoch { conf_ver: 5363 version: 498 } peers { id: 157443 store_id: 3 } peers { id: 157444 store_id: 11 } peers { id: 157445 store_id: 2 }”] [source_region=“id: 174913 start_key: ? end_key: ? re
gion_epoch { conf_ver: 5364 version: 497 } peers { id: 174914 store_id: 3 } peers { id: 174915 store_id: 11 } peers { id: 174916 store_id: 2 }”] [peer_id=157445] [region_id=157442] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [peer.rs:4924] [“merge finished”] [target_region=“Some(id: 157442 start_key: ? end_key: ? region_epoch { conf_ver: 5363 version: 496 } peers { id: 157443 store_id: 3 } peers { id: 157444 store_id: 11 } peers { id: 157445 store_id: 2 })”] [peer_id=174916] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [peer.rs:3706] [“delays destroy”] [reason=UnFlushLogGc] [merged_by_target=true] [peer_id=174916] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [pd.rs:1828] [“succeed to update max timestamp”] [region_id=157442] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [endpoint.rs:356] [“Resolver initialized”] [pending_data_index=0] [snapshot_index=1137] [observe_id=ObserveId(19960)] [region=157442] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [peer.rs:3759] [“starts destroy”] [is_latest_initialized=false] [is_peer_initialized=true] [merged_by_target=true] [peer_id=174916] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [peer.rs:1289] [“begin to destroy”] [peer_id=174916] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [endpoint.rs:750] [“deregister observe region”] [observe_id=ObserveId(19959)] [region_id=174913] [store_id=Some(2)] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [pd.rs:1752] [“remove peer statistic record in pd”] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.522 +08:00] [INFO] [peer_storage.rs:1049] [“finish clear peer meta”] [takes=1.994µs] [raft_key=1] [apply_key=1] [meta_key=1] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.523 +08:00] [INFO] [peer.rs:1393] [“peer destroy itself”] [keep_data=true] [clean=true] [takes=812.372µs] [peer_id=174916] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.523 +08:00] [INFO] [router.rs:283] [“shutdown mailbox”] [region_id=174913] [thread_id=0x5]
[2024/05/13 17:41:39.615 +08:00] [WARN] [scanner.rs:137] [“resolved_ts scan get snapshot failed”] [err=“Other("[components/resolved_ts/src/scanner.rs:193]: scan task cancelled")”] [thread_id=0x5]
[2024/05/13 17:41:40.455 +08:00] [INFO] [apply.rs:1689] [“execute admin command”] [command=“cmd_type: PrepareMerge prepare_merge { min_index: 1176 target { id: 168495 start_key: ? end_key: ? region_epoch { conf_ver: 5039 version: 483 } peers { id: 168496 store_id: 12 } peers { id: 168497 store_id: 10 } peers { id: 168498 store_id
: 2 } } }”] [index=1176] [term=18] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.469 +08:00] [INFO] [apply.rs:2848] [“asking delegate to stop”] [source_region_id=156324] [peer_id=168498] [region_id=168495] [thread_id=0x5]
[2024/05/13 17:41:40.469 +08:00] [INFO] [peer.rs:5489] [“failed to propose”] [err=“peer is not leader for region 168495, leader may Some(id: 168496 store_id: 12)”] [message=“header { region_id: 168495 peer { id: 168498 store_id: 2 } region_epoch { conf_ver: 5039 version: 483 } } admin_request { cmd_type: CommitMerge commit_merge
{ source { id: 156324 start_key: ? end_key: ? region_epoch { conf_ver: 5052 version: 474 } peers { id: 156327 store_id: 10 } peers { id: 175283 store_id: 2 } peers { id: 175282 store_id: 12 } } commit: 1176 entries { term: 18 index: 1176 data: ? context: ? } } }”] [peer_id=168498] [region_id=168495] [thread_id=0x5]
[2024/05/13 17:41:40.469 +08:00] [INFO] [apply.rs:4141] [“source logs are all applied now”] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.469 +08:00] [INFO] [apply.rs:4066] [“remove delegate from apply delegates”] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.469 +08:00] [INFO] [router.rs:283] [“shutdown mailbox”] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.469 +08:00] [INFO] [apply.rs:2870] [“execute CommitMerge”] [source_region=“id: 156324 start_key: ? end_key: ? region_epoch { conf_ver: 5052 version: 474 } peers { id: 156327 store_id: 10 } peers { id: 175283 store_id: 2 } peers { id: 175282 store_id: 12 }”] [index=388] [term=6] [entries=1] [commit=1176] [peer
_id=168498] [region_id=168495] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [util.rs:1575] [“reset safe_ts due to merge”] [peer_id=168498] [region_id=168495] [safe_ts=0] [target_safe_ts=449732478888575289] [source_safe_ts=0] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [peer.rs:5502] [“require updating max ts”] [initial_status=25769804744] [region_id=168495] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [peer.rs:4924] [“merge finished”] [target_region=“Some(id: 168495 start_key: ? end_key: ? region_epoch { conf_ver: 5039 version: 483 } peers { id: 168496 store_id: 12 } peers { id: 168497 store_id: 10 } peers { id: 168498 store_id: 2 })”] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [peer.rs:3706] [“delays destroy”] [reason=UnFlushLogGc] [merged_by_target=true] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [peer.rs:3759] [“starts destroy”] [is_latest_initialized=false] [is_peer_initialized=true] [merged_by_target=true] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [peer.rs:1289] [“begin to destroy”] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [pd.rs:1828] [“succeed to update max timestamp”] [region_id=168495] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [peer_storage.rs:1049] [“finish clear peer meta”] [takes=2.096µs] [raft_key=1] [apply_key=1] [meta_key=1] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [peer.rs:1393] [“peer destroy itself”] [keep_data=true] [clean=true] [takes=712.361µs] [peer_id=175283] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:40.470 +08:00] [INFO] [router.rs:283] [“shutdown mailbox”] [region_id=156324] [thread_id=0x5]
[2024/05/13 17:41:41.052 +08:00] [INFO] [region.rs:658] [“delete data in range because of stale”] [end_key=?] [start_key=?] [region_id=155344] [thread_id=0x5]
[2024/05/13 17:41:48.746 +08:00] [INFO] [endpoint.rs:597] [“the max gap of follower safe-ts is large”] [oldest_candidate=None] [latest_candidate=None] [applied_index=224] [duration_to_last_consume_leader=15353ms] [resolved_ts=449732477813784625] [safe_ts=449732477813784625] [gap=55324ms] [region_id=174333] [thread_id=0x5]
[2024/05/13 17:41:48.746 +08:00] [INFO] [endpoint.rs:618] [“the max gap of follower resolved-ts is large; it’s the same region that has the min safe-ts”] [thread_id=0x5]

像这种突然降到底又涨上来的,其实也是像集群夯住无法响应的情况。不过现在没保留应用报错的日志,想从监控面板找些证据

嗯嗯,关闭倒是不用的。如果正常这样配置不容易有性能影响的话,那就先这样配置看能否复现。谢谢你的建议

只是怀疑,因为刚好看到空region减低的情况。日志也看不出什么信息。不确定还能从哪里定位

拓扑图在发一下吧,这个说的不太清楚,其实主要看看时候不是tidb和tikv有混布,混布的话,看下内存的使用情况。。。

着看不出来

看监控,你这region也没减少多少,我是5.4,之前没开启合并region,前天刚开启,region都减少了60W了,资源使用率,并没有突降,也没有降多少,基本看不出来,我的集群总region数量有4百多万