select count(*) 导致集群宕机无法启动

麻烦提供下相关信息

  1. 集群的硬件配置
  2. tiup cluster display 当前集群状态和拓扑
  3. 上传启动失败的 tikv 日志
  1. 集群的硬件配置
    内存:125G
    cpu:物理2颗8核,逻辑32个
    网卡:万兆
    磁盘:
    Filesystem Size Used Avail Use% Mounted on
    /dev/sdd1 879G 160G 675G 20% /data1
    /dev/sde1 879G 14G 821G 2% /data2
    /dev/sdc1 1.8T 1.9G 1.7T 1% /data

  2. tiup cluster display 当前集群状态和拓扑
    Starting component cluster: /root/.tiup/components/cluster/v1.2.3/tiup-cluster display sd-pro-cluster
    Cluster type: tidb
    Cluster name: sd-pro-cluster
    Cluster version: v4.0.8
    SSH type: builtin
    ID Role Host Ports OS/Arch Status Data Dir Deploy Dir


    10.18.0.182:9093 alertmanager 10.18.0.182 9093/9094 linux/x86_64 inactive /data1/tidb-data/alertmanager /data1/tidb-deploy/alertmanager
    10.18.0.182:8300 cdc 10.18.0.182 8300 linux/x86_64 Down - /home/tidb/deploy/cdc-8300
    10.18.0.182:3000 grafana 10.18.0.182 3000 linux/x86_64 inactive - /data1/tidb-deploy/grafana
    10.18.0.181:2379 pd 10.18.0.181 2379/2380 linux/x86_64 Down /data2/tidb-data/pd01 /data2/tidb-deploy/pd01
    10.18.0.182:2379 pd 10.18.0.182 2379/2380 linux/x86_64 Down /data2/tidb-data/pd01 /data2/tidb-deploy/pd01
    10.18.0.184:2379 pd 10.18.0.184 2379/2380 linux/x86_64 Down /data2/tidb-data/pd01 /data2/tidb-deploy/pd01
    10.18.0.182:9090 prometheus 10.18.0.182 9090 linux/x86_64 inactive /data1/tidb-data/prometheus /data1/tidb-deploy/prometheus
    10.18.0.181:4000 tidb 10.18.0.181 4000/10080 linux/x86_64 Down - /data2/tidb-deploy
    10.18.0.183:4000 tidb 10.18.0.183 4000/10080 linux/x86_64 Down - /data1/tidb-deploy
    10.18.0.184:9000 tiflash 10.18.0.184 9000/8123/3930/20170/20292/8234 linux/x86_64 Down /data1/tidb-data/tiflash01 /data1/tidb-deploy/tiflash01
    10.18.0.182:20160 tikv 10.18.0.182 20160/20180 linux/x86_64 Down /data/tidb-data/tikv01 /data/tidb-deploy/tikv01
    10.18.0.183:20160 tikv 10.18.0.183 20160/20180 linux/x86_64 Down /data/tidb-data/tikv01 /data/tidb-deploy/tikv01
    10.18.0.184:20160 tikv 10.18.0.184 20160/20180 linux/x86_64 Down /data/tidb-data/tikv01 /data/tidb-deploy/tikv01

  3. 上传启动失败的 tikv 日志
    [2020/12/14 19:18:02.407 +08:00] [INFO] [lib.rs:92] [“Welcome to TiKV”]
    [2020/12/14 19:18:02.408 +08:00] [INFO] [lib.rs:94] []
    [2020/12/14 19:18:02.408 +08:00] [INFO] [lib.rs:94] [“Release Version: 4.0.8”]
    [2020/12/14 19:18:02.408 +08:00] [INFO] [lib.rs:94] [“Edition: Community”]
    [2020/12/14 19:18:02.408 +08:00] [INFO] [lib.rs:94] [“Git Commit Hash: 83091173e960e5a0f5f417e921a0801d2f6635ae”]
    [2020/12/14 19:18:02.408 +08:00] [INFO] [lib.rs:94] [“Git Commit Branch: heads/refs/tags/v4.0.8”]
    [2020/12/14 19:18:02.408 +08:00] [INFO] [lib.rs:94] [“UTC Build Time: 2020-10-30 08:40:33”]
    [2020/12/14 19:18:02.408 +08:00] [INFO] [lib.rs:94] [“Rust Version: rustc 1.42.0-nightly (0de96d37f 2019-12-19)”]
    [2020/12/14 19:18:02.408 +08:00] [INFO] [lib.rs:94] [“Enable Features: jemalloc mem-profiling portable sse protobuf-codec”]
    [2020/12/14 19:18:02.408 +08:00] [INFO] [lib.rs:94] [“Profile: dist_release”]
    [2020/12/14 19:18:02.421 +08:00] [INFO] [mod.rs:46] [“memory limit in bytes: 134661169152, cpu cores quota: 32”]
    [2020/12/14 19:18:02.421 +08:00] [WARN] [lib.rs:530] [“environment variable TZ is missing, using /etc/localtime“]
    [2020/12/14 19:18:02.421 +08:00] [WARN] [server.rs:852] [“check: kernel”] [err=“kernel parameters net.core.somaxconn got 128, expect 32768”]
    [2020/12/14 19:18:02.421 +08:00] [WARN] [server.rs:852] [“check: kernel”] [err=“kernel parameters net.ipv4.tcp_syncookies got 1, expect 0”]
    [2020/12/14 19:18:02.421 +08:00] [WARN] [server.rs:852] [“check: kernel”] [err=“kernel parameters vm.swappiness got 60, expect 0”]
    [2020/12/14 19:18:02.422 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:02.423 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a180 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:04.423 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 4-DEADLINE_EXCEEDED, details: Some(“Deadline Exceeded”) }))”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:04.423 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:04.423 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a240 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:06.424 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 4-DEADLINE_EXCEEDED, details: Some(“Deadline Exceeded”) }))”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:06.424 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:06.424 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a300 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:08.425 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 4-DEADLINE_EXCEEDED, details: Some(“Deadline Exceeded”) }))”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:08.425 +08:00] [WARN] [client.rs:104] [“validate PD endpoints failed”] [err=“Other(”[components/pd_client/src/util.rs:410]: PD cluster failed to respond”)”]
    [2020/12/14 19:18:08.726 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:08.726 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a3c0 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.141 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(“server not started”) }))”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:09.141 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:09.141 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a480 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.142 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(“server not started”) }))”[2020/12/14 19:18:09.142 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(“server not started”) }))”
    ] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:09.142 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:09.142 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a540 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.142 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(“server not started”) }))”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:09.443 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:09.443 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a600 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.444 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:09.444 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a6c0 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.444 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:09.444 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a780 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.445 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/14 19:18:09.445 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a840 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.746 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:09.746 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a900 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.747 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:09.747 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a9c0 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.747 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:09.748 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483aa80 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.748 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/14 19:18:09.748 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483ab40 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.049 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:10.049 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483ac00 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.050 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:10.050 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483acc0 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.051 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:10.051 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483ad80 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.052 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/14 19:18:10.052 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483ae40 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.353 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:10.353 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483af00 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.354 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:10.354 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483afc0 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.355 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:10.355 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b080 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.355 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/14 19:18:10.356 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b140 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.657 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:10.657 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b200 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.657 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:10.657 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a180 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.658 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:10.659 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b350 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.659 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/14 19:18:10.659 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b410 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.961 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:10.961 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b4d0 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.961 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:10.962 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b590 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.963 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:10.963 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b650 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:10.963 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/14 19:18:10.963 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b710 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:11.264 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:11.264 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483b7d0 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:04.423 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a240 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:06.424 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 4-DEADLINE_EXCEEDED, details: Some(“Deadline Exceede
    d”) }))”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:06.424 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:06.424 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a300 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:08.425 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 4-DEADLINE_EXCEEDED, details: Some(“Deadline Exceede
    d”) }))”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:08.425 +08:00] [WARN] [client.rs:104] [“validate PD endpoints failed”] [err=“Other(”[components/pd_client/src/util.rs:410]: PD cluster failed to respond")
    “]
    [2020/12/14 19:18:08.726 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:08.726 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a3c0 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.141 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(“server not started”) }))”
    ] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:09.141 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:09.141 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a480 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.142 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(“server not started”) }))”
    ] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:09.142 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:09.142 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a540 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.142 +08:00] [INFO] [util.rs:378] [“PD failed to respond”] [err=“Grpc(RpcFailure(RpcStatus { status: 2-UNKNOWN, details: Some(“server not started”) }))”
    ] [endpoints=10.18.0.184:2379]
    [2020/12/14 19:18:09.443 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/14 19:18:09.443 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a600 for subchannel 0x7f42f6419a00”]
    [2020/12/14 19:18:09.444 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/14 19:18:09.444 +08:00] [INFO] [] [“New connected subchannel at 0x7f42e483a6c0 for subchannel 0x7f42f6419a00”]
    [2020/12/15 10:35:09.351 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:09.351 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3c820 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.351 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:09.352 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a000 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.352 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:09.352 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a1b0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.353 +08:00] [WARN] [client.rs:104] [“validate PD endpoints failed”] [err=“Other(”[components/pd_client/src/util.rs:490]: failed to connect to [name: \“pd-10.18.0.184-2379\” member_id: 4139599881588812741 peer_urls: \“http://10.18.0.184:2380\” client_urls: \“http://10.18.0.184:2379\”, name: \“pd-10.18.0.182-2379\” member_id: 9310591878211497108 peer_urls: \“http://10.18.0.182:2380\” client_urls: \“http://10.18.0.182:2379\”, name: \“pd-10.18.0.181-2379\” member_id: 15963526235950833951 peer_urls: \“http://10.18.0.181:2380\” client_urls: \“http://10.18.0.181:2379\”]”)"]
    [2020/12/15 10:35:09.653 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:09.653 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a2d0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.654 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:09.654 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a450 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.654 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:09.655 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a690 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.655 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:09.655 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a780 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.956 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:09.956 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3aed0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.957 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:09.957 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b140 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.957 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:09.958 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b200 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:09.958 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:09.958 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b2c0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.259 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:10.260 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b380 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.260 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:10.260 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b440 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.261 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:10.261 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b500 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.262 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:10.262 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b5c0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.563 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:10.563 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b6b0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.563 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:10.564 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b770 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.564 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:10.564 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b920 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.565 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:10.565 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3ba70 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.866 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:10.867 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3bb60 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.867 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:10.867 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3bc80 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.868 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:10.868 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3bd40 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:10.870 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:10.870 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3be00 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.171 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:11.172 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3bec0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.172 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:11.172 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3bf80 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.173 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:11.173 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3c040 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.174 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:11.174 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a060 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.474 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:11.475 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a360 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.475 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:11.475 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a480 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.476 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:11.476 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a570 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.477 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:11.477 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a6c0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.778 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:11.778 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a870 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.778 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:11.779 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3a990 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.779 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:11.779 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3aa50 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:11.780 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:11.780 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3ab10 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.081 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:12.081 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3abd0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.082 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:12.082 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3ac90 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.082 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:12.083 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3ad50 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.083 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:12.083 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3ae10 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.384 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:12.384 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3af00 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.385 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:12.385 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3aff0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.386 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:12.386 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b0b0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.386 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:12.387 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3b890 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.387 +08:00] [WARN] [client.rs:104] [“validate PD endpoints failed”] [err=“Other(”[components/pd_client/src/util.rs:490]: failed to connect to [name: \“pd-10.18.0.184-2379\” member_id: 4139599881588812741 peer_urls: \“http://10.18.0.184:2380\” client_urls: \“http://10.18.0.184:2379\”, name: \“pd-10.18.0.182-2379\” member_id: 9310591878211497108 peer_urls: \“http://10.18.0.182:2380\” client_urls: \“http://10.18.0.182:2379\”, name: \“pd-10.18.0.181-2379\” member_id: 15963526235950833951 peer_urls: \“http://10.18.0.181:2380\” client_urls: \“http://10.18.0.181:2379\”]")"]
    [2020/12/15 10:35:12.687 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.181:2379]
    [2020/12/15 10:35:12.687 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3ba40 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.688 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.182:2379]
    [2020/12/15 10:35:12.688 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3c100 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.689 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=10.18.0.184:2379]
    [2020/12/15 10:35:12.689 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3c1c0 for subchannel 0x7f71e2419a00”]
    [2020/12/15 10:35:12.690 +08:00] [INFO] [util.rs:419] [“connecting to PD endpoint”] [endpoints=http://10.18.0.184:2379]
    [2020/12/15 10:35:12.690 +08:00] [INFO] [] [“New connected subchannel at 0x7f71d0a3c280 for subchannel 0x7f71e2419a00”]

非常感谢,帮忙看一下

正常的启动顺序是 pd - tikv - tidb,看上面报错是 tikv 连接不到 pd 了,确认下启动命令是 tiup cluster start <cluster_name>,再检查下 pd 日志有没有报错

  1. 按照正常命令启动,tiup cluster start <cluster_name>

  2. 三个PD都成功启动了:
    Starting component pd
    Starting instance pd 10.18.0.184:2379
    Starting instance pd 10.18.0.182:2379
    Starting instance pd 10.18.0.181:2379
    Start pd 10.18.0.182:2379 success
    Start pd 10.18.0.184:2379 success
    Start pd 10.18.0.181:2379 success

  3. 启动过程中打印PD错误日志:
    [2020/12/15 12:08:52.361 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.362 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.364 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.365 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.366 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.367 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.368 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.369 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.370 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.371 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.372 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.373 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.374 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.375 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.377 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.378 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.379 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.380 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.381 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.382 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.383 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="[PD:etcd:ErrEtcdGrantLease]etcdserver: mvcc: database space exceeded"]
    [2020/12/15 12:08:52.384 +08:00] [ERROR] [server.go:1117] [“campaign leader meet error”] [error="
    。。。。。

===pd_stderr.log
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.212+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = ResourceExhausted desc = etcdserver: mvcc: database space exceeded”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.212+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = NotFound desc = etcdserver: requested lease not found”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.213+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = ResourceExhausted desc = etcdserver: mvcc: database space exceeded”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.213+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = NotFound desc = etcdserver: requested lease not found”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.214+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = ResourceExhausted desc = etcdserver: mvcc: database space exceeded”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.214+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = NotFound desc = etcdserver: requested lease not found”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.215+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = ResourceExhausted desc = etcdserver: mvcc: database space exceeded”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.215+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = NotFound desc = etcdserver: requested lease not found”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.267+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = NotFound desc = etcdserver: requested lease not found”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.319+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = NotFound desc = etcdserver: requested lease not found”}
{“level”:“warn”,“ts”:“2020-12-15T12:08:52.372+0800”,“caller”:“clientv3/retry_interceptor.go:61”,“msg”:“retrying of unary invoker failed”,“target”:“endpoint://client-9d3f220b-898b-4e8c-a321-a5d3c580ad9b/10.18.0.182:2379”,“attempt”:0,“error”:“rpc error: code = NotFound desc = etcdserver: requested lease not found”}

etcdserver: mvcc: database space exceeded 是 etcd 超过 db size 限额了,可以先按照 etcd 文档清理恢复集群

// get current revision
$ rev=$(ETCDCTL_API=3 etcdctl --endpoints=:2379 endpoint status --write-out=“json” | egrep -o ‘“revision”:[0-9]’ | egrep -o '[0-9].’)
// compact away all old revisions
$ ETCDCTL_API=3 etcdctl compact $rev
compacted revision 1516
// defragment away excessive space
$ ETCDCTL_API=3 etcdctl defrag
Finished defragmenting etcd member[127.0.0.1:2379]
// disarm alarm
$ ETCDCTL_API=3 etcdctl alarm disarm
memberID:13803658152347727308 alarm:NOSPACE
// test puts are allowed again
$ ETCDCTL_API=3 etcdctl put newkey 123
OK

原因可能是 https://github.com/pingcap/ticdc/issues/993

请问下集群目前的版本 4.0.8,是从之前的版本升级上来的吗

v4.0.8 之前版本 TiCDC 可能打满 PD etcd db size(default 8GB) 导致 PD crash

  • Case1: 上游 TiDB 频繁创建表比如 1000 表/1h, 这些表通过 TiCDC 复制到下游,v4.0.8 之后的版本 4GB db size 水位线支持 5000表/1h
  • Case2: 使用 TiCDC 同步大量的表
    比如 2 小时内复制 1400 张表时,占用的 db size 大概是 7.51GB,N * a * (time window / key flushed interval) = 1400 * 40 * (2 * 3600 / 0.05),v4.0.8 修复后避免持续刷新 task status 到 etcd,应该不会再遇到这类 case

不是之前版本升级上来的,集群安装的版本是4.08,我先恢复集群

请问问题是否解决了? 是 TiCDC 导致的吗?

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。