使用 Docker Compose 快速构建 TiDB 集群完成后PD节点处于restarting状态,提示参数不对

  • 【TiDB 版本】:

  • 【问题描述】: 参考 使用 Docker Compose 快速构建 TiDB 集群 https://pingcap.com/docs-cn/stable/how-to/get-started/deploy-tidb-from-docker-compose/ 部署单机版,但是PD无法正常启动

          $ docker-compose.exe  ps
                     Name                             Command                 State                            Ports
      ------------------------------------------------------------------------------------------------------------------------------------
      tidbdockercompose_grafana_1          /run.sh                          Up           0.0.0.0:3000->3000/tcp
      **tidbdockercompose_pd0_1              /pd-server --name=pd0 --cl ...   Restarting**
      **tidbdockercompose_pd1_1              /pd-server --name=pd1 --cl ...   Restarting**
      **tidbdockercompose_pd2_1              /pd-server --name=pd2 --cl ...   Restarting**
      **tidbdockercompose_prometheus_1       /bin/prometheus --log.leve ...   Exit 0**
      tidbdockercompose_pushgateway_1      /bin/pushgateway --log.lev ...   Up           9091/tcp
      tidbdockercompose_tidb-vision_1      /bin/sh -c sed -i -e "s/PD ...   Up           2015/tcp, 443/tcp, 80/tcp, 0.0.0.0:8010->8010/tcp
      tidbdockercompose_tidb_1             /tidb-server --store=tikv  ...   Up           0.0.0.0:10080->10080/tcp, 0.0.0.0:4000->4000/tcp
      tidbdockercompose_tikv0_1            /tikv-server --addr=0.0.0. ...   Up           20160/tcp
      tidbdockercompose_tikv1_1            /tikv-server --addr=0.0.0. ...   Up           20160/tcp
      tidbdockercompose_tikv2_1            /tikv-server --addr=0.0.0. ...   Up           20160/tcp
      tidbdockercompose_tispark-master_1   /opt/spark/sbin/start-mast ...   Up           0.0.0.0:7077->7077/tcp, 0.0.0.0:8080->8080/tcp
      tidbdockercompose_tispark-slave0_1   /opt/spark/sbin/start-slav ...   Up           0.0.0.0:38081->38081/tcp
    

查看pd1.log信息如下

  [2020/02/14 06:52:45.467 +00:00] [INFO] [util.go:50] ["Welcome to Placement Driver (PD)"]
[2020/02/14 06:52:45.468 +00:00] [INFO] [util.go:51] [PD] [release-version=v4.0.0-beta-19-gc5d36ddf]
[2020/02/14 06:52:45.468 +00:00] [INFO] [util.go:52] [PD] [git-hash=c5d36ddfeeb27a6c43b0eb20637d372ab5704491]
[2020/02/14 06:52:45.468 +00:00] [INFO] [util.go:53] [PD] [git-branch=master]
[2020/02/14 06:52:45.468 +00:00] [INFO] [util.go:54] [PD] [utc-build-time="2020-02-13 11:45:41"]
[2020/02/14 06:52:45.468 +00:00] [INFO] [metricutil.go:85] ["start Prometheus push client"]
[2020/02/14 06:52:45.468 +00:00] [INFO] [server.go:202] ["PD Config"] [config="{\"client-urls\":\"http://0.0.0.0:2379\",\"peer-urls\":\"http://0.0.0.0:2380\",\"advertise-client-urls\":\"http://pd2:2379\",\"advertise-peer-urls\":\"http://pd2:2380\",\"name\":\"pd2\",\"data-dir\":\"/data/pd2\",\"force-new-cluster\":false,\"enable-grpc-gateway\":true,\"initial-cluster\":\"pd0=http://pd0:2380,pd1=http://pd1:2380,pd2=http://pd2:2380\",\"initial-cluster-state\":\"new\",\"join\":\"\",\"lease\":3,\"log\":{\"level\":\"debug\",\"format\":\"text\",\"disable-timestamp\":false,\"file\":{\"filename\":\"/logs/pd2.log\",\"max-size\":300,\"max-days\":0,\"max-backups\":0},\"development\":false,\"disable-caller\":false,\"disable-stacktrace\":false,\"disable-error-verbose\":true,\"sampling\":null},\"log-file\":\"\",\"log-level\":\"\",\"tso-save-interval\":\"3s\",\"metric\":{\"job\":\"pd2\",\"address\":\"pushgateway:9091\",\"interval\":\"15s\"},\"schedule\":{\"max-snapshot-count\":3,\"max-pending-peer-count\":16,\"max-merge-region-size\":0,\"max-merge-region-keys\":200000,\"split-merge-interval\":\"1h0m0s\",\"enable-one-way-merge\":\"false\",\"enable-cross-table-merge\":\"false\",\"patrol-region-interval\":\"100ms\",\"max-store-down-time\":\"30m0s\",\"leader-schedule-limit\":4,\"leader-schedule-policy\":\"count\",\"region-schedule-limit\":4,\"replica-schedule-limit\":8,\"merge-schedule-limit\":8,\"hot-region-schedule-limit\":4,\"hot-region-cache-hits-threshold\":3,\"store-balance-rate\":15,\"tolerant-size-ratio\":5,\"low-space-ratio\":0.8,\"high-space-ratio\":0.6,\"scheduler-max-waiting-operator\":3,\"enable-remove-down-replica\":\"true\",\"enable-replace-offline-replica\":\"true\",\"enable-make-up-replica\":\"true\",\"enable-remove-extra-replica\":\"true\",\"enable-location-replacement\":\"true\",\"enable-debug-metrics\":\"false\",\"schedulers-v2\":[{\"type\":\"balance-region\",\"args\":null,\"disable\":false,\"args-payload\":\"\"},{\"type\":\"balance-leader\",\"args\":null,\"disable\":false,\"args-payload\":\"\"},{\"type\":\"hot-region\",\"args\":null,\"disable\":false,\"args-payload\":\"\"},{\"type\":\"label\",\"args\":null,\"disable\":false,\"args-payload\":\"\"}],\"store-limit-mode\":\"manual\"},\"replication\":{\"max-replicas\":3,\"location-labels\":\"\",\"strictly-match-label\":\"false\",\"enable-placement-rules\":\"false\"},\"pd-server\":{\"use-region-storage\":\"true\",\"max-reset-ts-gap\":86400000000000,\"key-type\":\"table\",\"runtime-services\":\"\",\"metric-storage\":\"\"},\"cluster-version\":\"0.0.0\",\"quota-backend-bytes\":\"8GiB\",\"auto-compaction-mode\":\"periodic\",\"auto-compaction-retention-v2\":\"1h\",\"TickInterval\":\"500ms\",\"ElectionInterval\":\"3s\",\"PreVote\":true,\"security\":{\"cacert-path\":\"\",\"cert-path\":\"\",\"key-path\":\"\"},\"label-property\":{},\"WarningMsgs\":null,\"DisableStrictReconfigCheck\":false,\"HeartbeatStreamBindInterval\":\"1m0s\",\"LeaderPriorityCheckInterval\":\"1m0s\",\"EnableDynamicConfig\":false,\"EnableDashboard\":true}"]
[2020/02/14 06:52:45.470 +00:00] [INFO] [server.go:184] ["register REST path"] [path=/pd/api/v1]
[2020/02/14 06:52:45.470 +00:00] [INFO] [server.go:238] ["Enabled Dashboard API"] [path=/dashboard/api/]
[2020/02/14 06:52:45.473 +00:00] [INFO] [server.go:239] ["Enabled Dashboard UI"] [path=/dashboard/]
[2020/02/14 06:52:45.473 +00:00] [INFO] [etcd.go:117] ["configuring peer listeners"] [listen-peer-urls="[http://0.0.0.0:2380]"]
[2020/02/14 06:52:45.473 +00:00] [INFO] [etcd.go:127] ["configuring client listeners"] [listen-client-urls="[http://0.0.0.0:2379]"]
[2020/02/14 06:52:45.473 +00:00] [INFO] [etcd.go:602] ["pprof is enabled"] [path=/debug/pprof]
[2020/02/14 06:52:45.473 +00:00] [INFO] [systime_mon.go:26] ["start system time monitor"]
[2020/02/14 06:52:45.474 +00:00] [INFO] [etcd.go:299] ["starting an etcd server"] [etcd-version=3.4.3] [git-sha="Not provided (use ./build instead of go build)"] [go-version=go1.13] [go-os=linux] [go-arch=amd64] [max-cpu-set=1] [max-cpu-available=1] [member-initialized=false] [name=pd2] [data-dir=/data/pd2] [wal-dir=] [wal-dir-dedicated=] [member-dir=/data/pd2/member] [force-new-cluster=false] [heartbeat-interval=500ms] [election-timeout=3s] [initial-election-tick-advance=true] [snapshot-count=100000] [snapshot-catchup-entries=5000] [initial-advertise-peer-urls="[http://pd2:2380]"] [listen-peer-urls="[http://0.0.0.0:2380]"] [advertise-client-urls="[http://pd2:2379]"] [listen-client-urls="[http://0.0.0.0:2379]"] [listen-metrics-urls="[]"] [cors="[*]"] [host-whitelist="[*]"] [initial-cluster="pd0=http://pd0:2380,pd1=http://pd1:2380,pd2=http://pd2:2380"] [initial-cluster-state=new] [initial-cluster-token=etcd-cluster] [quota-size-bytes=8589934592] [pre-vote=true] [initial-corrupt-check=false] [corrupt-check-time-interval=0s] [auto-compaction-mode=periodic] [auto-compaction-retention=1h0m0s] [auto-compaction-interval=1h0m0s] [discovery-url=] [discovery-proxy=]
[2020/02/14 06:52:45.482 +00:00] [PANIC] [backend.go:157] ["failed to open database"] [path=/data/pd2/member/snap/db] [error="invalid argument"]

但是不清楚是哪个参数有问题,麻烦各位大侠指点迷津,多谢。

etcd 的数据文件没了,推测是 data-dir 没删干净,辛苦检查下。

删除目录tidb-docker-compose\data之后重新启动docker-compose,现象仍然跟以前一样。

[data-dir=/data/pd1] tidb-docker-compose\data\pd1\下只有一个文件夹,即member\snap\db

麻烦帮忙看下还需要提供哪些信息,或者查看哪些地方?多谢。

因为是重建创建的 docker compose 下的 TiDB,建议将 docker ps -a 下面的 docker container 删除干净。另外就是 docker compose 对应的物理文件系统目录下面的数据文件。
然后再尝试重新创建。

删掉container和整个tidb-docker-compose目录之后,重新创建还是一样的问题。请问还有其他的办法吗?

  Cheng@JIANGXIUQIANG MINGW64 ~/tidb-docker-compose (master)
    $ docker-compose.exe ps
                   Name                             Command                 State                            Ports
    ------------------------------------------------------------------------------------------------------------------------------------
    tidbdockercompose_grafana_1          /run.sh                          Up           0.0.0.0:3000->3000/tcp
    tidbdockercompose_pd0_1              /pd-server --name=pd0 --cl ...   Restarting
    tidbdockercompose_pd1_1              /pd-server --name=pd1 --cl ...   Restarting
    tidbdockercompose_pd2_1              /pd-server --name=pd2 --cl ...   Restarting
    tidbdockercompose_prometheus_1       /bin/prometheus --log.leve ...   Exit 0
    tidbdockercompose_pushgateway_1      /bin/pushgateway --log.lev ...   Up           9091/tcp
    tidbdockercompose_tidb-vision_1      /bin/sh -c sed -i -e "s/PD ...   Up           2015/tcp, 443/tcp, 80/tcp, 0.0.0.0:8010->8010/tcp
    tidbdockercompose_tidb_1             /tidb-server --store=tikv  ...   Up           0.0.0.0:10080->10080/tcp, 0.0.0.0:4000->4000/tcp
    tidbdockercompose_tikv0_1            /tikv-server --addr=0.0.0. ...   Up           20160/tcp
    tidbdockercompose_tikv1_1            /tikv-server --addr=0.0.0. ...   Up           20160/tcp
    tidbdockercompose_tikv2_1            /tikv-server --addr=0.0.0. ...   Up           20160/tcp
    tidbdockercompose_tispark-master_1   /opt/spark/sbin/start-mast ...   Up           0.0.0.0:7077->7077/tcp, 0.0.0.0:8080->8080/tcp
    tidbdockercompose_tispark-slave0_1   /opt/spark/sbin/start-slav ...   Up           0.0.0.0:38081->38081/tcp

ps:
刚开始提问题的时候也是新创建的环境。

发一下清理 docker-compose 的 docker container 的操作步骤和具体操作命令

用的是docker rm container id -f,最终结果如下。然后直接把tidb-docker-compose目录整个删除,重新从git上拉取的。

Cheng@JIANGXIUQIANG MINGW64 ~/tidb-docker-compose (master)
$ docker ps
time=“2020-02-17T14:06:46+08:00” level=info msg=“Unable to use system certificate pool: crypto/x509: system root pool is not available on Windows”
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES

Cheng@JIANGXIUQIANG MINGW64 ~/tidb-docker-compose (master)

确认过你的单机资源是否充足 ?