升级TiDB报错

为提高效率,提问时请提供以下信息,问题描述清晰可优先响应。

  • 【TiDB 版本】:2.0.6
  • 【问题描述】:从2。0.6升级至2.1版本过程中,报错如下: TASK [check_config_pd : Check PD config] ****************************************************************************************************************************************************** fatal: [10.240.17.203]: FAILED! => {“changed”: true, “cmd”: “cd /tmp/tidb_check_config && ./pd-server -config ./pd.toml -config-check”, “delta”: “0:00:00.318676”, “end”: “2019-12-27 00:59:02.183263”, “msg”: “non-zero return code”, “rc”: 1, “start”: “2019-12-27 00:59:01.864587”, “stderr”: “flag provided but not defined: -config-check\ Usage of pd:\ -L string\ \tlog level: debug, info, warn, error, fatal (default ‘info’)\ -V\tprint version information and exit\ -advertise-client-urls string\ \tadvertise url for client traffic (default ‘${client-urls}’)\ -advertise-peer-urls string\ \tadvertise url for peer traffic (default ‘${peer-urls}’)\ -cacert string\ \tPath of file that contains list of trusted TLS CAs\ -cert string\ \tPath of file that contains X509 certificate in PEM format\ -client-urls string\ \turl for client traffic (default “http://127.0.0.1:2379”)\ -config string\ \tConfig file\ -data-dir string\ \tpath to the data directory (default ‘default.${name}’)\ -initial-cluster string\ \tinitial cluster configuration for bootstrapping, e,g. pd=http://127.0.0.1:2380\ -join string\ \tjoin to an existing cluster (usage: cluster’s ‘${advertise-client-urls}’\ -key string\ \tPath of file that contains X509 key in PEM format\ -log-file string\ \tlog file path\ -log-rotate\ \trotate log (default true)\ -name string\ \thuman-readable name for this pd member (default “pd”)\ -namespace-classifier string\ \tnamespace classifier (default ‘table’) (default “table”)\ -peer-urls string\ \turl for peer traffic (default “http://127.0.0.1:2380”)\ -version\ \tprint version information and exit\ time=“2019-12-27T00:59:02+08:00” level=fatal msg=“parse cmd flags error: flag provided but not defined: -config-check\ngithub.com/pingcap/pd/server.(*Config).Parse\ \t/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/pd/server/config.go:227\ main.main\ \t/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:40\ runtime.main\ \t/usr/local/go/src/runtime/proc.go:200\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1337\ ””, “stderr_lines”: [“flag provided but not defined: -config-check”, “Usage of pd:”, " -L string", " \tlog level: debug, info, warn, error, fatal (default ‘info’)", " -V\tprint version information and exit", " -advertise-client-urls string", " \tadvertise url for client traffic (default ‘${client-urls}’)", " -advertise-peer-urls string", " \tadvertise url for peer traffic (default ‘${peer-urls}’)", " -cacert string", " \tPath of file that contains list of trusted TLS CAs", " -cert string", " \tPath of file that contains X509 certificate in PEM format", " -client-urls string", " \turl for client traffic (default “http://127.0.0.1:2379”)", " -config string", " \tConfig file", " -data-dir string", " \tpath to the data directory (default ‘default.${name}’)", " -initial-cluster string", " \tinitial cluster configuration for bootstrapping, e,g. pd=http://127.0.0.1:2380", " -join string", " \tjoin to an existing cluster (usage: cluster’s ‘${advertise-client-urls}’", " -key string", " \tPath of file that contains X509 key in PEM format", " -log-file string", " \tlog file path", " -log-rotate", " \trotate log (default true)", " -name string", " \thuman-readable name for this pd member (default “pd”)", " -namespace-classifier string", " \tnamespace classifier (default ‘table’) (default “table”)", " -peer-urls string", " \turl for peer traffic (default “http://127.0.0.1:2380”)", " -version", " \tprint version information and exit", “time=“2019-12-27T00:59:02+08:00” level=fatal msg=“parse cmd flags error: flag provided but not defined: -config-check\ngithub.com/pingcap/pd/server.(*Config).Parse\ \t/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/pd/server/config.go:227\ main.main\ \t/home/jenkins/workspace/release_tidb_2.1-ga/go/src/github.com/pingcap/pd/cmd/pd-server/main.go:40\ runtime.main\ \t/usr/local/go/src/runtime/proc.go:200\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1337\ ””], “stdout”: “”, “stdout_lines”: []} to retry, use: --limit @/home/tidb/tidb-ansible/retry_files/rolling_update.retry

请帮忙看看是什么原因。需要查看哪些配置文件后补上来

若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

TIKV同样报错

TASK [check_config_tikv : Check TiKV config] ************************************************************************************************************************************************** fatal: [TiKV1-1]: FAILED! => {“changed”: true, “cmd”: “cd /tmp/tidb_check_config && ./tikv-server --pd-endpoints pd:port --config ./tikv.toml --config-check”, “delta”: “0:00:00.352911”, “end”: “2019-12-27 01:09:33.149723”, “msg”: “non-zero return code”, “rc”: 1, “start”: “2019-12-27 01:09:32.796812”, “stderr”: “error: Found argument ‘–config-check’ which wasn’t expected, or isn’t valid in this context\ \tDid you mean \u001b[32m–\u001b[0m\u001b[32mconfig\u001b[0m?\ \ USAGE:\ tikv-server --config --pd-endpoints <PD_URL>…\ \ For more information try --help”, “stderr_lines”: [“error: Found argument ‘–config-check’ which wasn’t expected, or isn’t valid in this context”, “\tDid you mean \u001b[32m–\u001b[0m\u001b[32mconfig\u001b[0m?”, “”, “USAGE:”, " tikv-server --config --pd-endpoints <PD_URL>…", “”, “For more information try --help”], “stdout”: “”, “stdout_lines”: []}

注释掉上述config检查后,又报错:

TASK [Check pd cluster status] **************************************************************************************************************************************************************** fatal: [10.240.17.203]: FAILED! => {“changed”: false, “content”: “”, “msg”: “Status code was -1 and not [200]: Request failed: <urlopen error [Errno 111] 拒绝连接>”, “redirected”: false, “status”: -1, “url”: “http://10.240.17.203:2379/pd/health”}

NO MORE HOSTS LEFT **************************************************************************************************************************************************************************** to retry, use: --limit @/home/tidb/tidb-ansible/retry_files/rolling_update.retry

PLAY RECAP ************************************************************************************************************************************************************************************ 10.240.17.200 : ok=10 changed=1 unreachable=0 failed=0
10.240.17.201 : ok=9 changed=1 unreachable=0 failed=0
10.240.17.203 : ok=8 changed=0 unreachable=0 failed=1
10.240.17.204 : ok=8 changed=0 unreachable=0 failed=0
10.240.17.205 : ok=8 changed=0 unreachable=0 failed=0
10.240.17.206 : ok=8 changed=0 unreachable=0 failed=0
10.240.17.207 : ok=8 changed=0 unreachable=0 failed=0
10.240.17.208 : ok=8 changed=0 unreachable=0 failed=0
TiKV1-1 : ok=3 changed=0 unreachable=0 failed=0
TiKV1-2 : ok=3 changed=0 unreachable=0 failed=0
TiKV2-1 : ok=3 changed=0 unreachable=0 failed=0
TiKV2-2 : ok=3 changed=0 unreachable=0 failed=0
TiKV3-1 : ok=3 changed=0 unreachable=0 failed=0
TiKV3-2 : ok=3 changed=0 unreachable=0 failed=0
localhost : ok=7 changed=4 unreachable=0 failed=0

ERROR MESSAGE SUMMARY ************************************************************************************************************************************************************************* [10.240.17.200]: Ansible UNREACHABLE! => playbook: stop.yml; TASK: wait until the pushgateway port is down; message: {“changed”: false, “msg”: “Failed to connect to the host via ssh: Shared connection to 10.240.17.200 closed.”, “unreachable”: true}

[10.240.17.203]: Ansible FAILED! => playbook: rolling_update.yml; TASK: Check pd cluster status; message: {“changed”: false, “content”: “”, “msg”: “Status code was -1 and not [200]: Request failed: <urlopen error [Errno 111] 拒绝连接>”, “redirected”: false, “status”: -1, “url”: “http://10.240.17.203:2379/pd/health”}

1、请将详细的升级步骤提供下

2、是从 2.0.6 升级到 2.1.x 这个 2.1 的具体的版本是什么,检查请确认下载的 tidb-ansible 是否是升级的目标版本 2.1 对应的包

3、检查下 pushgateway 的服务是否可达

非常感谢您的解答。

我们是升级到2.1.6版本。按照https://pingcap.com/docs-cn/stable/how-to/upgrade/from-previous-version/步骤执行。最后我们将rolling_update.yml中的以下内容注释顺利完成的升级。

image

:+1::+1::+1:

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。