TIDB升级监控组件报错

pangyana · 2020 年6 月 23 日 08:49

为提高效率，提问时请提供以下信息，问题描述清晰可优先响应。

【TiDB 版本】： 3.0.9 升级4.0.1
【问题描述】：
升级监控组件的时候报错

ansible-playbook rolling_update_monitor.yml

ERROR MESSAGE SUMMARY **********************************************************************************************************************************************************************************************************************
[172.16.xxx.xx]: Ansible Failed! ==>
changed=false
connection: close
content: ‘{“message”:“Invalid username or password”}’
content_length: ‘42’
content_type: application/json; charset=UTF-8
date: Tue, 23 Jun 2020 08:43:26 GMT
json:
message: Invalid username or password
msg: ‘Status code was 401 and not [200]: HTTP Error 401: Unauthorized’
redirected: false
status: 401
url: http://172.16.xxx.xx:3000/api/auth/keys

yilong · 2020 年6 月 23 日 09:03

之前可能修改过grafana的密码，请将inventory.ini中的grafana密码修改为修改后的，多谢。

pangyana · 2020 年6 月 23 日 09:06

还有一个问题
TASK [wait until the node_exporter port is down]

这个老是卡住，我只能直接去kill进程，这个是为什么呢

yilong · 2020 年6 月 23 日 10:02

检查能否使用脚本停止。脚本在安装目录的/scripts目录下
检查是否端口有其他程序占用，多谢。

pangyana · 2020 年6 月 24 日 06:24

bash stop_node_exporter.sh
停不了进程，没有反应

yilong · 2020 年6 月 24 日 07:28

请具体查看下脚本是否和进程对应，cat 检查下
请检查/etc/system/systemd 目录下，是否还有其他 exporter进程访问了相同ip和端口，导致无法停止？

pangyana · 2020 年6 月 24 日 08:26

ps -ef|grep node
root 10345 1 0 Jun23 ? 00:03:58 /usr/local/services/prometheus_exporters/node_exporter-0.14.0.linux-amd64/node_exporter &

这个跟脚本里的node_exporter不一样吗？

bash run_node_exporter.sh
报错
time=“2020-06-24T16:25:30+08:00” level=fatal msg=“listen tcp :9100: bind: address already in use” source=“node_exporter.go:114”

yilong · 2020 年6 月 24 日 09:16

按照上面的方法自己检查下对比下，看看有没有重复的。

pangyana · 2020 年6 月 24 日 09:35

恩，我的意思是有一个node_exporter的进程，但是是自动启动的，跟用脚本启动的不是同一个，所以无法关闭
但是不知道为什么会自动启动这个 /usr/local/services/prometheus_exporters/node_exporter-0.14.0.linux-amd64/node_exporter

yilong · 2020 年6 月 24 日 10:31

查看下/etc/system/systemd目录下，是不是有另一个service文件拉起的，用systemd停止下。然后再继续操作。

pangyana · 2020 年6 月 28 日 07:31

看上去并没有

cd /etc/systemd/system/

ll|grep node
-rw-r–r-- 1 root root 318 Jan 18 15:27 node_exporter-9100.service

yilong · 2020 年6 月 28 日 09:32

那么用systemd 停止后，再操作试试，多谢。

system · 2022 年10 月 31 日 19:07

此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。