TidbH
2020 年6 月 15 日 07:18
1
tidb:v4.0.0-rc
错误描述:
在tidb用户下执行 tiup cluster import -d /home/tidb/tidb-ansible 命令出现如下权限问题。
Error: can not detect dir paths of tiflash 192.168.1.1:9000, grep: /etc/systemd/system/tiflash-9000.service: Permission denied
Verbose debug logs has been written to /home/tidb/logs/tiup-cluster-debug-2020-06-15-15-06-17.log.
Error: run /home/tidb/.tiup/components/cluster/v1.0.4/tiup-cluster
(wd:/home/tidb/.tiup/data/S1xgy64) failed: exit status 1
手动对/etc/systemd/system/tiflash-9000.service文件添加777权限也还是报错。
cat /home/tidb/logs/tiup-cluster-debug-2020-06-15-15-06-17.log 日志:
2020-06-15T14:38:52.101+0800 DEBUG Detecting deploy paths on 192.168.1.1…
2020-06-15T14:38:52.215+0800 INFO SSHCommand {“host”: “192.168.1.1”, “port”: “22”, “cmd”: "PATH=$PATH:/usr/bin:/usr/sbin cat grep 'ExecStart' /etc/systemd/system/tiflash-9000.serv ice | sed 's/ExecStart=//'
", “stdout”: “”, “stderr”: “grep: /etc/systemd/system/tiflash-9000.service: Permission denied\
”}
2020-06-15T14:38:52.215+0800 INFO Execute command finished {“code”: 1, “error”: “can not detect dir paths of tiflash 192.168.1.1:9000, grep: /etc/systemd/system/tiflash-9000.serv
ice: Permission denied\
”, “errorVerbose”: “can not detect dir paths of tiflash 192.168.1.1:9000, grep: /etc/systemd/system/tiflash-9000.service: Permission denied\
\ngithub.com/pingcap/tiup/
pkg/cluster/ansible.readStartScript\
\tgithub.com/pingcap/tiup@/pkg/cluster/ansible/dirs.go:239\
github.com/pingcap/tiup/pkg/cluster/ansible.parseDirs\
\tgithub.com/pingcap/tiup@/pkg/cluster/an
sible/dirs.go:44\ngithub.com/pingcap/tiup/pkg/cluster/ansible.ParseAndImportInventory\
\tgithub.com/pingcap/tiup@/pkg/cluster/ansible/inventory.go:82\
github.com/pingcap/tiup/components/cluster
/command.newImportCmd.func1\
\tgithub.com/pingcap/tiup@/components/cluster/command/import.go:100\
github.com/spf13/cobra.(*Command ).execute\
\tgithub.com/spf13/cobra@v1.0.0/command.go:842\
gith
ub.com/spf13/cobra.(*Command ).ExecuteC\
\tgithub.com/spf13/cobra@v1.0.0/command.go:950\
github.com/spf13/cobra.(*Command ).Execute\
\tgithub.com/spf13/cobra@v1.0.0/command.go:887\
github.com/pin
gcap/tiup/components/cluster/command.Execute\
\tgithub.com/pingcap/tiup@/components/cluster/command/root.go:220\
main.main\
\tgithub.com/pingcap/tiup@/components/cluster/main.go:19\
runtime.mai
n\
\truntime/proc.go:203\
runtime.goexit\
\truntime/asm_amd64.s:1357”}
来了老弟
2020 年6 月 15 日 07:24
2
你好,
可否将 debug 日志上传下,这边看下上下文,并反馈下 ll/etc/systemd/system/tiflash-9000.service
TidbH
2020 年6 月 16 日 06:19
3
TidbH:
1
你好:日志已上传
tiup-cluster-debug-2020-06-15-15-06-17.log (10.8 KB)
ll -lht /etc/systemd/system/tiflash-9000.service
-rwxrwxrwx 1 tidb tidb 304 May 19 11:51 /etc/systemd/system/tiflash-9000.service
来了老弟
2020 年6 月 16 日 06:43
4
你好,
通过中控机 ssh tidb@192.168.1.1 执行下面语句看是否有权限问题。
cat grep 'ExecStart' /etc/systemd/system/tiflash-9000.service | sed 's/ExecStart=//'
TidbH
2020 年6 月 16 日 06:48
5
tidb>$ cat grep ‘ExecStart’ /etc/systemd/system/tiflash-9000.service | sed ‘s/ExecStart=//’
cat: grep: No such file or directory
cat: ExecStart: No such file or directory
cat: /etc/systemd/system/tiflash-9000.service: Permission denied
不行。
来了老弟
2020 年6 月 16 日 06:53
6
ssh tidb@192.168.1.1 是否是免密登录进去的。
通过 ssh -i /home/tidb/.tiup/storage/cluster/clusters/qh/ssh/id_rsa tidb@172.16.4.107 看是否成功,并执行 import 命令
TidbH
2020 年6 月 16 日 07:03
7
发现在tidb用户下没有/home/tidb/.tiup/storage/cluster/clusters/yourClusterName/ssh/id_rsa 这个文件:
1.1.tidb.com <2020-06-16 15:02:43> ~/.tiup/storage/cluster
tidb>$ ls
audit
难道是我安装tiup有问题? 参考文档:https://pingcap.com/docs-cn/dev/upgrade-tidb-using-tiup/
TidbH
2020 年6 月 16 日 07:15
8
192.168.1.1就是中控机,刚没说清楚,这个权限问题就是出在中控机上。
来了老弟
2020 年6 月 16 日 07:29
9
目前排查的方向是免密 sudo 的问题,目前操作为了验证通过 ssh 登录远程服务器,是否正常
额,辛苦将此字段换成集群的名字。
感谢反馈,将/home/tidb/.ssh/id_rsa.pub 写入 /home/tidb/.ssh/authorized_keys,并执行 import -d
TidbH
2020 年6 月 16 日 07:41
10
yourClusterName
额,辛苦将此字段换成集群的名字。
1.1.tidb.com <2020-06-16 15:02:43> ~/.tiup/storage/cluster
tidb>$ ls
audit
我目录下只有这个文件。
将/home/tidb/.ssh/id_rsa.pub 写入 /home/tidb/.ssh/authorized_keys 我看了免密串是一致的。
应该不是这个问题吧?
来了老弟
2020 年6 月 16 日 07:51
11
辛苦反馈下 service 文件,
cat /etc/systemd/system/tiflash-9000.service
TidbH
2020 年6 月 16 日 07:53
12
cat /etc/systemd/system/tiflash-9000.service
[Unit]
Description=tiflash-9000 service
After=syslog.target network.target remote-fs.target nss-lookup.target
[Service]
LimitNOFILE=1000000
#LimitCORE=infinity
LimitSTACK=10485760
User=tidb
ExecStart=/data0/tidb/scripts/run_tiflash.sh
Restart=always
RestartSec=15s
[Install]
WantedBy=multi-user.target
来了老弟
2020 年6 月 16 日 10:05
13
你好,
请执行下以下命令确认下 tidb 用的 sudo 权限
tiup cluster exec qh --command=“sudo echo success”
ps: qh 改为自己集群的名字
TidbH
2020 年6 月 17 日 01:33
14
你好:
tidb>$ tiup cluster exec test-cluster --command=“sudo echo success”
Starting component cluster
: /home/tidb/.tiup/components/cluster/v1.0.4/tiup-cluster exec test-cluster --command=“sudo echo success”
Run shell command on host in the tidb cluster
Usage:
tiup cluster exec [flags]
Flags:
–command string the command run on cluster host (default “ls”)
-h, --help help for exec
-N, --node strings Only exec on host with specified nodes
-R, --role strings Only exec on host with specified roles
–sudo use root permissions (default false)
Global Flags:
–ssh-timeout int Timeout in seconds to connect host via SSH, ignored for operations that don’t need an SSH connection. (default 5)
–wait-timeout int Timeout in seconds to wait for an operation to complete, ignored for operations that don’t fit. (default 60)
-y, --yes Skip all confirmations and assumes ‘yes’
TidbH
2020 年6 月 17 日 01:53
16
tidb>$ tiup cluster exec test-cluster --command=‘sudo echo success’
Starting component cluster
: /home/tidb/.tiup/components/cluster/v1.0.4/tiup-cluster exec test-cluster --command=sudo echo success
Error: cannot execute command on non-exists cluster test-cluster
Verbose debug logs has been written to /home/tidb/logs/tiup-cluster-debug-2020-06-17-09-52-17.log.
Error: run /home/tidb/.tiup/components/cluster/v1.0.4/tiup-cluster
(wd:/home/tidb/.tiup/data/S286z6L) failed: exit status 1
debug 日志:
tidb>$ more /home/tidb/logs/tiup-cluster-debug-2020-06-17-09-52-17.log
2020-06-17T09:52:17.593+0800 INFO Execute command {“command”: “tiup cluster exec test-cluster --command=sudo echo success”}
2020-06-17T09:52:17.593+0800 DEBUG Environment variables {“env”: [“TIUP_HOME=/home/tidb/.tiup”, “TIUP_WORK_DIR=/home/tidb”, “TIUP_INSTANCE_DATA_DIR=/hom
e/tidb/.tiup/data/S286z6L”, “TIUP_COMPONENT_DATA_DIR=/home/tidb/.tiup/storage/cluster”, “TIUP_COMPONENT_INSTALL_DIR=/home/tidb/.tiup/components/cluster/v1.0.4”
, “TIUP_TELEMETRY_STATUS=enable”, “TIUP_TELEMETRY_UUID=bd0663f6-9535-4351-84aa-dd2ccde2496e”, “TIUP_TAG=S286z6L”, “XDG_SESSION_ID=3918”, “HOSTNAME=1.1.tidb.com ”, “SHELL=/bin/bash”, “TERM=linux”, “HISTSIZE=1000”, “USER=tidb”, “LS_COLORS=rs=0:di=01;34:ln=01;36:mh=00:pi=40;33:so=01;35:do=01;35:bd=40;33;01:cd=40;
33;01:or=40;31;01:mi=01;05;37;41:su=37;41:sg=30;43:ca=30;41:tw=30;42:ow=34;42:st=37;44:ex=01;32:.tar=01;31: .tgz=01;31:.arc=01;31: .arj=01;31:.taz=01;31: .l
ha=01;31:.lz4=01;31: .lzh=01;31:.lzma=01;31: .tlz=01;31:.txz=01;31: .tzo=01;31:.t7z=01;31: .zip=01;31:.z=01;31: .Z=01;31:.dz=01;31: .gz=01;31:.lrz=01;31
: .lz=01;31:.lzo=01;31: .xz=01;31:.bz2=01;31: .bz=01;31:.tbz=01;31: .tbz2=01;31:.tz=01;31: .deb=01;31:.rpm=01;31: .jar=01;31:.war=01;31: .ear=01;31:.sar
=01;31: .rar=01;31:.alz=01;31: .ace=01;31:.zoo=01;31: .cpio=01;31:.7z=01;31: .rz=01;31:.cab=01;31: .jpg=01;35:.jpeg=01;35: .gif=01;35:.bmp=01;35: .pbm=01
;35:.pgm=01;35: .ppm=01;35:.tga=01;35: .xbm=01;35:.xpm=01;35: .tif=01;35:.tiff=01;35: .png=01;35:.svg=01;35: .svgz=01;35:.mng=01;35: .pcx=01;35:.mov=01;
35: .mpg=01;35:.mpeg=01;35: .m2v=01;35:.mkv=01;35: .webm=01;35:.ogm=01;35: .mp4=01;35:.m4v=01;35: .mp4v=01;35:.vob=01;35: .qt=01;35:.nuv=01;35: .wmv=01;3
5:.asf=01;35: .rm=01;35:.rmvb=01;35: .flc=01;35:.avi=01;35: .fli=01;35:.flv=01;35: .gl=01;35:.dl=01;35: .xcf=01;35:.xwd=01;35: .yuv=01;35:.cgm=01;35: .e
mf=01;35:.axv=01;35: .anx=01;35:.ogv=01;35: .ogx=01;35:.aac=01;36: .au=01;36:.flac=01;36: .mid=01;36:.midi=01;36: .mka=01;36:.mp3=01;36: .mpc=01;36:.ogg
=01;36: .ra=01;36:.wav=01;36: .axa=01;36:.oga=01;36: .spx=01;36:*.xspf=01;36:”, “MAVEN_HOME=/usr/local/maven/apache-maven-3.6.3”, “MAIL=/var/spool/mail/tidb”
, “PATH=/home/tidb/.tiup/bin:/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/usr/local/maven/apache-maven-3.6.3/bin:/home/tidb/.local/bin:/home/tidb/bi
n:/usr/local/maven/apache-maven-3.6.3/bin”, “PWD=/home/tidb”, “LANG=en_US.UTF-8”, “TZ=Asia/Shanghai”, “PS1=\[\e]0;\a\]\
\[\e[1;32m\]\[\e[1;33m\]\H
[\e[1;35m\]<$(date +”%Y-%m-%d %T")> \[\e[32m\]\w\[\e[0m\]\
\u>\$ ", “HISTCONTROL=ignoredups”, “SHLVL=1”, “HOME=/home/tidb”, “LOGNAME=tidb”, “LES
SOPEN=||/usr/bin/lesspipe.sh %s”, “_=/home/tidb/.tiup/bin/tiup”, “OLDPWD=/home/tidb/tidb-ansible”]}
2020-06-17T09:52:17.599+0800 INFO Execute command finished {“code”: 1, “error”: “cannot execute command on non-exists cluster test-cluster”, “erro
rVerbose”: “cannot execute command on non-exists cluster test-cluster\ngithub.com/pingcap/tiup/components/cluster/command.newExecCmd.func1\
\tgithub.com/pingca
p/tiup@/components/cluster/command/exec.go:47\ngithub.com/spf13/cobra.(*Command ).execute\
\tgithub.com/spf13/cobra@v1.0.0/command.go:842\
github.com/spf13/cobr
a.(*Command).ExecuteC\
\tgithub.com/spf13/cobra@v1.0.0/command.go:950\
github.com/spf13/cobra.(*Command ).Execute\
\tgithub.com/spf13/cobra@v1.0.0/command.go:88
7\ngithub.com/pingcap/tiup/components/cluster/command.Execute\
\tgithub.com/pingcap/tiup@/components/cluster/command/root.go:220\
main.main\
\tgithub.com/pingc
ap/tiup@/components/cluster/main.go:19\
runtime.main\
\truntime/proc.go:203\
runtime.goexit\
\truntime/asm_amd64.s:1357”}
来了老弟
2020 年6 月 17 日 02:10
17
你这个集群名字不存在,
Error: cannot execute command on non-exists cluster test-cluster
错误:无法在不存在的集群测试集群上执行命令
可能是减号 引起的问题,尝试将集群名称用单引号包裹起来
TidbH
2020 年6 月 17 日 03:47
18
还是一样,报错和上面一样。 cluster_name = test-cluster 名字也没错。
来了老弟
2020 年6 月 17 日 04:11
19
sorry,该集群还没有导入成功,所以在 tiup 中还未识别该集群名称。
回归到问题本身,在 tidb-ansible 目录执行下 ansible-playbook -i hosts.ini create_users.yml -u root -k
保证 host.ini 中存在 inventory 文件中所有的 ip。重新配置下 ssh 互信和 sudo 规则。
因为在报错信息中,获取到的还是权限问题,
TidbH
2020 年6 月 17 日 05:47
20
tidb>$ ssh root@192.168.1.1
Last login: Wed Jun 17 13:45:18 2020 from 192.168.1.1
root># grep ‘ExecStart’ /etc/systemd/system/tiflash-9000.service | sed ‘s/ExecStart=//’
/data0/tidb/scripts/run_tiflash.sh
从tidb用户ssh到root没问题。 还是一样的错误