TIDB br 备份 PermissionDenied

Linux 权限继承-TiDB br备份

TiDB 的备份工具br 需要使用共享文件作为备份存储目录,采用cephFs的方式挂载到本机的/data/local_data/ 目录下。

br原理图:

br组件部署在pd节点上,因为需要通过pd去调度kv 往目录里有写文件。

所有节点挂载cephfs 到/data/local_backup 目录

备份

/data/tidb-install/tidb-toolkit/bin/br backup full --pd $pd  --storage "local:///data/local_backup/backup/$time/$cluster_name/data" --ratelimit 120 --log-file /data/local_backup/backup/$time/$cluster_name/logs/backupfull_$2-$time.log

备份报错

[2020/09/23 01:01:02.242 +08:00] [INFO] [collector.go:172] ["Full backup Failed summary : total backup ranges: 1, total success: 0, total failed: 1"] ["backup total regions"=238348] [unitName="range start:7480000000000001035f720000000000000000 end:7480000000000001035f72ffffffffffffffff00"] [error="rpc error: code = Unknown desc = Io(Os { code: 13, kind: PermissionDenied, message: \"Permission denied\" })"] [errorVerbose="rpc error: code = Unknown desc = Io(Os { code: 13, kind: PermissionDenied, message: \"Permission denied\" })\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20190809092503-95897b64e011/juju_adaptor.go:15\ngithub.com/pingcap/br/pkg/backup.SendBackup\n\t/home/jenkins/agent/workspace/build_br_multi_branch_v4.0.0/go/src/github.com/pingcap/br/pkg/backup/client.go:792\ngithub.com/pingcap/br/pkg/backup.(*pushDown).pushBackup.func1\n\t/home/jenkins/agent/workspace/build_br_multi_branch_v4.0.0/go/src/github.com/pingcap/br/pkg/backup/push.go:61\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1357"]

PermissionDenied 有机器对这个目录没有权限

由于早期TiDB帐户是手动创建的,造成tidb 的pid不一致,当需要共享一个目录的时候由于pid导致权限不致。

解决方案

1、修改目录为777

结果:不可行

TiKV 会往生成的文件不是每一台机器都可以追加数据

2、修改tidb帐户pid

结果:不可行

修改pid需要关闭当前使用tidb 启动程序的进程,停服是不可能的

3、创建备份帐户

结果:不可行

创建一个帐户,让所有机器这个帐户的pid一致,然后更改属主属组。但tikv 备份时用的tidb帐户

4、迁移集群

结果:可行,但代价太高了。保底方案

找出pid不一致的机器,扩容集群,然后下掉不致的机器,代价太高几十台机器来回折腾一遍花费时间太长了

如果目录的权限可以继承是不是就可以了呢?

setfacl

chmod 777 backup
setfacl -R -d -m u:tidb:rwx backup

在开启备份语句试一下?

/data/tidb-install/tidb-toolkit/bin/br backup full --pd $pd  --storage "local:///data/local_backup/backup/$time/$cluster_name/data" --ratelimit 120 --log-file /data/local_backup/backup/$time/$cluster_name/logs/backupfull_$2-$time.log

备份成功

[2020/09/23 15:25:24.076 +08:00] [INFO] [collector.go:59] ["Full backup Success summary: total backup ranges: 20, total success: 20, total failed: 0, total take(s): 84.59, total kv: 25172, total size(MB): 1.81, avg speed(MB/s): 0.02"] ["backup fast checksum"=4.391094ms] ["backup checksum"=40.133102ms] ["backup total regions"=20]
1赞