BR备份到 绑定iam的海外s3问题

【 TiDB 使用环境】生产\测试环境\ POC
【 TiDB 版本】
【遇到的问题】
【复现路径】做过哪些操作出现的问题
【问题现象及影响】
我是用br备份,当时使用命令

timeout --preserve-status 31500 /home/tidb/bin/br backup full --pd 10.xx.xx.xx:2379 --storage s3://x-xx-backup-tokyo/90736/20220513/2022051311545000101 --s3.region 'ap-northeast-1' --send-credentials-to-tikv=false --s3.endpoint http://s3.ap-northeast-1.amazonaws.com --ratelimit 100 --concurrency 1 --check-requirements=false --log-file  br_backup.log

可以执行成功但是我在脚本中使用这个命令,则会报错

[2022/05/13 17:23:08.576 +08:00] [INFO] [info.go:49] ["Welcome to Backup & Restore (BR)"] [release-version=v6.1.0-alpha-361-gadebe4452-dirty] [git-hash=adebe4452b4dd23366bad0032a6b64cc136d93ac] [git-branch=master] [go-version=go1.18.2] [utc-build-time="2022-05-13 07:02:13"] [race-enabled=false]
[2022/05/13 17:23:08.576 +08:00] [INFO] [common.go:692] [arguments] [__command="br backup full"] [check-requirements=false] [concurrency=1] [log-file=/home/dumbo/var/log/br_backup.log] [pd="[10.67.103.201:2379]"] [ratelimit=100] [s3.endpoint=http://s3.ap-northeast-1.amazonaws.com] [s3.region='ap-northeast-1'] [send-credentials-to-tikv=false] [storage=s3://dumbo-dumbo-backup-tokyo/90736/20220513/2022051311545000101]
[2022/05/13 17:23:08.577 +08:00] [WARN] [logging.go:251] ["setting `--ratelimit` and `--concurrency` at the same time, ignoring `--concurrency`: `--ratelimit` forces sequential (i.e. concurrency = 1) backup"] [ratelimit=104.9MB/s] [concurrency-specified=1]
[2022/05/13 17:23:08.577 +08:00] [INFO] [conn.go:245] ["new mgr"] [pdAddrs=10.67.103.201:2379]
[2022/05/13 17:23:08.580 +08:00] [INFO] [client.go:392] ["[pd] create pd client with endpoints"] [pd-address="[10.67.103.201:2379]"]
[2022/05/13 17:23:08.583 +08:00] [INFO] [base_client.go:332] ["[pd] update member urls"] [old-urls="[http://10.67.103.201:2379]"] [new-urls="[http://10.67.103.201:2379,http://10.67.103.235:2379,http://10.67.103.236:2379]"]
[2022/05/13 17:23:08.583 +08:00] [INFO] [base_client.go:350] ["[pd] switch leader"] [new-leader=http://10.67.103.235:2379] [old-leader=]
[2022/05/13 17:23:08.584 +08:00] [INFO] [base_client.go:105] ["[pd] init cluster id"] [cluster-id=7095557296043268125]
[2022/05/13 17:23:08.584 +08:00] [INFO] [client.go:687] ["[pd] tso dispatcher created"] [dc-location=global]
[2022/05/13 17:23:08.586 +08:00] [INFO] [conn.go:220] ["checked alive KV stores"] [aliveStores=3] [totalStores=3]
[2022/05/13 17:23:08.587 +08:00] [INFO] [client.go:392] ["[pd] create pd client with endpoints"] [pd-address="[10.67.103.201:2379]"]
[2022/05/13 17:23:08.589 +08:00] [INFO] [base_client.go:332] ["[pd] update member urls"] [old-urls="[http://10.67.103.201:2379]"] [new-urls="[http://10.67.103.201:2379,http://10.67.103.235:2379,http://10.67.103.236:2379]"]
[2022/05/13 17:23:08.589 +08:00] [INFO] [base_client.go:350] ["[pd] switch leader"] [new-leader=http://10.67.103.235:2379] [old-leader=]
[2022/05/13 17:23:08.589 +08:00] [INFO] [base_client.go:105] ["[pd] init cluster id"] [cluster-id=7095557296043268125]
[2022/05/13 17:23:08.589 +08:00] [INFO] [client.go:687] ["[pd] tso dispatcher created"] [dc-location=global]
[2022/05/13 17:23:08.594 +08:00] [INFO] [tidb.go:74] ["new domain"] [store=tikv-7095557296043268125] ["ddl lease"=1s] ["stats lease"=-1ns] ["index usage sync lease"=0s]
[2022/05/13 17:23:08.611 +08:00] [WARN] [info.go:241] ["init TiFlashPlacementManager"] ["pd addrs"="[10.67.103.235:2379,10.67.103.236:2379,10.67.103.201:2379]"]
[2022/05/13 17:23:08.656 +08:00] [INFO] [domain.go:176] ["full load InfoSchema success"] [currentSchemaVersion=0] [neededSchemaVersion=26] ["start time"=20.997078ms]
[2022/05/13 17:23:08.660 +08:00] [INFO] [domain.go:437] ["full load and reset schema validator"]
[2022/05/13 17:23:08.660 +08:00] [INFO] [ddl.go:382] ["[ddl] start DDL"] [ID=457e0034-f3c9-4dd2-b137-3ddb6dc83217] [runWorker=false]
[2022/05/13 17:23:08.666 +08:00] [INFO] [backup.go:273] ["get new_collations_enabled_on_first_bootstrap config from system table"] [new_collation_enabled=True]
[2022/05/13 17:23:08.666 +08:00] [INFO] [client.go:95] ["new backup client"]
[2022/05/13 17:23:10.890 +08:00] [INFO] [client.go:768] ["[pd] stop fetching the pending tso requests due to context canceled"] [dc-location=global]
[2022/05/13 17:23:10.890 +08:00] [INFO] [client.go:706] ["[pd] exit tso dispatcher"] [dc-location=global]
[2022/05/13 17:23:10.890 +08:00] [INFO] [collector.go:204] ["units canceled"] [cancel-unit=0]
[2022/05/13 17:23:10.891 +08:00] [INFO] [collector.go:70] ["Full Backup failed summary"] [total-ranges=0] [ranges-succeed=0] [ranges-failed=0]
[2022/05/13 17:23:10.890 +08:00] [INFO] [client.go:768] ["[pd] stop fetching the pending tso requests due to context canceled"] [dc-location=global]
[2022/05/13 17:23:10.891 +08:00] [INFO] [client.go:706] ["[pd] exit tso dispatcher"] [dc-location=global]
[2022/05/13 17:23:10.891 +08:00] [ERROR] [backup.go:40] ["failed to backup"] [error="error occurred when checking backupmeta file: BadRequest: Bad Request\n\tstatus code: 400, request id: JF2FW72H09XKY2MM, host id: jKL+hsg4RqgoQZfdRfDWeF4sFTnJcRR/al4x8Y0zOzW1QqGlkVUPO04bXkfun9E8WtyS2nqLQrE="] [errorVerbose="BadRequest: Bad Request\n\tstatus code: 400, request id: JF2FW72H09XKY2MM, host id: jKL+hsg4RqgoQZfdRfDWeF4sFTnJcRR/al4x8Y0zOzW1QqGlkVUPO04bXkfun9E8WtyS2nqLQrE=\ngithub.com/pingcap/errors.AddStack\n\t/root/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/root/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/juju_adaptor.go:15\ngithub.com/pingcap/tidb/br/pkg/storage.(*S3Storage).FileExists\n\t/root/dev/tidb/br/pkg/storage/s3.go:433\ngithub.com/pingcap/tidb/br/pkg/backup.(*Client).SetStorage\n\t/root/dev/tidb/br/pkg/backup/client.go:181\ngithub.com/pingcap/tidb/br/pkg/task.RunBackup\n\t/root/dev/tidb/br/pkg/task/backup.go:284\nmain.runBackupCommand\n\t/root/dev/tidb/br/cmd/br/backup.go:39\nmain.newFullBackupCommand.func1\n\t/root/dev/tidb/br/cmd/br/backup.go:108\ngithub.com/spf13/cobra.(*Command).execute\n\t/root/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/root/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/root/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:902\nmain.main\n\t/root/dev/tidb/br/cmd/br/main.go:57\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1571\nerror occurred when checking backupmeta file"] [stack="main.runBackupCommand\n\t/root/dev/tidb/br/cmd/br/backup.go:40\nmain.newFullBackupCommand.func1\n\t/root/dev/tidb/br/cmd/br/backup.go:108\ngithub.com/spf13/cobra.(*Command).execute\n\t/root/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:856\ngithub.com/spf13/cobra.(*Command).ExecuteC\n\t/root/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:974\ngithub.com/spf13/cobra.(*Command).Execute\n\t/root/go/pkg/mod/github.com/spf13/cobra@v1.4.0/command.go:902\nmain.main\n\t/root/dev/tidb/br/cmd/br/main.go:57\nruntime.main\n\t/usr/local/go/src/runtime/proc.go:250"]
2赞

阅读了br这部分代码 也不确定br是怎么传递的相关的key? 我想通过脚本使用br做定时备份任务。。

2赞

https://docs.pingcap.com/zh/tidb/stable/backup-and-restore-storages#s3-的-url-参数:

  1. $AWS_ACCESS_KEY_ID$AWS_SECRET_ACCESS_KEY 环境变量。
  2. $AWS_ACCESS_KEY$AWS_SECRET_KEY 环境变量。
  3. 工具节点上的共享凭证文件,路径由 $AWS_SHARED_CREDENTIALS_FILE 环境变量指定。
  4. 工具节点上的共享凭证文件,路径为 ~/.aws/credentials
  5. 当前 Amazon EC2 容器的 IAM 角色。
  6. 当前 Amazon ECS 任务的 IAM 角色。

可以按以上 BR 存取 s3 credential 的方式依次检查一下。

S3 400 这个错误,一般是 bad request, 可能是路径问题。

手动检查以下备份目录 s3://x-xx-backup-tokyo/90736/20220513/2022051311545000101, 看下 backupmeta data 文件是否存在。

1赞

好的 先谢谢大佬的支持。这部分我有参考官方文档,也阅读了相关代码,但是还是没找到相关的key。

1赞

经过学习s3 iam ,这这东西应该只需要tocken就好了。 我调用br命令是通过shell脚本调用的,将br 命令通过string的拼凑,传递到变量中,然后${cmd} 此时会有问题,当我在脚本中直接写命令执行的时候,则没有问题。。。问题 很奇怪。

1赞

谢谢大佬

1赞