tidb v5.4.1 集群使用br 备份出来的文件无法恢复

【 TiDB 使用环境】生产\测试环境\ POC
【 TiDB 版本】
【遇到的问题】
在tidb v5.4.1 集群使用br v5.4.1 备份出的数据无法恢复!!


【复现路径】做过哪些操作出现的问题

在tidb v5.4.1 集群使用br v5.4.1 备份恢复, 必现 !!!
【问题现象及影响】

集群备份文件无法使用, 集群无法备份恢复 ! 参考下下面的帖子, 有人遇到同样的问题
BR备份报错:[pd] fetch pending tso requests error

若提问为性能优化、故障排查类问题,请下载脚本运行。终端输出的打印结果,请务必全选并复制粘贴上传。

BR 备份这个有人帮忙看下嘛

备份处理的文件放在什么位置? 集群中的所有tikv 都能访问备份文件吗?

us3, 所有的tikv 都可以访问的

v4.0.14 集群是没有问题的

image 看着是没法读这个目录文件

不是没法读取这个目录, 是无法读取.sst 文件, 因为备份出来的.sst文件 size 是0
image
image

但是下面的这个temp 文件是有size 的,而且刚好是br expect size

我把 temp 文件和sst文件的文件名互换下就可以恢复成功

那个我建议下你们不要从文件夹目录无法访问这种问题开始给建议,这些基础的东西我们早都测试过了。

v4.0.14 我们BR测试就没问题, 线上所有环境 v5.4.1 只要备份就绝对出这个问题, 你们最好自己测试下,在我这里是必现的

测试没有复现。
备份文件大小是 0,备份log里 有错误信息吗?

重新跑一遍备份. 还是一样的问题么?sst 大小还是 0 吗?

br 备份显示的是success

同样的结果

能换个备份目录试一下吗?

每次备份都是新的备份目录

你们能自己在v5.4.1 上测试下br备份吗?看看有没有问题

测试了没问题,
能把 备份 和回复 的日志完整发一下吗?
这个集群是低版本升级上来的吗?

不是升级上来的,是直接创建的

那 能把 备份 和恢复 的日志完整发一下吗?

稍等下,我在整理日志

br 备份

备份命令

v5.4.1/br backup full --pd=10.x.139.xxx:10003 -f '*.*' -f '!mysql.*'  -f '!INFORMATION_SCHEMA.*' -f '!METRICS_SCHEMA.*' -f '!PERFORMANCE_SCHEMA.*' -f '!tidb_loader.*' --checksum=true --storage=local:///mnt/positive_volume/peter_test_20220623001 --log-file=scripts/bac
kup1.log

log输出

[2022/06/23 15:49:52.165 +08:00] [INFO] [info.go:49] ["Welcome to Backup & Restore (BR)"] [release-version=v5.4.1] [git-hash=be484475e76b3aaa2cad520111353c3327db35fa] [git-branch=heads/refs/tags/v5.4.1] [go-version=go1.16.4] [utc-build-time="2022-05-07 01:26:52"] [race-enabled=false]
[2022/06/23 15:49:52.165 +08:00] [INFO] [common.go:634] [arguments] [__command="br backup full"] [checksum=true] [filter="[*.*,!mysql.*,!grafana_ucloud.*,!INFORMATION_SCHEMA.*,!METRICS_SCHEMA.*,!PERFORMANCE_SCHEMA.*,!tidb_loader.*]"] [log-file=/root/wakanda/action/scripts/backup1.log] [pd="[10.9.139.114:10003]"] [storage=local:///mnt/positive_volume/peter_test_20220623001]
[2022/06/23 15:49:52.165 +08:00] [INFO] [conn.go:244] ["new mgr"] [pdAddrs=10.9.139.114:10003]
[2022/06/23 15:49:52.166 +08:00] [INFO] [client.go:352] ["[pd] create pd client with endpoints"] [pd-address="[10.9.139.114:10003]"]
[2022/06/23 15:49:52.168 +08:00] [INFO] [base_client.go:331] ["[pd] update member urls"] [old-urls="[http://10.9.139.114:10003]"] [new-urls="[http://10.9.139.114:10003,http://10.9.59.162:10001,http://10.9.98.126:10001]"]
[2022/06/23 15:49:52.168 +08:00] [INFO] [base_client.go:349] ["[pd] switch leader"] [new-leader=http://10.9.139.114:10003] [old-leader=]
[2022/06/23 15:49:52.168 +08:00] [INFO] [base_client.go:104] ["[pd] init cluster id"] [cluster-id=7111513361699399115]
[2022/06/23 15:49:52.168 +08:00] [INFO] [client.go:648] ["[pd] tso dispatcher created"] [dc-location=global]
[2022/06/23 15:49:52.168 +08:00] [INFO] [conn.go:219] ["checked alive KV stores"] [aliveStores=3] [totalStores=3]
[2022/06/23 15:49:52.168 +08:00] [INFO] [client.go:352] ["[pd] create pd client with endpoints"] [pd-address="[10.9.139.114:10003]"]
[2022/06/23 15:49:52.170 +08:00] [INFO] [base_client.go:331] ["[pd] update member urls"] [old-urls="[http://10.9.139.114:10003]"] [new-urls="[http://10.9.139.114:10003,http://10.9.59.162:10001,http://10.9.98.126:10001]"]
[2022/06/23 15:49:52.170 +08:00] [INFO] [base_client.go:349] ["[pd] switch leader"] [new-leader=http://10.9.139.114:10003] [old-leader=]
[2022/06/23 15:49:52.170 +08:00] [INFO] [base_client.go:104] ["[pd] init cluster id"] [cluster-id=7111513361699399115]
[2022/06/23 15:49:52.170 +08:00] [INFO] [client.go:648] ["[pd] tso dispatcher created"] [dc-location=global]
[2022/06/23 15:49:52.171 +08:00] [INFO] [tidb.go:72] ["new domain"] [store=tikv-7111513361699399115] ["ddl lease"=1s] ["stats lease"=-1ns] ["index usage sync lease"=0s]
[2022/06/23 15:49:52.201 +08:00] [INFO] [domain.go:169] ["full load InfoSchema success"] [currentSchemaVersion=0] [neededSchemaVersion=90] ["start time"=19.366697ms]
[2022/06/23 15:49:52.203 +08:00] [INFO] [domain.go:432] ["full load and reset schema validator"]
[2022/06/23 15:49:52.203 +08:00] [INFO] [ddl.go:347] ["[ddl] start DDL"] [ID=42873ba9-210a-4e16-a578-263031accb45] [runWorker=false]
[2022/06/23 15:49:52.205 +08:00] [INFO] [backup.go:261] ["get newCollationEnable for check during restore"] [newCollationEnable=False]
[2022/06/23 15:49:52.205 +08:00] [INFO] [client.go:93] ["new backup client"]
[2022/06/23 15:49:53.552 +08:00] [INFO] [client.go:137] ["backup encode timestamp"] [BackupTS=434102755263774721]
[2022/06/23 15:49:53.552 +08:00] [INFO] [backup.go:295] ["current backup safePoint job"] [safePoint="{ID=br-678461d8-cca3-48bd-9503-0bd2cba2fe2e,TTL=5m0s,BackupTime=\"2022-06-23 15:49:53.505 +0800 CST\",BackupTS=434102755263774721}"]
[2022/06/23 15:49:53.559 +08:00] [INFO] [client.go:505] ["backup started"] [range-sn=2] [startKey=7480000000000000A55F720000000000000000] [endKey=7480000000000000A55F72FFFFFFFFFFFFFFFF00] [rateLimit=0] [concurrency=4]
[2022/06/23 15:49:53.559 +08:00] [INFO] [client.go:505] ["backup started"] [range-sn=0] [startKey=7480000000000000A85F720000000000000000] [endKey=7480000000000000A85F72FFFFFFFFFFFFFFFF00] [rateLimit=0] [concurrency=4]
[2022/06/23 15:49:53.559 +08:00] [INFO] [client.go:505] ["backup started"] [range-sn=1] [startKey=7480000000000000A85F69800000000000000100] [endKey=7480000000000000A85F698000000000000001FB] [rateLimit=0] [concurrency=4]
[2022/06/23 15:49:53.561 +08:00] [INFO] [client.go:918] ["try backup"] [range-sn=0] [store-id=4] ["retry time"=0]
[2022/06/23 15:49:53.562 +08:00] [INFO] [client.go:918] ["try backup"] [range-sn=0] [store-id=5] ["retry time"=0]
[2022/06/23 15:49:53.562 +08:00] [INFO] [client.go:918] ["try backup"] [range-sn=0] [store-id=1] ["retry time"=0]
[2022/06/23 15:49:53.562 +08:00] [INFO] [client.go:918] ["try backup"] [range-sn=1] [store-id=4] ["retry time"=0]
[2022/06/23 15:49:53.562 +08:00] [INFO] [client.go:918] ["try backup"] [range-sn=1] [store-id=5] ["retry time"=0]
[2022/06/23 15:49:53.562 +08:00] [INFO] [client.go:918] ["try backup"] [range-sn=2] [store-id=1] ["retry time"=0]
[2022/06/23 15:49:53.562 +08:00] [INFO] [client.go:918] ["try backup"] [range-sn=2] [store-id=4] ["retry time"=0]
[2022/06/23 15:49:53.562 +08:00] [INFO] [client.go:918] ["try backup"] [range-sn=1] [store-id=1] ["retry time"=0]
[2022/06/23 15:49:53.562 +08:00] [INFO] [client.go:918] ["try backup"] [range-sn=2] [store-id=5] ["retry time"=0]
[2022/06/23 15:49:53.730 +08:00] [INFO] [client.go:968] ["backup streaming finish"] [range-sn=0] [store-id=4] [retry-time=0]
[2022/06/23 15:49:53.731 +08:00] [INFO] [client.go:968] ["backup streaming finish"] [range-sn=1] [store-id=4] [retry-time=0]
[2022/06/23 15:49:53.731 +08:00] [INFO] [client.go:968] ["backup streaming finish"] [range-sn=2] [store-id=4] [retry-time=0]
[2022/06/23 15:49:53.737 +08:00] [INFO] [client.go:968] ["backup streaming finish"] [range-sn=0] [store-id=5] [retry-time=0]
[2022/06/23 15:49:53.737 +08:00] [INFO] [client.go:968] ["backup streaming finish"] [range-sn=1] [store-id=5] [retry-time=0]
[2022/06/23 15:49:53.737 +08:00] [INFO] [client.go:968] ["backup streaming finish"] [range-sn=2] [store-id=5] [retry-time=0]
[2022/06/23 15:49:54.295 +08:00] [INFO] [client.go:986] ["range backed up"] [range-sn=1] [store-id=1] [small-range-start-key=7480000000000000A85F69800000000000000100] [small-range-end-key=7480000000000000A85F698000000000000001FB]
[2022/06/23 15:49:54.295 +08:00] [INFO] [client.go:968] ["backup streaming finish"] [range-sn=1] [store-id=1] [retry-time=0]
[2022/06/23 15:49:54.295 +08:00] [INFO] [client.go:527] ["finish backup push down"] [range-sn=1] [small-range-count=1]
[2022/06/23 15:49:54.295 +08:00] [INFO] [client.go:547] ["time range backed up"] [range-sn=1] [StartVersion=0] [EndVersion=434102755263774721]
[2022/06/23 15:49:54.295 +08:00] [INFO] [client.go:499] ["backup range finished"] [range-sn=1] [take=735.420384ms]
[2022/06/23 15:49:54.303 +08:00] [INFO] [client.go:986] ["range backed up"] [range-sn=0] [store-id=1] [small-range-start-key=7480000000000000A85F720000000000000000] [small-range-end-key=7480000000000000A85F72FFFFFFFFFFFFFFFF00]
[2022/06/23 15:49:54.303 +08:00] [INFO] [client.go:968] ["backup streaming finish"] [range-sn=0] [store-id=1] [retry-time=0]
[2022/06/23 15:49:54.303 +08:00] [INFO] [client.go:527] ["finish backup push down"] [range-sn=0] [small-range-count=1]
[2022/06/23 15:49:54.303 +08:00] [INFO] [client.go:547] ["time range backed up"] [range-sn=0] [StartVersion=0] [EndVersion=434102755263774721]
[2022/06/23 15:49:54.303 +08:00] [INFO] [client.go:499] ["backup range finished"] [range-sn=0] [take=744.182395ms]
[2022/06/23 15:49:54.311 +08:00] [INFO] [client.go:986] ["range backed up"] [range-sn=2] [store-id=1] [small-range-start-key=7480000000000000A55F720000000000000000] [small-range-end-key=7480000000000000A55F72FFFFFFFFFFFFFFFF00]
[2022/06/23 15:49:54.311 +08:00] [INFO] [client.go:968] ["backup streaming finish"] [range-sn=2] [store-id=1] [retry-time=0]
[2022/06/23 15:49:54.311 +08:00] [INFO] [client.go:527] ["finish backup push down"] [range-sn=2] [small-range-count=1]
[2022/06/23 15:49:54.311 +08:00] [INFO] [client.go:547] ["time range backed up"] [range-sn=2] [StartVersion=0] [EndVersion=434102755263774721]
[2022/06/23 15:49:54.311 +08:00] [INFO] [client.go:499] ["backup range finished"] [range-sn=2] [take=752.129861ms]
[2022/06/23 15:49:54.311 +08:00] [INFO] [client.go:484] ["Backup Ranges"] [take=2.173µs]
[2022/06/23 15:49:54.311 +08:00] [INFO] [metafile.go:562] ["write metas finished"] [type=datafile]
[2022/06/23 15:49:54.311 +08:00] [INFO] [metafile.go:602] ["finish the write metas"] [item=3] [type=datafile] [costs=752.409758ms]
[2022/06/23 15:49:54.312 +08:00] [INFO] [worker.go:73] ["table checksum start"] [db=peter] [table=runoob_tbl]
[2022/06/23 15:49:54.312 +08:00] [INFO] [worker.go:73] ["table checksum start"] [db=test] [table=tidb_demo]
[2022/06/23 15:49:54.312 +08:00] [INFO] [worker.go:73] ["table checksum finished"] [db=peter] [table=runoob_tbl] [Crc64Xor=8328212611495905372] [TotalKvs=266] [TotalBytes=17024] [take=548.769µs]
[2022/06/23 15:49:54.312 +08:00] [INFO] [worker.go:73] ["table checksum finished"] [db=test] [table=tidb_demo] [Crc64Xor=2923342626563082014] [TotalKvs=4] [TotalBytes=186] [take=827.985µs]
[2022/06/23 15:49:54.312 +08:00] [INFO] [schema.go:137] ["backup checksum"] [take=949.132µs]
[2022/06/23 15:49:54.312 +08:00] [INFO] [metafile.go:562] ["write metas finished"] [type=schema]
[2022/06/23 15:49:54.312 +08:00] [INFO] [metafile.go:602] ["finish the write metas"] [item=5] [type=schema] [costs=965.519µs]
[2022/06/23 15:49:54.313 +08:00] [INFO] [metafile.go:622] ["save backup meta"] [size=5887]
[2022/06/23 15:49:54.364 +08:00] [INFO] [validate.go:78] ["checksum success"] [db=test] [table=tidb_demo]
[2022/06/23 15:49:54.364 +08:00] [INFO] [validate.go:78] ["checksum success"] [db=peter] [table=runoob_tbl]
[2022/06/23 15:49:54.364 +08:00] [ERROR] [client.go:752] ["[pd] fetch pending tso requests error"] [dc-location=global] [error="[PD:client:ErrClientGetTSO]context canceled: context canceled"] [stack="github.com/tikv/pd/client.(*client).handleDispatcher\n\t/go/pkg/mod/github.com/tikv/pd@v1.1.0-beta.0.20211118054146-02848d2660ee/client/client.go:752"]
[2022/06/23 15:49:54.364 +08:00] [INFO] [client.go:666] ["[pd] exit tso dispatcher"] [dc-location=global]
[2022/06/23 15:49:54.364 +08:00] [ERROR] [client.go:752] ["[pd] fetch pending tso requests error"] [dc-location=global] [error="[PD:client:ErrClientGetTSO]context canceled: context canceled"] [stack="github.com/tikv/pd/client.(*client).handleDispatcher\n\t/go/pkg/mod/github.com/tikv/pd@v1.1.0-beta.0.20211118054146-02848d2660ee/client/client.go:752"]
[2022/06/23 15:49:54.364 +08:00] [INFO] [client.go:666] ["[pd] exit tso dispatcher"] [dc-location=global]
[2022/06/23 15:49:54.364 +08:00] [INFO] [collector.go:68] ["Full backup success summary"] [total-ranges=3] [ranges-succeed=3] [ranges-failed=0] [backup-checksum=959.257µs] [backup-fast-checksum=398.25µs] [backup-total-regions=3] [backup-total-ranges=3] [total-take=2.199371094s] [BackupTS=434102755263774721] [total-kv=270] [total-kv-size=17.21kB] [average-speed=7.825kB/s] [backup-data-size(after-compressed)=7.109kB] [Size=7109]

控制台输出

Full backup <----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
Checksum <-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------> 100.00%
[2022/06/23 15:49:54.364 +08:00] [INFO] [collector.go:67] ["Full backup success summary"] [total-ranges=3] [ranges-succeed=3] [ranges-failed=0] [backup-checksum=959.257µs] [backup-fast-checksum=398.25µs] [backup-total-regions=3] [backup-total-ranges=3] [total-take=2.199371094s] [BackupTS=434102755263774721] [total-kv=270] [total-kv-size=17.21kB] [average-speed=7.825kB/s] [backup-data-size(after-compressed)=7.109kB] [Size=7109]

备份的文件结果

image