br log metadata 相关

br log 备份到 S3 ,查询 metadata 信息返回:

[2026/01/06 10:35:50.378 +08:00] [INFO] [collector.go:77] [“log metadata”] [log-min-ts=463383019267293186] [log-min-date=“2026-01-06 10:18:56.536 +0800”] [log-max-ts=463383019267293186] [log-max-date=“2026-01-06 10:18:56.536 +0800”]

这里观察 log-max-date 一直没发生改变,怀疑是S3有问题,上传到 minio 是正常会变化的,所以想请问执行 metadata 具体查询的是哪个文件

1 个赞

getGlobalCheckpointFromStorage 函数遍历 v1/global_checkpoint 目录下的所有 .ts 文件,读取其中的时间戳并取最大值:

    1 func getGlobalCheckpointFromStorage(ctx context.Context, s storage.ExternalStorage) (uint64, error) {
    2     var globalCheckPointTS uint64 = 0
    3     opt := storage.WalkOption{SubDir: stream.GetStreamBackupGlobalCheckpointPrefix()}
    4     err := s.WalkDir(ctx, &opt, func(path string, size int64) error {
    5         if !strings.HasSuffix(path, ".ts") {
    6             return nil
    7         }
    8         buff, err := s.ReadFile(ctx, path)
    9         ts := binary.LittleEndian.Uint64(buff)
   10         globalCheckPointTS = max(ts, globalCheckPointTS)
   11         return nil
   12     })
   13     return globalCheckPointTS, errors.Trace(err)
   14 }

看了源码找的这几个文件的

1 个赞
	for {
		// FIXME: We can't use ListObjectsV2, it is not universally supported.
		// (Ceph RGW supported ListObjectsV2 since v15.1.0, released 2020 Jan 30th)
		// (as of 2020, DigitalOcean Spaces still does not support V2 - https://developers.digitalocean.com/documentation/spaces/#list-bucket-contents)
		res, err := rs.svc.ListObjects(ctx, req)
		if err != nil {
			return errors.Trace(err)
		}
		for _, r := range res.Contents {
			// https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObjects.html#AmazonS3-ListObjects-response-NextMarker -
			//
			// `res.NextMarker` is populated only if we specify req.Delimiter.
			// Aliyun OSS and minio will populate NextMarker no matter what,
			// but this documented behavior does apply to AWS S3:
			//
			// "If response does not include the NextMarker and it is truncated,
			// you can use the value of the last Key in the response as the marker
			// in the subsequent request to get the next set of object keys."
			req.Marker = r.Key

			// when walk on specify directory, the result include storage.Prefix,
			// which can not be reuse in other API(Open/Read) directly.
			// so we use TrimPrefix to filter Prefix for next Open/Read.
			path := strings.TrimPrefix(*r.Key, rs.options.Prefix)
			// trim the prefix '/' to ensure that the path returned is consistent with the local storage
			path = strings.TrimPrefix(path, "/")
			itemSize := *r.Size

			// filter out s3's empty directory items
			if itemSize <= 0 && strings.HasSuffix(path, "/") {
				log.Info("this path is an empty directory and cannot be opened in S3.  Skip it", zap.String("path", path))
				continue
			}
			if err = fn(path, itemSize); err != nil {
				return errors.Trace(err)
			}
		}
		if !aws.ToBool(res.IsTruncated) {
			break
		}
	}

并且对于 v1/global_checkpoint 目录下的所有 .ts 文件,是调用 ListObjects (具体代码再 tidb/br/pkg/storage/s3.go 的 func WalkDir 中)获取列表,倘若不支持这个接口,也是无法拿到 metadata

2 个赞

这个好深奥,研究到代码层次了啊

2 个赞

主要是任务显示 normal 正常状态,很纠结,不知道哪里出了问题

1 个赞

学习下,好高深啊