可能会有影响,可以先在测试环境测试一下,看下影响的情况。
切换了pd的leader和设置了 use-region-storage 还是有如下日志抛出
[2021/04/12 15:41:36.780 +08:00] [INFO] [process.rs:145] [“get snapshot failed”] [err=“Request(message: “EpochNotMatch current epoch of region 15141 is conf_ver: 5 version: 1244, but you sent conf_ver: 5 version: 1246” epoch_not_match { current_regions { id: 15141 start_key: 7480000000000001FFB35F698000000000FF0000010131323132FF33643530FF2D3066FF66362D3430FF3965FF2D383965612DFF37FF63303136666333FFFF6432326537326333FFFF323137662D6231FF64FF362D34613039FF2D62FF3235612D36FF326638FF35346233FF31626662FF000000FF0000000000F70000FD end_key: 7480000000000001FFB35F698000000000FF0000010132316531FF31373536FF2D6333FF30312D3463FF6137FF2D623866322DFF35FF63633238633730FFFF3432393836653064FFFF396136612D3663FF32FF302D34623064FF2D39FF3665382D38FF353330FF38353030FF32313934FF000000FF0000000000F70000FD region_epoch { conf_ver: 5 version: 1244 } peers { id: 15142 store_id:1 } peers { id: 15143 store_id: 4 } peers { id: 15144 store_id: 5 } } current_regions { id: 14585 start_key: 7480000000000001FFB35F698000000000FF0000010130623761FF32366537FF2D3531FF33642D3465FF6537FF2D393838622DFF65FF36343964656166FFFF3530363561653066FFFF323633332D6130FF64FF302D34393766FF2D62FF6135382D30FF633631FF38633232FF36316239FF000000FF0000000000F70000FD end_key: 7480000000000001FFB35F698000000000FF0000010131323132FF33643530FF2D3066FF66362D3430FF3965FF2D383965612DFF37FF63303136666333FFFF6432326537326333FFFF323137662D6231FF64FF362D34613039FF2D62FF3235612D36FF326638FF35346233FF31626662FF000000FF0000000000F70000FD region_epoch { conf_ver: 5 version: 1243 } peers { id: 14586 store_id: 1 } peers { id: 14587 store_id: 4 } peers { id: 14588 store_id: 5 } } })”] [cid=3496516605]
是先设置的 use-region-storage 为 false ,然后再切换 pd leader 的么
是的哦
package main
import (
"context"
"flag"
"fmt"
"os"
"path"
"strconv"
"strings"
"time"
"go.etcd.io/etcd/clientv3"
"go.etcd.io/etcd/pkg/transport"
)
var (
clusterID = flag.Uint64("cluster-id", 0, "please make cluster ID match with TiKV")
endpoints = flag.String("endpoints", "http://127.0.0.1:2379", "endpoints urls")
filePath = flag.String("file", "regions.dump", "dump file path and name")
caPath = flag.String("cacert", "", "path of file that contains list of trusted SSL CAs")
certPath = flag.String("cert", "", "path of file that contains X509 certificate in PEM format")
keyPath = flag.String("key", "", "path of file that contains X509 key in PEM format")
)
const (
etcdTimeout = 1200 * time.Second
pdRootPath = "/pd"
maxKVRangeLimit = 10000
minKVRangeLimit = 100
)
var (
rootPath = ""
)
func checkErr(err error) {
if err != nil {
fmt.Println(err.Error())
os.Exit(1)
}
}
func main() {
flag.Parse()
rootPath = path.Join(pdRootPath, strconv.FormatUint(*clusterID, 10))
urls := strings.Split(*endpoints, ",")
tlsInfo := transport.TLSInfo{
CertFile: *certPath,
KeyFile: *keyPath,
TrustedCAFile: *caPath,
}
tlsConfig, err := tlsInfo.ClientConfig()
checkErr(err)
client, err := clientv3.New(clientv3.Config{
Endpoints: urls,
DialTimeout: etcdTimeout,
TLS: tlsConfig,
})
checkErr(err)
deleteRegions(client)
fmt.Println("successful!")
}
var regions = []uint64{15141,15149,52021}
func regionPath(regionID uint64) string {
return path.Join("raft", "r", fmt.Sprintf("%020d", regionID))
}
func deleteRegions(client *clientv3.Client) {
for _, regionID := range regions {
key := regionPath(regionID)
_, err := clientv3.NewKV(client).Delete(context.TODO(), key)
if err != nil {
fmt.Println("delete key failed", key, err)
continue
}
fmt.Println("delete key succ:", key)
}
}
将上面脚本中的 var regions = []uint64{15141,15149,52021} 的 region id 替换成 store 4 上剩余的 region id 号,然后运行脚本,通过脚本删除 pd 中的 region 信息。
脚本使用方法
./regions-delete --endpoints http://${pd_ip}:${pd_port} --cluster-id=${cluster_id}
脚本执行成功的话应该会有类似
delete key succ: /pd/6637380734426264413/raft/r/00000000000000161945
的输出。
如果运行脚本删除 region 信息之后还是无法恢复的话,再设置一下 use-region-storage 为 false 并切换 pd leader 再看下。
这些需要删掉的region是用 pd-ctl 执行 region store 4 的结果重定向到文本中的region 的id嘛?
是的,可以 pd-ctl 执行下 region store 4 拿一下最新的剩余 region 情况。
并且通过 tikv-ctl 的方式对比下,这些 region 的确是 pd 和 tikv 中 region 信息不一致
./tikv-ctl --host ${store4_ip}:${store4_tikv_port} raft region -r 15141
执行的时候提示命令不存在是不是要安装什么环境,还是要放到什么目录下
你安装了 go 环境吗?需要在你们的 go 环境下编译一下二进制文件使用
这个是删除数据吗?还是只删除pd里面的记录信息呢?
是清除 PD 中存储的 region 元信息,不是数据。
删除pd 里的region信息,有没有相应的命令可操作?
这个我知道的是没有对应的 api
我看对应的帖子已经有人回复了。
这个帖子中关于一直 pending offline 的问题有解决么?
pedding offine是下线了。现在的store used都变小了
好的,store used 变小的问题,关注下另一个帖子中的回复吧
此话题已在最后回复的 1 分钟后被自动关闭。不再允许新回复。