TIKV_REGION_STATUS只能使用第一个PD进行查询

  1. 测试了一下,问题可以复现。会打印以下错误栈。
    [2022/06/24 10:27:47.896 +08:00] [INFO] [conn.go:1115] [“command dispatched failed”] [conn=7] [connInfo=“id:7, addr:172.xxx.xx.136:55676 status:10, collation:utf8_general_ci, user:root”] [command=Query] [status=“inTxn:0, autocommit:1”] [sql=“select * from TIKV_REGION_STATUS”] [txn_mode=PESSIMISTIC] [err=“Get “http://172.xxx.xx.162:18279/pd/api/v1/regions”: dial tcp 172.xxx.xx.162:18279: connect: connection refused\ngithub.com/pingcap/errors.AddStack\ \t/nfs/cache/mod/github.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/errors.go:174\ github.com/pingcap/errors.Trace\ \t/nfs/cache/mod/github.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/juju_adaptor.go:15\ github.com/pingcap/tidb/store/helper.(*Helper).requestPD\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/store/helper/helper.go:813\ngithub.com/pingcap/tidb/store/helper.(*Helper).GetRegionsInfo\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/store/helper/helper.go:771\ngithub.com/pingcap/tidb/executor.(*memtableRetriever).setDataForTiKVRegionStatus\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/infoschema_reader.go:1449\ngithub.com/pingcap/tidb/executor.(*memtableRetriever).retrieve\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/infoschema_reader.go:141\ngithub.com/pingcap/tidb/executor.(*MemTableReaderExec).Next\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/memtable_reader.go:118\ngithub.com/pingcap/tidb/executor.Next\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/executor.go:286\ github.com/pingcap/tidb/executor.(*recordSet).Next\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/executor/adapter.go:149\ngithub.com/pingcap/tidb/server.(*tidbResultSet).Next\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/driver_tidb.go:312\ngithub.com/pingcap/tidb/server.(*clientConn).writeChunks\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:2165\ngithub.com/pingcap/tidb/server.(*clientConn).writeResultset\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:2116\ngithub.com/pingcap/tidb/server.(*clientConn).handleStmt\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1994\ngithub.com/pingcap/tidb/server.(*clientConn).handleQuery\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1841\ngithub.com/pingcap/tidb/server.(*clientConn).dispatch\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1336\ngithub.com/pingcap/tidb/server.(*clientConn).Run\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/conn.go:1091\ngithub.com/pingcap/tidb/server.(*Server).onConn\ \t/home/jenkins/agent/workspace/build-common/go/src/github.com/pingcap/tidb/server/server.go:548\ runtime.goexit\ \t/usr/local/go/src/runtime/asm_amd64.s:1371”]
  2. 初步看是代码这里会访问到 down 掉的 PD 节点信息。
    for _, host := range pdHosts {
    req, err = http.NewRequest(method, util.InternalHTTPSchema()+"://"+host+uri, body)
    if err != nil {
    // Try to request from another PD node when some nodes may down.
    if strings.Contains(err.Error(), “connection refused”) {
    continue
    }
    return errors.Trace(err)
    }
    }
    if err != nil {
    return err
    }
    start := time.Now()
    resp, err := util.InternalHTTPClient().Do(req)
    if err != nil {
    return errors.Trace(err)
    }
  3. 提交了 issue https://github.com/pingcap/tidb/issues/35708