k8s部署tidb失败，没有报错

chenhanneu · 2024 年3 月 13 日 06:42

【 TiDB 使用环境】生产环境 /测试/ Poc
【 TiDB 版本】
【复现路径】做过哪些操作出现的问题
【遇到的问题：问题现象及影响】
【资源配置】
【TiDB Operator 版本】：v1.5.2
【K8s 版本】：v1.29.2
【附件：截图/日志/监控】
tidb-operator tidb-admin 1 2024-03-13 13:36:47.522594135 +0800 CST deployed tidb-operator-v1.5.2 v1.5.2
tidb-operator的pod日志：
E0313 06:17:24.355669 1 reflector.go:138] k8s.io/client-go@v0.20.15/tools/cache/reflector.go:167: Failed to watch *v1alpha1.TidbDashboard: failed to list *v1alpha1.TidbDashboard: the server could not find the requested resource (get tidbdashboards.pingcap.com)
E0313 06:18:09.206798 1 reflector.go:138] k8s.io/client-go@v0.20.15/tools/cache/reflector.go:167: Failed to watch *v1alpha1.TidbDashboard: failed to list *v1alpha1.TidbDashboard: the server could not find the requested resource (get tidbdashboards.pingcap.com)
E0313 06:19:03.449916 1 reflector.go:138] k8s.io/client-go@v0.20.15/tools/cache/reflector.go:167: Failed to watch *v1alpha1.TidbDashboard: failed to list *v1alpha1.TidbDashboard: the server could not find the requested resource (get tidbdashboards.pingcap.com)
E0313 06:19:52.099712 1 reflector.go:138] k8s.io/client-go@v0.20.15/tools/cache/reflector.go:167: Failed to watch *v1alpha1.TidbDashboard: failed to list *v1alpha1.TidbDashboard: the server could not find the requested resource (get tidbdashboards.pingcap.com)

kubectl apply -f tidb-test.yaml
tidbcluster.pingcap.com/tidb-test created

但是pod没创建。也没看到有任何报错信息。

去哪里排查出现问题了呢？

MrSylar · 2024 年3 月 13 日 06:58

kubectl get po -n ${namespace} -l app.kubernetes.io/instance=${cluster_name}

这个返回如何

chenhanneu · 2024 年3 月 13 日 07:05

图片显示的

TiDBer_jYQINSnf · 2024 年3 月 13 日 08:25

kubectl get tc -n ns -oyaml 看看你的tc是怎么写的

TiDBer_5cwU0ltE · 2024 年3 月 13 日 09:38

是不是按照文档一步一步操作的呀。一般按文档都会很顺利。

redgame · 2024 年3 月 13 日 09:49

看提示有Operator 无法正常工作，进而影响到 tidb-test 的部署

没头脑123 · 2024 年3 月 13 日 09:56

有可能版本不兼容

TiDBer_aaO4sU46 · 2024 年3 月 13 日 10:06

没有明确报错的话，不太好弄

chenhanneu · 2024 年3 月 13 日 10:47

apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
name: basic
spec:
version: v7.1.1
timezone: UTC
pvReclaimPolicy: Retain
enableDynamicConfiguration: true
configUpdateStrategy: RollingUpdate
discovery: {}
helper:
image: alpine:3.16.0
pd:
baseImage: pingcap/pd
maxFailoverCount: 0
replicas: 1
# if storageClassName is not set, the default Storage Class of the Kubernetes cluster will be used
storageClassName: nfs-client
requests:
storage: “1Gi”
config: {}
tikv:
baseImage: pingcap/tikv
maxFailoverCount: 0
# If only 1 TiKV is deployed, the TiKV region leader
# cannot be transferred during upgrade, so we have
# to configure a short timeout
evictLeaderTimeout: 1m
replicas: 1
# if storageClassName is not set, the default Storage Class of the Kubernetes cluster will be used
storageClassName: nfs-client
requests:
storage: “1Gi”
config:
storage:
# In basic examples, we set this to avoid using too much storage.
reserve-space: “0MB”
rocksdb:
# In basic examples, we set this to avoid the following error in some Kubernetes clusters:
# “the maximum number of open file descriptors is too small, got 1024, expect greater or equal to 82920”
max-open-files: 256
raftdb:
max-open-files: 256
tidb:
baseImage: pingcap/tidb
maxFailoverCount: 0
replicas: 1
service:
type: ClusterIP
config: {}

从文档里面copy的最小资源模板，改了一下storageClassName: nfs-client。
(https://github.com/pingcap/tidb-operator/blob/v1.5.2/examples/advanced/tidb-cluster.yaml)
详细的apply以后现象一样

TiDBer_jYQINSnf · 2024 年3 月 13 日 10:50

你的存储都是nfs-client，那么有没有privisoner呢？
kubectl get pvc -n xxx 看看pvc都有没有申请出来，对应的pv有没有申请出来。

chenhanneu · 2024 年3 月 13 日 10:52

这两个是测试用的，tidb的pvc没有创建出来

TiDBer_jYQINSnf · 2024 年3 月 13 日 10:57

pvc都没出来.
log 一下operator的日志，controlmanager那个，然后过滤这个ns。贴一段看看。

chenhanneu · 2024 年3 月 13 日 11:03

tidb-controller-manager和kube-controller-manager-node日志都没有匹配的日志。
多次删除重新apply yaml，kube-controller-manager-node日志都没有变化的

chenhanneu · 2024 年3 月 13 日 11:38

kubectl get crd
确实没有TidbDashboard

The CustomResourceDefinition “tidbclusters.pingcap.com” is invalid: metadata.annotations: Too long: must have at most 262144 bytes

dashborad有了，tidbclusters没有了。

apply yaml，pod正常创建了。

chenhanneu · 2024 年3 月 13 日 11:38

谢谢大家

TiDBer_jYQINSnf · 2024 年3 月 13 日 11:56

这个crd应该是新增加的，我们的版本中没有这个，刚才看到报错就习惯性的忽略了。

system · 2024 年5 月 12 日 11:56

此话题已在最后回复的 60 天后被自动关闭。不再允许新回复。