TidbCluster集群部署失败, 提示TiKVStoreNotUp

【 TiDB 使用环境】测试环境
【 TiDB 版本】v6.1.0
【遇到的问题】集群部署失败
【问题现象及影响】
部署集群, 一直提示TiKV store(s) are not up。tikv pod是running状态


+ kubectl get pod
NAME                                                                   READY   STATUS             RESTARTS   AGE
advanced-tidb-discovery-6c65bf49fb-kmmmn                          1/1     Running            0          62m
advanced-tidb-pd-0                                                1/1     Running            0          62m
advanced-tidb-tikv-0                                              1/1     Running            0          62m
advanced-tidb-tikv-1                                              1/1     Running            0          22m
advanced-tidb-tikv-2                                              1/1     Running            0          22m

进入tikv pod查看日志也没有异常信息.

查看tidb-controller-manager的日志是一直打印以下信息:

I1101 08:44:09.534955       1 tikv_member_manager.go:808] TiKV of Cluster frs-dev/advanced-tidb not bootstrapped yet
I1101 08:44:09.555285       1 tikv_member_manager.go:906] TiKV of Cluster frs-dev/advanced-tidb is not bootstrapped yet, no need to set store labels
I1101 08:44:09.555981       1 tidb_cluster_controller.go:127] TidbCluster: frs-dev/advanced-tidb, still need sync: TidbCluster: [frs-dev/advanced-tidb], waiting for TiKV cluster running, requeuing

有没有哪位知道是什么原因?

tikv 没设定 label

TiKV of Cluster frs-dev/advanced-tidb is not bootstrapped yet, no need to set store labels
need to set store labels

这个labels怎么设置啊?

我看日志写的是tidb is not bootstrapped yet, no need to set store labels, 不需要设置标签

哪访问试试看?

你指的访问数据库吗?访问不了,连tidb服务都没起来,部署进度就卡在这里了,后面的tidb服务没创建

+ kubectl get tidbclusters -n frs-dev
NAME            READY   PD                  STORAGE   READY   DESIRE   TIKV   STORAGE   READY   DESIRE   TIDB   READY   DESIRE   AGE
advanced-tidb   False   pingcap/pd:v6.1.0   10Gi      1       1               50Gi      3       3                       2        62m

查下调度事件,哪儿挂了?

额,调度事件指的是?怎么查?

你都用了K8S了,,,pod 怎么创建的过程不了解一下么?

换个说法,POD 创建的日志总要查一下吧,为啥老卡这儿

pod的日志都查了,没有什么异常的,就tidb-controller-manager提示tikv集群没起来,

tidb的statefulsets都没有被创建出来,我觉得应该是tidb-Operator检查tikv没有ready,所以没有进行下一步pod的创建。 但是现在就是不知道是什么原因让它认为tikv集群没正常,我查了tikv几个pod的log都是正常的

+ kubectl get statefulsets -n frs-dev
NAME                 READY   AGE
advanced-tidb-pd     1/1     62m
advanced-tidb-tikv   3/3     62m

优先排查网络吧 tidb-controller-manager 能否 访问 tikv 的 pod
pd 和 tikv 的 pod 是否互通了

网络是通的,tikv中有日志,已经链接上pd了

frs-dev/advanced-tidb not bootstrapped yet

哪就是这个问题咯

TiKV of Cluster frs-dev/advanced-tidb not bootstrapped yet

额,就是不知道为啥它认为TiKV没有初始化完成,或者这个初始化有哪些工作要做吗?主要是tikv的pod中也没有错误日志,只有10分钟一次的心跳日志

advanced-tidb 查下这个咯,这个是 tidb 的实例名吧

这个是集群名称,没有这个pod实例

https://docs.pingcap.com/zh/tidb-in-kubernetes/stable/deploy-failures#pod-处于-pending-状态

查下所有pod 的日志吧,不然没法判断

K8S 基础环境的需求也挺复杂的

呃,有创建出来的pod都是running的状态,日志查看也没有特别的错误日志。但是tidb pod没有创建,它对于的statefulsets 也没有创建。

下面是集群的详情信息

+ kubectl describe tidbclusters -n frs-dev

Name:         advanced-tidb
Namespace:    frs-dev
Labels:       <none>
Annotations:  <none>
API Version:  pingcap.com/v1alpha1
Kind:         TidbCluster
Metadata:
  Creation Timestamp:  2022-11-01T07:32:45Z
  Generation:          17
  Resource Version:    3871439862
  Self Link:           /apis/pingcap.com/v1alpha1/namespaces/frs-dev/tidbclusters/advanced-tidb
  UID:                 dae58b62-7e57-4ac1-96ae-8c9b1680bb57
Spec:
  Config Update Strategy:  RollingUpdate
  Discovery:
  Enable Dynamic Configuration:  true
  Enable PV Reclaim:             false
  Helper:
    Image:            alpine:3.16.0
  Image Pull Policy:  IfNotPresent
  Node Selector:
    Project:  RECONPLFM
  Pd:
    Base Image:  pingcap/pd
    Config:      lease = 1800

[dashboard]
  internal-proxy = true

    Max Failover Count:           0
    Mount Cluster Client Secret:  false
    Replicas:                     1
    Requests:
      Storage:           10Gi
    Storage Class Name:  cna-reconplfm-dev-nas
  Pv Reclaim Policy:     Retain
  Tidb:
    Base Image:  pingcap/tidb
    Config:      [log]
  [log.file]
    max-backups = 3

[performance]
  tcp-keep-alive = true

    Max Failover Count:  0
    Replicas:            2
    Service:
      Type:              ClusterIP
    Storage Class Name:  cna-reconplfm-dev-nas
  Tikv:
    Base Image:  pingcap/tikv
    Config:      log-level = "info"

    Max Failover Count:           0
    Mount Cluster Client Secret:  false
    Replicas:                     3
    Requests:
      Storage:           50Gi
    Storage Class Name:  cna-reconplfm-dev-nas
  Timezone:              UTC
  Tls Cluster:
  Tolerations:
    Effect:    NoSchedule
    Key:       RECONPLFM
    Operator:  Equal
  Version:     v6.1.0
Status:
  Cluster ID:  7160932248881483001
  Conditions:
    Last Transition Time:  2022-11-01T07:32:45Z
    Last Update Time:      2022-11-01T07:33:07Z
    Message:               TiKV store(s) are not up
    Reason:                TiKVStoreNotUp
    Status:                False
    Type:                  Ready
  Pd:
    Image:  pingcap/pd:v6.1.0
    Leader:
      Client URL:            http://advanced-tidb-pd-0.advanced-tidb-pd-peer.frs-dev.svc:2379
      Health:                true
      Id:                    11005745135123337789
      Last Transition Time:  2022-11-01T07:33:06Z
      Name:                  advanced-tidb-pd-0
    Members:
      advanced-tidb-pd-0:
        Client URL:            http://advanced-tidb-pd-0.advanced-tidb-pd-peer.frs-dev.svc:2379
        Health:                true
        Id:                    11005745135123337789
        Last Transition Time:  2022-11-01T07:33:06Z
        Name:                  advanced-tidb-pd-0
    Phase:                     Normal
    Stateful Set:
      Collision Count:      0
      Current Replicas:     1
      Current Revision:     advanced-tidb-pd-59466586bc
      Observed Generation:  1
      Ready Replicas:       1
      Replicas:             1
      Update Revision:      advanced-tidb-pd-59466586bc
      Updated Replicas:     1
    Synced:                 true
  Pump:
  Ticdc:
  Tidb:
  Tiflash:
  Tikv:
    Phase:  Normal
    Stateful Set:
      Collision Count:      0
      Current Replicas:     3
      Current Revision:     advanced-tidb-tikv-66f457c77b
      Observed Generation:  3
      Ready Replicas:       3
      Replicas:             3
      Update Revision:      advanced-tidb-tikv-66f457c77b
      Updated Replicas:     3
    Synced:                 true
Events:                     <none>

这是我集群的配置文件,麻烦办法看一下有没有问题

apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
  name: advanced-tidb
  namespace: frs-dev

spec:
  #######################
  # Basic Configuration #
  #######################

  ## TiDB cluster version
  version: "v6.1.0"

  ## Time zone of TiDB cluster Pods
  timezone: UTC

  ## serviceAccount specifies the service account for PD/TiDB/TiKV/TiFlash/Pump/TiCDC components in this TidbCluster
  # serviceAccount: advanced-tidb

  ## ConfigUpdateStrategy determines how the configuration change is applied to the cluster.
  ## Valid values are `InPlace` and `RollingUpdate`
  ##   UpdateStrategy `InPlace` will update the ConfigMap of configuration in-place and an extra rolling update of the
  ##   cluster component is needed to reload the configuration change.
  ##   UpdateStrategy `RollingUpdate` will create a new ConfigMap with the new configuration and rolling update the
  ##   related components to use the new ConfigMap, that is, the new configuration will be applied automatically.
  configUpdateStrategy: RollingUpdate

  ## ImagePullPolicy of TiDB cluster Pods
  ## Ref: https://kubernetes.io/docs/concepts/configuration/overview/#container-images
  # imagePullPolicy: IfNotPresent

  ## If private registry is used, imagePullSecrets may be set
  ## You can also set this in service account
  ## Ref: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  # imagePullSecrets:
  # - name: secretName

  ## Image used to do miscellaneous tasks as sidecar container, such as:
  ## - execute sysctls when PodSecurityContext is set for some components, requires `sysctl` installed
  ## - tail slow log for tidb, requires `tail` installed
  ## - fill tiflash config template file based on pod ordinal
  helper:
    image: alpine:3.16.0
  # imagePullPolicy: IfNotPresent

  ## Enable PVC/PV reclaim for orphan PVC/PV left by statefulset scale-in.
  ## When set to `true`, PVC/PV that are not used by any tidb cluster pods will be deleted automatically.
  # enablePVReclaim: false

  ## Persistent volume reclaim policy applied to the PV consumed by the TiDB cluster, default to `Retain`.
  ## Note that the reclaim policy Recycle may not be supported by some storage types, e.g. local.
  ## Ref: https://kubernetes.io/docs/tasks/administer-cluster/change-pv-reclaim-policy/
  pvReclaimPolicy: Retain

  ##########################
  # Advanced Configuration #
  ##########################

  ## when deploying a heterogeneous TiDB cluster, you MUST specify the cluster name to join here
  # cluster:
  #   namespace: default
  #   name: tidb-cluster-to-join
  #   clusterDomain: cluster.local

  ## specifying pdAddresses will make PD in this TiDB cluster to join another existing PD cluster
  ## PD will then start with arguments --join= instead of --initial-cluster=
  # pdAddresses:
  #   - http://cluster1-pd-0.cluster1-pd-peer.default.svc:2379
  #   - http://cluster1-pd-1.cluster1-pd-peer.default.svc:2379

  ## Enable mutual TLS connection between TiDB cluster components
  ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/enable-tls-between-components/
  # tlsCluster:
  #   enabled: true

  ## Annotations of TiDB cluster pods, will be merged with component annotation settings.
  # annotations:
  #   node.kubernetes.io/instance-type: some-vm-type
  #   topology.kubernetes.io/region: some-region

  ## NodeSelector of TiDB cluster pods, will be merged with component nodeSelector settings.
  ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
  nodeSelector:
    project: RECONPLFM

  ## Tolerations are applied to TiDB cluster pods, allowing (but do not require) pods to be scheduled onto nodes with matching taints.
  ## This cluster-level `tolerations` only takes effect when no component-level `tolerations` are set.
  ## e.g. if `pd.tolerations` is not empty, `tolerations` here will be ignored.
  ## Ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
  tolerations:
    - effect: NoSchedule
      key: RECONPLFM
      operator: Equal
    # value: RECONPLFM

  ## Use the node network namespace, default to false
  ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces
  # hostNetwork: false

  ## specify resource requirements for discovery deployment
  # discovery:
  #   requests:
  #     cpu: 1000m
  #     memory: 256Mi
  #   limits:
  #     cpu: 2000m
  #     memory: 1Gi
  #   ## The following block overwrites TiDB cluster-level configurations in `spec`
  #   imagePullPolicy: IfNotPresent
  #   imagePullSecrets: secretName
  #   hostNetwork: false
  #   priorityClassName: system-cluster-critical
  #   schedulerName: default-scheduler
  #   nodeSelector:
  #     app.kubernetes.io/component: discovery
  #   annotations:
  #     node.kubernetes.io/instance-type: some-vm-type
  #   labels: {}
  #   env:
  #     - name: MY_ENV_1
  #       value: value1
  #   affinity: {}
  #   tolerations:
  #     - effect: NoSchedule
  #       key: dedicated
  #       operator: Equal
  #       value: discovery

  ## if true, this tidb cluster is paused and will not be synced by the controller
  # paused: false

  ## SchedulerName of TiDB cluster pods.
  ## If specified, the pods will be scheduled by the specified scheduler.
  ## Can be overwritten by component settings.
  # schedulerName: default-scheduler

  ## PodManagementPolicy default `OrderedReady` for Pump
  ## and default `Parallel` for the other components.
  ## Ref: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-management-policies
  # podManagementPolicy: Parallel

  ## Affinity for pod scheduling, will be overwritten by each cluster component's specific affinity setting
  ## Can refer to PD/TiDB/TiKV affinity settings, and ensure only cluster-scope general settings here
  ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
  # affinity: {}

  ## Specify pod priorities of pods in TidbCluster, default to empty.
  ## Can be overwritten by component settings.
  ## Ref: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
  # priorityClassName: system-cluster-critical

  ## If set to `true`, `--advertise-status-addr` will be appended to the startup parameters of TiKV
  enableDynamicConfiguration: true

  ## Set update strategy of StatefulSet, can be overwritten by the setting of each component.
  ## defaults to RollingUpdate
  ## Ref: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#update-strategies
  # statefulSetUpdateStrategy: RollingUpdate

  ## The identifier of the Pod will be `$(podName).$(serviceName).$(namespace).svc.$(clusterDomain)` when `clusterDomain` is set.
  ## Set this in the case where a TiDB cluster is deployed across multiple Kubernetes clusters. default to empty.
  # clusterDomain: cluster.local

  ## TopologySpreadConstraints for pod scheduling, will be overwritten by each cluster component's specific spread constraints setting
  ## Can refer to PD/TiDB/TiKV/TiCDC/TiFlash/Pump topologySpreadConstraints settings, and ensure only cluster-scope general settings here
  ## Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
  # topologySpreadConstraints:
  # - topologyKey: topology.kubernetes.io/zone

  ###########################
  # TiDB Cluster Components #
  ###########################

  pd:
    ##########################
    # Basic PD Configuration #
    ##########################

    ## Base image of the component
    baseImage: pingcap/pd

    ## pd-server configuration
    ## Ref: https://docs.pingcap.com/tidb/stable/pd-configuration-file
    config: |
      lease = 1800
      [dashboard]
        internal-proxy = true

    ## The desired replicas
    replicas: 1

    ## max inprogress failover PD pod counts
    maxFailoverCount: 0

    ## describes the compute resource requirements and limits.
    ## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
    requests:
    #   cpu: 1000m
    #   memory: 1Gi
      storage: 10Gi
    # limits:
    #   cpu: 2000m
    #   memory: 2Gi

    ## defines Kubernetes service for pd-server
    ## Ref: https://kubernetes.io/docs/concepts/services-networking/service/
    # service:
    #   type: ClusterIP
    #   annotations:
    #     foo: bar
    #   portName: client

    #############################
    # Advanced PD Configuration #
    #############################

    ## The following block overwrites TiDB cluster-level configurations in `spec`
    # version: "v6.1.0"
    # imagePullPolicy: IfNotPresent
    # imagePullSecrets:
    # - name: secretName
    # hostNetwork: false
    # serviceAccount: advanced-tidb-pd
    # priorityClassName: system-cluster-critical
    # schedulerName: default-scheduler
    # nodeSelector:
    #   app.kubernetes.io/component: pd
    # annotations:
    #   node.kubernetes.io/instance-type: some-vm-type
    # tolerations:
    #   - effect: NoSchedule
    #     key: dedicated
    #     operator: Equal
    #     value: pd
    # configUpdateStrategy: RollingUpdate
    # statefulSetUpdateStrategy: RollingUpdate

    ## List of environment variables to set in the container
    ## Note that the following env names cannot be used and will be overwritten by TiDB Operator builtin envs
    ##   - NAMESPACE
    ##   - TZ
    ##   - SERVICE_NAME
    ##   - PEER_SERVICE_NAME
    ##   - HEADLESS_SERVICE_NAME
    ##   - SET_NAME
    ##   - HOSTNAME
    ##   - CLUSTER_NAME
    ##   - POD_NAME
    ##   - BINLOG_ENABLED
    ##   - SLOW_LOG_FILE
    ## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
    # env:
    # - name: MY_ENV_1
    #   value: value1
    # - name: MY_ENV_2
    #   valueFrom:
    #     fieldRef:
    #       fieldPath: status.myEnv2

    ## Custom sidecar containers can be injected into the PD pods,
    ## which can act as a logging/tracing agent or for any other use case
    # additionalContainers:
    # - name: myCustomContainer
    #   image: ubuntu

    ## custom additional volumes in PD pods
    # additionalVolumes:
    # # specify volume types that are supported by Kubernetes, Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
    # - name: nfs
    #   nfs:
    #     server: 192.168.0.2
    #     path: /nfs

    ## custom additional volume mounts in PD pods
    # additionalVolumeMounts:
    # # this must match `name` in `additionalVolumes`
    # - name: nfs
    #   mountPath: /nfs

    ## Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
    ## Ref: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution
    # terminationGracePeriodSeconds: 30

    ## PodSecurityContext holds pod-level security attributes and common container settings.
    ## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
    # podSecurityContext:
    #   sysctls:
    #   - name: net.core.somaxconn
    #     value: "32768"

    ## when TLS cluster feature is enabled, TiDB Operator will automatically mount the cluster client certificates if mountClusterClientSecret is set to true
    ## Defaults to false
    ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/configure-a-tidb-cluster#mountclusterclientsecret
    mountClusterClientSecret: false

    ## The storageClassName of the persistent volume for PD data storage.
    storageClassName: "cna-reconplfm-dev-nas"

    ## defines additional volumes for which PVCs will be created by StatefulSet controller
    # storageVolumes:
    #   # this will be suffix of PVC names in VolumeClaimTemplates of PD StatefulSet
    # - name: volumeName
    #   # specify this to use special storageClass for this volume, default to component-level `storageClassName`
    #   storageClassName: local-storage
    #   # storage request of PVC
    #   storageSize: 1Gi
    #   # mount path of the PVC
    #   mountPath: /some/path

    ## Subdirectory within the volume to store PD Data. By default, the data
    ## is stored in the root directory of volume which is mounted at
    ## /var/lib/pd. Specifying this will change the data directory to a subdirectory,
    ## e.g. /var/lib/pd/data if you set the value to "data".
    ## It's dangerous to change this value for a running cluster as it will
    ## upgrade your cluster to use a new storage directory.
    ## Defaults to "" (volume's root).
    # dataSubDir: ""

    ## Affinity for pod scheduling
    ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
    # affinity:
    #   podAntiAffinity:
    #     # prefer not to run pd pods on the same node which runs tidb/tikv pods
    #     preferredDuringSchedulingIgnoredDuringExecution:
    #     - podAffinityTerm:
    #         labelSelector:
    #           matchExpressions:
    #           - key: app.kubernetes.io/component
    #             operator: In
    #             values:
    #             - tidb
    #             - tikv
    #         topologyKey: kubernetes.io/hostname
    #       weight: 100
    #     # require not to run PD pods on nodes where there's already a PD pod running
    #     # if setting this, you must ensure that at least `replicas` nodes are available in the cluster
    #     requiredDuringSchedulingIgnoredDuringExecution:
    #     - labelSelector:
    #         matchExpressions:
    #         - key: app.kubernetes.io/component
    #           operator: In
    #           values:
    #           - pd
    #       topologyKey: kubernetes.io/hostname

    ## set a different tidb client TLS cert secret name for TiDB Dashboard than the default ${clusterName}-tidb-client-secret
    ## only useful when TLS is enabled for TiDB server
    ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/enable-tls-for-mysql-client
    # tlsClientSecretName: custom-tidb-client-secret-name

    ## TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
    ## Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
    # topologySpreadConstraints:
    # - topologyKey: topology.kubernetes.io/zone

  tidb:
    ############################
    # Basic TiDB Configuration #
    ############################

    ## Base image of the component
    baseImage: pingcap/tidb

    ## tidb-server Configuration
    ## Ref: https://docs.pingcap.com/tidb/stable/tidb-configuration-file
    config: |
      [performance]
        tcp-keep-alive = true

    ## The desired replicas
    replicas: 2

    ## max inprogress failover TiDB pod counts
    maxFailoverCount: 0

    ## describes the compute resource requirements.
    ## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
    # requests:
    #   cpu: 1000m
    #   memory: 1Gi
    # limits:
    #   cpu: 2000m
    #   memory: 2Gi

    ## defines Kubernetes service for tidb-server
    ## If you are in a public cloud environment, you can use cloud LoadBalancer to access the TiDB service
    ## if you are in a private cloud environment, you can use Ingress or NodePort, or ClusterIP and port forward (only for development/test)
    ## you can set mysqlNodePort and statusNodePort to expose server/status service to the given NodePort
    service:
      type: ClusterIP
      # Ref: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip
      # externalTrafficPolicy: Local
      # # which NodePort to expose 4000 (mysql) port of tidb-server, only effective when type=LoadBalancer/NodePort
      # mysqlNodePort: 30020
      # # whether to export the status port, defaults to true
      # exposeStatus: true
      # # which NodePort to expose 10080 (status) port of tidb-server, only effective when type=LoadBalancer/NodePort and exposeStatus=true
      # statusNodePort: 30040

    ###############################
    # Advanced TiDB Configuration #
    ###############################

    ## The following block overwrites TiDB cluster-level configurations in `spec`
    # version: "v6.1.0"
    # imagePullPolicy: IfNotPresent
    # imagePullSecrets:
    # - name: secretName
    # hostNetwork: false
    # serviceAccount: advanced-tidb-tidb
    # priorityClassName: system-cluster-critical
    # schedulerName: default-scheduler
    # nodeSelector:
    #   app.kubernetes.io/component: tidb
    # annotations:
    #   node.kubernetes.io/instance-type: some-vm-type
    # tolerations:
    #   - effect: NoSchedule
    #     key: dedicated
    #     operator: Equal
    #     value: tidb
    # configUpdateStrategy: RollingUpdate
    # statefulSetUpdateStrategy: RollingUpdate

    ## List of environment variables to set in the container
    ## Note that the following env names cannot be used and will be overwritten by TiDB Operator builtin envs
    ##   - NAMESPACE
    ##   - TZ
    ##   - SERVICE_NAME
    ##   - PEER_SERVICE_NAME
    ##   - HEADLESS_SERVICE_NAME
    ##   - SET_NAME
    ##   - HOSTNAME
    ##   - CLUSTER_NAME
    ##   - POD_NAME
    ##   - BINLOG_ENABLED
    ##   - SLOW_LOG_FILE
    ## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
    # env:
    # - name: MY_ENV_1
    #   value: value1
    # - name: MY_ENV_2
    #   valueFrom:
    #     fieldRef:
    #       fieldPath: status.myEnv2

    ## Custom sidecar containers can be injected into the TiDB pods,
    ## which can act as a logging/tracing agent or for any other use case
    # additionalContainers:
    # - name: myCustomContainer
    #   image: ubuntu

    ## custom additional volumes in TiDB pods
    # additionalVolumes:
    # # specify volume types that are supported by Kubernetes, Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
    # - name: nfs
    #   nfs:
    #     server: 192.168.0.2
    #     path: /nfs

    ## custom additional volume mounts in TiDB pods
    # additionalVolumeMounts:
    # # this must match `name` in `additionalVolumes`
    # - name: nfs
    #   mountPath: /nfs

    ## Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
    ## Ref: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution
    # terminationGracePeriodSeconds: 30

    ## PodSecurityContext holds pod-level security attributes and common container settings.
    ## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
    # podSecurityContext:
    #   sysctls:
    #   - name: net.ipv4.tcp_keepalive_time
    #     value: "300"
    #   - name: net.ipv4.tcp_keepalive_intvl
    #     value: "75"
    #   - name: net.core.somaxconn
    #     value: "32768"

    ## prob tidb-server readiness
    ## valid type values:
    ##   - `tcp`, which uses Kubernetes TCPSocketAction to prob the 4000 tcp port of the pod
    ##   - `command`, which uses curl to access the /status path on port 10080 of the pod
    ## This is supported from TiDB Operator v1.1.7
    # readinessProbe:
    #   # The `command` type is only supported after tidb v4.0.9, ref: https://github.com/pingcap/tidb/pull/20694
    #   type: command

    ## when enabled, TiDB will accept TLS encrypted connections from MySQL client
    ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/enable-tls-for-mysql-client
    # tlsClient:
    #   enabled: true
    #   disableClientAuthn: false
    #   skipInternalClientCA: false

    ## binlogEnabled will automatically be true if Pump is enabled, otherwise false
    ## set this manually only if you really know what you are doing
    ## MANUAL CONFIG NOT RECOMMENDED
    # binlogEnabled: false

    ## if enabled, the slow log will be shown in a separate sidecar container
    # separateSlowLog: true
    # slowLogVolumeName: ""
    ## configures separate sidecar container, where `image` & `imagePullPolicy` will be overwritten by
    ## the same field in `TidbCluster.helper`
    # slowLogTailer:
    #   requests:
    #     cpu: 1000m
    #     memory: 1Gi
    #   limits:
    #     cpu: 2000m
    #     memory: 2Gi
    #   image: busybox
    #   imagePullPolicy: IfNotPresent

    ## The storageClassName of the persistent volume for TiDB data storage.
    storageClassName: "cna-reconplfm-dev-nas"

    ## defines additional volumes for which PVCs will be created by StatefulSet controller
    # storageVolumes:
    #   # this will be suffix of PVC names in VolumeClaimTemplates of TiDB StatefulSet
    # - name: volumeName
    #   # specify this to use special storageClass for this volume, default to component-level `storageClassName`
    #   storageClassName: local-storage
    #   # storage request of PVC
    #   storageSize: 1Gi
    #   # mount path of the PVC
    #   mountPath: /some/path

    ## config Kubernetes container lifecycle hooks for tidb-server pods
    ## Ref: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
    # lifecycle:
    #   postStart:
    #     exec:
    #       command:
    #       - echo
    #       - "postStart"
    #   preStop:
    #     exec:
    #       command:
    #       - echo
    #       - "preStop"

    ## Affinity for pod scheduling
    ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
    # affinity:
    #   podAntiAffinity:
    #     preferredDuringSchedulingIgnoredDuringExecution:
    #     - podAffinityTerm:
    #         labelSelector:
    #           matchExpressions:
    #           - key: app.kubernetes.io/component
    #             operator: In
    #             values:
    #             - pd
    #             - tikv
    #         topologyKey: kubernetes.io/hostname
    #       weight: 100
    #     # require not to run TiDB pods on nodes where there's already a TiDB pod running
    #     # if setting this, you must ensure that at least `replicas` nodes are available in the cluster
    #     requiredDuringSchedulingIgnoredDuringExecution:
    #     - labelSelector:
    #         matchExpressions:
    #         - key: app.kubernetes.io/component
    #           operator: In
    #           values:
    #           - tidb
    #       topologyKey: kubernetes.io/hostname

    ## TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
    ## Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
    # topologySpreadConstraints:
    # - topologyKey: topology.kubernetes.io/zone

  tikv:
    ############################
    # Basic TiKV Configuration #
    ############################

    ## Base image of the component
    baseImage: pingcap/tikv

    ## tikv-server configuration
    ## Ref: https://docs.pingcap.com/tidb/stable/tikv-configuration-file
    config: |
      log-level = "info"

    ## The desired replicas
    replicas: 3

    ## max inprogress failover TiKV pod counts
    maxFailoverCount: 0

    ## describes the compute resource requirements.
    ## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
    requests:
    #   cpu: 1000m
    #   memory: 1Gi
      storage: 50Gi
    # limits:
    #   cpu: 2000m
    #   memory: 2Gi
    #   # settings `storage` here will add `--capacity` arg to tikv-server
    #   storage: 10Gi

    ###############################
    # Advanced TiKV Configuration #
    ###############################

    ## The following block overwrites TiDB cluster-level configurations in `spec`
    # version: "v6.1.0"
    # imagePullPolicy: IfNotPresent
    # imagePullSecrets:
    # - name: secretName
    # hostNetwork: false
    # serviceAccount: advanced-tidb-tikv
    # priorityClassName: system-cluster-critical
    # schedulerName: default-scheduler
    # nodeSelector:
    #   app.kubernetes.io/component: tikv
    # annotations:
    #   node.kubernetes.io/instance-type: some-vm-type
    # tolerations:
    #   - effect: NoSchedule
    #     key: dedicated
    #     operator: Equal
    #     value: tikv
    # configUpdateStrategy: RollingUpdate
    # statefulSetUpdateStrategy: RollingUpdate

    ## List of environment variables to set in the container
    ## Note that the following env names cannot be used and will be overwritten by TiDB Operator builtin envs
    ##   - NAMESPACE
    ##   - TZ
    ##   - SERVICE_NAME
    ##   - PEER_SERVICE_NAME
    ##   - HEADLESS_SERVICE_NAME
    ##   - SET_NAME
    ##   - HOSTNAME
    ##   - CLUSTER_NAME
    ##   - POD_NAME
    ##   - BINLOG_ENABLED
    ##   - SLOW_LOG_FILE
    ## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
    # env:
    # - name: MY_ENV_1
    #   value: value1
    # - name: MY_ENV_2
    #   valueFrom:
    #     fieldRef:
    #       fieldPath: status.myEnv2

    ## Custom sidecar containers can be injected into the TiKV pods,
    ## which can act as a logging/tracing agent or for any other use case
    # additionalContainers:
    # - name: myCustomContainer
    #   image: ubuntu

    ## custom additional volumes in TiKV pods
    # additionalVolumes:
    # # specify volume types that are supported by Kubernetes, Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
    # - name: nfs
    #   nfs:
    #     server: 192.168.0.2
    #     path: /nfs

    ## custom additional volume mounts in TiKV pods
    # additionalVolumeMounts:
    # # this must match `name` in `additionalVolumes`
    # - name: nfs
    #   mountPath: /nfs

    ## Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
    ## Ref: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution
    # terminationGracePeriodSeconds: 30

    ## PodSecurityContext holds pod-level security attributes and common container settings.
    ## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
    # podSecurityContext:
    #   sysctls:
    #   - name: net.ipv4.tcp_keepalive_time
    #     value: "300"
    #   - name: net.ipv4.tcp_keepalive_intvl
    #     value: "75"
    #   - name: net.core.somaxconn
    #     value: "32768"

    ## when TLS cluster feature is enabled, TiDB Operator will automatically mount the cluster client certificates if mountClusterClientSecret is set to true
    ## Defaults to false
    ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/configure-a-tidb-cluster#mountclusterclientsecret
    mountClusterClientSecret: false

    ## if enabled, the RocksDB log will be shown in a separate sidecar container
    # separateRocksDBLog: true
    ## rocksDBLogVolumeName is only applicable when separateRocksDBLog is enabled.
    ## if separateRocksDBLog is enabled, rocksDBLogVolumeName can be set or not.
    ## if rocksDBLogVolumeName is not set, the rocksdb log is saved in the PV of TiKV data, which may increase the IO pressure of the data disk.
    ## if rocksDBLogVolumeName is set, rocksDBLogVolumeName should be defined in storageVolumes or additionalVolumes to use another PV.
    ## you may need to change the `rocksdb.info-log-dir` to save the logs in the separate PV.
    ## Ref: https://github.com/tikv/tikv/blob/master/etc/config-template.toml
    # rocksDBLogVolumeName: ""

    ## if enabled, the Raft log will be shown in a separate sidecar container
    # separateRaftLog: true
    ## raftLogVolumeName is only applicable when separateRaftLog is enabled.
    ## if separateRaftLog is enabled, raftLogVolumeName can be set or not.
    ## if raftLogVolumeName is not set, the separated raftdb log is saved in the PV of TiKV data, which may increase the IO pressure of the data disk.
    ## if raftLogVolumeName is set, raftLogVolumeName should be defined in storageVolumes or additionalVolumes to use another PV.
    ## you may need to change the `raftdb.info-log-dir` to save the logs in the separate PV.
    ## Ref: https://github.com/tikv/tikv/blob/master/etc/config-template.toml
    # raftLogVolumeName: ""

    ## configures RocksDB/Raft log sidecar container resource requirements
    # logTailer:
    #   requests:
    #     cpu: 1000m
    #     memory: 1Gi
    #   limits:
    #     cpu: 2000m
    #     memory: 2Gi

    ## The storageClassName of the persistent volume for TiKV data storage.
    storageClassName: "cna-reconplfm-dev-nas"

    ## defines additional volumes for which PVCs will be created by StatefulSet controller
    # storageVolumes:
    #   # this will be suffix of PVC names in VolumeClaimTemplates of TiKV StatefulSet
    # - name: volumeName
    #   # specify this to use special storageClass for this volume, default to component-level `storageClassName`
    #   storageClassName: local-storage
    #   # storage request of PVC
    #   storageSize: 1Gi
    #   # mount path of the PVC
    #   mountPath: /some/path

    ## run TiKV container in privileged mode
    ## Processes in privileged containers are essentially equivalent to root on the host
    ## NOT RECOMMENDED in production environment
    ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privileged
    # privileged: false

    ## if set to true, TiDB Operator will recover newly created TiKV pods due to automatic failover
    # recoverFailover: true

    ## Subdirectory within the volume to store TiKV Data. By default, the data
    ## is stored in the root directory of volume which is mounted at /var/lib/tikv.
    ## Specifying this will change the data directory to a subdirectory, e.g.
    ## /var/lib/tikv/data if you set the value to "data".
    ## It's dangerous to change this value for a running cluster as it will
    ## upgrade your cluster to use a new storage directory.
    ## Defaults to "" (volume's root).
    # dataSubDir: ""

    ## defines the timeout for region leader eviction in golang `Duration` format, if raft region leaders are not transferred to other stores after this duration, TiDB Operator will delete the Pod forcibly.
    # evictLeaderTimeout: 3m

    ## Affinity for pod scheduling
    ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
    # affinity:
    #   podAntiAffinity:
    #     preferredDuringSchedulingIgnoredDuringExecution:
    #     - podAffinityTerm:
    #         labelSelector:
    #           matchExpressions:
    #           - key: app.kubernetes.io/component
    #             operator: In
    #             values:
    #             - tidb
    #             - pd
    #         topologyKey: kubernetes.io/hostname
    #       weight: 100
    #     # require not to run TiKV pods on nodes where there's already a TiKV pod running
    #     # if setting this, you must ensure that at least `replicas` nodes are available in the cluster
    #     requiredDuringSchedulingIgnoredDuringExecution:
    #     - labelSelector:
    #         matchExpressions:
    #         - key: app.kubernetes.io/component
    #           operator: In
    #           values:
    #           - tikv
    #       topologyKey: kubernetes.io/hostname

    ## TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
    ## Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
    # topologySpreadConstraints:
    # - topologyKey: topology.kubernetes.io/zone

  ## Deploy TiDB Binlog of a TiDB cluster
  ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/deploy-tidb-binlog/#deploy-pump
  # pump:
  #   baseImage: pingcap/tidb-binlog
  #   version: "v6.1.0"
  #   replicas: 1
  #   storageClassName: local-storage
  #   requests:
  #     cpu: 1000m
  #     memory: 1Gi
  #     storage: 1Gi
  #   limits:
  #     cpu: 2000m
  #     memory: 2Gi
  #   imagePullPolicy: IfNotPresent
  #   imagePullSecrets:
  #   - name: secretName
  #   hostNetwork: false
  #   serviceAccount: advanced-tidb-pump
  #   priorityClassName: system-cluster-critical
  #   schedulerName: default-scheduler
  #   nodeSelector:
  #     app.kubernetes.io/component: pump
  #   annotations:
  #     node.kubernetes.io/instance-type: some-vm-type
  #   tolerations: {}
  #   configUpdateStrategy: RollingUpdate
  #   statefulSetUpdateStrategy: RollingUpdate
  #   podSecurityContext: {}
  #   env: []
  #   additionalContainers: []
  #   additionalVolumes: []
  #   additionalVolumeMounts: []
  #   terminationGracePeriodSeconds: 30
  #   # Ref: https://docs.pingcap.com/tidb/stable/tidb-binlog-configuration-file#pump
  #   config: |
  #     gc = 7
  #   # TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
  #   # Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
  #   topologySpreadConstraints:
  #   - topologyKey: topology.kubernetes.io/zone

  ## TiCDC is a tool for replicating the incremental data of TiDB
  ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/deploy-ticdc/
  # ticdc:
  #   baseImage: pingcap/ticdc
  #   version: "v6.1.0"
  #   replicas: 3
  #   storageClassName: local-storage
  #   requests:
  #     cpu: 1000m
  #     memory: 1Gi
  #   limits:
  #     cpu: 2000m
  #     memory: 2Gi
  #   imagePullPolicy: IfNotPresent
  #   imagePullSecrets:
  #   - name: secretName
  #   hostNetwork: false
  #   serviceAccount: advanced-tidb-ticdc
  #   priorityClassName: system-cluster-critical
  #   schedulerName: default-scheduler
  #   nodeSelector:
  #     app.kubernetes.io/component: ticdc
  #   annotations:
  #     node.kubernetes.io/instance-type: some-vm-type
  #   tolerations: {}
  #   configUpdateStrategy: RollingUpdate
  #   statefulSetUpdateStrategy: RollingUpdate
  #   podSecurityContext: {}
  #   env: []
  #   additionalContainers: []
  #   storageVolumes: []
  #   additionalVolumes: []
  #   additionalVolumeMounts: []
  #   terminationGracePeriodSeconds: 30
  #   # Ref: https://docs.pingcap.com/tidb/stable/deploy-ticdc#add-ticdc-to-an-existing-tidb-cluster-using-binary-not-recommended
  #   config:
  #     timezone: UTC
  #     gcTTL: 86400
  #     logLevel: info
  #     logFile: /dev/stderr
  #   # TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
  #   # Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
  #   topologySpreadConstraints:
  #   - topologyKey: topology.kubernetes.io/zone

  ## TiFlash is the columnar storage extension of TiKV
  ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/deploy-tiflash/
  # tiflash:
  #   ###############################
  #   # Basic TiFlash Configuration #
  #   ###############################
  #   baseImage: pingcap/tiflash
  #   version: "v6.1.0"
  #   replicas: 1
  #   # limits:
  #   #   cpu: 2000m
  #   #   memory: 2Gi
  #   imagePullPolicy: IfNotPresent
  #   imagePullSecrets:
  #   - name: secretName

  #   ##################################
  #   # Advanced TiFlash Configuration #
  #   ##################################
  #   maxFailoverCount: 0
  #   hostNetwork: false
  #   serviceAccount: advanced-tidb-tiflash
  #   priorityClassName: system-cluster-critical
  #   schedulerName: default-scheduler
  #   nodeSelector:
  #     app.kubernetes.io/component: tiflash
  #   annotations:
  #     node.kubernetes.io/instance-type: some-vm-type
  #   tolerations: {}
  #   configUpdateStrategy: RollingUpdate
  #   statefulSetUpdateStrategy: RollingUpdate
  #   podSecurityContext: {}
  #   env: []
  #   additionalContainers: []
  #   additionalVolumes: []
  #   additionalVolumeMounts: []
  #   terminationGracePeriodSeconds: 30
  #   storageClaims:
  #     - resources:
  #         requests:
  #           # specify PVC storage used for TiFlash
  #           storage: 1Gi
  #       # specify PVC storage class
  #       storageClassName: local-storage
  #   # run TiFlash container in privileged mode
  #   # Processes in privileged containers are essentially equivalent to root on the host
  #   # NOT RECOMMENDED in production environment
  #   # Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privileged
  #   privileged: false

  #   # if set to true, TiDB Operator will recover newly created TiFlash pods due to automatic failover
  #   recoverFailover: true

  #   # configures serverlog/errorlog/clusterlog sidecar container resource requirements
  #   # logTailer:
  #   #   requests:
  #   #     cpu: 1000m
  #   #     memory: 1Gi
  #   #   limits:
  #   #     cpu: 2000m
  #   #     memory: 2Gi
  #   # configures init container resource requirements
  #   initializer:
  #     requests:
  #       cpu: 1000m
  #       memory: 1Gi
  #     limits:
  #       cpu: 2000m
  #       memory: 2Gi

  #   # TOML format configuration
  #   # Ref: https://docs.pingcap.com/tidb/dev/tiflash-configuration
  #   config:
  #     # configure the configuration file for TiFlash process
  #     config: |
  #       [logger]
  #         log = "/data0/logs/somelog"
  #     # configure the configuration file for TiFlash Proxy process
  #     proxy: |
  #       [security]
  #         cert-allowed-cn = "CNNAME"
  #   # TopologySpreadConstraints for pod scheduling, will overwrite the cluster level spread constraints setting
  #   # Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
  #   topologySpreadConstraints:
  #   - topologyKey: topology.kubernetes.io/zone

你用啥方式部署的? 推荐你用 helm 试试

配置文件没看出什么问题

我是按官方的文档步骤来的,用helm部署 TiDB Operator
用kubectl apply -f ${cluster_name} -n ${namespace} 来部署tidb集群。你说的是helm部署tidb集群吗?能给过说明链接地址吗?

https://docs.pingcap.com/zh/tidb-in-kubernetes/v1.4/deploy-tidb-operator

你要先搞定 k8s 的环境,,能对得上号才行的
https://docs.pingcap.com/zh/tidb-in-kubernetes/v1.4/prerequisites