TidbCluster集群部署失败, 提示TiKVStoreNotUp

TiDBer_6Lizki07 · 2022 年11 月 1 日 08:45

【 TiDB 使用环境】测试环境
【 TiDB 版本】v6.1.0
【遇到的问题】集群部署失败
【问题现象及影响】
部署集群, 一直提示TiKV store(s) are not up。tikv pod是running状态


+ kubectl get pod
NAME                                                                   READY   STATUS             RESTARTS   AGE
advanced-tidb-discovery-6c65bf49fb-kmmmn                          1/1     Running            0          62m
advanced-tidb-pd-0                                                1/1     Running            0          62m
advanced-tidb-tikv-0                                              1/1     Running            0          62m
advanced-tidb-tikv-1                                              1/1     Running            0          22m
advanced-tidb-tikv-2                                              1/1     Running            0          22m

进入tikv pod查看日志也没有异常信息.

查看tidb-controller-manager的日志是一直打印以下信息：

I1101 08:44:09.534955       1 tikv_member_manager.go:808] TiKV of Cluster frs-dev/advanced-tidb not bootstrapped yet
I1101 08:44:09.555285       1 tikv_member_manager.go:906] TiKV of Cluster frs-dev/advanced-tidb is not bootstrapped yet, no need to set store labels
I1101 08:44:09.555981       1 tidb_cluster_controller.go:127] TidbCluster: frs-dev/advanced-tidb, still need sync: TidbCluster: [frs-dev/advanced-tidb], waiting for TiKV cluster running, requeuing

有没有哪位知道是什么原因？

xfworld · 2022 年11 月 1 日 08:55

tikv 没设定 label

TiKV of Cluster frs-dev/advanced-tidb is not bootstrapped yet, no need to set store labels
need to set store labels

TiDBer_6Lizki07 · 2022 年11 月 1 日 08:57

这个labels怎么设置啊？

我看日志写的是tidb is not bootstrapped yet, no need to set store labels，不需要设置标签

xfworld · 2022 年11 月 1 日 08:58

哪访问试试看？

TiDBer_6Lizki07 · 2022 年11 月 1 日 09:01

你指的访问数据库吗？访问不了，连tidb服务都没起来，部署进度就卡在这里了，后面的tidb服务没创建

+ kubectl get tidbclusters -n frs-dev
NAME            READY   PD                  STORAGE   READY   DESIRE   TIKV   STORAGE   READY   DESIRE   TIDB   READY   DESIRE   AGE
advanced-tidb   False   pingcap/pd:v6.1.0   10Gi      1       1               50Gi      3       3                       2        62m

xfworld · 2022 年11 月 1 日 09:01

查下调度事件，哪儿挂了？

TiDBer_6Lizki07 · 2022 年11 月 1 日 09:02

额，调度事件指的是？怎么查？

xfworld · 2022 年11 月 1 日 09:03

你都用了K8S了，，，pod 怎么创建的过程不了解一下么？

换个说法，POD 创建的日志总要查一下吧，为啥老卡这儿

TiDBer_6Lizki07 · 2022 年11 月 1 日 09:08

pod的日志都查了，没有什么异常的，就tidb-controller-manager提示tikv集群没起来，

tidb的statefulsets都没有被创建出来，我觉得应该是tidb-Operator检查tikv没有ready，所以没有进行下一步pod的创建。但是现在就是不知道是什么原因让它认为tikv集群没正常，我查了tikv几个pod的log都是正常的

+ kubectl get statefulsets -n frs-dev
NAME                 READY   AGE
advanced-tidb-pd     1/1     62m
advanced-tidb-tikv   3/3     62m

xfworld · 2022 年11 月 1 日 09:09

优先排查网络吧 tidb-controller-manager 能否访问 tikv 的 pod
pd 和 tikv 的 pod 是否互通了

TiDBer_6Lizki07 · 2022 年11 月 1 日 09:10

网络是通的，tikv中有日志，已经链接上pd了

xfworld · 2022 年11 月 1 日 09:12

frs-dev/advanced-tidb not bootstrapped yet

哪就是这个问题咯

TiDBer_6Lizki07 · 2022 年11 月 1 日 09:16

TiKV of Cluster frs-dev/advanced-tidb not bootstrapped yet

额，就是不知道为啥它认为TiKV没有初始化完成，或者这个初始化有哪些工作要做吗？主要是tikv的pod中也没有错误日志，只有10分钟一次的心跳日志

xfworld · 2022 年11 月 1 日 09:43

advanced-tidb 查下这个咯，这个是 tidb 的实例名吧

TiDBer_6Lizki07 · 2022 年11 月 1 日 09:45

这个是集群名称，没有这个pod实例

xfworld · 2022 年11 月 1 日 09:51

https://docs.pingcap.com/zh/tidb-in-kubernetes/stable/deploy-failures#pod-处于-pending-状态

查下所有pod 的日志吧，不然没法判断

K8S 基础环境的需求也挺复杂的

TiDBer_6Lizki07 · 2022 年11 月 1 日 10:01

呃，有创建出来的pod都是running的状态，日志查看也没有特别的错误日志。但是tidb pod没有创建，它对于的statefulsets 也没有创建。

下面是集群的详情信息

+ kubectl describe tidbclusters -n frs-dev

Name:         advanced-tidb
Namespace:    frs-dev
Labels:       <none>
Annotations:  <none>
API Version:  pingcap.com/v1alpha1
Kind:         TidbCluster
Metadata:
  Creation Timestamp:  2022-11-01T07:32:45Z
  Generation:          17
  Resource Version:    3871439862
  Self Link:           /apis/pingcap.com/v1alpha1/namespaces/frs-dev/tidbclusters/advanced-tidb
  UID:                 dae58b62-7e57-4ac1-96ae-8c9b1680bb57
Spec:
  Config Update Strategy:  RollingUpdate
  Discovery:
  Enable Dynamic Configuration:  true
  Enable PV Reclaim:             false
  Helper:
    Image:            alpine:3.16.0
  Image Pull Policy:  IfNotPresent
  Node Selector:
    Project:  RECONPLFM
  Pd:
    Base Image:  pingcap/pd
    Config:      lease = 1800

[dashboard]
  internal-proxy = true

    Max Failover Count:           0
    Mount Cluster Client Secret:  false
    Replicas:                     1
    Requests:
      Storage:           10Gi
    Storage Class Name:  cna-reconplfm-dev-nas
  Pv Reclaim Policy:     Retain
  Tidb:
    Base Image:  pingcap/tidb
    Config:      [log]
  [log.file]
    max-backups = 3

[performance]
  tcp-keep-alive = true

    Max Failover Count:  0
    Replicas:            2
    Service:
      Type:              ClusterIP
    Storage Class Name:  cna-reconplfm-dev-nas
  Tikv:
    Base Image:  pingcap/tikv
    Config:      log-level = "info"

    Max Failover Count:           0
    Mount Cluster Client Secret:  false
    Replicas:                     3
    Requests:
      Storage:           50Gi
    Storage Class Name:  cna-reconplfm-dev-nas
  Timezone:              UTC
  Tls Cluster:
  Tolerations:
    Effect:    NoSchedule
    Key:       RECONPLFM
    Operator:  Equal
  Version:     v6.1.0
Status:
  Cluster ID:  7160932248881483001
  Conditions:
    Last Transition Time:  2022-11-01T07:32:45Z
    Last Update Time:      2022-11-01T07:33:07Z
    Message:               TiKV store(s) are not up
    Reason:                TiKVStoreNotUp
    Status:                False
    Type:                  Ready
  Pd:
    Image:  pingcap/pd:v6.1.0
    Leader:
      Client URL:            http://advanced-tidb-pd-0.advanced-tidb-pd-peer.frs-dev.svc:2379
      Health:                true
      Id:                    11005745135123337789
      Last Transition Time:  2022-11-01T07:33:06Z
      Name:                  advanced-tidb-pd-0
    Members:
      advanced-tidb-pd-0:
        Client URL:            http://advanced-tidb-pd-0.advanced-tidb-pd-peer.frs-dev.svc:2379
        Health:                true
        Id:                    11005745135123337789
        Last Transition Time:  2022-11-01T07:33:06Z
        Name:                  advanced-tidb-pd-0
    Phase:                     Normal
    Stateful Set:
      Collision Count:      0
      Current Replicas:     1
      Current Revision:     advanced-tidb-pd-59466586bc
      Observed Generation:  1
      Ready Replicas:       1
      Replicas:             1
      Update Revision:      advanced-tidb-pd-59466586bc
      Updated Replicas:     1
    Synced:                 true
  Pump:
  Ticdc:
  Tidb:
  Tiflash:
  Tikv:
    Phase:  Normal
    Stateful Set:
      Collision Count:      0
      Current Replicas:     3
      Current Revision:     advanced-tidb-tikv-66f457c77b
      Observed Generation:  3
      Ready Replicas:       3
      Replicas:             3
      Update Revision:      advanced-tidb-tikv-66f457c77b
      Updated Replicas:     3
    Synced:                 true
Events:                     <none>

这是我集群的配置文件，麻烦办法看一下有没有问题

apiVersion: pingcap.com/v1alpha1
kind: TidbCluster
metadata:
  name: advanced-tidb
  namespace: frs-dev

spec:
  #######################
  # Basic Configuration #
  #######################

  ## TiDB cluster version
  version: "v6.1.0"

  ## Time zone of TiDB cluster Pods
  timezone: UTC

  ## serviceAccount specifies the service account for PD/TiDB/TiKV/TiFlash/Pump/TiCDC components in this TidbCluster
  # serviceAccount: advanced-tidb

  ## ConfigUpdateStrategy determines how the configuration change is applied to the cluster.
  ## Valid values are `InPlace` and `RollingUpdate`
  ##   UpdateStrategy `InPlace` will update the ConfigMap of configuration in-place and an extra rolling update of the
  ##   cluster component is needed to reload the configuration change.
  ##   UpdateStrategy `RollingUpdate` will create a new ConfigMap with the new configuration and rolling update the
  ##   related components to use the new ConfigMap, that is, the new configuration will be applied automatically.
  configUpdateStrategy: RollingUpdate

  ## ImagePullPolicy of TiDB cluster Pods
  ## Ref: https://kubernetes.io/docs/concepts/configuration/overview/#container-images
  # imagePullPolicy: IfNotPresent

  ## If private registry is used, imagePullSecrets may be set
  ## You can also set this in service account
  ## Ref: https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  # imagePullSecrets:
  # - name: secretName

  ## Image used to do miscellaneous tasks as sidecar container, such as:
  ## - execute sysctls when PodSecurityContext is set for some components, requires `sysctl` installed
  ## - tail slow log for tidb, requires `tail` installed
  ## - fill tiflash config template file based on pod ordinal
  helper:
    image: alpine:3.16.0
  # imagePullPolicy: IfNotPresent

  ## Enable PVC/PV reclaim for orphan PVC/PV left by statefulset scale-in.
  ## When set to `true`, PVC/PV that are not used by any tidb cluster pods will be deleted automatically.
  # enablePVReclaim: false

  ## Persistent volume reclaim policy applied to the PV consumed by the TiDB cluster, default to `Retain`.
  ## Note that the reclaim policy Recycle may not be supported by some storage types, e.g. local.
  ## Ref: https://kubernetes.io/docs/tasks/administer-cluster/change-pv-reclaim-policy/
  pvReclaimPolicy: Retain

  ##########################
  # Advanced Configuration #
  ##########################

  ## when deploying a heterogeneous TiDB cluster, you MUST specify the cluster name to join here
  # cluster:
  #   namespace: default
  #   name: tidb-cluster-to-join
  #   clusterDomain: cluster.local

  ## specifying pdAddresses will make PD in this TiDB cluster to join another existing PD cluster
  ## PD will then start with arguments --join= instead of --initial-cluster=
  # pdAddresses:
  #   - http://cluster1-pd-0.cluster1-pd-peer.default.svc:2379
  #   - http://cluster1-pd-1.cluster1-pd-peer.default.svc:2379

  ## Enable mutual TLS connection between TiDB cluster components
  ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/enable-tls-between-components/
  # tlsCluster:
  #   enabled: true

  ## Annotations of TiDB cluster pods, will be merged with component annotation settings.
  # annotations:
  #   node.kubernetes.io/instance-type: some-vm-type
  #   topology.kubernetes.io/region: some-region

  ## NodeSelector of TiDB cluster pods, will be merged with component nodeSelector settings.
  ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/
  nodeSelector:
    project: RECONPLFM

  ## Tolerations are applied to TiDB cluster pods, allowing (but do not require) pods to be scheduled onto nodes with matching taints.
  ## This cluster-level `tolerations` only takes effect when no component-level `tolerations` are set.
  ## e.g. if `pd.tolerations` is not empty, `tolerations` here will be ignored.
  ## Ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
  tolerations:
    - effect: NoSchedule
      key: RECONPLFM
      operator: Equal
    # value: RECONPLFM

  ## Use the node network namespace, default to false
  ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces
  # hostNetwork: false

  ## specify resource requirements for discovery deployment
  # discovery:
  #   requests:
  #     cpu: 1000m
  #     memory: 256Mi
  #   limits:
  #     cpu: 2000m
  #     memory: 1Gi
  #   ## The following block overwrites TiDB cluster-level configurations in `spec`
  #   imagePullPolicy: IfNotPresent
  #   imagePullSecrets: secretName
  #   hostNetwork: false
  #   priorityClassName: system-cluster-critical
  #   schedulerName: default-scheduler
  #   nodeSelector:
  #     app.kubernetes.io/component: discovery
  #   annotations:
  #     node.kubernetes.io/instance-type: some-vm-type
  #   labels: {}
  #   env:
  #     - name: MY_ENV_1
  #       value: value1
  #   affinity: {}
  #   tolerations:
  #     - effect: NoSchedule
  #       key: dedicated
  #       operator: Equal
  #       value: discovery

  ## if true, this tidb cluster is paused and will not be synced by the controller
  # paused: false

  ## SchedulerName of TiDB cluster pods.
  ## If specified, the pods will be scheduled by the specified scheduler.
  ## Can be overwritten by component settings.
  # schedulerName: default-scheduler

  ## PodManagementPolicy default `OrderedReady` for Pump
  ## and default `Parallel` for the other components.
  ## Ref: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#pod-management-policies
  # podManagementPolicy: Parallel

  ## Affinity for pod scheduling, will be overwritten by each cluster component's specific affinity setting
  ## Can refer to PD/TiDB/TiKV affinity settings, and ensure only cluster-scope general settings here
  ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
  # affinity: {}

  ## Specify pod priorities of pods in TidbCluster, default to empty.
  ## Can be overwritten by component settings.
  ## Ref: https://kubernetes.io/docs/concepts/configuration/pod-priority-preemption/
  # priorityClassName: system-cluster-critical

  ## If set to `true`, `--advertise-status-addr` will be appended to the startup parameters of TiKV
  enableDynamicConfiguration: true

  ## Set update strategy of StatefulSet, can be overwritten by the setting of each component.
  ## defaults to RollingUpdate
  ## Ref: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#update-strategies
  # statefulSetUpdateStrategy: RollingUpdate

  ## The identifier of the Pod will be `$(podName).$(serviceName).$(namespace).svc.$(clusterDomain)` when `clusterDomain` is set.
  ## Set this in the case where a TiDB cluster is deployed across multiple Kubernetes clusters. default to empty.
  # clusterDomain: cluster.local

  ## TopologySpreadConstraints for pod scheduling, will be overwritten by each cluster component's specific spread constraints setting
  ## Can refer to PD/TiDB/TiKV/TiCDC/TiFlash/Pump topologySpreadConstraints settings, and ensure only cluster-scope general settings here
  ## Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
  # topologySpreadConstraints:
  # - topologyKey: topology.kubernetes.io/zone

  ###########################
  # TiDB Cluster Components #
  ###########################

  pd:
    ##########################
    # Basic PD Configuration #
    ##########################

    ## Base image of the component
    baseImage: pingcap/pd

    ## pd-server configuration
    ## Ref: https://docs.pingcap.com/tidb/stable/pd-configuration-file
    config: |
      lease = 1800
      [dashboard]
        internal-proxy = true

    ## The desired replicas
    replicas: 1

    ## max inprogress failover PD pod counts
    maxFailoverCount: 0

    ## describes the compute resource requirements and limits.
    ## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
    requests:
    #   cpu: 1000m
    #   memory: 1Gi
      storage: 10Gi
    # limits:
    #   cpu: 2000m
    #   memory: 2Gi

    ## defines Kubernetes service for pd-server
    ## Ref: https://kubernetes.io/docs/concepts/services-networking/service/
    # service:
    #   type: ClusterIP
    #   annotations:
    #     foo: bar
    #   portName: client

    #############################
    # Advanced PD Configuration #
    #############################

    ## The following block overwrites TiDB cluster-level configurations in `spec`
    # version: "v6.1.0"
    # imagePullPolicy: IfNotPresent
    # imagePullSecrets:
    # - name: secretName
    # hostNetwork: false
    # serviceAccount: advanced-tidb-pd
    # priorityClassName: system-cluster-critical
    # schedulerName: default-scheduler
    # nodeSelector:
    #   app.kubernetes.io/component: pd
    # annotations:
    #   node.kubernetes.io/instance-type: some-vm-type
    # tolerations:
    #   - effect: NoSchedule
    #     key: dedicated
    #     operator: Equal
    #     value: pd
    # configUpdateStrategy: RollingUpdate
    # statefulSetUpdateStrategy: RollingUpdate

    ## List of environment variables to set in the container
    ## Note that the following env names cannot be used and will be overwritten by TiDB Operator builtin envs
    ##   - NAMESPACE
    ##   - TZ
    ##   - SERVICE_NAME
    ##   - PEER_SERVICE_NAME
    ##   - HEADLESS_SERVICE_NAME
    ##   - SET_NAME
    ##   - HOSTNAME
    ##   - CLUSTER_NAME
    ##   - POD_NAME
    ##   - BINLOG_ENABLED
    ##   - SLOW_LOG_FILE
    ## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
    # env:
    # - name: MY_ENV_1
    #   value: value1
    # - name: MY_ENV_2
    #   valueFrom:
    #     fieldRef:
    #       fieldPath: status.myEnv2

    ## Custom sidecar containers can be injected into the PD pods,
    ## which can act as a logging/tracing agent or for any other use case
    # additionalContainers:
    # - name: myCustomContainer
    #   image: ubuntu

    ## custom additional volumes in PD pods
    # additionalVolumes:
    # # specify volume types that are supported by Kubernetes, Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
    # - name: nfs
    #   nfs:
    #     server: 192.168.0.2
    #     path: /nfs

    ## custom additional volume mounts in PD pods
    # additionalVolumeMounts:
    # # this must match `name` in `additionalVolumes`
    # - name: nfs
    #   mountPath: /nfs

    ## Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
    ## Ref: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution
    # terminationGracePeriodSeconds: 30

    ## PodSecurityContext holds pod-level security attributes and common container settings.
    ## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
    # podSecurityContext:
    #   sysctls:
    #   - name: net.core.somaxconn
    #     value: "32768"

    ## when TLS cluster feature is enabled, TiDB Operator will automatically mount the cluster client certificates if mountClusterClientSecret is set to true
    ## Defaults to false
    ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/configure-a-tidb-cluster#mountclusterclientsecret
    mountClusterClientSecret: false

    ## The storageClassName of the persistent volume for PD data storage.
    storageClassName: "cna-reconplfm-dev-nas"

    ## defines additional volumes for which PVCs will be created by StatefulSet controller
    # storageVolumes:
    #   # this will be suffix of PVC names in VolumeClaimTemplates of PD StatefulSet
    # - name: volumeName
    #   # specify this to use special storageClass for this volume, default to component-level `storageClassName`
    #   storageClassName: local-storage
    #   # storage request of PVC
    #   storageSize: 1Gi
    #   # mount path of the PVC
    #   mountPath: /some/path

    ## Subdirectory within the volume to store PD Data. By default, the data
    ## is stored in the root directory of volume which is mounted at
    ## /var/lib/pd. Specifying this will change the data directory to a subdirectory,
    ## e.g. /var/lib/pd/data if you set the value to "data".
    ## It's dangerous to change this value for a running cluster as it will
    ## upgrade your cluster to use a new storage directory.
    ## Defaults to "" (volume's root).
    # dataSubDir: ""

    ## Affinity for pod scheduling
    ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
    # affinity:
    #   podAntiAffinity:
    #     # prefer not to run pd pods on the same node which runs tidb/tikv pods
    #     preferredDuringSchedulingIgnoredDuringExecution:
    #     - podAffinityTerm:
    #         labelSelector:
    #           matchExpressions:
    #           - key: app.kubernetes.io/component
    #             operator: In
    #             values:
    #             - tidb
    #             - tikv
    #         topologyKey: kubernetes.io/hostname
    #       weight: 100
    #     # require not to run PD pods on nodes where there's already a PD pod running
    #     # if setting this, you must ensure that at least `replicas` nodes are available in the cluster
    #     requiredDuringSchedulingIgnoredDuringExecution:
    #     - labelSelector:
    #         matchExpressions:
    #         - key: app.kubernetes.io/component
    #           operator: In
    #           values:
    #           - pd
    #       topologyKey: kubernetes.io/hostname

    ## set a different tidb client TLS cert secret name for TiDB Dashboard than the default ${clusterName}-tidb-client-secret
    ## only useful when TLS is enabled for TiDB server
    ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/enable-tls-for-mysql-client
    # tlsClientSecretName: custom-tidb-client-secret-name

    ## TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
    ## Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
    # topologySpreadConstraints:
    # - topologyKey: topology.kubernetes.io/zone

  tidb:
    ############################
    # Basic TiDB Configuration #
    ############################

    ## Base image of the component
    baseImage: pingcap/tidb

    ## tidb-server Configuration
    ## Ref: https://docs.pingcap.com/tidb/stable/tidb-configuration-file
    config: |
      [performance]
        tcp-keep-alive = true

    ## The desired replicas
    replicas: 2

    ## max inprogress failover TiDB pod counts
    maxFailoverCount: 0

    ## describes the compute resource requirements.
    ## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
    # requests:
    #   cpu: 1000m
    #   memory: 1Gi
    # limits:
    #   cpu: 2000m
    #   memory: 2Gi

    ## defines Kubernetes service for tidb-server
    ## If you are in a public cloud environment, you can use cloud LoadBalancer to access the TiDB service
    ## if you are in a private cloud environment, you can use Ingress or NodePort, or ClusterIP and port forward (only for development/test)
    ## you can set mysqlNodePort and statusNodePort to expose server/status service to the given NodePort
    service:
      type: ClusterIP
      # Ref: https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/#preserving-the-client-source-ip
      # externalTrafficPolicy: Local
      # # which NodePort to expose 4000 (mysql) port of tidb-server, only effective when type=LoadBalancer/NodePort
      # mysqlNodePort: 30020
      # # whether to export the status port, defaults to true
      # exposeStatus: true
      # # which NodePort to expose 10080 (status) port of tidb-server, only effective when type=LoadBalancer/NodePort and exposeStatus=true
      # statusNodePort: 30040

    ###############################
    # Advanced TiDB Configuration #
    ###############################

    ## The following block overwrites TiDB cluster-level configurations in `spec`
    # version: "v6.1.0"
    # imagePullPolicy: IfNotPresent
    # imagePullSecrets:
    # - name: secretName
    # hostNetwork: false
    # serviceAccount: advanced-tidb-tidb
    # priorityClassName: system-cluster-critical
    # schedulerName: default-scheduler
    # nodeSelector:
    #   app.kubernetes.io/component: tidb
    # annotations:
    #   node.kubernetes.io/instance-type: some-vm-type
    # tolerations:
    #   - effect: NoSchedule
    #     key: dedicated
    #     operator: Equal
    #     value: tidb
    # configUpdateStrategy: RollingUpdate
    # statefulSetUpdateStrategy: RollingUpdate

    ## List of environment variables to set in the container
    ## Note that the following env names cannot be used and will be overwritten by TiDB Operator builtin envs
    ##   - NAMESPACE
    ##   - TZ
    ##   - SERVICE_NAME
    ##   - PEER_SERVICE_NAME
    ##   - HEADLESS_SERVICE_NAME
    ##   - SET_NAME
    ##   - HOSTNAME
    ##   - CLUSTER_NAME
    ##   - POD_NAME
    ##   - BINLOG_ENABLED
    ##   - SLOW_LOG_FILE
    ## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
    # env:
    # - name: MY_ENV_1
    #   value: value1
    # - name: MY_ENV_2
    #   valueFrom:
    #     fieldRef:
    #       fieldPath: status.myEnv2

    ## Custom sidecar containers can be injected into the TiDB pods,
    ## which can act as a logging/tracing agent or for any other use case
    # additionalContainers:
    # - name: myCustomContainer
    #   image: ubuntu

    ## custom additional volumes in TiDB pods
    # additionalVolumes:
    # # specify volume types that are supported by Kubernetes, Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
    # - name: nfs
    #   nfs:
    #     server: 192.168.0.2
    #     path: /nfs

    ## custom additional volume mounts in TiDB pods
    # additionalVolumeMounts:
    # # this must match `name` in `additionalVolumes`
    # - name: nfs
    #   mountPath: /nfs

    ## Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
    ## Ref: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution
    # terminationGracePeriodSeconds: 30

    ## PodSecurityContext holds pod-level security attributes and common container settings.
    ## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
    # podSecurityContext:
    #   sysctls:
    #   - name: net.ipv4.tcp_keepalive_time
    #     value: "300"
    #   - name: net.ipv4.tcp_keepalive_intvl
    #     value: "75"
    #   - name: net.core.somaxconn
    #     value: "32768"

    ## prob tidb-server readiness
    ## valid type values:
    ##   - `tcp`, which uses Kubernetes TCPSocketAction to prob the 4000 tcp port of the pod
    ##   - `command`, which uses curl to access the /status path on port 10080 of the pod
    ## This is supported from TiDB Operator v1.1.7
    # readinessProbe:
    #   # The `command` type is only supported after tidb v4.0.9, ref: https://github.com/pingcap/tidb/pull/20694
    #   type: command

    ## when enabled, TiDB will accept TLS encrypted connections from MySQL client
    ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/enable-tls-for-mysql-client
    # tlsClient:
    #   enabled: true
    #   disableClientAuthn: false
    #   skipInternalClientCA: false

    ## binlogEnabled will automatically be true if Pump is enabled, otherwise false
    ## set this manually only if you really know what you are doing
    ## MANUAL CONFIG NOT RECOMMENDED
    # binlogEnabled: false

    ## if enabled, the slow log will be shown in a separate sidecar container
    # separateSlowLog: true
    # slowLogVolumeName: ""
    ## configures separate sidecar container, where `image` & `imagePullPolicy` will be overwritten by
    ## the same field in `TidbCluster.helper`
    # slowLogTailer:
    #   requests:
    #     cpu: 1000m
    #     memory: 1Gi
    #   limits:
    #     cpu: 2000m
    #     memory: 2Gi
    #   image: busybox
    #   imagePullPolicy: IfNotPresent

    ## The storageClassName of the persistent volume for TiDB data storage.
    storageClassName: "cna-reconplfm-dev-nas"

    ## defines additional volumes for which PVCs will be created by StatefulSet controller
    # storageVolumes:
    #   # this will be suffix of PVC names in VolumeClaimTemplates of TiDB StatefulSet
    # - name: volumeName
    #   # specify this to use special storageClass for this volume, default to component-level `storageClassName`
    #   storageClassName: local-storage
    #   # storage request of PVC
    #   storageSize: 1Gi
    #   # mount path of the PVC
    #   mountPath: /some/path

    ## config Kubernetes container lifecycle hooks for tidb-server pods
    ## Ref: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
    # lifecycle:
    #   postStart:
    #     exec:
    #       command:
    #       - echo
    #       - "postStart"
    #   preStop:
    #     exec:
    #       command:
    #       - echo
    #       - "preStop"

    ## Affinity for pod scheduling
    ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
    # affinity:
    #   podAntiAffinity:
    #     preferredDuringSchedulingIgnoredDuringExecution:
    #     - podAffinityTerm:
    #         labelSelector:
    #           matchExpressions:
    #           - key: app.kubernetes.io/component
    #             operator: In
    #             values:
    #             - pd
    #             - tikv
    #         topologyKey: kubernetes.io/hostname
    #       weight: 100
    #     # require not to run TiDB pods on nodes where there's already a TiDB pod running
    #     # if setting this, you must ensure that at least `replicas` nodes are available in the cluster
    #     requiredDuringSchedulingIgnoredDuringExecution:
    #     - labelSelector:
    #         matchExpressions:
    #         - key: app.kubernetes.io/component
    #           operator: In
    #           values:
    #           - tidb
    #       topologyKey: kubernetes.io/hostname

    ## TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
    ## Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
    # topologySpreadConstraints:
    # - topologyKey: topology.kubernetes.io/zone

  tikv:
    ############################
    # Basic TiKV Configuration #
    ############################

    ## Base image of the component
    baseImage: pingcap/tikv

    ## tikv-server configuration
    ## Ref: https://docs.pingcap.com/tidb/stable/tikv-configuration-file
    config: |
      log-level = "info"

    ## The desired replicas
    replicas: 3

    ## max inprogress failover TiKV pod counts
    maxFailoverCount: 0

    ## describes the compute resource requirements.
    ## Ref: https://kubernetes.io/docs/concepts/configuration/manage-compute-resources-container/
    requests:
    #   cpu: 1000m
    #   memory: 1Gi
      storage: 50Gi
    # limits:
    #   cpu: 2000m
    #   memory: 2Gi
    #   # settings `storage` here will add `--capacity` arg to tikv-server
    #   storage: 10Gi

    ###############################
    # Advanced TiKV Configuration #
    ###############################

    ## The following block overwrites TiDB cluster-level configurations in `spec`
    # version: "v6.1.0"
    # imagePullPolicy: IfNotPresent
    # imagePullSecrets:
    # - name: secretName
    # hostNetwork: false
    # serviceAccount: advanced-tidb-tikv
    # priorityClassName: system-cluster-critical
    # schedulerName: default-scheduler
    # nodeSelector:
    #   app.kubernetes.io/component: tikv
    # annotations:
    #   node.kubernetes.io/instance-type: some-vm-type
    # tolerations:
    #   - effect: NoSchedule
    #     key: dedicated
    #     operator: Equal
    #     value: tikv
    # configUpdateStrategy: RollingUpdate
    # statefulSetUpdateStrategy: RollingUpdate

    ## List of environment variables to set in the container
    ## Note that the following env names cannot be used and will be overwritten by TiDB Operator builtin envs
    ##   - NAMESPACE
    ##   - TZ
    ##   - SERVICE_NAME
    ##   - PEER_SERVICE_NAME
    ##   - HEADLESS_SERVICE_NAME
    ##   - SET_NAME
    ##   - HOSTNAME
    ##   - CLUSTER_NAME
    ##   - POD_NAME
    ##   - BINLOG_ENABLED
    ##   - SLOW_LOG_FILE
    ## Ref: https://kubernetes.io/docs/tasks/inject-data-application/environment-variable-expose-pod-information/
    # env:
    # - name: MY_ENV_1
    #   value: value1
    # - name: MY_ENV_2
    #   valueFrom:
    #     fieldRef:
    #       fieldPath: status.myEnv2

    ## Custom sidecar containers can be injected into the TiKV pods,
    ## which can act as a logging/tracing agent or for any other use case
    # additionalContainers:
    # - name: myCustomContainer
    #   image: ubuntu

    ## custom additional volumes in TiKV pods
    # additionalVolumes:
    # # specify volume types that are supported by Kubernetes, Ref: https://kubernetes.io/docs/concepts/storage/persistent-volumes/#types-of-persistent-volumes
    # - name: nfs
    #   nfs:
    #     server: 192.168.0.2
    #     path: /nfs

    ## custom additional volume mounts in TiKV pods
    # additionalVolumeMounts:
    # # this must match `name` in `additionalVolumes`
    # - name: nfs
    #   mountPath: /nfs

    ## Optional duration in seconds the pod needs to terminate gracefully. May be decreased in delete request.
    ## Ref: https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-handler-execution
    # terminationGracePeriodSeconds: 30

    ## PodSecurityContext holds pod-level security attributes and common container settings.
    ## Ref: https://kubernetes.io/docs/tasks/configure-pod-container/security-context/
    # podSecurityContext:
    #   sysctls:
    #   - name: net.ipv4.tcp_keepalive_time
    #     value: "300"
    #   - name: net.ipv4.tcp_keepalive_intvl
    #     value: "75"
    #   - name: net.core.somaxconn
    #     value: "32768"

    ## when TLS cluster feature is enabled, TiDB Operator will automatically mount the cluster client certificates if mountClusterClientSecret is set to true
    ## Defaults to false
    ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/configure-a-tidb-cluster#mountclusterclientsecret
    mountClusterClientSecret: false

    ## if enabled, the RocksDB log will be shown in a separate sidecar container
    # separateRocksDBLog: true
    ## rocksDBLogVolumeName is only applicable when separateRocksDBLog is enabled.
    ## if separateRocksDBLog is enabled, rocksDBLogVolumeName can be set or not.
    ## if rocksDBLogVolumeName is not set, the rocksdb log is saved in the PV of TiKV data, which may increase the IO pressure of the data disk.
    ## if rocksDBLogVolumeName is set, rocksDBLogVolumeName should be defined in storageVolumes or additionalVolumes to use another PV.
    ## you may need to change the `rocksdb.info-log-dir` to save the logs in the separate PV.
    ## Ref: https://github.com/tikv/tikv/blob/master/etc/config-template.toml
    # rocksDBLogVolumeName: ""

    ## if enabled, the Raft log will be shown in a separate sidecar container
    # separateRaftLog: true
    ## raftLogVolumeName is only applicable when separateRaftLog is enabled.
    ## if separateRaftLog is enabled, raftLogVolumeName can be set or not.
    ## if raftLogVolumeName is not set, the separated raftdb log is saved in the PV of TiKV data, which may increase the IO pressure of the data disk.
    ## if raftLogVolumeName is set, raftLogVolumeName should be defined in storageVolumes or additionalVolumes to use another PV.
    ## you may need to change the `raftdb.info-log-dir` to save the logs in the separate PV.
    ## Ref: https://github.com/tikv/tikv/blob/master/etc/config-template.toml
    # raftLogVolumeName: ""

    ## configures RocksDB/Raft log sidecar container resource requirements
    # logTailer:
    #   requests:
    #     cpu: 1000m
    #     memory: 1Gi
    #   limits:
    #     cpu: 2000m
    #     memory: 2Gi

    ## The storageClassName of the persistent volume for TiKV data storage.
    storageClassName: "cna-reconplfm-dev-nas"

    ## defines additional volumes for which PVCs will be created by StatefulSet controller
    # storageVolumes:
    #   # this will be suffix of PVC names in VolumeClaimTemplates of TiKV StatefulSet
    # - name: volumeName
    #   # specify this to use special storageClass for this volume, default to component-level `storageClassName`
    #   storageClassName: local-storage
    #   # storage request of PVC
    #   storageSize: 1Gi
    #   # mount path of the PVC
    #   mountPath: /some/path

    ## run TiKV container in privileged mode
    ## Processes in privileged containers are essentially equivalent to root on the host
    ## NOT RECOMMENDED in production environment
    ## Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privileged
    # privileged: false

    ## if set to true, TiDB Operator will recover newly created TiKV pods due to automatic failover
    # recoverFailover: true

    ## Subdirectory within the volume to store TiKV Data. By default, the data
    ## is stored in the root directory of volume which is mounted at /var/lib/tikv.
    ## Specifying this will change the data directory to a subdirectory, e.g.
    ## /var/lib/tikv/data if you set the value to "data".
    ## It's dangerous to change this value for a running cluster as it will
    ## upgrade your cluster to use a new storage directory.
    ## Defaults to "" (volume's root).
    # dataSubDir: ""

    ## defines the timeout for region leader eviction in golang `Duration` format, if raft region leaders are not transferred to other stores after this duration, TiDB Operator will delete the Pod forcibly.
    # evictLeaderTimeout: 3m

    ## Affinity for pod scheduling
    ## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
    # affinity:
    #   podAntiAffinity:
    #     preferredDuringSchedulingIgnoredDuringExecution:
    #     - podAffinityTerm:
    #         labelSelector:
    #           matchExpressions:
    #           - key: app.kubernetes.io/component
    #             operator: In
    #             values:
    #             - tidb
    #             - pd
    #         topologyKey: kubernetes.io/hostname
    #       weight: 100
    #     # require not to run TiKV pods on nodes where there's already a TiKV pod running
    #     # if setting this, you must ensure that at least `replicas` nodes are available in the cluster
    #     requiredDuringSchedulingIgnoredDuringExecution:
    #     - labelSelector:
    #         matchExpressions:
    #         - key: app.kubernetes.io/component
    #           operator: In
    #           values:
    #           - tikv
    #       topologyKey: kubernetes.io/hostname

    ## TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
    ## Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
    # topologySpreadConstraints:
    # - topologyKey: topology.kubernetes.io/zone

  ## Deploy TiDB Binlog of a TiDB cluster
  ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/deploy-tidb-binlog/#deploy-pump
  # pump:
  #   baseImage: pingcap/tidb-binlog
  #   version: "v6.1.0"
  #   replicas: 1
  #   storageClassName: local-storage
  #   requests:
  #     cpu: 1000m
  #     memory: 1Gi
  #     storage: 1Gi
  #   limits:
  #     cpu: 2000m
  #     memory: 2Gi
  #   imagePullPolicy: IfNotPresent
  #   imagePullSecrets:
  #   - name: secretName
  #   hostNetwork: false
  #   serviceAccount: advanced-tidb-pump
  #   priorityClassName: system-cluster-critical
  #   schedulerName: default-scheduler
  #   nodeSelector:
  #     app.kubernetes.io/component: pump
  #   annotations:
  #     node.kubernetes.io/instance-type: some-vm-type
  #   tolerations: {}
  #   configUpdateStrategy: RollingUpdate
  #   statefulSetUpdateStrategy: RollingUpdate
  #   podSecurityContext: {}
  #   env: []
  #   additionalContainers: []
  #   additionalVolumes: []
  #   additionalVolumeMounts: []
  #   terminationGracePeriodSeconds: 30
  #   # Ref: https://docs.pingcap.com/tidb/stable/tidb-binlog-configuration-file#pump
  #   config: |
  #     gc = 7
  #   # TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
  #   # Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
  #   topologySpreadConstraints:
  #   - topologyKey: topology.kubernetes.io/zone

  ## TiCDC is a tool for replicating the incremental data of TiDB
  ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/deploy-ticdc/
  # ticdc:
  #   baseImage: pingcap/ticdc
  #   version: "v6.1.0"
  #   replicas: 3
  #   storageClassName: local-storage
  #   requests:
  #     cpu: 1000m
  #     memory: 1Gi
  #   limits:
  #     cpu: 2000m
  #     memory: 2Gi
  #   imagePullPolicy: IfNotPresent
  #   imagePullSecrets:
  #   - name: secretName
  #   hostNetwork: false
  #   serviceAccount: advanced-tidb-ticdc
  #   priorityClassName: system-cluster-critical
  #   schedulerName: default-scheduler
  #   nodeSelector:
  #     app.kubernetes.io/component: ticdc
  #   annotations:
  #     node.kubernetes.io/instance-type: some-vm-type
  #   tolerations: {}
  #   configUpdateStrategy: RollingUpdate
  #   statefulSetUpdateStrategy: RollingUpdate
  #   podSecurityContext: {}
  #   env: []
  #   additionalContainers: []
  #   storageVolumes: []
  #   additionalVolumes: []
  #   additionalVolumeMounts: []
  #   terminationGracePeriodSeconds: 30
  #   # Ref: https://docs.pingcap.com/tidb/stable/deploy-ticdc#add-ticdc-to-an-existing-tidb-cluster-using-binary-not-recommended
  #   config:
  #     timezone: UTC
  #     gcTTL: 86400
  #     logLevel: info
  #     logFile: /dev/stderr
  #   # TopologySpreadConstraints for pod scheduling, will overwrite cluster level spread constraints setting
  #   # Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
  #   topologySpreadConstraints:
  #   - topologyKey: topology.kubernetes.io/zone

  ## TiFlash is the columnar storage extension of TiKV
  ## Ref: https://docs.pingcap.com/tidb-in-kubernetes/stable/deploy-tiflash/
  # tiflash:
  #   ###############################
  #   # Basic TiFlash Configuration #
  #   ###############################
  #   baseImage: pingcap/tiflash
  #   version: "v6.1.0"
  #   replicas: 1
  #   # limits:
  #   #   cpu: 2000m
  #   #   memory: 2Gi
  #   imagePullPolicy: IfNotPresent
  #   imagePullSecrets:
  #   - name: secretName

  #   ##################################
  #   # Advanced TiFlash Configuration #
  #   ##################################
  #   maxFailoverCount: 0
  #   hostNetwork: false
  #   serviceAccount: advanced-tidb-tiflash
  #   priorityClassName: system-cluster-critical
  #   schedulerName: default-scheduler
  #   nodeSelector:
  #     app.kubernetes.io/component: tiflash
  #   annotations:
  #     node.kubernetes.io/instance-type: some-vm-type
  #   tolerations: {}
  #   configUpdateStrategy: RollingUpdate
  #   statefulSetUpdateStrategy: RollingUpdate
  #   podSecurityContext: {}
  #   env: []
  #   additionalContainers: []
  #   additionalVolumes: []
  #   additionalVolumeMounts: []
  #   terminationGracePeriodSeconds: 30
  #   storageClaims:
  #     - resources:
  #         requests:
  #           # specify PVC storage used for TiFlash
  #           storage: 1Gi
  #       # specify PVC storage class
  #       storageClassName: local-storage
  #   # run TiFlash container in privileged mode
  #   # Processes in privileged containers are essentially equivalent to root on the host
  #   # NOT RECOMMENDED in production environment
  #   # Ref: https://kubernetes.io/docs/concepts/policy/pod-security-policy/#privileged
  #   privileged: false

  #   # if set to true, TiDB Operator will recover newly created TiFlash pods due to automatic failover
  #   recoverFailover: true

  #   # configures serverlog/errorlog/clusterlog sidecar container resource requirements
  #   # logTailer:
  #   #   requests:
  #   #     cpu: 1000m
  #   #     memory: 1Gi
  #   #   limits:
  #   #     cpu: 2000m
  #   #     memory: 2Gi
  #   # configures init container resource requirements
  #   initializer:
  #     requests:
  #       cpu: 1000m
  #       memory: 1Gi
  #     limits:
  #       cpu: 2000m
  #       memory: 2Gi

  #   # TOML format configuration
  #   # Ref: https://docs.pingcap.com/tidb/dev/tiflash-configuration
  #   config:
  #     # configure the configuration file for TiFlash process
  #     config: |
  #       [logger]
  #         log = "/data0/logs/somelog"
  #     # configure the configuration file for TiFlash Proxy process
  #     proxy: |
  #       [security]
  #         cert-allowed-cn = "CNNAME"
  #   # TopologySpreadConstraints for pod scheduling, will overwrite the cluster level spread constraints setting
  #   # Ref: pkg/apis/pingcap/v1alpha1/types.go#TopologySpreadConstraint
  #   topologySpreadConstraints:
  #   - topologyKey: topology.kubernetes.io/zone

xfworld · 2022 年11 月 1 日 10:25

你用啥方式部署的？推荐你用 helm 试试

配置文件没看出什么问题

TiDBer_6Lizki07 · 2022 年11 月 1 日 10:28

我是按官方的文档步骤来的，用helm部署 TiDB Operator
用kubectl apply -f ${cluster_name} -n ${namespace} 来部署tidb集群。你说的是helm部署tidb集群吗？能给过说明链接地址吗？

xfworld · 2022 年11 月 1 日 10:30

https://docs.pingcap.com/zh/tidb-in-kubernetes/v1.4/deploy-tidb-operator

你要先搞定 k8s 的环境，，能对得上号才行的
https://docs.pingcap.com/zh/tidb-in-kubernetes/v1.4/prerequisites