老蒋的知识库

  • 首页
  • 文章归档
  • 关于页面

  • 搜索

K8S新增Master加入集群

发表于 2023-05-22 | 分类于 K8S部署 | 0 | 阅读次数 142

新增master

原有master节点,获取join node命令

kubeadm token create --print-join-command
# 得到
# kubeadm join 172.16.92.196:6443 --token nknwoo.b2psf52tkqnntty7 --discovery-token-ca-cert-hash sha256:bb6546ceaee7948ab57789aaaedb7ba1211cb4ca9c05e1855407a7c033d683e3

原有master节点,生成key

kubeadm init phase upload-certs --upload-certs
# 得到
# I0520 23:15:19.456755 1323595 version.go:255] remote version is much newer: v1.27.2; falling back to: stable-1.24
# [upload-certs] Storing the certificates in Secret "kubeadm-certs" in the "kube-system" Namespace
# [upload-certs] Using certificate key:
# bb2d5bbc1467ae5931282e4afc7bce98a179796d7c692dedb961003cd0785f4e

在新的node 服务器使用 --control-plane --certificate-key 拼接key证书获取master join命令

# 先初始化节点
kubeadm reset
# 再加入集群
kubeadm join 172.16.92.196:6443 --token nknwoo.b2psf52tkqnntty7 --discovery-token-ca-cert-hash sha256:bb6546ceaee7948ab57789aaaedb7ba1211cb4ca9c05e1855407a7c033d683e3 --control-plane --certificate-key  bb2d5bbc1467ae5931282e4afc7bce98a179796d7c692dedb961003cd0785f4e

报错处理

曾经加入过集群的node节点,再次加入集群时会报错

[root@k8s-0 ~]# kubeadm join 172.16.92.196:6443 --token nknwoo.b2psf52tkqnntty7 --discovery-token-ca-cert-hash sha256:bb6546ceaee7948ab57789aaaedb7ba1211cb4ca9c05e1855407a7c033d683e3 --control-plane --certificate-key  bb2d5bbc1467ae5931282e4afc7bce98a179796d7c692dedb961003cd0785f4e
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: missing optional cgroups: blkio
error execution phase preflight: [preflight] Some fatal errors occurred:
        [ERROR FileAvailable--etc-kubernetes-kubelet.conf]: /etc/kubernetes/kubelet.conf already exists
        [ERROR Port-10250]: Port 10250 is in use
[preflight] If you know what you are doing, you can make a check non-fatal with `--ignore-preflight-errors=...`
To see the stack trace of this error execute with --v=5 or higher

错误原因:如果原有node节点已经加入集群,再次执行join命令会报错
解决方法:需要先删除node节点,再kubeadm reset后join

  1. master节点删除work节点
kubectl get node
kubectl delete nodes k8s-0
  1. work节点执行kubeadm reset
kubeadm reset
  1. 再次加入
kubeadm join 172.16.92.196:6443 --token nknwoo.b2psf52tkqnntty7 --discovery-token-ca-cert-hash sha256:bb6546ceaee7948ab57789aaaedb7ba1211cb4ca9c05e1855407a7c033d683e3 --control-plane --certificate-key  bb2d5bbc1467ae5931282e4afc7bce98a179796d7c692dedb961003cd0785f4e

非高可用节点加入多个master报错

[root@k8s-0 ~]# kubeadm join 172.16.92.196:6443 --token nknwoo.b2psf52tkqnntty7 --discovery-token-ca-cert-hash sha256:bb6546ceaee7948ab57789aaaedb7ba1211cb4ca9c05e1855407a7c033d683e3 --control-plane --certificate-key  bb2d5bbc1467ae5931282e4afc7bce98a179796d7c692dedb961003cd0785f4e
[preflight] Running pre-flight checks
        [WARNING SystemVerification]: missing optional cgroups: blkio
[preflight] Reading configuration from the cluster...
[preflight] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
error execution phase preflight:
One or more conditions for hosting a new control plane instance is not satisfied.

unable to add a new control plane instance to a cluster that doesn't have a stable controlPlaneEndpoint address

Please ensure that:
* The cluster has a stable controlPlaneEndpoint address.
* The certificates that must be shared among control plane instances are provided.


To see the stack trace of this error execute with --v=5 or higher

错误原因:原有k8s非高可用集群,需要添加:controlPlaneEndpoint
解决方法:

  1. 查看kubeadm-config.yaml
kubectl -n kube-system get cm kubeadm-config -oyaml
  1. 添加controlPlaneEndpoint
# 编辑
kubectl -n kube-system edit cm kubeadm-config

# 添加
...
    kind: ClusterConfiguration
    kubernetesVersion: v1.24.0
    controlPlaneEndpoint: 172.16.92.196:6443 # 这里的ip是master0的ip
    networking:
... 

新的master节点使用kubectl命令报错

[root@k8s-0 ~]# kubectl get node
The connection to the server localhost:8080 was refused - did you specify the right host or port?

错误原因:新master node服务器未配置kubectl
解决方法:配置常规用户使用kubectl访问k8s集群

mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config

新master节点加入集群后,所有的pod都是pending状态

查看pod状态发现新增master node有污染

# 发现所有的pod都是pending状态
kubectl get pods --all-namespaces -o wide
NAMESPACE              NAME                                         READY   STATUS     RESTARTS      AGE    IP               NODE     NOMINATED NODE   READINESS GATES
ai-nav                 ai-nav-nginx-75b6df9c64-qlb5c                0/1     Pending    0             24m    <none>           <none>   <none>           <none>
apisix                 apisix-dashboard-7b6cdd75d6-mp7zk            0/1     Pending    0             24m    <none>           <none>   <none>           <none>
apisix                 apisix-etcd-0                                0/1     Pending    0             24m    <none>           <none>   <none>           <none>

# 查看pod状态发现没有node可以调度
kubectl -n ai-nav describe pods ai-nav-nginx-75b6df9c64-mtbwt
...
 0/1 nodes are available: 1 node(s) had untolerated taint {node-role.kubernetes.io/control-plane: }. preemption: 0/1 nodes are available: 1 Preemption is not helpful for scheduling.
...

处理方法:删除污染

# 检查master节点污染,发现果然有污染
kubectl describe node k8s-0 | grep Taint

Taints:             node-role.kubernetes.io/control-plane:NoSchedule
# 删除污染
kubectl taint nodes k8s-0  node-role.kubernetes.io/control-plane:NoSchedule-
  • 本文作者: jagger
  • 本文链接: /archives/k8s-xin-zeng-master-jia-ru-ji-qun
  • 版权声明: 本博客所有文章除特别声明外,均采用CC BY-NC-SA 3.0 许可协议。转载请注明出处!
K8S集群DNS解析域名问题排查
Windows Docker快速部署项目
jagger

jagger

66 日志
31 分类
0 标签
Creative Commons
0%
© 2026 jagger
由 Halo 强力驱动