Kubernets进阶

五十岚2022年10月6日
大约 25 分钟

k8s 的调度、网络 与 认证授权

Kubernetes 进阶

1. etcd

查看etcd数据

拷贝etcdctl命令行工具:

$ docker exec -ti etcd_container which etcdctl
$ docker cp etcd_container:/usr/local/bin/etcdctl /usr/bin/etcdctl

查看所有key值:

$ ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key get / --prefix --keys-only

查看具体的key对应的数据:

$ ETCDCTL_API=3 etcdctl --endpoints=https://[127.0.0.1]:2379 --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/healthcheck-client.crt --key=/etc/kubernetes/pki/etcd/healthcheck-client.key get /registry/pods/jenkins/sonar-postgres-7fc5d748b6-gtmsb

2.Kubernetes 调度

为何要控制Pod应该如何调度
  • 集群中有些机器的配置高(SSD,更好的内存等),我们希望核心的服务(比如说数据库)运行在上面
  • 某两个服务的网络传输很频繁,我们希望它们最好在同一台机器上
  • ......
调度的过程

Scheduler提供的调度流程分为预选(Predicates)和优选(Priorities)两个步骤:

  • 预选,K8S会遍历当前集群中的所有Node,筛选出其中符合要求的Node作为候选

  • 优选,K8S将对候选的Node进行打分

经过预选筛选和优选打分之后,K8S选择分数最高的Node来运行Pod,如果最终有多个Node的分数最高,那么 Scheduler将从当中随机选择一个Node来运行Pod

Cordon
$ kubectl cordon k8s-slave2
$ kubectl drain k8s-slave2
NodeSelector

labelk8s 中一个非常重要的概念,用户可以非常灵活的利用 label 来管理集群中的资源,POD 的调度可以根据节点的 label 进行特定的部署。

查看节点的label:

$ kubectl get nodes --show-labels

为节点打label:

$ kubectl label node k8s-master disktype=ssd

当 node 被打上了相关标签后,在调度的时候就可以使用这些标签了,只需要在spec 字段中添加nodeSelector字段,里面是我们需要被调度的节点的 label。

...
spec:
hostNetwork: true # 声明pod的网络模式为host模式,效果通docker run --net=host
volumes:
- name: mysql-data
hostPath:
path: /opt/mysql/data
nodeSelector: # 使用节点选择器将Pod调度到指定label的节点
component: mysql
containers:
- name: mysql
image: 172.21.32.6:5000/demo/mysql:5.7
...
nodeAffinity

节点亲和性 , 比上面的nodeSelector更加灵活,它可以进行一些简单的逻辑组合,不只是简单的相等匹配 。分为两种,软策略和硬策略。

preferredDuringSchedulingIgnoredDuringExecution:软策略,如果你没有满足调度要求的节点的话,Pod就会忽略这条规则,继续完成调度过程,说白了就是满足条件最好了,没有满足就忽略掉的策略。

requiredDuringSchedulingIgnoredDuringExecution : 硬策略,如果没有满足条件的节点的话,就不断重试直到满足条件为止,简单说就是你必须满足我的要求,不然我就不会调度Pod。

#要求 Pod 不能运行在128和132两个节点上,如果有个节点满足disktype=ssd的话就优先调度到这个节点上
...
spec:
containers:
- name: demo
image: 172.21.32.6:5000/demo/myblog
ports:
- containerPort: 8002
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: NotIn
values:
- 192.168.136.128
- 192.168.136.132
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 1
preference:
matchExpressions:
- key: disktype
operator: In
values:
- ssd
- sas
...

这里的匹配逻辑是 label 的值在某个列表中,现在Kubernetes提供的操作符有下面的几种:

  • In:label 的值在某个列表中
  • NotIn:label 的值不在某个列表中
  • Gt:label 的值大于某个值
  • Lt:label 的值小于某个值
  • Exists:某个 label 存在
  • DoesNotExist:某个 label 不存在

如果nodeSelectorTerms下面有多个选项的话,满足任何一个条件就可以了;如果matchExpressions有多个选项的话,则必须同时满足这些条件才能正常调度 Pod

污点(Taints)与容忍(tolerations)

对于nodeAffinity无论是硬策略还是软策略方式,都是调度 Pod 到预期节点上,而Taints恰好与之相反,如果一个节点标记为 Taints ,除非 Pod 也被标识为可以容忍污点节点,否则该 Taints 节点不会被调度Pod。

Taints(污点)是Node的一个属性,设置了Taints(污点)后,因为有了污点,所以Kubernetes是不会将Pod调度到这个Node上的。于是Kubernetes就给Pod设置了个属性Tolerations(容忍),只要Pod能够容忍Node上的污点,那么Kubernetes就会忽略Node上的污点,就能够(不是必须)把Pod调度过去。

比如用户希望把 Master 节点保留给 Kubernetes 系统组件使用,或者把一组具有特殊资源预留给某些 Pod,则污点就很有用了,Pod 不会再被调度到 taint 标记过的节点。taint 标记节点举例如下:

设置污点:

$ kubectl taint node [node_name] key=value:[effect]
其中[effect] 可取值: [ NoSchedule | PreferNoSchedule | NoExecute ]
NoSchedule:一定不能被调度。
PreferNoSchedule:尽量不要调度。
NoExecute:不仅不会调度,还会驱逐Node上已有的Pod。
示例:kubectl taint node k8s-master smoke=true:NoSchedule

去除污点:

去除指定key及其effect:
kubectl taint nodes [node_name] key:[effect]- #这里的key不用指定value

去除指定key所有的effect:
kubectl taint nodes node_name key-

示例:
kubectl taint node k8s-master smoke=true:NoSchedule
kubectl taint node k8s-master smoke:NoExecute-
kubectl taint node k8s-master smoke-

污点演示:

## 给k8s-slave1打上污点,smoke=true:NoSchedule
$ kubectl taint node k8s-slave1 smoke=true:NoSchedule
$ kubectl taint node k8s-slave2 drunk=true:NoSchedule


## 扩容myblog的Pod,观察新Pod的调度情况
$ kuebctl -n demo scale deploy myblog --replicas=3
$ kubectl -n demo get po -w ## pending

Pod容忍污点示例:myblog/deployment/deploy-myblog-taint.yaml

...
spec:
containers:
- name: demo
image: 172.21.32.6:5000/demo/myblog
tolerations: #设置容忍性
- key: "smoke"
operator: "Equal" #如果操作符为Exists,那么value属性可省略,不指定operator,默认为Equal
value: "true"
effect: "NoSchedule"
- key: "drunk"
operator: "Equal" #如果操作符为Exists,那么value属性可省略,不指定operator,默认为Equal
value: "true"
effect: "NoSchedule"
#意思是这个Pod要容忍的有污点的Node的key是smoke Equal true,效果是NoSchedule,
#tolerations属性下各值必须使用引号,容忍的值都是设置Node的taints时给的值。
$ kubectl apply -f deploy-myblog-taint.yaml
spec:
containers:
- name: demo
image: 172.21.32.6:5000/demo/myblog
tolerations:
- operator: "Exists"

3. Kubernetes认证与授权

创建用户open in new window

APIService安全控制
  • Authentication:身份认证
  1. 这个环节它面对的输入是整个http request,负责对来自client的请求进行身份校验,支持的方法包括:
  • client证书验证(https双向验证)

  • basic auth

  • 普通token

  • jwt token(用于serviceaccount)

  1. APIServer启动时,可以指定一种Authentication方法,也可以指定多种方法。如果指定了多种方法,那么APIServer将会逐个使用这些方法对客户端请求进行验证, 只要请求数据通过其中一种方法的验证,APIServer就会认为Authentication成功;

  2. 使用kubeadm引导启动的k8s集群的apiserver初始配置中,默认支持client证书验证和serviceaccount两种身份验证方式。 证书认证通过设置--client-ca-file根证书以及--tls-cert-file--tls-private-key-file来开启。

  3. 在这个环节,apiserver会通过client证书或 http header中的字段(比如serviceaccount的jwt token)来识别出请求的用户身份,包括”user”、”group”等,这些信息将在后面的authorization环节用到。

  • Authorization:鉴权,你可以访问哪些资源
  1. 这个环节面对的输入是http request context中的各种属性,包括:usergrouprequest path(比如:/api/v1/healthz/version等)、 request verb(比如:getlistcreate等)。

  2. APIServer会将这些属性值与事先配置好的访问策略(access policy)相比较。APIServer支持多种authorization mode,包括Node、RBAC、Webhook等。

  3. APIServer启动时,可以指定一种authorization mode,也可以指定多种authorization mode,如果是后者,只要Request通过了其中一种mode的授权, 那么该环节的最终结果就是授权成功。在较新版本kubeadm引导启动的k8s集群的apiserver初始配置中,authorization-mode的默认配置是”Node,RBAC”

  • Admission Control:准入控制open in new window,一个控制链(层层关卡),偏集群安全控制、管理方面。为什么说是安全相关的机制?
  • 以NamespaceLifecycle为例, 该插件确保处于Termination状态的Namespace不再接收新的对象创建请求,并拒绝请求不存在的Namespace。该插件还可以防止删除系统保留的Namespace:default,kube-system,kube-public。
  • NodeRestriction, 此插件限制kubelet修改Node和Pod对象,这样的kubelets只允许修改绑定到Node的Pod API对象,以后版本可能会增加额外的限制 。

为什么我们执行命令kubectl命令,可以直接管理k8s集群资源?

kubectl的认证授权

kubectl的日志调试级别:

信息描述
v=0通常,这对操作者来说总是可见的。
v=1当您不想要很详细的输出时,这个是一个合理的默认日志级别。
v=2有关服务和重要日志消息的有用稳定状态信息,这些信息可能与系统中的重大更改相关。这是大多数系统推荐的默认日志级别。
v=3关于更改的扩展信息。
v=4调试级别信息。
v=6显示请求资源。
v=7显示 HTTP 请求头。
v=8显示 HTTP 请求内容。
v=9显示 HTTP 请求内容,并且不截断内容。
$ kubectl get nodes -v=7
I0329 20:20:08.633065 3979 loader.go:359] Config loaded from file /root/.kube/config
I0329 20:20:08.633797 3979 round_trippers.go:416] GET https://192.168.136.128:6443/api/v1/nodes?limit=500

kubeadm init启动完master节点后,会默认输出类似下面的提示内容:

... ...
Your Kubernetes master has initialized successfully!

To start using your cluster, you need to run the following as a regular user:
mkdir -p $HOME/.kube
sudo cp -i /etc/kubernetes/admin.conf $HOME/.kube/config
sudo chown $(id -u):$(id -g) $HOME/.kube/config
... ...

这些信息是在告知我们如何配置kubeconfig文件。按照上述命令配置后,master节点上的kubectl就可以直接使用$HOME/.kube/config的信息访问k8s cluster了。 并且,通过这种配置方式,kubectl也拥有了整个集群的管理员(root)权限。

很多K8s初学者在这里都会有疑问:

  • kubectl使用这种kubeconfig方式访问集群时,Kuberneteskube-apiserver是如何对来自kubectl的访问进行身份验证(authentication)和授权(authorization)的呢?
  • 为什么来自kubectl的请求拥有最高的管理员权限呢?

查看/root/.kube/config文件:

前面提到过apiserver的authentication支持通过tls client certificate、basic auth、token等方式对客户端发起的请求进行身份校验, 从kubeconfig信息来看,kubectl显然在请求中使用了tls client certificate的方式,即客户端的证书。

证书base64解码:

$ echo xxxxxxxxxxxxxx |base64 -d > kubectl.crt

说明在认证阶段,apiserver会首先使用--client-ca-file配置的CA证书去验证kubectl提供的证书的有效性,基本的方式 :

$ openssl verify -CAfile /etc/kubernetes/pki/ca.crt kubectl.crt
kubectl.crt: OK

除了认证身份,还会取出必要的信息供授权阶段使用,文本形式查看证书内容:

$ openssl x509 -in kubectl.crt -text
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 4736260165981664452 (0x41ba9386f52b74c4)
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN=kubernetes
Validity
Not Before: Feb 10 07:33:39 2020 GMT
Not After : Feb 9 07:33:40 2021 GMT
Subject: O=system:masters, CN=kubernetes-admin
...

认证通过后,提取出签发证书时指定的CN(Common Name),kubernetes-admin,作为请求的用户名 (User Name), 从证书中提取O(Organization)字段作为请求用户所属的组 (Group),group = system:masters,然后传递给后面的授权模块。

kubeadm在init初始引导集群启动过程中,创建了许多defaultrole、clusterrole、rolebindingclusterrolebinding, 在k8s有关RBAC的官方文档中,我们看到下面一些default clusterrole列表:

其中第一个cluster-admin这个cluster role binding绑定了system:masters group,这和authentication环节传递过来的身份信息不谋而合。 沿着system:masters group对应的cluster-admin clusterrolebinding“追查”下去,真相就会浮出水面。

我们查看一下这一binding:

$ kubectl describe clusterrolebinding cluster-admin
Name: cluster-admin
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
Role:
Kind: ClusterRole
Name: cluster-admin
Subjects:
Kind Name Namespace
---- ---- ---------
Group system:masters

我们看到在kube-system名字空间中,一个名为cluster-admin的clusterrolebinding将cluster-admin cluster role与system:masters Group绑定到了一起, 赋予了所有归属于system:masters Group中用户cluster-admin角色所拥有的权限。

我们再来查看一下cluster-admin这个role的具体权限信息:

$ kubectl describe clusterrole cluster-admin
Name: cluster-admin
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
*.* [] [] [*]
[*] [] [*]

非资源类,如查看集群健康状态。

RBAC

Role-Based Access Control,基于角色的访问控制, apiserver 启动参数添加 --authorization-mode=RBAC 来启用 RBAC 认证模式,用 kubeadm 安装的 k8s 集群默认开启

官方文档open in new window

k8s-master 节点查看 apiserver 进程是否开启

$ ps aux |grep apiserver

RBAC 模式引入了 4 个资源

  • Role,角色

一个Role只能授权访问单个namespace

## 示例定义一个名为pod-reader的角色,该角色具有读取default这个命名空间下的pods的权限
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
namespace: default
name: pod-reader
rules:
- apiGroups: [""] # "" indicates the core API group
resources: ["pods"]
verbs: ["get", "watch", "list"]

## apiGroups: "","apps", "autoscaling", "batch", kubectl api-versions
## resources: "services", "pods","deployments"... kubectl api-resources
## verbs: "get", "list", "watch", "create", "update", "patch", "delete", "exec"
  • ClusterRole

一个ClusterRole能够授予和Role一样的权限,但是它是集群范围内的。

## 定义一个集群角色,名为secret-reader,该角色可以读取所有的namespace中的secret资源
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
# "namespace" omitted since ClusterRoles are not namespaced
name: secret-reader
rules:
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get", "watch", "list"]
  • Rolebinding

将role中定义的权限分配给用户和用户组。RoleBinding包含主题(users,groups,或service accounts)和授予角色的引用。对于namespace内的授权使用RoleBinding,集群范围内使用ClusterRoleBinding。

## 定义一个角色绑定,将pod-reader这个role的权限授予给jane这个User,使得jane可以在读取default这个命名空间下的所有的pod数据
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: read-pods
namespace: default
subjects:
- kind: User #这里可以是User,Group,ServiceAccount
name: jane
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: Role #这里可以是Role或者ClusterRole,若是ClusterRole,则权限也仅限于rolebinding的内部
name: pod-reader # match the name of the Role or ClusterRole you wish to bind to
apiGroup: rbac.authorization.k8s.io

注意:rolebinding既可以绑定role,也可以绑定clusterrole,当绑定clusterrole的时候,subject的权限也会被限定于rolebinding定义的namespace内部,若想跨namespace,需要使用clusterrolebinding

## 定义一个角色绑定,将dave这个用户和secret-reader这个集群角色绑定,虽然secret-reader是集群角色,但是因为是使用rolebinding绑定的,因此dave的权限也会被限制在development这个命名空间内
apiVersion: rbac.authorization.k8s.io/v1
# This role binding allows "dave" to read secrets in the "development" namespace.
# You need to already have a ClusterRole named "secret-reader".
kind: RoleBinding
metadata:
name: read-secrets
#
# The namespace of the RoleBinding determines where the permissions are granted.
# This only grants permissions within the "development" namespace.
namespace: development
subjects:
- kind: User
name: dave # Name is case sensitive
apiGroup: rbac.authorization.k8s.io
- kind: ServiceAccount
name: dave # Name is case sensitive
namespace: demo
roleRef:
kind: ClusterRole
name: secret-reader
apiGroup: rbac.authorization.k8s.io

考虑一个场景: 如果集群中有多个namespace分配给不同的管理员,每个namespace的权限是一样的,就可以只定义一个clusterrole,然后通过rolebinding将不同的namespace绑定到管理员身上,否则就需要每个namespace定义一个Role,然后做一次rolebinding。

  • ClusterRolebingding

允许跨namespace进行授权

apiVersion: rbac.authorization.k8s.io/v1
# This cluster role binding allows anyone in the "manager" group to read secrets in any namespace.
kind: ClusterRoleBinding
metadata:
name: read-secrets-global
subjects:
- kind: Group
name: manager # Name is case sensitive
apiGroup: rbac.authorization.k8s.io
roleRef:
kind: ClusterRole
name: secret-reader
apiGroup: rbac.authorization.k8s.io
kubelet的认证授权

查看kubelet进程

$ systemctl status kubelet
● kubelet.service - kubelet: The Kubernetes Node Agent
Loaded: loaded (/etc/systemd/system/kubelet.service; enabled; vendor preset: disabled)
Drop-In: /etc/systemd/system/kubelet.service.d
└─10-kubeadm.conf
Active: active (running) since Wed 2020-04-01 02:34:13 CST; 1 day 14h ago
Docs: https://kubernetes.io/docs/
Main PID: 851 (kubelet)
Tasks: 21
Memory: 127.1M
CGroup: /system.slice/kubelet.service
└─851 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf

查看/etc/kubernetes/kubelet.conf,解析证书:

$ echo xxxxx |base64 -d >kubelet.crt
$ openssl x509 -in kubelet.crt -text
Certificate:
Data:
Version: 3 (0x2)
Serial Number: 9059794385454520113 (0x7dbadafe23185731)
Signature Algorithm: sha256WithRSAEncryption
Issuer: CN=kubernetes
Validity
Not Before: Feb 10 07:33:39 2020 GMT
Not After : Feb 9 07:33:40 2021 GMT
Subject: O=system:nodes, CN=system:node:master-1

得到我们期望的内容:

Subject: O=system:nodes, CN=system:node:k8s-master

我们知道,k8s会把O作为Group来进行请求,因此如果有权限绑定给这个组,肯定在clusterrolebinding的定义中可以找得到。因此尝试去找一下绑定了system:nodes组的clusterrolebinding

$ kubectl get clusterrolebinding|awk 'NR>1{print $1}'|xargs kubectl get clusterrolebinding -oyaml|grep -n10 system:nodes
98- roleRef:
99- apiGroup: rbac.authorization.k8s.io
100- kind: ClusterRole
101- name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
102- subjects:
103- - apiGroup: rbac.authorization.k8s.io
104- kind: Group
105: name: system:nodes
106-- apiVersion: rbac.authorization.k8s.io/v1
107- kind: ClusterRoleBinding
108- metadata:
109- creationTimestamp: "2020-02-10T07:34:02Z"
110- name: kubeadm:node-proxier
111- resourceVersion: "213"
112- selfLink: /apis/rbac.authorization.k8s.io/v1/clusterrolebindings/kubeadm%3Anode-proxier

$ kubectl describe clusterrole system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
Name: system:certificates.k8s.io:certificatesigningrequests:selfnodeclient
Labels: kubernetes.io/bootstrapping=rbac-defaults
Annotations: rbac.authorization.kubernetes.io/autoupdate: true
PolicyRule:
Resources Non-Resource URLs Resource Names Verbs
--------- ----------------- -------------- -----
certificatesigningrequests.certificates.k8s.io/selfnodeclient [] [] [create]

结局有点意外,除了system:certificates.k8s.io:certificatesigningrequests:selfnodeclient外,没有找到system相关的rolebindings,显然和我们的理解不一样。 尝试去找资料open in new window,发现了这么一段 :

Default ClusterRoleDefault ClusterRoleBindingDescription
system:kube-schedulersystem:kube-scheduler userAllows access to the resources required by the scheduleropen in new windowcomponent.
system:volume-schedulersystem:kube-scheduler userAllows access to the volume resources required by the kube-scheduler component.
system:kube-controller-managersystem:kube-controller-manager userAllows access to the resources required by the controller manageropen in new window component. The permissions required by individual controllers are detailed in the controller rolesopen in new window.
system:nodeNoneAllows access to resources required by the kubelet, including read access to all secrets, and write access to all pod status objects. You should use the Node authorizeropen in new window and NodeRestriction admission pluginopen in new window instead of the system:node role, and allow granting API access to kubelets based on the Pods scheduled to run on them. The system:node role only exists for compatibility with Kubernetes clusters upgraded from versions prior to v1.8.
system:node-proxiersystem:kube-proxy userAllows access to the resources required by the kube-proxyopen in new windowcomponent.

大致意思是说:之前会定义system:node这个角色,目的是为了kubelet可以访问到必要的资源,包括所有secret的读权限及更新pod状态的写权限。如果1.8版本后,是建议使用 Node authorizeropen in new window and NodeRestriction admission pluginopen in new window 来代替这个角色的。

我们目前使用1.16,查看一下授权策略:

$ ps axu|grep apiserver
kube-apiserver --authorization-mode=Node,RBAC --enable-admission-plugins=NodeRestriction

查看一下官网对Node authorizer的介绍:

Node authorization is a special-purpose authorization mode that specifically authorizes API requests made by kubelets.

In future releases, the node authorizer may add or remove permissions to ensure kubelets have the minimal set of permissions required to operate correctly.

In order to be authorized by the Node authorizer, kubelets must use a credential that identifies them as being in the system:nodes group, with a username of system:node:<nodeName>

Service Account

前面说,认证可以通过证书,也可以通过使用ServiceAccount(服务账户)的方式来做认证。大多数时候,我们在基于k8s做二次开发时都是选择通过serviceaccount的方式。我们之前访问dashboard的时候,是如何做的?

## 新建一个名为admin的serviceaccount,并且把名为cluster-admin的这个集群角色的权限授予新建的serviceaccount
apiVersion: v1
kind: ServiceAccount
metadata:
name: admin
namespace: kubernetes-dashboard
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1beta1
metadata:
name: admin
annotations:
rbac.authorization.kubernetes.io/autoupdate: "true"
roleRef:
kind: ClusterRole
name: cluster-admin
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: admin
namespace: kubernetes-dashboard

我们查看一下:

$ kubectl -n kubernetes-dashboard get sa admin -o yaml
apiVersion: v1
kind: ServiceAccount
metadata:
creationTimestamp: "2020-04-01T11:59:21Z"
name: admin
namespace: kubernetes-dashboard
resourceVersion: "1988878"
selfLink: /api/v1/namespaces/kubernetes-dashboard/serviceaccounts/admin
uid: 639ecc3e-74d9-11ea-a59b-000c29dfd73f
secrets:
- name: admin-token-lfsrf

注意到serviceaccount上默认绑定了一个名为admin-token-lfsrf的secret,我们查看一下secret

$ kubectl -n kubernetes-dashboard describe secret admin-token-lfsrf
Name: admin-token-lfsrf
Namespace: kubernetes-dashboard
Labels: <none>
Annotations: kubernetes.io/service-account.name: admin
kubernetes.io/service-account.uid: 639ecc3e-74d9-11ea-a59b-000c29dfd73f

Type: kubernetes.io/service-account-token
Data
====
ca.crt: 1025 bytes
namespace: 4 bytes
token: eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZW1vIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImFkbWluLXRva2VuLWxmc3JmIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiNjM5ZWNjM2UtNzRkOS0xMWVhLWE1OWItMDAwYzI5ZGZkNzNmIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmRlbW86YWRtaW4ifQ.ffGCU4L5LxTsMx3NcNixpjT6nLBi-pmstb4I-W61nLOzNaMmYSEIwAaugKMzNR-2VwM14WbuG04dOeO67niJeP6n8-ALkl-vineoYCsUjrzJ09qpM3TNUPatHFqyjcqJ87h4VKZEqk2qCCmLxB6AGbEHpVFkoge40vHs56cIymFGZLe53JZkhu3pwYuS4jpXytV30Ad-HwmQDUu_Xqcifni6tDYPCfKz2CZlcOfwqHeGIHJjDGVBKqhEeo8PhStoofBU6Y4OjObP7HGuTY-Foo4QindNnpp0QU6vSb7kiOiQ4twpayybH8PTf73dtdFt46UF6mGjskWgevgolvmO8A

开发的时候如何去调用k8s的api:

  1. curl演示
$ curl -k -H "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IiJ9.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkZW1vIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZWNyZXQubmFtZSI6ImFkbWluLXRva2VuLWxmc3JmIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQubmFtZSI6ImFkbWluIiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9zZXJ2aWNlLWFjY291bnQudWlkIjoiNjM5ZWNjM2UtNzRkOS0xMWVhLWE1OWItMDAwYzI5ZGZkNzNmIiwic3ViIjoic3lzdGVtOnNlcnZpY2VhY2NvdW50OmRlbW86YWRtaW4ifQ.ffGCU4L5LxTsMx3NcNixpjT6nLBi-pmstb4I-W61nLOzNaMmYSEIwAaugKMzNR-2VwM14WbuG04dOeO67niJeP6n8-ALkl-vineoYCsUjrzJ09qpM3TNUPatHFqyjcqJ87h4VKZEqk2qCCmLxB6AGbEHpVFkoge40vHs56cIymFGZLe53JZkhu3pwYuS4jpXytV30Ad-HwmQDUu_Xqcifni6tDYPCfKz2CZlcOfwqHeGIHJjDGVBKqhEeo8PhStoofBU6Y4OjObP7HGuTY-Foo4QindNnpp0QU6vSb7kiOiQ4twpayybH8PTf73dtdFt46UF6mGjskWgevgolvmO8A" https://62.234.214.206:6443/api/v1/namespaces/demo/pods?limit=500
  1. postman

基于EFK实现kubernetes集群的日志平台(扩展) 录屏!!!

EFK介绍

EFK工作示意

  • Elasticsearch

一个开源的分布式、Restful 风格的搜索和数据分析引擎,它的底层是开源库Apache Lucene。它可以被下面这样准确地形容:

  • 一个分布式的实时文档存储,每个字段可以被索引与搜索;

  • 一个分布式实时分析搜索引擎;

  • 能胜任上百个服务节点的扩展,并支持 PB 级别的结构化或者非结构化数据。

  • Fluentd

一个针对日志的收集、处理、转发系统。通过丰富的插件系统,可以收集来自于各种系统或应用的日志,转化为用户指定的格式后,转发到用户所指定的日志存储系统之中。

Fluentd 通过一组给定的数据源抓取日志数据,处理后(转换成结构化的数据格式)将它们转发给其他服务,比如 Elasticsearch、对象存储、kafka等等。Fluentd 支持超过300个日志存储和分析服务,所以在这方面是非常灵活的。主要运行步骤如下

  1. 首先 Fluentd 从多个日志源获取数据
  2. 结构化并且标记这些数据
  3. 然后根据匹配的标签将数据发送到多个目标服务
  • Kibana

Kibana是一个开源的分析和可视化平台,设计用于和Elasticsearch一起工作。可以通过Kibana来搜索,查看,并和存储在Elasticsearch索引中的数据进行交互。也可以轻松地执行高级数据分析,并且以各种图标、表格和地图的形式可视化数据。

部署es服务
部署分析
  1. es生产环境是部署es集群,通常会使用statefulset进行部署,此例由于演示环境资源问题,部署为单点
  2. 数据存储挂载主机路径
  3. es默认使用elasticsearch用户启动进程,es的数据目录是通过宿主机的路径挂载,因此目录权限被主机的目录权限覆盖,因此可以利用init container容器在es进程启动之前把目录的权限修改掉,注意init container要用特权模式启动。
部署并验证

efk/elasticsearch.yaml

apiVersion: apps/v1
kind: StatefulSet
metadata:
labels:
addonmanager.kubernetes.io/mode: Reconcile
k8s-app: elasticsearch-logging
version: v7.4.2
name: elasticsearch-logging
namespace: logging
spec:
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
k8s-app: elasticsearch-logging
version: v7.4.2
serviceName: elasticsearch-logging
template:
metadata:
labels:
k8s-app: elasticsearch-logging
version: v7.4.2
spec:
nodeSelector:
log: "true" ## 指定部署在哪个节点。需根据环境来修改
containers:
- env:
- name: NAMESPACE
valueFrom:
fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
- name: cluster.initial_master_nodes
value: elasticsearch-logging-0
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
image: 172.21.32.6:5000/elasticsearch/elasticsearch:7.4.2
name: elasticsearch-logging
ports:
- containerPort: 9200
name: db
protocol: TCP
- containerPort: 9300
name: transport
protocol: TCP
volumeMounts:
- mountPath: /usr/share/elasticsearch/data
name: elasticsearch-logging
dnsConfig:
options:
- name: single-request-reopen
initContainers:
- command:
- /sbin/sysctl
- -w
- vm.max_map_count=262144
image: alpine:3.6
imagePullPolicy: IfNotPresent
name: elasticsearch-logging-init
resources: {}
securityContext:
privileged: true
- name: fix-permissions
image: alpine:3.6
command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
securityContext:
privileged: true
volumeMounts:
- name: elasticsearch-logging
mountPath: /usr/share/elasticsearch/data
volumes:
- name: elasticsearch-logging
hostPath:
path: /esdata
---
apiVersion: v1
kind: Service
metadata:
labels:
k8s-app: elasticsearch-logging
name: elasticsearch
namespace: logging
spec:
ports:
- port: 9200
protocol: TCP
targetPort: db
selector:
k8s-app: elasticsearch-logging
type: ClusterIP
$ kubectl create namespace logging
## 给slave1节点打上label,将es服务调度到slave1节点
$ kubectl label node k8s-slave1 log=true
## 部署服务,可以先去部署es的节点把镜像下载到本地
$ kubectl create -f elasticsearch.yaml
statefulset.apps/elasticsearch-logging created
service/elasticsearch created

## 等待片刻,查看一下es的pod部署到了k8s-slave1节点,状态变为running
$ kubectl -n logging get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE
elasticsearch-logging-0 1/1 Running 0 69m 10.244.1.104 k8s-slave1
# 然后通过curl命令访问一下服务,验证es是否部署成功
$ kubectl -n logging get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
elasticsearch ClusterIP 10.109.174.58 <none> 9200/TCP 71m
$ curl 10.109.174.58:9200
{
"name" : "elasticsearch-logging-0",
"cluster_name" : "docker-cluster",
"cluster_uuid" : "uic8xOyNSlGwvoY9DIBT1g",
"version" : {
"number" : "7.4.2",
"build_flavor" : "default",
"build_type" : "docker",
"build_hash" : "2f90bbf7b93631e52bafb59b3b049cb44ec25e96",
"build_date" : "2019-10-28T20:40:44.881551Z",
"build_snapshot" : false,
"lucene_version" : "8.2.0",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
部署kibana
部署分析
  1. kibana需要暴漏web页面给前端使用,因此使用ingress配置域名来实现对kibana的访问
  2. kibana为无状态应用,直接使用Deployment来启动
  3. kibana需要访问es,直接利用k8s服务发现访问此地址即可,http://elasticsearch:9200open in new window
部署并验证

资源文件 efk/kibana.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
name: kibana
namespace: logging
labels:
app: kibana
spec:
selector:
matchLabels:
app: kibana
template:
metadata:
labels:
app: kibana
spec:
containers:
- name: kibana
image: 172.21.32.6:5000/kibana/kibana:7.4.2
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
env:
- name: ELASTICSEARCH_URL
value: http://elasticsearch:9200
ports:
- containerPort: 5601
---
apiVersion: v1
kind: Service
metadata:
name: kibana
namespace: logging
labels:
app: kibana
spec:
ports:
- port: 5601
protocol: TCP
targetPort: 5601
type: ClusterIP
selector:
app: kibana
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: kibana
namespace: logging
spec:
rules:
- host: kibana.devops.cn
http:
paths:
- path: /
backend:
serviceName: kibana
servicePort: 5601
$ kubectl create -f kibana.yaml
deployment.apps/kibana created
service/kibana created
ingress/kibana created

# 然后查看pod,等待状态变成running
$ kubectl -n logging get po
NAME READY STATUS RESTARTS AGE
elasticsearch-logging-0 1/1 Running 0 88m
kibana-944c57766-ftlcw 1/1 Running 0 15m

## 配置域名解析 kibana.devops.cn,并访问服务进行验证,若可以访问,说明连接es成功
部署fluentd
部署分析
  1. fluentd为日志采集服务,kubernetes集群的每个业务节点都有日志产生,因此需要使用daemonset的模式进行部署
  2. 为进一步控制资源,会为daemonset指定一个选择表情,fluentd=true来做进一步过滤,只有带有此标签的节点才会部署fluentd
  3. 日志采集,需要采集哪些目录下的日志,采集后发送到es端,因此需要配置的内容比较多,我们选择使用configmap的方式把配置文件整个挂载出来
部署服务

配置文件,efk/fluentd-es-main.yaml

apiVersion: v1
data:
fluent.conf: |-
# This is the root config file, which only includes components of the actual configuration
#
# Do not collect fluentd's own logs to avoid infinite loops.
<match fluent.**>
@type null
</match>

@include /fluentd/etc/config.d/*.conf
kind: ConfigMap
metadata:
labels:
addonmanager.kubernetes.io/mode: Reconcile
name: fluentd-es-config-main
namespace: logging

配置文件,fluentd-config.yaml,注意点:

  1. 数据源source的配置,k8s会默认把容器的标准和错误输出日志重定向到宿主机中
  2. 默认集成了 kubernetes_metadata_filteropen in new window 插件,来解析日志格式,得到k8s相关的元数据,raw.kubernetes
  3. match输出到es端的flush配置
kind: ConfigMap
apiVersion: v1
metadata:
name: fluentd-config
namespace: logging
labels:
addonmanager.kubernetes.io/mode: Reconcile
data:
system.conf: |-
<system>
root_dir /tmp/fluentd-buffers/
</system>
containers.input.conf: |-
<source>
@id fluentd-containers.log
@type tail
path /var/log/containers/*.log
pos_file /var/log/es-containers.log.pos
time_format %Y-%m-%dT%H:%M:%S.%NZ
localtime
tag raw.kubernetes.*
format json
read_from_head true
</source>
# Detect exceptions in the log output and forward them as one log entry.
<match raw.kubernetes.**>
@id raw.kubernetes
@type detect_exceptions
remove_tag_prefix raw
message log
stream stream
multiline_flush_interval 5
max_bytes 500000
max_lines 1000
</match>
forward.input.conf: |-
# Takes the messages sent over TCP
<source>
@type forward
</source>
output.conf: |-
# Enriches records with Kubernetes metadata
<filter kubernetes.**>
@type kubernetes_metadata
</filter>
<match **>
@id elasticsearch
@type elasticsearch
@log_level info
include_tag_key true
host elasticsearch
port 9200
logstash_format true
request_timeout 30s
<buffer>
@type file
path /var/log/fluentd-buffers/kubernetes.system.buffer
flush_mode interval
retry_type exponential_backoff
flush_thread_count 2
flush_interval 5s
retry_forever
retry_max_interval 30
chunk_limit_size 2M
queue_limit_length 8
overflow_action block
</buffer>
</match>

daemonset定义文件,fluentd.yaml,注意点:

  1. 需要配置rbac规则,因为需要访问k8s api去根据日志查询元数据
  2. 需要将/var/log/containers/目录挂载到容器中
  3. 需要将fluentd的configmap中的配置文件挂载到容器内
  4. 想要部署fluentd的节点,需要添加fluentd=true的标签

efk/fluentd.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd-es
namespace: logging
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd-es
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
rules:
- apiGroups:
- ""
resources:
- "namespaces"
- "pods"
verbs:
- "get"
- "watch"
- "list"
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd-es
labels:
k8s-app: fluentd-es
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
subjects:
- kind: ServiceAccount
name: fluentd-es
namespace: logging
apiGroup: ""
roleRef:
kind: ClusterRole
name: fluentd-es
apiGroup: ""
---
apiVersion: apps/v1
kind: DaemonSet
metadata:
labels:
addonmanager.kubernetes.io/mode: Reconcile
k8s-app: fluentd-es
name: fluentd-es
namespace: logging
spec:
selector:
matchLabels:
k8s-app: fluentd-es
template:
metadata:
labels:
k8s-app: fluentd-es
spec:
containers:
- env:
- name: FLUENTD_ARGS
value: --no-supervisor -q
image: 172.21.32.6:5000/fluentd-es-root:v1.6.2-1.0
imagePullPolicy: IfNotPresent
name: fluentd-es
resources:
limits:
memory: 500Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- mountPath: /var/log
name: varlog
- mountPath: /var/lib/docker/containers
name: varlibdockercontainers
readOnly: true
- mountPath: /home/docker/containers
name: varlibdockercontainershome
readOnly: true
- mountPath: /fluentd/etc/config.d
name: config-volume
- mountPath: /fluentd/etc/fluent.conf
name: config-volume-main
subPath: fluent.conf
nodeSelector:
fluentd: "true"
securityContext: {}
serviceAccount: fluentd-es
serviceAccountName: fluentd-es
volumes:
- hostPath:
path: /var/log
type: ""
name: varlog
- hostPath:
path: /var/lib/docker/containers
type: ""
name: varlibdockercontainers
- hostPath:
path: /home/docker/containers
type: ""
name: varlibdockercontainershome
- configMap:
defaultMode: 420
name: fluentd-config
name: config-volume
- configMap:
defaultMode: 420
items:
- key: fluent.conf
path: fluent.conf
name: fluentd-es-config-main
name: config-volume-main
## 给slave1和slave2打上标签,进行部署fluentd日志采集服务
$ kubectl label node k8s-slave1 fluentd=true
node/k8s-slave1 labeled
$ kubectl label node k8s-slave2 fluentd=true
node/k8s-slave2 labeled

# 创建服务
$ kubectl create -f fluentd-es-config-main.yaml
configmap/fluentd-es-config-main created
$ kubectl create -f fluentd-configmap.yaml
configmap/fluentd-config created
$ kubectl create -f fluentd.yaml
serviceaccount/fluentd-es created
clusterrole.rbac.authorization.k8s.io/fluentd-es created
clusterrolebinding.rbac.authorization.k8s.io/fluentd-es created
daemonset.extensions/fluentd-es created

## 然后查看一下pod是否已经在k8s-slave1和k8s-slave2节点启动成功
$ kubectl -n logging get po -o wide
NAME READY STATUS RESTARTS AGE
elasticsearch-logging-0 1/1 Running 0 123m
fluentd-es-246pl 1/1 Running 0 2m2s
fluentd-es-4e21w 1/1 Running 0 2m10s
kibana-944c57766-ftlcw 1/1 Running 0 50m
EFK功能验证
验证思路

k8s-slave1和slave2中启动服务,同时往标准输出中打印测试日志,到kibana中查看是否可以收集

创建测试容器
apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
nodeSelector:
fluentd: "true"
containers:
- name: count
image: alpine:3.6
args: [/bin/sh, -c,
'i=0; while true; do echo "$i: $(date)"; i=$((i+1)); sleep 1; done']
$ kubectl get po
NAME READY STATUS RESTARTS AGE
counter 1/1 Running 0 6s
配置kibana

登录kibana界面,按照截图的顺序操作:

也可以通过其他元数据来过滤日志数据,比如可以单击任何日志条目以查看其他元数据,如容器名称,Kubernetes 节点,命名空间等,比如kubernetes.pod_name : counter

到这里,我们就在 Kubernetes 集群上成功部署了 EFK ,要了解如何使用 Kibana 进行日志数据分析,可以参考 Kibana 用户指南文档:https://www.elastic.co/guide/en/kibana/current/index.htmlopen in new window

上次编辑于: 2022/11/2 14:32:44