OpenKruise Self-assessment

This assessment was created by community members as part of the Security Pals process, and is currently pending changes from the maintainer team.

Authors and collaborators

Self-assessment outline

Table of contents

Metadata

Assessment StageIncomplete
Softwarehttps://github.com/openkruise/kruise
Security ProviderNo. OpenKruise is an extended component suite for Kubernetes, which mainly focuses on application automations, such as deployment, upgrade, ops and availability protection.
LanguagesGo, Makefile, Dockerfile, Shell
SBOMOpenKruise does not currently generate an SBOM on release
Docurl
Security fileSECURITY.md
Documentationhttps://openkruise.io/docs/

Overview

OpenKruise is an open-source project that focuses on extending the capabilities of Kubernetes. It provides a set of custom controllers and tools to enhance and simplify application lifecycle management on Kubernetes clusters.

OpenKruise aims to address various aspects of application management, including rolling updates, canary releases, blue-green deployments, and more. It is designed to help users automate and manage the deployment and scaling of applications on Kubernetes with additional features beyond the standard Kubernetes functionality

Background

Kubernetes is an open-source platform for declaratively configuring and automating containerized applications, which are referred to as workloads and services. While Kubernetes provides some has features for deployment and management of applications, it’s not considered enough by many who aim to do this in large-scale production clusters.

OpenKruise is an extended component suite for Kubernetes, which mainly focuses on automated management of large-scale applications, such as deployment, upgrade, ops and availability protection.

A Kubernetes cluster is split into two main components, Control Plane Components which make global decisions about the cluster and Node Components which manage Pods and provide the runtime environment. OpenKruise is a set of additional components that extend the Kubernetes API to support advanced use cases. In-place Update is one of the key features for updating of images on Pods. Other key features are Advanced Workloads, Bypass Application Management, High-availability protection, and High-level operation features.

Most features developed by OpenKruise are based on CRD extensions. This is the recommended way of extending functionality and doesn’t require additional dependencies. With Custom Resource Definitions (CRD) you can create new resource types without adding a new API Server.

image

Actors

  • kruise-manager
    • kruise-controller
    • kruise-webhook
  • kruise-daemon
  • kube-apiserver
krusie-manager

This is a control plane component that runs controllers and webhooks. Logically, each controller is a separate process but to reduce complexity they are compiled into the same binary running on the API server. It consists of a kruise-controller and kruise-webhook.

  • kruise-controller - Responsible for checking the that the configuration is in the desiredstate based on the user-configuration. It checks the resources across all of the nodes and ensures they are up-to-date.

  • kruise-webhook - Used for admission control. It intercepts,validates and potentially mutates requests coming from the user. The kruise-webhook is important since the kube-apiserver will fail the request if the calling to the kruise-webhook fails.

kruise-daemon

This runs on every node and manages things like image pre-download and container restarting. It interacts with the kruise-manager indirectly by calling the API Server which calls through.

kube-apiserver

The Kubernetes API server validates and configures data for the api objects. This include pods, services, replicationcontrollers, and other things. It provides the frontend to the clusters shared state.

Actions

Manual command

When a user executes a kubctl-kruise command, such as scale or rollout the CLI will call on kruise-manager, which is part of the control plane. Simultaneously the kruise-daemon will call the kruise-manager through the kube-apiserver. Finally, the kruise-daemon will execute an operation on the node, such as to restart the container.

InPlace Update

When the kruise-manager starts to update a pod it will update the changed fields in the pod and then call the kubelet to stop the old container. Then the kubelet pulls the image, creates the new container, and starts it. Finally, the kruise-manager updates local conditions for the InPlaceUpdate and the update Pod Status of the kubelet to be ready.

Goals

General goals
  • OpenKruise plays a complementary role to Kubernetes
  • Most features work on Kubernetes clusters without any other dependencies
  • Provide Advanced Workloads which extend basic, default Workloads
  • Decoupled Application Management to manage apps without modifying Workloads
    • Provide reliable In-Place update for updating existing Pod images
  • High-availability Protection provides extended ways of protecting availability
  • OpenKruise simplifies sidecar injection and enables sidecar In-Place update. OpenKruise also enhances the sidecar startup and termination control.
  • Multi-domain Management: This can help you manage applications over nodes with multiple domains, such as different node pools, available zones, architectures (x86 & arm) or node types (kubelet and virtual kubelet)
Security goals
  • OpenKruise components should be protected and robust against tampering
  • Authenticating and authorizing access OpenKruise to control plane components
  • Protect the OpenKruise control plane from being compromised

Non-goals

General non-goals
  • OpenKruise is not a PaaS and it will not provide any abilities of PaaS
  • Providing ways of managing containers without Kubernetes
  • Replacement for kubernetes: OpenKruise is not intended as a means to replace kubernetes, rather it is an extension to Kubernetes providing extra features on top.
  • Standardization of deployment strategies: OpenKruise introduces several advanced deployment strategies, it does not aim to standardize these strategies across Kubernetes cluster. Users are free to manipulate the strategy based on their needs.
Security non-goals
  • OpenKruise doesn’t provide additional security for Kubernetes
  • No guarantee of security when an attacker has compromised the system

Self-assessment use

This self-assessment is created by group of Security Pals with help from the maintainers of OpenKruise to perform an analysis of the project’s security. It is not intended to provide a security audit of OpenKruise, or function as an independent assessment or attestation of OpenKruise’s security health.

This document serves to provide OpenKruise users with an initial understanding of OpenKruise’s security, where to find existing security documentation, OpenKruise plans for security, and general overview of OpenKruise security practices, both for development of OpenKruise as well as security of OpenKruise.

This document provides the CNCF TAG-Security with an initial understanding of OpenKruise to assist in a joint-assessment, necessary for projects under incubation. Taken together, this document and the joint-assessment serve as a cornerstone for if and when OpenKruise seeks graduation and is preparing for a security audit.

Security functions and features

Critical

  • Security scanning with Snyk in the CI pipeline identifies vulnerabilities in container images so only verified images are displayed.
  • Security scanning with CodeQL in the CI pipeline identifies variants of known security vulnerability in the codebase.
  • Supporting only recent software versions that provide patches and updates mitigates general vulnerabilities.

Security Relevant

  • Regularly scanning the code in the main (master) and nightly builds, as well as in pull requests (PRs) for the Go programming language helps identify any potential vulnerabilities or issues before release.
  • Scanning the container images that are published on the GitHub Container Registry ensures that the images, which are used to run OpenKruise in a Kubernetes environment, are secure.

Threat Model

See OpenKruise Threat Model for details

Project compliance

OpenKruise does not document meeting particular compliance standards

Secure development practices

Contributing guidelines

  • The Kruise project has clear contributing guidelines
  • Anyone is encouraged to submit an issue, code, or documentation change
  • They additional information for building and testing your code locally
  • Proposals should be submitted before making a significant change
  • Decisions are made based on consensus between maintainers. Proposals and ideas can either be submitted for agreement via a github issue or PR.
Development pipeline

All source code is available on publicly on GitHub

  • Submitting a PR is the only way to change Kruise project files
  • Process for submitting a PR is first forking the main repository, then cloning the project from your repo, setting the remote upstream for syncing changes, and finally creating a branch to develop on that will be used to submit features.
  • They’ve provided a PR description template to keep descriptions focused
  • An OWNERS file specifies approvers and reviewers enforced by GitHub in the PR process - More information about OWNERS files specific to Kubernetes exists
  • There are multiple automated checks using GitHub Actions when a PR is created. See the workflows directory for a list of yaml files that specify each job below. All automated checks need to pass before something can be merged.
    • CodeQL (Static Code Analysis)
    • DCO (Enforces signed commits)
    • E2E-1.20-EphemeralJob
    • E2E-1.24 (Some automated tests)
    • E2E-1.16 (Some automated tests)
    • CI (Mischellaneous continuous integration)
    • Spell check
    • golangci-lint
    • markdownlint-misspell-shellcheck
    • unit-tests
    • License (Unapproved license check)
    • Code scanning (Automated Trivy scanning)
  • Automatic code coverage using codecov.io is generated in the PR for each submitted
  • At least 1 approving review is required to merge a pull request
Development security policy
  • Container images are scanned in every PR with Snyk to detect new vulnerabilities
  • Additional measures of security are in the process of being implemented
    • Scan code in master/nightly build and PR/master/nightly for Go.
    • Scan published container images on GitHub Container Registry.
Release process

The entire release process is covered in detail in the repository

  • The changelog is updated manually each time a release is created. The individual in charge of the release is expected to update the changelog with relevant user facing information.
  • Documentation is manually published to update what’s on the website
  • Creating a new release in the releases page triggers a GitHub Workflow. This includes the automated creation of a new image with the latest code tagged with the right version.
  • The Helm Chart needs to be prepared for shipping the update. There is a separate repository that contains all of the charts and where new charts are added. A new chart version is created and the CRDs and Kubernetes resources updated based on the release artifact. (Check what exactly it means to update these resources)
  • A PR is submitted to merge the new release and publish automatically does so
Communication Channels
Internal

Team members communicate with each other through a range of mediums. There is an invite only Slack channel, DingTalk group, and WeChat. There are also Bi-weekly Community Meetings held in both Chinese and English.

Inbound

Users communicate with the team through docs, issues, and discussions

Outbound

Team members communicate with users though the website and changelog

Ecosystem

OpenKruise is used by users of Kubernetes to extend the functionality of Kubernetes to something that better fits their needs and use cases running productions apps. It is installed directly by users and administrators for Kubernetes. OpenKruise is a CNCF( Cloud Native Computing Foundation) project.

Security issue resolution

  • Responsible Disclosures Process: OpenKruise has a responsible disclosure process for reporting security vulnerabilities. This process is designed to ensure that vulnerabilities are handled in a timely and effective manner. The process can be found here: https://github.com/openkruise/kruise/security/policy
  • Security researchers can report vulnerabilities confidentially by emailing cncf-openkruise-maintainers@lists.cncf.io.
  • GitHub: Security-related issues can be reported through GitHub issues at https://github.com/openkruise/kruise/issues
  • Reporters can expect a response from the maintainers within 2 business days.
  • The maintainers will triage the vulnerability and determine the appropriate remediation
  • It is the maintainers’ responsibility to triage the severity of issues and determine remediation plans
  • Disclosures: Openkruise encourages the community to assist in identifying security breaches; in the event of a confirmed breach, reporters will receive full credit and have the option to stay informed and kept in the loop.
  • If you know of a publicly disclosed security vulnerability, you should immediately email the OpenKruise maintainers at cncf-openkruise-maintainers@lists.cncf.io.
  • Remediation: Kruise commits to supporting the n-2 version minor version of the current major release; as well as the last minor version of the previous major release

Communication

GitHub Security Advisory will be used to communicate during the identification, fixing, and shipping of vulnerability mitigations.

The advisory becomes public only when the patched version is released to inform the community about the breach and its potential security impact.

Appendix

  • Known Issues Over Time
    Openkruise doesn’t have any security vulnerabilities pointed out as of the tools and frameworks that it uses (for eg. Golang vulnerabilities).
  • CII Best Practices
    OpenKruise hasn’t attained any badge from Open Source Security Foundation (OpenSSF), the progress is at 30% to attaining a passing level criteria from OpenSSF.
    OpenSSF Best Practices
  • Case Studies
    Many organisations have adopted OpenKruise and are using our project in
    • Alibaba Group, also known as Alibaba, is a Chinese multinational technology company specializing in e-commerce, retail, Internet, and technology. Alibaba had made its core systems fully cloud-native, and had managed more than 10w OpenKruise workload, and gained 80% improvement in deployment efficiency. Alibaba had utilized many workloads in OpenKruise, including CloneSet, Advance StatefulSet, SidecarSet, Advance DaemonSet etc. Their story had been presented in many blog posts.
    • Ctrip: a Chinese multinational online travel company, is using OpenKruise advanced workload to build their cloud native PaaS platform. They rely on the inplace update feature of openkruise and manage more than 2.8w CloneSet and 200+ advance StatefulSet. Their story had been presented in a KubeMeet sharing
    • Oppo: a Chinese consumer electronics manufacturer, is using OpenKruise to manage large scale stateful applications. Oppo rely on the inplace-update feature of OpenKruise, and had even customized K8S so that OpenKruise can be extended to inplace update fields other than container images. They share their story in a blog post.
    • Ant Group: formerly known as Ant Financial, a world leading internet open platform, owns the world’s largest online payment platform Alipay. Ant Group chose Kubernetes to orchestrate the tens-of-thousands-of-node clusters in its data centers. To manage these nodes, they chose OpenKruise advance daemonset to manage node agents, utilizing the enhanced rolling strategy such as rolling selector, partition rolling. They are consented to share necessary details privately with the TOC, if required.
    • LinkedIn: a leading business and employment-oriented online service in America, is using OpenKruise CloneSet to manage large scale workloads for the inplace-update and enhanced PVC support feature. In addition, they’re evaluating the container launch priority feature to ensure their configuration update sequence in pod creation as well as container inplace-update scenarios.
  • Related Projects / Vendors
    • Istio - Istio is a service mesh that provides a uniform way to secure, connect, and monitor microservices. It manages the communication between services in a Kubernetes cluster.
      Istio primarily focuses on service mesh features such as traffic management, security, and observability. OpenKruise is geared towards enhancing application deployment strategies, offering features beyond service communication.
    • Kubevela - KubeVela is a modern application delivery framework for Kubernetes, providing higher-level abstractions for defining, deploying, and managing applications.
      Both KubeVela and OpenKruise provide higher-level abstractions, but they may differ in their approach to application delivery and management. OpenKruise offers advanced deployment strategies through controllers, whereas KubeVela may have a different emphasis in its framework.
      There are also plans for OpenKruise to integrate with other open-source products from related fields, like KubeVela, to build a more complete cloud-native application system.
    • ArgoCD - ArgoCD is a declarative GitOps continuous delivery tool for Kubernetes. It automates the deployment of applications based on configurations stored in Git repositories, ensuring the desired state is maintained.
      While ArgoCD excels in GitOps and continuous delivery, OpenKruise focuses on extending Kubernetes controllers to offer advanced deployment strategies. OpenKruise provides features like rolling updates, canary releases, and blue-green deployments, offering a broader range of options for application lifecycle management.
    • FluxCD - FluxCD is a GitOps tool for Kubernetes, ensuring the cluster’s state aligns with the Git repository configuration. It automates the deployment of applications by continuously monitoring and applying changes from the repository.
      FluxCD is heavily focused on GitOps practices, while OpenKruise emphasizes advanced deployment strategies. OpenKruise’s controllers allow users to define more sophisticated deployment workflows beyond GitOps.
    • Knative - Knative is a set of components for building modern, serverless applications on Kubernetes. It abstracts away infrastructure complexities for serverless workloads.
      Knative is more oriented toward serverless computing, while OpenKruise concentrates on traditional application deployment and management strategies. OpenKruise’s controllers provide features like rolling updates and canary releases for more controlled application updates.