Run Deployment
This page shows how to leverage Kueue’s scheduling and resource management capabilities when running Deployments. Although Kueue does not yet support managing a Deployment as a single Workload, it’s still possible to leverage Kueue’s scheduling and resource management capabilities for the individual Pods of the Deployment.
We demonstrate how to support scheduling Deployments in Kueue based on the Plain Pod integration, where every Pod from a Deployment is represented as a single independent Plain Pod. This approach allows independent resource management for the Pods, and thus scale-out and scale-in of the Deployment.
This guide is for serving users that have a basic understanding of Kueue. For more information, see Kueue’s overview.
Before you begin
- 
Learn how to install Kueue with a custom manager configuration. 
- 
Ensure that you have the deploymentintegration enabled, for example:apiVersion: config.kueue.x-k8s.io/v1beta1 kind: Configuration integrations: frameworks: - "deployment"Pod integration requirementsSince Kueue v0.15, you don’t need to explicitly enable "pod"integration to use the"deployment"integration.For Kueue v0.14 and earlier, "pod"integration must be explicitly enabled.See Run Plain Pods for configuration details. 
- 
Check Administer cluster quotas for details on the initial Kueue setup. 
Running a Deployment admitted by Kueue
When running Deployment on Kueue, take into consideration the following aspects:
a. Queue selection
The target local queue should be specified in the metadata.labels section of the Deployment configuration.
metadata:
  labels:
    kueue.x-k8s.io/queue-name: user-queue
b. Configure the resource needs
The resource needs of the workload can be configured in the spec.template.spec.containers.
    - resources:
        requests:
          cpu: 3
c. Scaling
You may perform scale up or scale down operations on Deployments.
On scale-in, the excess Pods are deleted, and the quota is freed.
On scale-out, new Pods are created, and remain suspended until their corresponding workloads get admitted.
If there is not enough quota in your cluster, the Deployment might run only a subset of Pods.
So, if your workloads are business-critical,
you can consider reserving the quota only for the serving workloads by the ClusterQueue lendingLimit.
The lendingLimit allows you to rapidly scale out the critical serving workload.
For more lendingLimit details, please see the ClusterQueue page.
d. Limitations
- The scope for Deployments is implied by the pod integration’s namespace selector. There’s no independent control for deployments.
Example
Here is a sample Deployment:
apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  labels:
    app: nginx
    kueue.x-k8s.io/queue-name: user-queue
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: registry.k8s.io/nginx-slim:0.27
          ports:
            - containerPort: 80
          resources:
            requests:
              cpu: "100m"
You can create the Deployment using the following command:
kubectl create -f sample-deployment.yaml
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.