Runbooks
Introduction​
A Runbook is a predefined action that Devtron runs to apply a change, such as resizing resources or hibernating a namespace. When you approve an AI recommendation, its linked runbook carries out the change with the required approvals.
Using AI-generated Runbooks​
Whenever AI detects an optimization opportunity, it automatically generates a corresponding runbook to carry out the recommended change once approved. These runbooks are auto-linked from Notifications of AI Recommendations.

When AI recommends a cost optimization such as reducing memory allocation, the linked runbook carries out that change.
For example, scaling down a pod’s memory limit from 6 Gi to 3 Gi across selected clusters.
Using Your Runbook​
If you wish to create or modify a runbook beyond what AI generates automatically, Devtron allows you to create one.
Follow this section only if you wish to create a runbook different than the one generated by AI.
-
From the left navigation, go to AI Recommendations → Runbooks.
-
Click Create Runbook.

-
Enter the following details:
- Name - Example:
update-resource-limits - Description - Example: Updates CPU and memory limits for workloads.
- Name - Example:
-
Click Create Runbook to save.
Add Runbook Spec​
You can edit your runbooks here. Each runbook follows a YAML structure that defines its metadata, tags, and executable steps.

Use the YAML editor in Devtron to paste and modify this structure.
apiVersion: devtron.ai/v1
kind: Runbook
metadata:
name: <name of the runbook>
description: <description of the runbook, specifying its purpose and usage>
tags:
- <tag1 specifying category or type>
- <tag2 specifying purpose>
spec:
steps:
- name: <name of the step>
action: <predefined action to be executed>
type : <type of action, devtron-action, kubernetes-action, custom-action>
parameters:
param1: <value for parameter 1>
param2: <value for parameter 2>
onFailure:
- nextStep: <name of the next step to execute on failure>
Each step in runbook spec represents one operation that can interact with Kubernetes resources, Devtron apps, or external systems. Below are the most commonly used predefined actions supported by Devtron runbooks.
Example 1: Get Deployment Manifest​
Retrieves the manifest of a specified deployment in a Kubernetes cluster.
spec:
steps:
- name: <name of the step>
action: get-k8s-workload-controller-manifest
type : kubectl-get
parameters:
clusterId: "{{.clusterId}}"
group: "{{.group}}"
version: "{{.version}}"
kind: "Deployment"
namespace: "{{.namespace}}"
resourceName: "{{.resourceName}}"
To inspect the configuration of an existing deployment before applying any changes.
Example 2: Update Resource Spec in Deployment Manifest​
Updates the CPU and memory requests or limits for a container inside a Kubernetes workload.
spec:
steps:
- name: <name of the step>
action: update-k8s-workload-resource-spec
type : kubectl-patch
parameters:
clusterId: "{{.clusterId}}"
group: "{{.group}}"
version: "{{.version}}"
kind: "Pod"
namespace: "{{.namespace}}"
resourceName: "{{.resourceName}}"
patch:
spec:
container:
name: "{{.containerName}}"
resources:
requests:
cpu: "{{.newCpuRequestValue}}"
memory: "{{.newMemoryRequestValue}}"
limits:
cpu: "{{.newCpuLimitValue}}"
memory: "{{.newMemoryLimitValue}}"
To rightsize workload resource consumption and optimize costs.
Example 3: Update Resource Spec in Devtron Apps Config​
Applies resource specification updates within Devtron-managed application configurations.
spec:
steps:
- name: <name of the step>
action: update-resource-spec-devtron-apps-config
type : devtron-app-patch
parameters:
clusterId: "{{.clusterId}}"
group: "{{.group}}"
version: "{{.version}}"
kind: "Pod"
namespace: "{{.namespace}}"
resourceName: "{{.resourceName}}"
patch:
spec:
container:
name: "{{.containerName}}"
resources:
requests:
cpu: "{{.newCpuRequestValue}}"
memory: "{{.newMemoryRequestValue}}"
limits:
cpu: "{{.newCpuLimitValue}}"
memory: "{{.newMemoryLimitValue}}"
To modify resource values for Devtron-managed apps directly through the configuration interface.
Example 4: Update Resource Spec in Helm Chart Values​
Modifies resource settings defined within Helm chart values YAML files.
spec:
steps:
- name: <name of the step>
action: update-resource-spec-helm-chart-values-yaml
type : helm-chart-patch
parameters:
clusterId: "{{.clusterId}}"
group: "{{.group}}"
version: "{{.version}}"
kind: "Pod"
namespace: "{{.namespace}}"
resourceName: "{{.resourceName}}"
patch:
spec:
container:
name: "{{.containerName}}"
resources:
requests:
cpu: "{{.newCpuRequestValue}}"
memory: "{{.newMemoryRequestValue}}"
limits:
cpu: "{{.newCpuLimitValue}}"
memory: "{{.newMemoryLimitValue}}"
To synchronize Helm chart values with runtime resource adjustments.
Example 5: Webhook to Any Service​
Sends a webhook to an external service for integrations such as Slack notifications, monitoring tools, or CI/CD triggers.
spec:
steps:
- name: <name of the step>
action: webhook
type : devtron-action
parameters:
url: <<"url to which the webhook needs to be sent">>
headers: <<"headers to be included in the webhook">>
httpMethod: <<"HTTP method to be used (GET, POST, etc.)">>
body: <<"body of the webhook">>
To notify other systems or trigger automated workflows upon completion of a Devtron runbook.
Approval Types​
Before execution, every AI-generated runbook requires an approval decision. You can approve or reject its execution for specific clusters and different durations.

When you take an action, Devtron applies the following logic:
- If you approve or reject a runbook, the decision auto-applies to all the recommendations linked to that runbook across the selected clusters.
- If you approve or reject an individual recommendation, the runbook is rejected only for that specific cluster where the recommendation originated.
Approve Options​
| Option | Behavior | Example Use Case |
|---|---|---|
| Forever | All future runs of this runbook stands indefinitely auto-approved. | For dev or sandbox clusters where downtime or failed runs are acceptable and you want continuous savings. |
| Till date & time | Auto-approves until a specific expiry date and time. | During a maintenance window or before a critical demo, so changes are applied automatically until that period ends. |
| For duration | Auto-approves temporarily for a set number of hours. | For short tests or limited-time fixes, such as approving remediation for the next few hours. |
Reject Options​
| Option | Behavior | Example Use Case |
|---|---|---|
| Forever | Blocks all future runs of this runbook permanently. | For production clusters where any automated remediation is risky or unwanted. |
| Till date & time | Rejects runs until a specific expiry date and time. | When you want the cluster to stay stable (e.g., during a product demo or release). |
| For duration | Rejects runs temporarily for a few hours. | To pause remediation during high-traffic periods or while verifying manual changes. |
When any approval or rejection period ends, the runbook status resets to Action Pending. The user is expected to take an action again.
Audit Logs​
Every Runbook logs:
- Created / Updated / Approved / Rejected actions
- User, timestamp, and resource
- Full JSON payload for traceability

You can access this under AI Recommendations → Runbooks → Audit Logs.