Skip to main content
Version: 2.0 🆕

Infrastructure Overview

Introduction​

The Infrastructure Overview page provides a centralized view of your Kubernetes infrastructure across all connected clusters. It displays cluster health, resource capacity, node distribution, and version details in a single dashboard.

It contains the following sections:

  1. At a Glance
  2. Cluster & Nodes
  3. Cost Visibility
  4. Actions & Insights

At a Glance​

The At a Glance section helps you quickly understand the overall state of your connected clusters and available resources. It helps you understand your current infrastructure capacity and connectivity status at a single glance.


CardDescription
Reachable ClustersShows the number of clusters that are currently reachable out of the total clusters connected to Devtron
Total CPU CapacityDisplays the total CPU capacity across all reachable clusters, giving you an overview of your available compute power
Total Memory CapacityDisplays the total memory capacity across all reachable clusters, helping you keep track of your memory resources
Unable to view clusters?

If some clusters don’t appear, they might be temporarily unreachable. You can verify their status under Cluster Configuration.


Cluster & Nodes​

The Cluster & Nodes section helps you monitor the health and stability of your connected clusters and nodes. It gives you visibility into cluster connectivity, node errors, and scheduling readiness, allowing you to quickly spot and resolve any infrastructure level issues before they affect workloads.


CardDescription
Cluster Health StatusDisplays the number of clusters which are healthy. A healthy cluster indicates active connectivity, while Connection Failed highlights clusters that are currently unreachable
Node ErrorsShows whether any node-level issues exist. If you see No node errors, all nodes are operating normally
Node SchedulingDisplays the percentage of nodes currently available for scheduling workloads. Schedulable nodes are ready to accept pods
info

If you notice connection failures or scheduling issues, verify your cluster connectivity and node configurations under Cluster Management.

Cluster Counts​

The Cluster Counts section gives you visibility into how your clusters are distributed across different cloud providers and Kubernetes versions. It helps you identify where most of your clusters are hosted and which versions are actively running in your infrastructure.


There are two views available:

TabDescription
By ProvidersDisplays the total number of clusters grouped by cloud providers such as GCP, AWS, Azure, or Unknown. This helps you understand your cloud distribution and dependency.
By Cluster VersionsDisplays the number of clusters based on their Kubernetes version. This helps you track version diversity and identify clusters that may require upgrades for consistency and security.
info

You can sort the data High to Low or Low to High to quickly identify the most used cloud provider or the most common Kubernetes version in your setup.

Cluster Capacity & Resource Allocation​

The Cluster Capacity & Resource Allocation section provides a detailed view of how CPU and Memory resources are distributed and utilized across all connected clusters. It helps you assess infrastructure efficiency, monitor resource limits, and identify clusters that may be underutilized or over committed.


FieldDescription
Cluster NameLists all clusters connected to Devtron. You can click on a cluster name to view its detailed resource usage
CPUDisplays the total CPU capacity, along with utilization, requests, and limits for each cluster. This helps you track compute usage and detect over-provisioning
MemoryDisplays the total memory capacity, along with utilization, requests, and limits for each cluster. It helps you ensure workloads are balanced, and resource allocation is efficient

You can sort the data by:

  • Cluster Name (A to Z or Z to A)
  • Utilization (High to Low or Low to High)
Note

If a cluster shows zero utilization or capacity, it may be inactive or disconnected. Verify its status under Cluster Configuration.

Node Counts​

The Node Counts section helps you visualize how nodes are distributed across clusters and autoscaling modes.
It provides a quick overview of your cluster node density and helps identify environments with higher or lower capacity.


TabDescription
By ClusterThis graph displays the total number of nodes within each cluster. Each bar shows the number of clusters within a specific cluster. This view helps you assess how evenly nodes are distributed and whether specific clusters may be over or under-provisioned
By AutoscalerGroups nodes based on their autoscaling configuration (for example, GKE Automode or Not Detected). Each bar shows the number of nodes within the autoscaling configuration

You can sort the chart data using the dropdown in the top-right corner. High to Low or Low to High, to focus on clusters with the most or fewest nodes.

Troubleshooting Autoscaler Detection​

Devtron currently supports autoscaler detection for the following autoscalers: EKS Auto Mode, Karpenter, CAST AI, and GKE Autopilot. If your cluster uses any other autoscaler, it will be categorized as Not Detected under the By Autoscaler view.

Devtron identifies supported autoscalers using the following Kubernetes node labels:

# EKS Auto Mode label

LabelEKSComputeType = "eks.amazonaws.com/compute-type"
LabelEKSComputeAuto = "auto"

# Karpenter label

LabelKarpenterInitialized = "karpenter.sh/initialized"
LabelKarpenterTrue = "true"

# Cast AI label

LabelCastAIManagedBy = "provisioner.cast.ai/managed-by"
LabelCastAIValue = "cast.ai"

# GKE label

LabelGKEProvisioning = "cloud.google.com/gke-provisioning"
LabelGKEAutoPilot = "spot"

Need support for another autoscaler?

Submit a feature request on GitHub. Our team regularly reviews community requests, and your feedback helps us prioritize new integrations in upcoming releases.


Cost Visibility ​

The Cost Breakdown chart helps you see how costs are distributed across different infrastructure components for the selected time period.

Each bar represents one Application, Cluster, Environment, or Project, and the colored segments in the bar show the share of different resource types. This makes it easy to compare categories and see which resources are contributing most to their total cost.


Resource TypeColor Used in Chart
CPU CostLimeGreen
Memory CostSkyBlue
Storage (PV) CostAquaTeal
GPU CostMagenta
Network CostGoldenYellow

Filters​

FiltersWhat It Shows
ApplicationEach bar represents an application, segmented by CPU, Memory, Storage (PV), GPU, and Network costs
ClusterEach bar represents a cluster, segmented by CPU, Memory, Storage (PV), GPU, and Network costs
EnvironmentEach bar represents an environment, segmented by CPU, Memory, Storage (PV), GPU, and Network costs
ProjectEach bar represents a project, segmented by CPU, Memory, Storage (PV), GPU, and Network costs

Sorting Criteria​

Sorting OptionDescription
Cost: High to LowShows the highest cost items first
Cost: Low to HighShows the lowest cost items first
A to ZSorts items alphabetically
Z to ASorts items in reverse alphabetical order

Actions and Insights​

The Actions & Insights section highlights where you can achieve the highest cost savings. It shows the categories with the largest cost saving opportunities, based on the difference between allocated resources and your actual usage.


It also shows, which version of Kubernetes your cluster is running, and you can also check the compatibility of the cluster to upgrade to the latest Kubernetes version. You can click Show All to expand and view additional clusters that are not immediately visible in the list.

Each item in the Top saving opportunities will show

FieldDescription
NameThe name of the category (for example, a cluster, application, or environment) with the largest savings opportunities
Potential Savings (%)The percentage of your current spend that could be saved, for the selected time range
Estimated SavingsThe estimated cost you could save in that category, based on the difference between provisioned and used resources, for the selected time range

Clicking on any item in this list takes you to its detailed Cost Breakdown page. Refer Cost Breakdown to learn more.

Checking Upgrade Compatibility​

  1. To check upgrade capability, go to Infrastructure Management → Overview.

  2. Under Check cluster update compatibility, hover over the cluster you want to check compatibility for and click the search button.

  3. A pop-up modal will appear, select the target version, and click scan cluster.

  4. A page will open, with the summary of all the API-endpoints that are compatible for upgrading. You can also check Deprecated Fields (Against current API version), Resources with no PDB, Resources with 0 Disruption PDB


FAQs​

1. Why does Cost Visibility show data for some clusters but not others?

Cost data appears only for clusters where Cost Visibility is enabled.
If a cluster doesn’t show cost insights, verify that the Cost Visibility module is active for that cluster.

Refer Configurations to learn more.

2. What does Connection Failed mean in Cluster Health Status?

Connection Failed means Devtron could not reach the cluster’s API server or retrieve data from it.
This can happen due to:

  • Network or firewall restrictions
  • Expired or invalid Kubernetes credentials
  • Misconfigured cluster agent

Try revalidating credentials or redeploying the Devtron agent to restore connectivity.

3. Why does a cluster show Not Detected under Autoscaler in Node Counts?

This means Devtron couldn’t identify any predefined autoscaling configuration, it can be a custom autoscaler.

4. How often is the infrastructure data updated?

Infrastructure data (including metrics, cost, and health status) is refreshed automatically every hour.