In the last article in this series, we learned why on-prem Kubernetes is difficult: HA control planes (LBs, etcd, quorum), ongoing day-2 ops (patching, cert rotation, upgrades) and cluster sprawl.
In this part, we’ll explore two important choices every on-prem Kubernetes endeavour starts with:
- Where to run it: on a virtualization platform or directly on bare metal
- How to get it: build it yourself or buy a platform
Running on Virtual Machines
Since most orgs have virtualization in place, they start here: Kubernetes clusters run as VMs on vSphere, OpenStack or Proxmox. This virtualization layer is often managed by a dedicated infrastructure team, which abstracts away hardware, networking, storage, capacity management and monitoring.

Pros & Cons
- Abstracts networking, storage, capacity, monitoring
- Clear team boundaries (infra vs platform)
- Mature operational model
- Easy to split large hosts into smaller nodes
- Often achieves higher hardware utilization by consolidating many VMs on shared hosts
- Built-in scaling and self-service via VM APIs
- Hypervisor/Virtualization is another control plane/orchestrator to operate
- More complexity and failure surface
- Licensing and operational overhead
- Constrained for GPU/SmartNIC-heavy workloads
- Slight performance overhead
Tackling the hard parts
The last article identified challenges which this approach helps to solve:
- Load balancers & HA: Often available via LBaaS; easy to put kube-apiserver behind a VIP.
- Day-2 operations: VM templates/snapshots simplify rollouts and blue-green control plane upgrades.
- Bootstrap: Straightforward: spin up VMs from a template.
- Cluster sprawl: Easier to spin up more clusters (but still your platform team’s burden).
When it fits
You already have IaaS and licenses, and/or you plan to buy a K8s platform that integrates with IaaS APIs. This is usually the fastest and lowest-risk path to start.
Running on Bare Metal
Running Kubernetes directly on servers removes the virtualization layer and makes Kubernetes itself the baseline infrastructure platform (meaning, you could for example run VMs in it, as well).
It simplifies the stack, but usually also shifts responsibility for hardware and lifecycle onto the Kubernetes team.

Be aware that Kubernetes on Bare Metal comes with its own interesting challenges which we wrote about in a dedicated blog post: Kubernetes on Bare Metal: The Four Hard Problems
Importantly, you still need three dedicated control plane nodes per cluster, which wastes capacity on large servers. This is where Hosted Control Planes become almost a necessity: they let you pack many control planes onto shared management nodes instead of dedicating whole servers to each one.
So you might end up with a setup like this:

Choose Your Worker Server Profile Wisely
With virtualization, selecting the right hardware server size is easy: the infra team can bay a few beefy hosts and let teams slice them up as VMs of various sizes. On bare metal on the other hand, your Kubernetes nodes are the physical servers themselves, so you have to make sure the servers you order match the workload and cluster profiles.
When picking bare metal servers for your Kubernetes workers, keep these guidelines in mind:
- Avoid large servers: As outlined as Problem 1 in our blog post about bare metal problems, you don’t want to run more than 500 pods per node for stability reasons. For example, if each pod averages 1GB of used RAM (which we found a good rule of thumb for enterprise environments), you won’t ever use more than 512GB RAM per server. Buying more is wasted money and electricity.
- Avoid very small servers: Small servers (below 128GB RAM) often don’t make efficient use of rack space, power, and networking ports. Each physical node carries a fixed cost in terms of CPU, cabling, switch ports, and management overhead. To use your space and power budget effectively, we found that servers below 128GB rarely make sense.
- Respect HA & maintenance requirements: Most customers require the cluster to be spread over at least two racks or sites for high-availability. So for each cluster, you will need at least two servers (one for each rack/site). If you have site- or rack-local persistent storage (a common architecture), you’ll need at least two servers per site, otherwise you can’t take one server down for maintenance while keeping PDBs happy. In practice, most customers require at least four physical servers per cluster.
- Multiply per cluster: Once you’ve established the baseline for each cluster (e.g. four servers of 256GB RAM each), multiply that by the number of clusters you plan to run. This quickly adds up to a significant hardware footprint.
Given the constraints above, it’s better to scale with many smaller nodes rather than few large ones. We found that a fleet of 128GB-256GB RAM servers are a good sweet spot for many use cases.
Cluster-as-a-Service Becomes Hard
Since each new clusters now requires at least a few physical servers, the Cluster-as-a-Service model (where developers can self-serve new clusters on demand, or platform teams want to create dedicated clusters for certain projects/teams) becomes much more difficult. You can mitigate that to some degree by keeping spares (especially when using many smaller servers), but in general, if you expect many dedicated, on-demand clusters, using bare metal workers may not be the best fit.
A Note on Costs
Comparing the cost of virtualization vs bare metal for our customers, we found some interesting cost dynamics:
- Since the base cost per server is high (especially the CPU is expensive), having many small servers (e.g. 256GB RAM) for a bare metal build quickly becomes more expensive than fewer, larger servers (e.g. 4TB RAM) you’d purchase when choosing to go virtualized.
- If you expect to end up with a big fleet of smaller servers, also calculate the per-server cost on rack space, power & cooling and networking. We found that these “hidden” costs (especially expensive switch ports) can double the cost in many cases.
- Virtualization not only allows you to consolidate many workloads on fewer, larger servers (= cheaper) but also enables better hardware utilization by packing many VMs on shared hosts. As an effect, you’ll need fewer physical servers overall - and this saving on hardware can outweigh the licensing of the virtualization layer (even the crazy Broadcom quotes).
We strongly recommend running the numbers for your own use case, hardware vendor and virtualization licensing costs.
Pros & Cons
- Simpler overall stack with fewer layers to manage
- Lower licensing costs
- Enables use cases that need direct hardware access (GPUs, DPUs, SR-IOV)
- Establishes Kubernetes as your infrastructure platform
- Supports KubeVirt to run VMs inside Kubernetes
- Requires knowledge in areas usually abstracted away by virtualization: hardware failures, firmware, RAID, NICs, PXE provisioning
- Most likely requires Hosted Control Planes and their management cluster
- Need to choose server sizes that fit your workload profile
- Many small servers are usually more expensive than few large ones
- Higher operational effort to keep machines consistent and healthy
- Cluster-as-a-Service is difficult due to physical server requirements
Tackling the hard parts
Here’s how this approach relates to the challenges identified in the last article:
- Control plane HA/LB/etcd: While the necessity for Hosted Control Planes gives you many advantages (orchestration, lifecycle management, LBs, …), it also introduces new complexities like a separate mangement cluster
- Day-2 operations: Without VM snapshots or templates, rolling out patches and upgrades means touching real machines. A neat way to work around this are immutable, image-based workers. For more on this, you might want to read our article From Metal to Kubernetes Worker.
- Bootstrap: You start from bare metal. Every server needs to be provisioned, imaged and enrolled before it can join the cluster.
When it fits
If you want to get rid of virtualization or have Kubernetes become your infrastructure rather than sit on top of it.
Virtual Workers on Bare Metal: Best of Both Worlds?
Some benefits of virtualization are extremely compelling: cost-wise, its consolidation advantages and feature-wise, its ability to dynamically create small nodes from larger servers, which is essential for “Cluster-as-a-Service” setups.
Thus, we see that the industry is moving towards virtualizing Kubernetes Workers inside Bare Metal Kubernetes: using KubeVirt or similar projects to run Kubernetes Worker VMs inside the outer bare metal Kubernetes.
Which leads to an architecture like this:

Examples include:
- OpenShift HCP using KubeVirt Provider: using HCPs with KubeVirt to run worker VMs on bare metal
- vCluster and its vNode run virtual control planes and isolated worker capacity inside a shared bare metal cluster
- meltcloud Elastic Pools (our approach): dynamically provision virtual worker nodes out of a large bare metal cluster using KubeVirt and Karpenter
In short, the industry is reintegrating virtualization as a native building block of the Kubernetes platform (rather than as a separate layer underneath it).
Build or Buy the Platform
Once you’ve decided where your nodes run, the question remains how you’ll get Kubernetes.
- Build it yourself: You assemble the platform. You’ll pick OS, bootstrap tooling (kubeadm/Talos), wire CNI/CSI/Ingress, stand up GitOps/monitoring/backups, automate upgrades, and integrate with load balancers, PKI, IAM, and image pipelines.
- Buy a platform: You adopt a pre-integrated distribution. You run the installer, connect to your IaaS or metal, and get opinionated defaults for networking, storage, upgrades, and consoles. You focus on landing zones, guardrails, and enablement; the vendor provides lifecycles and support.
Build it Yourself
You bootstrap clusters with kubeadm or Talos, wire all components yourself, and automate upgrades, observability, backups, and multi-cluster tooling.
Tackling the hard parts
Possible approaches for handling the challenges identified in our last article:
- Control plane HA/LB/etcd: You design and automate it (LB VIPs, etcd backup/restore, quorum).
- Day-2 ops: You build the upgrade train (OS, kubelet, etcd, control plane).
- Cluster sprawl: You script multi-cluster with CAPI & GitOps; still your responsibility.
You can adopt advanced patterns like Hosted Control Planes, immutable workers or worker slicing with KubeVirt as needed if you are prepared to integrate and operate them yourself.
Pros & Cons
- Maximum control over components and roadmaps
- No vendor lock-in, easy to swap parts
- Deep internal expertise and ownership
- Flexible to unusual networking or security needs
- Can adopt advanced patterns (Hosted Control Planes, immutable workers) on your own timeline
- Several FTEs required to build the whole platform
- Higher risk and longer time to readiness
- You own HA/LB/etcd, upgrades, and incident response
- Fleet and multi-cluster tooling to build and maintain
- You must integrate patterns like Hosted CPs or immutable workers yourself
Buy a Platform
You install a supported distribution (e.g., OpenShift, Tanzu, Rancher, Canonical) and get opinionated defaults, supported lifecycles, and integrated multi-cluster tooling out of the box.
Tackling the hard parts
Typically, the platform will ship with solutions to the challenges identified in our last article:
- Control plane HA & upgrades: Standardized and supported; fewer bespoke scripts
- Day-2 ops: Vendors provide patch cadence, cert rotation, and health checks
- Cluster sprawl: Built-in fleet/multi-cluster management reduces toil (not zero, but less)
- Bootstrap: Still your job to provide infra (LB, storage, network), but installers simplify the process
Note: Many modern platforms include patterns like Hosted Control Planes, immutable worker images or worker slicing with virtualization out of the box.
Pros & Cons
- Fastest path to production with opinionated defaults
- Supported upgrades, cert rotation, and lifecycle
- Integrated networking, storage, monitoring, and RBAC
- Vendor certifications and ecosystem integrations
- Many include Hosted Control Planes, immutable worker models, or virtual worker pools
- Licensing and subscription costs
- Less flexibility for low-level components
- Platform boundaries may constrain advanced use cases
- Risk of vendor dependence over time
- If missing advanced patterns, you depend on vendor roadmap
Future Options to Watch
The on-prem Kubernetes space is evolving quickly. In just the past year, several new players and patterns have emerged aiming to make it feel more like the cloud:
- Omni (by Sidero): brings Talos-based Kubernetes clusters to bare metal with a cloud-like control plane
- Spectro Cloud Palette: full-stack Kubernetes lifecycle with multi-cluster governance
- Canonical MicroCloud: a lightweight cluster platform using LXD and Juju
- Kamaji and HyperShift: open-source projects pushing the “Hosted Control Planes” pattern forward
This is still a fast-moving market, with new approaches appearing and maturing quickly. It’s worth watching if you’re planning a longer-term Kubernetes strategy.
…and of course, us!
At meltcloud, we were surprised that none of the existing platforms fully solve all of these challenges. So we’re building it ourselves: a cloud-like Kubernetes platform for your own hardware.
We combine several patterns we’ve talked about:
- Hosted Control Planes to avoid control plane waste on bare metal, hosted on an appliance to solve the bootstrap/management problem
- Immutable Workers to remove day-2 pain on bare metal ( how it works)
- Elastic Pools to slice up large bare metal nodes into virtual workers for multi-cluster scenarios ( docs)
The goal: make Kubernetes behave like GKE, AKS or EKS, but in your own data center. If that sounds interesting, here’s our Platform Overview.
Wrapping up
These two choices (where your nodes run and how you get Kubernetes) shape how your teams work with the platform every day. There’s no right answer - it depends on your people, skills and goals.
Continue reading
This article is part of a series. Next up: Part 4: Taming the network jungle: CNI choices, L2/L3 realities, firewalls, east-west vs north-south, and other fun topics.
- Part 1: Has the Cloud Delivered on Its Promise?
- Part 2: Why on-prem Kubernetes is hard
- Part 3: Choosing your on-prem Kubernetes stack (this post)
- Part 4: Taming the network jungle (coming soon)
- Part 5: Dude, where is my storage? (coming soon)
- Part 6: Bringing Data(bases) into Kubernetes (coming soon)
- Part 7: Running VM workloads on Kubernetes (coming soon)