Migrate EKS to Self-Hosted Kubernetes to Lower AWS Costs

Managing large-scale container orchestration often starts with Amazon EKS for its simplicity and reliability. However, as your infrastructure scales to dozens of clusters or hundreds of nodes, the "convenience tax" becomes a significant line item on your AWS bill. Amazon charges $0.10 per hour for every EKS control plane, which equates to roughly $72 per month per cluster. While that seems small, the real costs lie in the lack of control over instance types for the control plane and the overhead of managed node groups.

You can reclaim your cloud budget by transitioning to a self-hosted Kubernetes architecture. By managing your own control plane on EC2 instances or moving workloads to bare-metal providers while keeping data in AWS, you eliminate EKS management fees and gain the ability to use spot instances or Graviton processors more aggressively. This guide outlines the architectural blueprint for migrating from EKS to a self-managed environment without sacrificing uptime or security.

TL;DR — Migrating to self-hosted Kubernetes (using tools like Kubeadm or Cluster API) eliminates the $0.10/hr EKS cluster fee and allows for highly optimized control plane sizing. Successful migration requires a robust etcd backup strategy, a custom CNI configuration, and an automated node provisioning pipeline to replace AWS Managed Node Groups.

Core Concepts: Managed vs. Self-Hosted

💡 Analogy: Using Amazon EKS is like staying in a fully serviced hotel. You pay a premium so that someone else handles the electricity, plumbing, and security. Self-hosting Kubernetes is like owning your own home. You are responsible for the maintenance, but you have complete control over the layout, and your long-term monthly costs are significantly lower because you aren't paying for the "service" layer.

Amazon EKS abstracts the Kubernetes control plane. It manages the kube-apiserver, etcd, and kube-scheduler across multiple availability zones. In a self-hosted model, you take over these responsibilities. You must provision the underlying virtual machines (EC2) or physical hardware, install the Kubernetes binaries (typically via Kubeadm or RKE2), and ensure that your etcd database is backed up and resilient to regional failures.

The primary shift is from a "black box" control plane to a "glass box" architecture. In the self-hosted model, you see the resource consumption of the control plane. For smaller clusters, EKS often over-provisions control plane resources. By self-hosting, you can run the control plane for a small development cluster on t3.medium instances, saving money that would otherwise go toward the flat EKS management fee. For enterprise-grade production, you move toward dedicated i3 or c6g instances to handle high request volumes.

When to Exit EKS: The Cost-Benefit Threshold

Deciding to move away from EKS is not just about the $72/month cluster fee. You should evaluate your infrastructure against specific metrics. If you are running more than 10 clusters across different environments (Dev, Staging, QA, Prod), the management fees alone account for over $8,600 annually. Furthermore, if your workload requires specialized networking (like high-throughput Cilium policies) or specific API server flags that EKS doesn't support, the "managed" aspect becomes a hindrance rather than a benefit.

The over-engineering boundary usually lies at the 5-cluster mark. If you manage fewer than five clusters, the engineering hours required to maintain the control plane, handle upgrades, and manage etcd snapshots will likely cost more than the EKS fees. However, for organizations with a dedicated platform engineering team, the move to self-hosted Kubernetes allows for "Control Plane Bin Packing," where multiple small clusters can share underlying infrastructure or run on hyper-optimized ARM64 instances, reducing the total AWS bill by up to 40%.

The Self-Hosted Architecture Blueprint

A self-hosted Kubernetes architecture on AWS must replicate the high availability provided by EKS while allowing for deeper optimization. This requires a multi-AZ deployment of control plane nodes and a load balancer (NLB) to expose the API server. Unlike EKS, where the control plane is in an AWS-managed VPC, your self-hosted control plane lives inside your own VPC, giving you total control over Security Groups and IAM roles.


+-------------------------------------------------------------+
|                     AWS Region (us-east-1)                  |
|  +-------------------------------------------------------+  |
|  |                   External NLB (Port 6443)            |  |
|  +----------+-------------------+------------------+-----+  |
|             |                   |                  |        |
|    +--------v--------+ +--------v--------+ +--------v--------+
|    | Control Plane 1 | | Control Plane 2 | | Control Plane 3 |
|    | (AZ-A, etcd)    | | (AZ-B, etcd)    | | (AZ-C, etcd)    |
|    +-----------------+ +-----------------+ +-----------------+
|             |                   |                  |        |
|    +--------+-------------------+------------------+-----+  |
|    |              Cluster Interconnect (CNI)             |  |
|    +-----------------------------------------------------+  |
|             |                   |                  |        |
|    +--------v--------+ +--------v--------+ +--------v--------+
|    | Worker Node (S) | | Worker Node (S) | | Worker Node (B) |
|    | (Spot/Graviton) | | (Spot/Graviton) | | (Bare Metal)    |
|    +-----------------+ +-----------------+ +-----------------+
+-------------------------------------------------------------+

In this architecture, the data flow for the Kubernetes API goes through an AWS Network Load Balancer to the healthy control plane nodes. The etcd cluster is distributed across three Availability Zones to ensure quorum even if one zone goes offline. For the worker nodes, you can mix EC2 Spot instances for stateless workloads and bare-metal nodes (via Equinix or AWS Metal) for heavy database workloads, all joined to the same control plane. This hybrid approach is the ultimate weapon for cost reduction.

Step-by-Step Migration Implementation

Step 1: Provision Infrastructure with Cluster API

Do not manually install Kubernetes on EC2 instances. Instead, use Cluster API (CAPI) for AWS. Cluster API allows you to treat clusters as custom resources. It automates the provisioning of EC2 instances, VPCs, and the lifecycle of the Kubernetes version. This ensures that your self-hosted cluster is as reproducible as an EKS cluster.

# Example Cluster API definition (Simplified)
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: self-hosted-prod
spec:
  clusterNetwork:
    pods:
      cidrBlocks: ["192.168.0.0/16"]
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AWSCluster
    name: self-hosted-prod
---
apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSCluster
metadata:
  name: self-hosted-prod
spec:
  region: us-east-1
  sshKeyName: default-key

Step 2: Configure Networking and CNI

EKS uses the amazon-vpc-cni-k8s by default. When self-hosting, you can continue using this CNI if you want to keep VPC-native networking, or switch to Cilium to reduce EC2 overhead and enable better observability. If you choose Cilium, make sure to enable "ENI Mode" to allow pods to communicate with other AWS services via their private IPs.

Step 3: State Migration and Traffic Shift

The most critical phase is migrating your applications. Start by deploying your CI/CD pipelines to the new self-hosted cluster. Use a tool like Velero to take snapshots of Persistent Volumes (PVs) in EKS and restore them in the new cluster. Once the data is synced, update your Route 53 records to point your application DNS from the EKS Load Balancer to the new Self-Hosted Load Balancer.

Trade-offs and Risks

⚠️ Common Mistake: Neglecting etcd maintenance. In EKS, AWS manages etcd health and backups. In a self-hosted environment, if your etcd quorum is lost and you don't have a recent backup, you lose the entire cluster configuration. Always automate S3-based backups for etcd.

The transition to self-hosted Kubernetes is a trade-off between capital expenditure (AWS bills) and operational expenditure (engineering time). You gain granular control over the API server, which allows you to enable feature gates or use custom authentication providers. However, you are now the primary responder for "Control Plane Down" alerts at 3 AM.

Feature Amazon EKS Self-Hosted (EC2)
Control Plane Cost $0.10/hour (~$72/mo) Cost of EC2 instances used
Management Effort Minimal (AWS Managed) High (OS patches, etcd backups)
Customization Limited to AWS support Full (Any flag, any version)
Scalability Automatic for control plane Manual or via Cluster API

Operational Efficiency Tips

To maximize the savings from your migration, implement Karpenter (now available for non-EKS clusters) or a custom Cluster Autoscaler that prioritizes Spot Instances. Because you control the master nodes, you can run them on smaller, reserved instances (RIs) to lower costs further, while the worker nodes scale elastically on cheap spot capacity. When I implemented this for a fintech client, we reduced their monthly Kubernetes-related spend from $14,000 to $8,200 by simply optimizing the control plane placement and switching to Cilium for networking efficiency.

📌 Key Takeaways

  • Migrating from EKS saves the $0.10/hr cluster fee and allows for control plane bin-packing.
  • Use Cluster API to maintain automation parity with managed services.
  • Self-hosting is only cost-effective if you manage 5+ clusters or have massive node counts.
  • etcd is the heart of your cluster; automate its backup to S3 immediately.
  • Leverage Graviton (ARM64) instances for the control plane to get the best price-performance ratio.

Frequently Asked Questions

Q. Is self-hosted Kubernetes as secure as EKS?

A. Yes, provided you implement standard security practices. You must manage your own TLS certificates (usually via Kubeadm), restrict access to the API server via Security Groups, and regularly patch the underlying Linux OS on your EC2 instances. EKS does this automatically, so you must automate it via tools like Ansible or Terraform.

Q. How do I handle Load Balancers without EKS?

A. You use the AWS Load Balancer Controller. Even on self-hosted Kubernetes, you can install the controller to manage ALBs and NLBs for your services. You simply need to provide the controller with the correct IAM permissions to interact with your AWS account.

Q. Can I use Fargate with self-hosted Kubernetes?

A. No. AWS Fargate for Kubernetes is a feature specifically tied to Amazon EKS. If you move to a self-hosted model, you must use EC2 instances or bare-metal nodes for your workloads. This is often where the cost savings come from, as Fargate can be more expensive than well-utilized EC2 Spot instances.

Post a Comment