Published on

Migrating to Kubernetes: A Journey Through CI/CD with GitHub Actions, ArgoCD, and Helm

Authors

Introduction

For the past two and a half months, I've been deeply immersed in one of the most transformative infrastructure projects of my career: migrating our workloads from Amazon ECS to Kubernetes (EKS) and establishing a robust, modern CI/CD pipeline. This journey has been both challenging and rewarding, teaching me invaluable lessons about container orchestration, GitOps practices, and the power of declarative infrastructure.

In this article, I'll walk you through our complete CI/CD setup, covering everything from building container images with GitHub Actions to deploying applications using GitOps principles with ArgoCD and Helm Charts. Whether you're planning a similar migration or looking to modernize your deployment pipeline, I hope this serves as a practical guide.

The Motivation: Why Move from ECS to Kubernetes?

Before diving into the technical details, let me address the elephant in the room: why migrate from ECS to Kubernetes? While ECS is a solid container orchestration service, Kubernetes offers several advantages that made the migration worthwhile:

  • Portability: Kubernetes is cloud-agnostic, giving us flexibility to run workloads across different cloud providers
  • Ecosystem: A rich ecosystem of tools and operators (ArgoCD, KEDA, Prometheus, etc.)
  • Scalability: More granular control over resource allocation and scaling policies
  • GitOps: Native support for declarative deployments and GitOps workflows
  • Community: Extensive community support and a wealth of learning resources

The decision to migrate wasn't taken lightly, but the benefits have already started to materialize in terms of deployment velocity, observability, and operational flexibility.

Architecture Overview

Our CI/CD pipeline follows a modern GitOps approach with the following components:

  1. GitHub Actions: Builds and pushes container images to ECR
  2. Amazon ECR: Stores container images securely
  3. GitOps Repository: Contains ArgoCD application definitions and Helm values
  4. ArgoCD: Monitors the GitOps repository and syncs applications to Kubernetes
  5. Helm Charts: Templates for Kubernetes resources with environment-specific values
┌─────────────┐     ┌───────────────┐     ┌─────────────┐     ┌─────────────┐
GitHub    │────▶│ GitHub Actions│────▶│  Amazon ECR │     │   ArgoCDRepository  (Build & Tag)  (Registry)  (GitOps)└─────────────┘     └───────────────┘     └─────────────┘     └─────────────┘
                                                          ┌─────────────┐
Kubernetes                                                             (EKS)                                                          └─────────────┘
                                                          ┌─────────────┐
Helm Charts                                                            (Templates)                                                          └─────────────┘

Step 1: Building Container Images with GitHub Actions

The foundation of our CI/CD pipeline is GitHub Actions, which automates the build and push process for container images. Let me break down our workflow:

Build and Release Workflow

Our main workflow (build-push-release.yml) triggers on every push to the main branch and follows a sophisticated versioning strategy:

Key Features:

  • Semantic Versioning: Automatically determines version bumps (major, minor, patch) based on commit messages or PR labels
  • Smart Tagging: Creates SHA-based tags first, then adds version tags without rebuilding if the container build is successful
  • Release Automation: Creates Git tags and GitHub releases automatically

Here's a simplified version of how the workflow operates:

name: Build & Release

on:
  push:
    branches:
      - main

jobs:
  build-release:
    runs-on: ubuntu-latest
    steps:
      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v5
        with:
          role-to-assume: ${{ secrets.AWS_ROLE_ARN }}
          aws-region: ${{ vars.AWS_REGION }}

      - name: Log in to Amazon ECR
        uses: aws-actions/amazon-ecr-login@v2

      - name: Build and push with SHA tag
        uses: docker/build-push-action@v6
        with:
          push: true
          tags: |
            ${{ steps.ecr-login.outputs.registry }}/matters-ai/service-a:sha-${{ github.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max

Version Detection Strategy:

The workflow intelligently determines version bumps:

  • Major: Breaking changes (commit messages with BREAKING CHANGE: or feat!:)
  • Minor: New features (feat: prefix)
  • Patch: Bug fixes and other changes
  • PR Labels: Can override with release:major, release:minor, or release:patch

After building with a SHA tag, the workflow adds version tags to the same image without rebuilding, which is both efficient and ensures consistency.

Deployment Workflow

For deploying to development environments, we have a separate workflow (deploy-dev.yml) that supports two deployment strategies:

  1. Branch Deployments: Builds and deploys from feature/hotfix/bugfix branches
  2. Release Deployments: Deploys a specific released version

The branch deployment workflow:

  • Validates branch naming conventions
  • Checks if the image already exists in ECR (skips build if present)
  • Updates the GitOps repository with the new image tag
  • Triggers ArgoCD sync automatically

Step 2: Container Registry with Amazon ECR

Amazon ECR serves as our primary container registry, providing secure, scalable image storage. We leverage ECR's features:

  • Image Scanning: Automatic vulnerability scanning
  • Lifecycle Policies: Automatic cleanup of old images
  • IAM Integration: Fine-grained access control using IAM roles
  • Cross-Region Replication: For disaster recovery and latency optimization

Our GitHub Actions workflows authenticate to ECR using OIDC (OpenID Connect), eliminating the need to store long-lived credentials. This is a security best practice that I highly recommend.

Step 3: GitOps with ArgoCD

ArgoCD is the heart of our GitOps implementation. It continuously monitors our GitOps repository and ensures that the Kubernetes cluster state matches the desired state defined in Git.

ArgoCD Application Structure

Our GitOps repository (gitops-argocd) follows a hierarchical structure:

gitops-argocd/
├── argocd/
│   ├── applications/
│   │   ├── dev/
│   │   │   ├── service-a.yaml
│   │   │   ├── service-b.yaml
│   │   │   └── ...
│   │   ├── staging/
│   │   ├── prod/
│   │   └── root-*-app.yaml
│   └── projects/
│       ├── dev.yaml
│       ├── staging.yaml
│       └── prod.yaml
└── apps/
    ├── dev/
    │   └── values/
    │       ├── service-a.yaml
    │       └── ...
    └── ...

Root Applications

We use ArgoCD's Application of Applications pattern, where root applications manage other applications:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: root-apps-dev
  namespace: argocd
spec:
  project: dev
  source:
    repoURL: git@github.com:matters-ai/gitops-argocd.git
    targetRevision: main
    path: argocd/applications/dev
  destination:
    server: https://kubernetes.default.svc
    namespace: dev
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

Key Features:

  • Automated Sync: Changes in Git are automatically synced to the cluster
  • Self-Healing: ArgoCD automatically corrects manual changes to the cluster
  • Pruning: Removes resources that are no longer in Git

Application Definitions

Each application references a Helm chart and environment-specific values:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: service-a
  namespace: argocd
spec:
  project: dev
  sources:
  - repoURL: 'git@github.com:matters-ai/helm-charts.git'
    path: service-a
    targetRevision: main
    helm:
      valueFiles:
      - $values/apps/dev/values/service-a.yaml
  - repoURL: 'git@github.com:matters-ai/gitops-argocd.git'
    targetRevision: main
    ref: values
  destination:
    server: https://kubernetes.default.svc
    namespace: service-a
  syncPolicy:
    automated:
      prune: true
      selfHeal: true

This setup allows us to:

  • Separate Helm chart templates from environment-specific values
  • Use ArgoCD's multi-source feature to reference values from a different repository
  • Maintain a clear separation of concerns

Step 4: Helm Charts for Kubernetes Deployments

Helm charts provide a templated approach to defining Kubernetes resources. Our Helm charts are stored in a separate repository (helm-charts) and follow best practices:

Chart Structure

service-a/
├── Chart.yaml
├── values.yaml
└── templates/
    ├── deployment-api.yaml
    ├── deployment-workers.yaml
    ├── service.yaml
    ├── ingress.yaml
    ├── hpa.yaml
    ├── pdb.yaml
    ├── serviceaccount.yaml
    └── _helpers.tpl

Key Features of Our Helm Charts

  1. Multi-Component Support: Our service charts can support both API servers and queue workers, each with independent configurations

  2. Environment-Specific Values: Values files in the GitOps repository override defaults:

# apps/dev/values/service-a.yaml
image:
  repository: "123456789012.dkr.ecr.ap-south-1.amazonaws.com/matters-ai/service-a"
  tag: "v1.2.3"
  pullPolicy: IfNotPresent

api:
  enabled: true
  replicaCount: 1
  autoscaling:
    enabled: true
    minReplicas: 1
    maxReplicas: 3
    targetCPUUtilizationPercentage: 80
  1. Flexible Configuration: Support for ConfigMaps, Secrets, environment variables, and volume mounts

  2. Resource Management: CPU and memory limits/requests for proper resource allocation

  3. Autoscaling: Built-in support for Horizontal Pod Autoscaling (HPA) and KEDA for event-driven scaling

Deployment Template Example

Our deployment templates are comprehensive, supporting various Kubernetes features:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: {{ include "service-a.fullname" . }}-api
spec:
  replicas: {{ .Values.api.replicaCount }}
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    spec:
      serviceAccountName: {{ include "service-a.serviceAccountName" . }}
      containers:
        - name: {{ .Chart.Name }}-api
          image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
          imagePullPolicy: {{ .Values.image.pullPolicy }}
          resources:
            {{- toYaml .Values.api.resources | nindent 12 }}

The Complete Deployment Flow

Let me walk you through what happens when we deploy a new version:

  1. Developer pushes code to a feature branch or merges to main
  2. GitHub Actions triggers the build workflow
  3. Docker image is built and pushed to ECR with a SHA tag
  4. Version is determined based on commit messages or PR labels
  5. Version tag is added to the existing image (no rebuild)
  6. Git tag and GitHub release are created automatically
  7. Deployment workflow (manual or automated) updates the GitOps repository
  8. ArgoCD detects the change in the GitOps repository
  9. ArgoCD syncs the new image tag to the Kubernetes cluster
  10. Helm renders the Kubernetes manifests with the new image
  11. Kubernetes performs a rolling update of the deployment

This entire process is automated, repeatable, and auditable through Git history.

Challenges and Learnings

No migration of this scale is without challenges. Here are some key learnings:

1. Image Tagging Strategy

Initially, we struggled with image tagging. We learned that:

  • SHA-based tags are great for traceability
  • Version tags are essential for releases
  • Tagging without rebuilding (using manifests) saves time and ensures consistency

2. GitOps Repository Management

Managing the GitOps repository required discipline:

  • All changes must go through Git (no manual kubectl edits)
  • Clear commit messages help with debugging
  • Separate repositories for charts and values improve maintainability

3. Helm Chart Complexity

As our applications grew, Helm charts became complex:

  • Use _helpers.tpl for reusable template logic
  • Keep values.yaml well-documented
  • Version your charts properly

4. ArgoCD Sync Policies

Finding the right balance for sync policies:

  • Automated sync is great for dev/staging
  • Manual sync might be preferred for production
  • Self-healing is powerful but can be surprising if not understood

5. Resource Management

Kubernetes resource management is more granular than ECS:

  • Proper resource requests/limits are crucial
  • HPA requires careful tuning
  • Node affinity and taints help with workload placement

Best Practices We Follow

  1. Immutable Infrastructure: All changes go through Git, no manual edits
  2. Environment Parity: Same Helm charts, different values files
  3. Security: OIDC for authentication, least-privilege IAM roles
  4. Observability: Comprehensive logging and monitoring
  5. Documentation: Clear READMEs and inline comments
  6. Testing: Test Helm charts with different values before deploying

What's Next?

Our CI/CD journey continues to evolve. Some areas we're exploring:

  • Progressive Delivery: Using Argo Rollouts for canary and blue-green deployments
  • Policy as Code: Implementing OPA (Open Policy Agent) for governance
  • Multi-Cluster Management: Managing multiple EKS clusters with ArgoCD
  • Cost Optimization: Better resource utilization and spot instance integration

Conclusion

Migrating from ECS to Kubernetes and establishing a modern CI/CD pipeline has been a transformative experience. The combination of GitHub Actions, ECR, ArgoCD, and Helm Charts has given us:

  • Faster deployments: Automated pipeline reduces manual steps
  • Better visibility: Git-based history of all changes
  • Improved reliability: Automated testing and rollback capabilities
  • Enhanced security: Immutable infrastructure and proper access controls
  • Greater flexibility: Easy to add new environments or services

While the initial setup required significant effort, the long-term benefits in terms of developer productivity, operational efficiency, and system reliability make it well worth it.

If you're considering a similar migration, I'd recommend:

  1. Start small with a non-critical service
  2. Invest time in understanding GitOps principles
  3. Document everything as you go
  4. Involve your team early in the process
  5. Be patient—migrations take time

I hope this article provides valuable insights for your own CI/CD journey. If you have recently migrated to Kubernetes or implemented GitOps, I'd love to hear about your experiences and learnings. Feel free to share your thoughts or reach out if you'd like to discuss any aspect of this setup in more detail.