Blend: Modernizing AWS Infrastructure for Scale, Security, and Reliability

Amine Malaeb   ☁️   November 1, 2025   ☁️  

Modernizing Blend’s AWS Infrastructure for Scale, Security, and Reliability

Customer: Blend

Short Description: Digico Solutions modernized Blend’s AWS and EKS environments by enhancing autoscaling, improving observability, tightening security, and preparing a zero-downtime cutover plan for a production-ready deployment.

Overview

Blend, a SaaS platform running on AWS EKS, experienced scaling limitations, inconsistent pod scheduling, and gaps in observability and security controls across their staging environments. To support future growth and stabilize operations, Blend engaged Digico Solutions to assess the environment and implement improvements aligned with AWS best practices.

The Challenge

Blend’s existing EKS setup was struggling with several issues that impacted performance, security, and reliability. Node groups frequently reached capacity, Cluster Autoscaler lacked the required IAM permissions, HPAs and resource limits were misconfigured, critical control plane logs were not enabled, Datadog monitoring was inconsistent, RDS and Redis security groups were overly permissive, Jenkins was exposed in a public subnet, and there was no plan for a zero-downtime cutover to a production-ready environment.

The Solution

  • Deploying Karpenter to enable dynamic autoscaling and resolve scheduling failures.
  • Creating improved NodePools and EC2NodeClasses with proper labels and taints.
  • Adjusting HPAs, resource requests, and scheduling constraints for balanced workloads.
  • Enabling full EKS control plane logging for better visibility.
  • Moving Datadog to Terraform and configuring dashboards, monitors, and alerts.
  • Restricting RDS and Redis security groups based on least-privilege access.
  • Relocating Jenkins to a private subnet behind NAT.
  • Adding EFS/EBS storage classes where needed.
  • Supporting Artillery load tests to validate scaling behavior.
  • Delivering a documented cutover plan with success criteria and dry-run validation.

The Results

  • Autoscaling became reliable through Karpenter, eliminating pod scheduling issues.
  • Full visibility was restored through EKS logging and Datadog monitoring.
  • Security posture improved through SG tightening and CI/CD isolation.
  • Jenkins now operates securely from a private network.
  • The environment adheres to AWS best practices and supports predictable scaling.
  • Blend now has a validated, repeatable cutover workflow for production adoption.

The Outcome

Blend’s AWS environment is now scalable, secure, and fully observable, enabling stable deployments and reducing operational risk. The improvements provide a strong foundation for production workloads and future application growth.