On behalf of our client, we are looking for an experienced DevOps Engineer to take a key role in our infrastructure team. This position requires deep expertise in cloud architecture, automation, and Kubernetes orchestration.
This is a high-impact role in a fast-paced environment, working with cutting-edge technologies to build, scale, and optimize our infrastructure. You will be responsible for managing multiple environments, ensuring high availability, security, and operational efficiency.
If you thrive in complex cloud-based ecosystems and love working with AWS, Kubernetes (EKS), Terraform, Kafka, and Prometheus/Grafana, we’d love to hear from you.
Job Responsibilities
- Design, build, and maintain a scalable, reliable, and high-performance cloud infrastructure.
- Develop and manage Infrastructure as Code (IaC) using Terraform (hands-on experience is a must).
- Ensure AWS best practices for production environments, including high availability, security, and cost optimization.
- Maintain and enhance our EKS-based Kubernetes environment, optimizing for performance and security.
- Manage Kafka clusters, ensuring reliability, scaling, and performance tuning.
- Implement and optimize monitoring and alerting systems using Prometheus, Grafana, OpenSearch, and Jaeger.
- Automate deployment pipelines and CI/CD workflows using GitHub Actions, AWS CodePipeline, and ArgoCD.
- Administer AWS services such as RDS, S3, Route53, ACM (AWS Cert Manager), SSM, Lambda, ECS, and OpenVPN.
- Troubleshoot production issues, ensuring high availability and minimal downtime.
- Collaborate with developers and security teams to improve infrastructure reliability and security.
- Maintain multi-environment infrastructure and ensure smooth deployments across development, staging, and production.
Requirements
- Terraform Hands-On Experience (IaC is central to our infrastructure).
- AWS Production Experience (3+ years supporting live production environments).
- Kubernetes (EKS) and Kafka expertise.
- Monitoring & Observability: Proficiency in Grafana, Prometheus, OpenSearch, and Jaeger.
- Linux Proficiency: Deep knowledge of system administration and networking.
- CI/CD Automation: Experience with GitHub Actions, ArgoCD, and AWS CodePipeline.
- Strong troubleshooting and problem-solving skills.
- A self-driven, proactive approach with a passion for operational excellence.
Bonus Skills (Nice to Have):
- Knowledge of SQL/NoSQL databases (PostgreSQL, Redis, Aerospike, Cassandra, ScyllaDB).
- Strong scripting skills (Bash, Python, Ruby, or Groovy).
- Understanding of cloud security best practices.
Why Join?
- Work in a cutting-edge cloud-native environment with a strong DevOps culture.
- Ownership & impact – You’ll have a say in architecture and infrastructure decisions.
- Collaborate with top engineers and help shape the future of our cloud operations.
- Competitive compensation and a chance to work with the latest DevOps tooling