[Remote] Senior Cloud Operations Engineer
Note: The job is a remote job and is open to candidates in USA. Loftware is a global leader in enterprise labeling and packaging, providing cloud-based solutions for businesses. They are seeking a Senior Cloud Operations Engineer to join their 24x7 Cloud Operations Team, responsible for building, maintaining, and troubleshooting customer environments across various cloud platforms.
Responsibilities
- Own and enhance monitoring, observability, and alerting frameworks across AWS, Azure, and other platforms
- Define SLIs/SLOs and drive continuous reliability improvements
- Architect and maintain scalable, reusable IaC solutions using Terraform, Terragrunt, and Ansible
- Champion automation-first approaches to eliminate manual operational tasks
- Design and enforce cloud security best practices, governance, and compliance standards
- Lead vulnerability assessments and remediation strategies across environments
- Design and implement disaster recovery (DR), backup strategies, and high availability architectures
- Lead incident response, root cause analysis (RCA), and postmortem reviews with actionable improvements
- Partner closely with Engineering, QA, and Product teams to improve system architecture and application performance
- Influence design decisions to enhance scalability, resilience, and operational efficiency
- Design and manage complex networking setups including VPNs, Direct Connect, Transit Gateways, and hybrid connectivity
- Identify bottlenecks and lead initiatives to optimize system performance and cost efficiency
- Mentor junior engineers and contribute to team skill development
- Define and promote best practices across cloud operations
- Participate in and lead on-call rotations
- Act as escalation point for critical incidents and ensure rapid resolution
Skills
- 8+ years of relevant experience in cloud operations, SRE, or infrastructure engineering
- Strong expertise in AWS and/or Azure
- Deep experience with Linux and/or Windows server environments
- Proven experience leading operational initiatives in production environments
- Strong communication skills in English (written and verbal)
- Database: PostgreSQL Microsoft SQL Server
- Scripting: Python, Java, Bash, .NET/C#, Powershell
- IAC and Automation: Terraform, Terragrunt, Ansible, Rundeck, Jenkins
- Cloud networking concepts: VPN, direct connect, transit gateways
- Container Technologies: Docker, Kubernetes
- Cloud-native technologies: RDS, Microservices, Serverless computing
Benefits
- Visa sponsorship is not available for this role.
- Hybrid
- Remote (U.S.-based candidates working EST hours
- We offer comprehensive training to all employees and place an emphasis on employee development.
- We are an equal opportunities employer.
- We invest in our employees to inspire confidence and help everyone realize their full potential.
Company Overview