About Us: Platinumlist.net, a pioneering leader in the online event guide and ticketing solution industry, has been revolutionizing the event landscape in the Gulf region since 2009. As the largest ticketing provider in the GCC, we proudly serve an extensive array of events across the United Arab Emirates, Saudi Arabia, Oman, Bahrain, Qatar, and Kuwait from our Dubai-based headquarters.
About the Role: We’re looking for a Senior DevOps / SRE Engineer to own and evolve our AWS infrastructure with a strong focus on reliability, scalability, performance under peak load, and safe delivery of new AWS capabilities. You’ll partner with engineering teams to ensure our platform stays fast and resilient during traffic spikes while continuously improving automation, observability, security, and cost efficiency.
Key Responsibilities:
- Own production reliability on AWS: availability, latency, throughput, capacity, and incident response.
- Architect and operate scalable infrastructure (multi-AZ as a baseline; DR strategy and regular testing).
- Build and maintain Infrastructure as Code (Terraform / CloudFormation / CDK) and Git-based workflows.
- Improve CI/CD pipelines and deployment strategies (blue/green, canary, progressive delivery).
- Implement strong observability: metrics, logs, traces, alerting, dashboards; define SLO/SLI and reduce noise.
- Own database operations on AWS (Aurora/RDS MySQL): backups/restores (including restore drills), read replicas, performance troubleshooting, and capacity planning.
- Improve caching and traffic handling (CDN, Redis/ElastiCache, queues) to sustain peak demand.
- Harden security posture: IAM least privilege, secrets management, patching, WAF, audit trails.
- Drive adoption of relevant AWS managed services (where it increases reliability and reduces ops burden).
- Drive cloud cost efficiency (FinOps): cost visibility, tagging, budgets/alerts, rightsizing, and smart usage of AWS pricing models without compromising reliability.
- Lead post-incident reviews (RCA, corrective actions, prevention), and ensure improvements are implemented and verified.