Guardanthealth·2 months ago
We are on the hunt for a dynamic and proficient Cloud Data Engineer to join our Guardant Data Platform within the Data Team.
Duties and Responsibilities
• Quickly learn and adapt to new technologies as the Data Team's technology stack evolves, demonstrating the ability to tackle new challenges.
• Consider all aspects of usability, scalability, deployment, integration, maintenance, and automation when integrating new technology stacks.
• Demonstrate strong programming skills in at least one language (Python, Scala, Java) and the ability to learn additional languages as needed.
• Build and maintain ETL pipelines and data-driven systems utilizing technologies such as Apache Spark, AWS Glue, Athena, Redshift, and AWS Batch.
• Expertise in writing complex SQL queries is essential.
• Manage code on GitHub, with a comprehensive understanding of advanced git operations, including git-flow, rebasing, and squashing.
• Implement infrastructure as code using Terraform and utilize AWS Analytics and Data Services like Glue, S3, Lambda, AWS Batch, Athena, Redshift, DynamoDB, CloudWatch, Kinesis, SQS, SNS, and DMS.
• Use Jenkins to implement deployment pipelines and engage in requirements gathering to estimate efforts for integrating new technology stacks.
• Design and architect solutions for ML, Data Governance, Deployment/Integration Automations, and Data Analytics.
• Explore and learn additional AWS services such as ECS, ECR, and EC2, along with Data Modeling.
• A minimum of 5 years of experience in software development, with at least 2 years focused on building scalable and stable data pipelines using the AWS tech stack.
• Proven experience in constructing Data Pipelines in the AWS Cloud, gained through job experience or personal projects. Data and AWS Tools
• Strong programming skills and proficiency in SQL.
• Familiarity with a range of AWS Analytics Ecosystem components, including but not limited to Apache Airflow, Apache Spark, S3, Glue, Kafka, AWS Athena, Lambda, Redshift, Lake Formation, AWS Batch, ECS - Fargate, Kinesis, Flink, DynamoDB, and SageMaker.
• Should have experience in using IaC while deploying the Data Pipelines in AWS. Example - Terraform, CloudFormation, etc.
• Experience with Docker, Kubernetes, ECR, EC2, VPC, SNS, SQS, CloudWatch is highly valued.
• Experience in building Jenkins deployment pipelines is highly valued.
• Proficiency in using collaboration tools like JIRA, Confluence, GitHub is beneficial. • Exposure to NoSQL databases is an advantage.
Job Location: Hyderabad, Telangana, India. (Hybrid model - Work from Office).
Why Join Us?
At Guardant Health, we are on a mission to conquer cancer with data. You’ll be part of a team that is revolutionizing precision oncology through cutting-edge technology and innovative software solutions. If you are a technically strong, people-oriented leader who thrives in a collaborative and high-impact environment, we’d love to hear from you!
To learn more about the information collected when you apply for a position at Guardant Health, Inc. and how it is used, please review our Privacy Notice for Job Applicants.
Please visit our career page at: http://www.guardanthealth.com/jobs/