Responsibilities:
- Design, develop, and maintain automated data quality checks for complex data pipelines handling petabyte scale datasets.
- Implement scalable data validation frameworks using PySpark, PyTest , Python, SQL, and Hive, ensuring comprehensive test coverage.
- Collaborate with Data Engineers and DevOps teams to integrate automated tests into CI CD workflows.
- Analyze test results, identify data anomalies, and provide actionable insights to resolve data quality issues.
- Develop monitoring and alerting solutions for data quality in production environments.
- Document test processes, standards, and best practices mentor junior engineers on data quality automation.
- Continuously improve test frameworks and processes to optimize performance, scalability, and reliability.
This is a hybrid position. Expectation of days in office will be confirmed by your hiring manager.
Basic Qualifications:
- Bachelors or Master's degree in Computer Science, Engineering, or related field.
- 1+ years of experience in test engineering or data engineering roles, with a focus on data quality and automation.
- Bachelors or Master's degree in Computer Science, Engineering, or related field.
- 1+ years of experience in test engineering or data engineering roles, with a focus on data quality and automation.
- Advanced proficiency in PySpark and Python for building scalable data processing and testing solutions.
- Strong experience with SQL and Hive for querying, validating, and profiling data in large datasets.
- Hands on exposure to Hive and HDFS.
- Solid understanding of data pipeline architectures, ETL processes, and best practices for big data environments.
- Experience implementing CI CD pipelines for automated data testing.
- Strong analytical, problem solving, and communication skills.
- Ability to work independently and in a collaborative, fast paced team environment.
Preferred Qualifications:
- Experience with data quality frameworks (e.g. Deequ, Great Expectations).
- Familiarity with workflow orchestration tools (e.g. Airflow, Step Functions).
- Exposure to data cataloging, data lineage, and metadata management
Visa is an EEO Employer. Qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, sexual orientation, gender identity, disability or protected veteran status. Visa will also consider for employment qualified applicants with criminal histories in a manner consistent with EEOC guidelines and applicable local law.