Oocahq·about 1 month ago
We are looking for a Data Engineer to design, build, and maintain reliable data streaming and batch processing systems that support the company’s core data infrastructure.
This role focuses on orchestrating data workflows using Airflow, ensuring data correctness, operational stability, and scalability across business-critical systems.
The Data Engineer will work closely with Product, Finance, Operations, and Clinical teams to deliver accurate, timely, and auditable data pipelines, forming the foundation for analytics, reporting, and future AI initiatives.
Design, build, and maintain data pipelines supporting both batch processing and streaming workloads.
Develop, operate, and monitor Airflow DAGs for scheduled, event-driven, and batch workflows.
Implement retry logic, backfilling, dependency management, and failure recovery within Airflow DAGs.
Build and maintain ETL/ELT pipelines from diverse data sources, including:
Support data streaming pipelines and message-based systems for near real-time data processing.
Ensure data correctness, consistency, and auditability across pipelines and datasets.
Implement data validation, reconciliation checks, and monitoring for production workflows.
Support batch-based upsert and historical data correction processes, maintaining change history and data lineage.
Maintain and optimize datasets, schemas, and performance in analytical data storage systems.
Support complex business logic pipelines related to:
Build automation workflows for reporting, notifications, and internal operations.
Troubleshoot data pipeline failures and performance issues in production environments.
Collaborate with Analytics and AI teams by providing clean, reliable, and well-structured data.
Bachelor’s degree in Computer Science, Engineering, Information Systems, or related field; or equivalent practical experience.
2–3 years of experience in Data Engineering, Backend Engineering, or a closely related role.
Strong proficiency in Python for data processing and automation.
Strong proficiency in SQL for querying, joining, and aggregating data.
Hands-on experience with batch data processing and streaming data systems.
Hands-on experience with Airflow, including designing and maintaining DAGs.
Experience working with SQL and NoSQL data sources.
Familiarity with analytical data storage platforms (e.g. cloud data warehouses).
Strong understanding of data reliability, correctness, and production operations.
Ability to communicate clearly in Thai, with working proficiency in English.
Experience with message queues or streaming platforms.
Experience with automation or low-code workflow tools.
Familiarity with cloud infrastructure environments especially GCP (Google Cloud) or AWS.
Experience supporting financial, payroll, or compliance-related data systems.