11 years of hands-on experience building high-performance data solutions with a solid foundation in SQL, query optimization, and data modeling.
Proven track record designing, orchestrating, and operating batch ETL pipelines at scale, including monitoring, incident response, and cost/performance tuning.
Currently developing end-to-end data pipelines leveraging Databricks (PySpark), Snowflake, Kafka, and Apache Airflow (DAG design, scheduling, SLAs), integrating streaming and batch workloads.
Experienced in schema design, incremental loads, CDC, data quality/validation, and governance best practices across lakehouse and warehouse architectures.
Collaborates closely with data scientists, platform engineers, and stakeholders to deliver reliable, observable pipelines with robust testing (unit/integration) and CI/CD.
Comfortable across cloud-native tooling and infrastructure-as-code, with a focus on scalability, reliability, and maintainability.