Experienced Data Platform Engineer with 5+ years building scalable data infrastructure processing 50TB+ daily across cloud and hybrid environments
Specialized in real-time streaming architectures using Kafka and Spark, delivering sub-second analytics for business-critical applications
Proven track record of migrating legacy systems to modern data platforms, reducing costs by 40% while improving performance 3x
Work Experience
CloudTech Solutions
Senior Data Engineer
March 2022 - Present
Architected multi-tenant data platform on AWS processing 75TB daily data from 200+ sources, supporting 500+ concurrent users with 99.95% uptime
Designed real-time fraud detection pipeline using Kafka Streams and Apache Flink, reducing detection time from 24 hours to 30 seconds
Optimized Snowflake data warehouse performance through intelligent clustering and materialized views, cutting query costs by 45% while maintaining sub-second response times
Led migration of 15 legacy ETL jobs to modern ELT architecture using dbt and Airflow, reducing maintenance overhead by 60%
Implemented data quality framework with Great Expectations, achieving 99.8% data accuracy across critical business metrics
FinanceFlow Inc
Data Engineer
June 2020 - February 2022
Built end-to-end analytics pipeline processing 2M financial transactions daily using Apache Spark and Delta Lake, enabling real-time regulatory reporting
Migrated on-premise Teradata warehouse to AWS Redshift, reducing infrastructure costs by $300K annually while improving query performance by 4x
Developed automated data reconciliation system using Python and Pandas, reducing manual validation time from 8 hours to 15 minutes daily
Created self-service analytics platform using Apache Superset, empowering 50+ business users to generate reports independently
Established CI/CD pipelines for data workflows using GitLab and Terraform, reducing deployment time from 4 hours to 20 minutes
RetailMetrics Corp
Junior Data Engineer
August 2019 - May 2020
Developed ETL pipelines using Python and Apache Airflow to process e-commerce data from 25+ sources, supporting $50M annual revenue analytics
Optimized PostgreSQL database queries reducing average report generation time from 45 minutes to 5 minutes
Built automated data monitoring system using Prometheus and Grafana, reducing data incident response time by 70%
Collaborated with data science team to productionize ML models, implementing feature stores serving 100+ model predictions per second
Bachelor of Science in Computer Science | GPA: 3.8/4.0
May 2019
Relevant Coursework: Database Systems, Distributed Computing, Big Data Analytics, Machine Learning, Data Structures & Algorithms, Statistics Capstone Project: Built real-time recommendation engine processing 1M user interactions daily using Apache Kafka and collaborative filtering algorithms
Certifications
AWS Certified Data Analytics - Specialty
Amazon Web Services
March 2023
Google Cloud Professional Data Engineer
Google Cloud
January 2023
Databricks Certified Associate Developer for Apache Spark
Databricks
November 2022
Snowflake SnowPro Core Certification
Snowflake
September 2022
Honors
1st Place, AWS re:Invent Hackathon
Amazon Web Services
December 2023
Designed serverless data lake solution processing IoT sensor data with 99.9% accuracy. Implemented cost-optimization algorithm reducing storage costs by 55% using intelligent tiering
CloudTech Innovation Award
CloudTech Solutions
September 2023
Recognized for developing automated data lineage tracking system used across 15+ data teams
Publications
Optimizing Spark Performance for Large-Scale ETL Workloads
Data Engineering Weekly
August 2023
Technical article demonstrating 60% performance improvement through advanced partitioning strategies. Featured solution now implemented by 500+ data engineers based on community feedback
Open Source Contributions
Apache Airflow
Contributor
2022-2023
Contributed 3 bug fixes and performance improvements
dbt-utils
Contributor
Developed custom macro for data quality validation, adopted by 200+ projects
Create your Resumonk account
Resumonk AI Plan Free Trial
Trial Period
3 days
Trial Credits
50
Subscription
Create a New Resumonk Account or Access Your Existing One