Data Engineers design, build, and maintain systems that enable organizations to collect, store, and analyze large volumes of data. They ensure data pipelines are efficient, scalable, and reliable, empowering data-driven decision-making.

Nguyen
Duc An
|

Tech Stack & Skills

💾 Data Engineering

Apache Spark Kafka Airflow Flink HDFS

🗄️ Databases

SQL Server MongoDB Delta Lake Iceberg

🤖 AI & Analytics

LLM NLP Power BI Python

🛠️ DevOps

Docker Git CI/CD MinIO

My Projects

Building the future, one pipeline at a time 🚀

02
Real-time Streaming Pipeline

Real-time Streaming Pipeline

Flink Kafka Delta Lake MongoDB

Real-time data pipeline for IoT using Apache Flink, Kafka, Delta Lake, and Apache Iceberg. Handles CDC streams from MongoDB for analytics and downstream consumption.

03
ETL Pipeline Spark Airflow

ETL Pipeline with Spark & Airflow

Spark Airflow HDFS Docker

Automated ETL pipeline with Apache Spark, Airflow, and HDFS. Data is processed, transformed, and stored for large-scale analytics with orchestrated workflows.

04
Medallion Architecture

Data Warehouse Medallion Architecture

SQL Server T-SQL Power BI

Modern data warehouse using Medallion architecture (Bronze, Silver, Gold). Focus on ETL, data modeling, normalization, and analytics reporting with SQL Server.

Contact Me

Feel free to reach out for collaboration or any questions!

0
Projects Completed
0
Technologies Used
0
% Passionate
0
Age