Zoey Wang

Data Engineer | Data Scientist
Eindhoven, NL.

About

Highly analytical and results-driven Master's candidate in Data Science and Society with proven expertise in developing robust ETL pipelines, performing advanced data analytics, and leveraging big data technologies. Skilled in Python, SQL, AWS, and GCP, with a track record of optimizing data processes, enhancing reporting, and driving data-informed strategic decisions. Eager to apply strong quantitative and problem-solving abilities within a dynamic data engineering or data science role in the Netherlands, supported by a valid Search Year Visa.

Work

NXP
|

Data Engineer Intern

Summary

Engineered and optimized Python-based ETL pipelines for sustainable portfolio data, significantly reducing manual effort and enhancing data quality for critical business insights.

Highlights

Developed and optimized Python-based ETL pipelines utilizing Pandas, SQLAlchemy, and Apache Airflow to efficiently process over 4,000 sustainability-focused data entries.

Reduced manual data processing effort by 50% through automated pipelines, ensuring clean and accurate ESG datasets that supported sustainability portfolio analysis.

Designed and implemented a sustainability review framework using Power BI, creating interactive dashboards to analyze sustainability scores for over 5,000 products across 2023 and 2024.

Led a 2-person intern team to compile and validate data for an annual energy use audit of 4,000+ products, automating initial validation steps with Python to reduce manual checks by 30%.

GeekPark
|

Media Operations Intern

Summary

Conducted data-driven analysis of media performance metrics and content strategy, significantly increasing video viewership and follower growth for a leading tech media platform.

Highlights

Conducted data-driven analysis of media performance metrics, including viewership and engagement rates, leading to a tenfold increase in video viewership.

Doubled follower growth rate by implementing insights derived from content analytics to optimize content strategy and improve audience engagement.

Produced 10+ articles and 200+ short videos on AI topics for the WeChat Official Account of Founder Park, achieving 110K+ cumulative reading volume and 2M+ total views.

Supported on-site event operations for AGI Playground, managing material collection, content editing, and timely release of video and text content to enhance event visibility and engagement.

Education

Tilburg University

Master

Data Science and Society

Courses

Machine Learning

Big Data

Deep Learning

Computational Statistics

Nankai University

Bachelor

Philosophy

Courses

Philosophy of Science

Analytical Philosophy

Logic

Phenomenology

Languages

English
Chinese
German
Dutch

Certificates

Microsoft Azure Data Fundamentals (DP 900)

Issued By

Microsoft

Databricks Certified Data Engineer Associate

Issued By

Databricks

Apache Airflow Fundamentals Certification

Skills

Programming Languages

Python, JavaScript.

Programming Languages

Bash Scripting.

Databases

SQL, NoSQL, MySQL, PostgreSQL, MongoDB.

Big Data Technologies

Apache Spark, Hadoop Ecosystems.

Cloud Platforms

AWS, GCP.

ETL Tools

Apache Airflow, Mage.

Data Analysis & Visualization

Pandas, NumPy, Power BI, Excel, Looker.

Machine Learning

XGBoost, RankNet, Scikit-learn, TensorFlow, Grid Search, Batch Normalization.

Projects

Master Thesis: Predicting Moral Dilemma Decisions with Machine Learning

Summary

Developed a machine learning model to predict moral dilemma decisions, involving comprehensive data preprocessing, advanced model selection, and rigorous hyperparameter tuning.

Uber Data Analytics Pipeline Using GCP

Summary

Constructed a comprehensive data analytics pipeline on Google Cloud Platform (GCP) for Uber trip data, encompassing robust ETL processes, efficient data warehousing, and interactive visualization.