Design, develop, and maintain scalable ETL pipelines and data workflows using Python, PySpark, and cloud services such as Azure, AWS, and Cloudera CDP.
Collaborate with stakeholders, data scientists, and cross‑functional teams to ensure data quality, governance, and performance.
Create data models and dashboards utilizing BI tools like Power BI, Qlik Sense, and Informatica.
Implement data integration and transformation processes, optimizing storage solutions and ensuring reliable data ingestion and delivery across the platform.
Support data analytics initiatives and integrate machine‑learning workflows where applicable.
Qualifications
Strong experience with Python, SQL, and PySpark.
Proficiency in big‑data technologies such as Hadoop, Spark, Cloudera, and Apache NiFi.
Hands‑on experience with cloud platforms (Azure, AWS, GCP) and related services.