Key Responsibilities 1. Data Pipeline Development
- Design, build, and optimize scalable data pipelines using Scala and Hadoop frameworks.
- Implement ETL processes for ingesting, transforming, and storing data from various sources.
2. Data Analysis and Query Optimization
- Write and optimize complex SQL queries for efficient data retrieval and transformation.
- Troubleshoot and resolve performance issues in queries and data workflows.
3. Big Data Ecosystem Management
- Manage Hadoop-based data infrastructure, including HDFS, Hive, and related components.
- Monitor system performance and optimize resource utilization in a distributed environment.
4. Collaboration and Problem Solving
- Work closely with data analysts, data scientists, and business stakeholders to understand requirements and translate them into technical solutions.
- Provide technical...