Job Description
Location: 100% remote work
Responsibilities:• Building a distributed and highly parallelized Big Data processing pipeline to process massive amounts of data (both structured and unstructured) in near real-time
• Leveraging Spark & Scala to transform corporate data and enable data products to be built
• Ensuring continuous delivery on Hadoop and other Big Data platforms
• Automating processes where possible to ensure they are repeatable and reliable
• Creating and maintaining efficient data flows
• Collaborating with team members and stakeholders to deliver expected results
Requirements:• Minimum 5 years of experience with Spark & Scala
• Minimum 7 years of experience with Python
• Minimum 5 years of experience with Linux
• Hands-on experience with distributed data processing engines (e.g. Spark...