Our client is an R&D and Innovation lab located in downtown Toronto, that are responsible for transmitting billions of bytes of electronic and secure data at dizzying speeds. Their goal is to make commerce more accessible and convenient, and in 2017, they launched their first foray app into Canada/North America, which helps users organize and pay bills in one simple location. Not only does the app send you reminders so that you never miss a payment, but it also gives you 3% cash back on popular retail brand gift cards! They support their parent company, a mobile payments and financial services company that currently serves 300 million customers!!
Working on a small diverse, and tight-knit team that is committed to working for the end consumer, they leverage their expertise in technology to build a lasting, secure, and efficient solution. Their creative and incredibly talented engineers work to provide customized and confidential experiences for their consumers and users. They encourage their employees to take charge of their innovative ideas and execute them with passion and vigour.
Role:This ‘Data as a Service’ team operates a charter of capturing, storing and processing data reliably at scale. The DaaS Team makes this data available for a large set of products that are used for internal and external services. The core infrastructure that powers this platform operates at a scale of speed, performance, and complexity that few others can claim. The issues they face with large-scale data storage, low-latency retrievals, high volume requests and high availability are common yet complex. To help solve these challenges, they are looking for the best of the best engineering talent to come and join our cool & rewarding environment! Right now these guys are in a hyper growth phase, and this is a stellar opportunity to make an impact. The ‘Hadoop Infrastructure Engineer’, will be a core contributor in the Data Platform team and help deliver the world-class Data Platform that our client is gearing towards for data products growth.
Must Have Skills:
• Minimum 3+ years of handling services in a large scale distributed systems environment, preferably Hadoop.
• Hands on experience with Hadoop (or similar) ecosystem – Yarn, Hive, HDFS, Spark, Presto, Parquet, HBase
• Experience with workflow management (Airflow, Oozie, Azkaban)
Familiarity with systems management tools (Puppet, Chef, Capistrano, etc)
• Knowledge of Linux operating system internals, file systems, disk/storage technologies and storage protocols and networking stack.
• Proven knowledge of systems programming (bash and shell tools) and/or at least one scripting language (Python, Ruby, Perl).
• Be adaptable and able to focus on the simplest, most efficient & reliable solutions.
Responsibilities:• Get stuff done: A problem partially solved today is better than a perfect solution next year. Have an idea during the night? Discuss it with your team in the morning, code it, push it at noon, test it in the afternoon and deploy it the next morning.
• You will participate in and build tools to diagnose and fix complex distributed systems handling petabytes of data & drive opportunities to automate infrastructure, deployments, and observability of data services.
• You will test, monitor, administer, optimize and operate multiple Hadoop / Spark clusters across cloud providers – AWS, GCP, Aliyun and on premise data centers, primarily in Python , Java and Scala.
• Investigate emerging technologies in Hadoop ecosystem that relate to our needs and implement those technologies.
• Partner with Hadoop developers in building best practices for Data Warehouse and analytics environment. Share an on-call rotation and handle service incidents.
• Working with Big Data tools and building high performance, high throughput, and distributed data pipeline and big data platform with Hadoop, Spark, Kakfa, Hive, and Presto.