We are looking for a talented Data Engineer to join our Data Science team. The group is part of a larger DS team, informing all product decisions and creating models and infrastructure to improve efficiency, growth, and security. To do this, we use data from various sources and of varying quality. Our automated ETL processes serve both the broader company (in the form of clean, simplified tables of aggregated statistics and dashboards) and the Data Science team itself (cleaning and processing data for analysis and modeling purposes, ensuring reproducibility).
We are looking for someone with experience in designing, building, and maintaining a scalable and robust Data Infra that makes data easily accessible to the Data Science team and the broader audience via different tools. As a data engineer, you will be involved in all aspects of the data infrastructure, from understanding current bottlenecks and requirements to ensuring the quality and availability of data. You will collaborate closely with data scientists, platform, and front-end engineers, defining requirements and designing new data processes for both streaming and batch processing of data, as well as maintaining and improving existing ones. We are looking for someone passionate about high-quality data who understands their impact in solving real-life problems. Being proactive in identifying issues, digging deep into their source, and developing solutions, are at the heart of this role.
Junior:
What You Will Do
- Maintain and evolve the current data lake infrastructure and look to evolve it for new requirements
- Maintain and extend our core data infrastructure and existing data pipelines and ETLs
- Provide best practices and frameworks for data testing and validation and ensure reliability and accuracy of data
- Design, develop and implement data visualization and analytics tools and data products.
What You Will Need
- Bachelor’s degree in Computer Science, Applied Mathematics, Engineering or any other technology-related field
- Previous experience working in a data engineering project or role
- Fluency in Python
- Previous experience with ETL pipelines and data processing
- Good knowledge of SQL and no-SQL databases
- Good knowledge of coding principles, including Oriented Object Programming
- Experience with Git
Nice to have
- Experience with Airflow, Google Composer or Kubernetes Engine
- Experience working with Google Cloud Platform
- Experiences with other programming languages, like Java, Kotlin or Scala
- Experience with Spark or other Big Data frameworks
- Experience with distributed and real-time technologies (Kafka, etc..)
- 1-2 years commercial experience in a related role
Middle:
What You Will Do
- Maintain and evolve the current data infrastructure and look to evolve it for new requirements
- Maintain and extend our core data infrastructure and existing data pipelines and ETLs
- Provide best practices and frameworks for data testing and validation and ensure reliability and accuracy of data
- Design, develop and implement data visualization and analytics tools and data products.
What You Will Need
- Bachelor’s degree in Computer Science, Applied Mathematics, Engineering or any other technology-related field
- Previous experience working in a data engineering role
- Fluency in Python
- Previous experience with ETL pipelines
- Experience working with Google Cloud Platform
- In-depth knowledge of SQL and no-SQL databases
- In-depth knowledge of coding principles, including Oriented Object Programming
- Experience with Git
Nice to have
- Experience with code optimisation, parallel processing
- Experience with Airflow, Google Composer or Kubernetes Engine
- Experiences with other programming languages, like Java, Kotlin or Scala
- Experience with Spark or other Big Data frameworks
- Experience with distributed and real-time technologies (Kafka, etc..)
- 2-5 years commercial experience in a related role
Senior:
What You Will Need
- Bachelor’s degree in Computer Science, Applied Mathematics, Engineering or any other technology-related field
- Previous experience working in a data engineering role
- Fluency in Python
- Experience in both batch processing and streaming data pipelines
- Experience working with Google Cloud Platform
- In-depth knowledge of SQL and no-SQL databases
- In-depth knowledge of coding principles, including Oriented Object Programming
- Experience with Git
Nice to have
- Experience with code optimisation, parallel processing
- Experience with Airflow, Google Composer or Kubernetes Engine
- Experiences with other programming languages, like Java, Kotlin or Scala
- Experience with Spark or other Big Data frameworks
- Experience with distributed and real-time technologies (Kafka, etc..)
- 5-8 years commercial experience in a related role
Staff:
What You Will Do
- Maintain and evolve the current data infrastructure and look to evolve it for new requirements
- Maintain and extend our core data infrastructure and existing data pipelines and ETLs
- Provide best practices and frameworks for data testing and validation and ensure reliability and accuracy of data
- Design, develop and implement data visualization and analytics tools and data products.
- Play a critical role in helping to set up directions and goals for the team
- Build and ship high-quality code, provide thorough code reviews, testing, monitoring and proactive changes to improve stability
- You are the one who implements the hardest part of the system or feature.
What You Will Need
- Bachelor’s degree in Computer Science, Applied Mathematics, Engineering or any other technology-related field
- Previous experience working in a data engineering role
- Fluency in Python
- Experience in both batch processing and streaming data pipelines
- Experience working with Google Cloud Platform
- In-depth knowledge of SQL and no-SQL databases
- In-depth knowledge of coding principles, including Oriented Object Programming
- Experience with Git
- Ability to solve technical problems that few others can do
- Ability to lead/coordinate rollout and releases of major initiatives
Nice to have
- Experience with code optimisation, parallel processing
- Experience with Airflow, Google Composer or Kubernetes Engine
- Experiences with other programming languages, like Java, Kotlin or Scala
- Experience with Spark or other Big Data frameworks
- Experience with distributed and real-time technologies (Kafka, etc..)
- 8+ years commercial experience in a related role