Blockchain.com

Data Engineer

Spark Data Visualization Kafka

October 23, 2022

Apply Now

United States, United States

October 23, 2022

Apply Now

Job description

We are looking for a talented Data Engineer to join our Data Science team. The group is part of a larger DS team, informing all product decisions and creating models and infrastructure to improve efficiency, growth, and security. To do this, we use data from various sources and of varying quality. Our automated ETL processes serve both the broader company (in the form of clean, simplified tables of aggregated statistics and dashboards) and the Data Science team itself (cleaning and processing data for analysis and modeling purposes, ensuring reproducibility).

We are looking for someone with experience in designing, building, and maintaining a scalable and robust Data Infra that makes data easily accessible to the Data Science team and the broader audience via different tools. As a data engineer, you will be involved in all aspects of the data infrastructure, from understanding current bottlenecks and requirements to ensuring the quality and availability of data. You will collaborate closely with data scientists, platform, and front-end engineers, defining requirements and designing new data processes for both streaming and batch processing of data, as well as maintaining and improving existing ones. We are looking for someone passionate about high-quality data who understands their impact in solving real-life problems. Being proactive in identifying issues, digging deep into their source, and developing solutions, are at the heart of this role.

Junior:

What You Will Do

Maintain and evolve the current data lake infrastructure and look to evolve it for new requirements
Maintain and extend our core data infrastructure and existing data pipelines and ETLs
Provide best practices and frameworks for data testing and validation and ensure reliability and accuracy of data
Design, develop and implement data visualization and analytics tools and data products.

What You Will Need

Bachelor’s degree in Computer Science, Applied Mathematics, Engineering or any other technology-related field
Previous experience working in a data engineering project or role
Fluency in Python
Previous experience with ETL pipelines and data processing
Good knowledge of SQL and no-SQL databases
Good knowledge of coding principles, including Oriented Object Programming
Experience with Git

Nice to have

Experience with Airflow, Google Composer or Kubernetes Engine
Experience working with Google Cloud Platform
Experiences with other programming languages, like Java, Kotlin or Scala
Experience with Spark or other Big Data frameworks
Experience with distributed and real-time technologies (Kafka, etc..)
1-2 years commercial experience in a related role

Middle:

What You Will Do

Maintain and evolve the current data infrastructure and look to evolve it for new requirements
Maintain and extend our core data infrastructure and existing data pipelines and ETLs
Provide best practices and frameworks for data testing and validation and ensure reliability and accuracy of data
Design, develop and implement data visualization and analytics tools and data products.

What You Will Need

Bachelor’s degree in Computer Science, Applied Mathematics, Engineering or any other technology-related field
Previous experience working in a data engineering role
Fluency in Python
Previous experience with ETL pipelines
Experience working with Google Cloud Platform
In-depth knowledge of SQL and no-SQL databases
In-depth knowledge of coding principles, including Oriented Object Programming
Experience with Git

Nice to have

Experience with code optimisation, parallel processing
Experience with Airflow, Google Composer or Kubernetes Engine
Experiences with other programming languages, like Java, Kotlin or Scala
Experience with Spark or other Big Data frameworks
Experience with distributed and real-time technologies (Kafka, etc..)
2-5 years commercial experience in a related role

Senior:

What You Will Need

Bachelor’s degree in Computer Science, Applied Mathematics, Engineering or any other technology-related field
Previous experience working in a data engineering role
Fluency in Python
Experience in both batch processing and streaming data pipelines
Experience working with Google Cloud Platform
In-depth knowledge of SQL and no-SQL databases
In-depth knowledge of coding principles, including Oriented Object Programming
Experience with Git

Nice to have

Experience with code optimisation, parallel processing
Experience with Airflow, Google Composer or Kubernetes Engine
Experiences with other programming languages, like Java, Kotlin or Scala
Experience with Spark or other Big Data frameworks
Experience with distributed and real-time technologies (Kafka, etc..)
5-8 years commercial experience in a related role

Staff:

What You Will Do

Maintain and evolve the current data infrastructure and look to evolve it for new requirements
Maintain and extend our core data infrastructure and existing data pipelines and ETLs
Provide best practices and frameworks for data testing and validation and ensure reliability and accuracy of data
Design, develop and implement data visualization and analytics tools and data products.
Play a critical role in helping to set up directions and goals for the team
Build and ship high-quality code, provide thorough code reviews, testing, monitoring and proactive changes to improve stability
You are the one who implements the hardest part of the system or feature.

What You Will Need

Bachelor’s degree in Computer Science, Applied Mathematics, Engineering or any other technology-related field
Previous experience working in a data engineering role
Fluency in Python
Experience in both batch processing and streaming data pipelines
Experience working with Google Cloud Platform
In-depth knowledge of SQL and no-SQL databases
In-depth knowledge of coding principles, including Oriented Object Programming
Experience with Git
Ability to solve technical problems that few others can do
Ability to lead/coordinate rollout and releases of major initiatives

Nice to have