We are seeking a highly motivated Junior Consultant Data Engineer to join our dynamic data engineering team. The ideal candidate will have experience in data engineering technologies and be eager to learn and grow within the role. This position will involve working with .NET, ETL processes, Python, PySpark, Hive, and will also include opportunities to work with ChatGPT and machine learning models.
Responsibilities
Develop and maintain ETL processes to extract, transform, and load data from various sources.
Use .NET to develop data integration and processing solutions.
Work with Python and PySpark for data processing and analysis.
Utilize Hive for data warehousing solutions.
Assist in the implementation and integration of ChatGPT and machine learning models into data pipelines.
Collaborate with data scientists and other stakeholders to understand data requirements and deliver solutions.
Optimize and troubleshoot data processes for performance and reliability.
Maintain documentation of data engineering processes and workflows.
Stay updated with the latest industry trends and technologies in data engineering and machine learning.
Requirements
Bachelor’s degree in Computer Science, Information Technology, Data Science, or a related field.
Proven experience in data engineering or software development.
Proficiency in .NET for data processing and integration.
Experience with ETL processes and tools.
Strong programming skills in Python.
Familiarity with PySpark and Hive for big data processing.
Basic understanding of ChatGPT and machine learning models.
Strong analytical and problem-solving skills.
Excellent communication and teamwork skills.
Willingness to learn and adapt to new technologies and methodologies.
Preferred Qualifications
Experience with cloud platforms (e.g., AWS, Azure, Google Cloud).
Knowledge of data modeling and database design.
Familiarity with other big data technologies and frameworks.
Understanding of DevOps practices and CI/CD pipelines.
Exposure to machine learning frameworks and libraries (e.g., TensorFlow, PyTorch).