Project Role : Data Engineer
Project Role Description : Design, develop and maintain data solutions for data generation, collection, and processing. Create data pipelines, ensure data quality, and implement ETL (extract, transform and load) processes to migrate and deploy data across systems.
Must have skills : Apache Spark
Good to have skills : Python (Programming Language), PySpark, No Function Specialty
Minimum 5 year(s) of experience is required
Educational Qualification : 15 years full time education
Summary: As a Data Engineer, you will design, develop, and maintain data solutions for data generation, collection, and processing. Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across systems. You will play a crucial role in managing and optimizing data infrastructure to support the organization's data needs. Roles & Responsibilities: - Expected to be an SME, collaborate and manage the team to perform. - Responsible for team decisions. - Engage with multiple teams and contribute on key decisions. - Provide solutions to problems for their immediate team and across multiple teams. - Design and develop scalable data pipelines to extract, transform, and load data from various sources. - Implement data quality checks and ensure data integrity throughout the data processing workflow. - Optimize and tune data processing and ETL jobs for performance and efficiency. - Collaborate with cross-functional teams to understand data requirements and design appropriate data solutions. Professional & Technical Skills: - Must To Have Skills: Proficiency in Apache Spark, PySpark, Python (Programming Language). - Strong understanding of statistical analysis and machine learning algorithms. - Experience with data visualization tools such as Tableau or Power BI. - Hands-on experience implementing various machine learning algorithms such as linear regression, logistic regression, decision trees, and clustering algorithms. - Solid grasp of data munging techniques, including data cleaning, transformation, and normalization to ensure data quality and integrity. Additional Information: - The candidate should have a minimum of 5 years of experience in Apache Spark. - This position is based at our Hyderabad office. - A 15 years full-time education is required.