About The Position
WHAT WE DO Imagine if every time you went to the doctor you paid cash upfront before you are treated, where every time you rented an apartment you paid 2 years cash upfront, or where your monthly paycheck is frequently delayed for months at a time. Then imagine doing that without a loan, credit card, or even bank accounts. This is everyday life for over 3 billion people in developing countries around the world. We are Migo, and we are on a mission to change this by re-inventing the way people access and use credit. Through a simple API integration to our platform, companies can enable their customers to make purchases and pay bills on credit, or get personal loans. Leveraging proprietary datasets, Migo builds ML algorithms on customer phone records, bank records, and payment transactions to assess credit risk, enabling us to offer credit lines to individuals and small businesses. This credit line can be used to make purchases from a merchant or withdraw cash without the need for point-of-sale hardware or plastic cards. Because of our proprietary data and innovative technical solutions, Migo is able to provide credit to underbanked customers who are not typically covered by credit bureaus, a critical area of growth for developing countries. RESPONSIBILITIES OF THE ROLE Are you a data engineer ready to help us redefine credit for the 21st century? We are on a mission to leverage data from partners around the globe to perform credit scoring for 3 billion people. The opportunities for you to make an impact are limitless. As an experienced data engineer, we would look to you to design a high-capacity data pipeline architecture as well as design data governance policies while ensuring privacy and security of sensitive data. You will develop scalable data management systems and manage the storage of large datasets. You will be involved with creating performant and reliable ETL jobs, as well as manage external data feeds across on-premise and cloud infrastructure. WHAT WE ARE LOOKING FOR As a data pipeline engineer, you will develop scalable data management systems and manage the storage of large datasets. We would be looking to you to create performant and reliable ETL jobs, as well as manage external data feeds across on-premise and cloud infrastructure. You are also a clear and concise communicator and can communicate ideas to a wide range of stakeholders, both technical and non-technical. You appreciate hearing different points of view and wait to hear other's point of view before offering your own. You have a pragmatic approach to building systems, see multiple ways of solving problems, and can discuss the tradeoffs of each solution. You are technology agnostic with a broad depth of experience using many different technologies. You have traveled extensively or have lived in a developing country. You are empathetic, self-aware, and respect all cultures. You are fun and enlightening to work with, and you have a good work/life balance with hobbies and interests you are happy to share with others. In the first 90 days you will: - Rewrite our big data ETL pipeline to create datasets for our modeling efforts - Wrangle with raw data from large, diverse data sets from our distribution partners - Automate data integrity checks - Connect our various databases to our BI platform OUR TECH STACK Our technology stack consists of modern tools; we are open to technologies and pick the right tool for the job: - Python for Machine Learning e.g. (Scikit-learn and PyTorch) - Python/Scala for data pipelines - Scala/Java/Python for micro-services and APIs - Swagger(OpenAPI) for API documentation - Docker and Kubernetes to package and run services - AWS for underlying infrastructure - On-premise servers for data processing and extraction at our partners Requirements - Degree in a relevant technical field or equivalent experience - 5+ years of experience building production-quality software infrastructure - Experience developing ETL jobs - Fluency in SQL and experience with RDBMSs - Fluency in Python data tools e.g. Pandas, Dask, or Pyspark - Experience designing and building big data pipelines - Experience working on large scale, distributed systems Desirable - Experience working with AWS, GCP, or other cloud-based services - Experience at a rapidly growing startup or with cutting-edge teams at a larger tech company
Requirements
None
Benefits
Vacaciones / Flexibilidad de Trabajo
- Trabajo Remoto: Globalmente remoto
5903-4-25072024