Responsibilities
We seek to engage a highly skilled and experienced Data Engineer who will work within the Business Intelligence team to orchestrate data extraction from multiple data sources. Our data landscape is a largely diverse enterprise with data sources ranging from traditional relational database management systems to 3rd Party cloud solution providers and no-code/low-code platforms. As such, a lot of data extraction jobs are largely API driven. You would be required to build systems that perfectly execute data extraction and transformation to support the data analytics teams and drive speedy decision-making across the enterprise.
Requirements
B.Sc. Degree or its equivalent in Statistics, Mathematics, Engineering or a related field.
3+ Years building and orchestrating data pipelines
5+ years writing clean, usable and well-documented Python code
Experience working with a startup will be considered an advantage
Proficiency in ETL development using Python and SQL.
You have experience working with ELT platforms like StitchData, and Fivetran.
Hands-on experience with Python data processing frameworks such as PySpark and Pandas.
Good Knowledge of Software Development Standards and best practices such as Test- Driven Development (TDD), KISS etc
Experience with using CI/CD tooling to analyse, build, test and deploy your code
Knowledge of Data Warehousing Concepts and Data Warehouse Modelling.
Experience working with at least one Cloud Data Warehouse Solution (Google BigQuery Preferred).
Good understanding of design choices for data storage and data processing, with a particular focus on cloud data services.
Experience with data flow orchestration tools such as Apache Airflow or Airbyte is an advantage.
Experience with distributed computing and container orchestration (Kubernetes) is an advantage.
Experience with Microservices and Event-Driven architecture is an advantage