Rate: depending on experience
Work location: MI (hybrid)We are seeking a highly skilled and certified Data Engineer with expertise in Big Data technologies, Kafka, and Google Cloud Platform (GCP). The ideal candidate will have a strong background in designing, building, and maintaining scalable data pipelines, as well as experience with real-time data processing. You will be responsible for ensuring the reliability, efficiency, and scalability of our data infrastructure.
Key Responsibilities:
- Design, develop, and maintain scalable data pipelines and ETL processes using Big Data technologies.
- Implement and manage real-time data streaming solutions using Apache Kafka.
- Architect and manage data storage solutions on Google Cloud Platform (GCP), ensuring high availability and reliability.
- Collaborate with data scientists, analysts, and other engineering teams to integrate data solutions into various applications.
- Optimize data systems and pipelines for performance, scalability, and cost-efficiency.
- Ensure data quality and integrity across all data pipelines.
- Monitor and troubleshoot data pipelines to ensure continuous and reliable data flow.
- Document data engineering processes and maintain comprehensive records of all data infrastructure components.
- Stay up to date with the latest trends and best practices in data engineering, Big Data, Kafka, and GCP.
Qualifications:
Certification:
Google Cloud Certified - Professional Data Engineer or equivalent certification.
Experience: 3+ years of experience in data engineering with a focus on Big Data technologies.
Technical Skills:
- Proficiency in Apache Kafka for real-time data streaming.
- Hands-on experience with Google Cloud Platform (GCP) services such as BigQuery, Dataflow, Pub/Sub, Cloud Storage, etc.
- Strong knowledge of ETL processes and data pipeline orchestration tools.
- Experience with distributed data processing frameworks like Apache Hadoop, Apache Spark, etc.
- Proficiency in programming languages such as Python, Java, or Scala.
- Experience with SQL and NoSQL databases.