All roles

Kafka and Data Lake Engineer

Remote · USA Full-time New today

You will build, manage, and operate scalable data platforms centered around Kafka and data lakes.

Responsibilities

  • Design data pipelines to ingest, process, and move data from various sources into the data lake using Kafka.
  • Deploy, configure, and maintain Kafka clusters, including Kafka Connect and Schema Registry, ensuring high availability.
  • Oversee the architecture and governance of the data lake, managing storage (e.g., S3/ADLS), security, and metadata.
  • Develop producers and consumers to interact with Kafka topics using Python, Java, or Scala.
  • Implement data quality checks, manage lineage, and enforce security controls across data flows.

Required Skills

  • 5+ years of proven experience designing and managing data platforms with Apache Kafka and big data technologies.
  • Strong proficiency in Python, Java, or Scala.
  • Expertise in big data processing frameworks like Apache Spark and Apache Flink.
  • Hands-on experience with cloud environments (AWS, Azure, or GCP) and services like S3 or Azure Data Lake Storage.
  • Solid understanding of data lake design principles, including Delta Lake or Apache Iceberg.
  • Familiarity with infrastructure-as-code tools like Terraform or Ansible and containerization with Docker and Kubernetes.
  • Experience with SQL and NoSQL database systems.

Apply tot his job Apply To this Job

Related roles