Description

Description:

We are seeking a skilled Data Engineer to design, develop, and maintain scalable and reliable data pipelines using AWS services, PySpark, and AWS Lake Formation. In this role, you will be responsible for implementing ETL processes to extract, transform, and load data from various sources into data lakes or data warehouses. You will collaborate with cross-functional teams to understand data requirements, identify data sources, and define data ingestion strategies. Additionally, you will optimize and tune data pipelines to ensure high performance, scalability, and data quality.

Responsibilities:

  • Design, develop, and maintain scalable and reliable data pipelines using AWS services, PySpark, and AWS Lake Formation.
  • Implement ETL processes to extract, transform, and load data from various sources into data lakes or data warehouses.
  • Collaborate with cross-functional teams to understand data requirements, identify data sources, and define data ingestion strategies.
  • Optimize and tune data pipelines to ensure high performance, scalability, and data quality.
  • Monitor and troubleshoot data pipelines to identify and resolve issues in a timely manner.
  • Collaborate with data scientists and analysts to provide them with clean, transformed, and reliable data for analysis and modeling.
  • Develop and maintain data documentation, including data lineage, data dictionaries, and metadata management.
  • Stay up-to-date with industry trends and best practices in data engineering, AWS services, PySpark, Athena, and AWS Lake.
  • Work in a team environment with product, QE/QA, and cross-functional teams to deliver a project throughout the whole software development cycle.
  • Provide technical support for data-related issues and incidents.
  • Collaborate with other teams to resolve issues and ensure data integrity.

Skills:

  • Proficiency in AWS services, including AWS Lake Formation, S3, Glue, Athena, and Lambda.
  • Strong experience with PySpark and other big data processing frameworks.
  • Solid understanding of ETL processes and data warehousing concepts.
  • Excellent problem-solving and troubleshooting skills.
  • Strong collaboration and communication skills, with the ability to work effectively in cross-functional teams.
  • Experience with data documentation and metadata management.
  • Familiarity with software development lifecycle processes.
  • Ability to stay updated with industry trends and best practices in data engineering and AWS services.
  • Bachelor’s degree in Computer Science, Engineering, or a related field preferred.

Interested in this job?

192 days left to apply

Apply for this job

Cancel
Job Alert
Subscribe to receive instant alerts of new relevant jobs directly to your email inbox.
Subcrible
Send message
Cancel