Sr. Data Engineer (AWS - Analytics & Data Science)

Location: Irvine, CA or Dallas, TX or Remote USA
Job Type: Engineering

IQ Workforce is a leading recruiting firm for the engineering, analytics, and data science communities.

Our client is a global restaurant company, which engages in the development and operation of some of the most recognizable brands in the world. They have over 30,000 employees and operate over 50,000 restaurants in 150+ nations and territories.

Our of their restaurants brands is seeking to add a new Sr. Data Engineer to their Data & Analytics team. This Engineer will be the subject matter expert in building their Customer360 data pipelines to source, analyze and validate data from internal and external customer data sources. You will work with cross functional partners and third-party vendors to enrich their customer data assets by acquiring, organizing, and aggregating customer data from various sources to construct a full and accurate 360 view of their customer for use in marketing, targeted media campaigns and analytics/data science. This person must be able to work at a detail level, when needed, to identify issues, risks, root causes, develop mitigation strategies and solutions, and identify and track actions to closure.

Responsibilities include:
Collaborate with data product managers, data scientists, data analysts, and engineers to define requirements and data specifications

Develop, deploy and maintain data processing pipelines using cloud technology such as AWS, Airflow, Redshift, EMR

Develop, deploy, and maintain serverless data pipelines using Event Bridge, Kinesis, AWS Lambda, S3, and Glue

Architect cloud-based data infrastructure solutions to meet stakeholder needs

Build out a robust big data ingestion framework with automation, self-heal capabilities and ability to handle data drifts

Work with real-time data streams & API’s from multiple internal/external sources

Write ETL pipelines for batch-based data extracts

Provide scalable solutions to manage large file imports

Adopt automated and manual test strategies to ensure product quality

Learn and understand how product works and help build end-to-end solutions

Maintain detailed documentation of your work and changes to support data quality and data governance

Ensure high operational efficiency and quality of your solutions to meet SLAs and support commitment to our customers (Data Science, Data Analytics teams)

Be an active participant and advocate of product methodology

Act as Subject Matter expert and make recommendations regarding standards for code quality and timeliness

Bachelor’s degree in analytics, statistics, engineering, math, economics, computer science, information technology or related discipline

5+ years professional experience in the big data space

5+ years of experience designing and delivering large scale, 24-7, mission-critical data pipelines and features using modern big data architectures

Strong coding skills with Python/Pyspark/Spark and SQL

Experience with AWS Ecosystem, especially Redshift, Athena, DynamoDB, Airflow, AWS Lambda and S3

Expert knowledge in writing complex SQL and ETL development with experience processing extremely large datasets

Expert in applying SCD types on S3 data lake using Hudi

Demonstrated ability to analyze large data sets to identify gaps and inconsistencies, provide data insights, and advance effective product solutions

Experience integrating data using API integration

Experience integrating data using streaming technologies such as Kinesis Firehose, Kafka

Experience integrating data from multiple data sources and file types such as JSON, Parquet and Avro formats

Experience supporting and working with cross-functional teams in a dynamic environment

Strong quantitative and communication skill

Nice To Have:
Proficiency with automated testing using tools like pytest

Experience with customer data platforms such as Amperity and MarTech platforms

Experience contributing to full lifecycle deployments with a focus on testing and quality

Experience with data quality processes, data quality checks, validations, data quality metrics definition and measurement

Experience with CI/CD tools like Gitlab, Terraform

Experience in data quality tools such as Informatica DQ or Talend DQ tools