Data Engineers - CVS Health

Location: Greater Boston or Irving TX OR Northbrook IL
Job Type: Engineering

IQ Workforce is a leading recruiting firm for the global analytics & data science community.

Our client partner, CVS Health, is at the forefront of digital transformation in healthcare and retail. Personalization is a major initiative to transform the customer experience within their retail pharmacies. CVS Health’s Data Engineering team is helping lead this personalization effort and has multiple openings at the Data Engineer, Sr. Data Engineer, and Lead Data Engineer levels.

The Data Engineering team will lead the effort to transform healthcare and retail for better outcomes for patients and customers by leveraging advanced machine learning capabilities across their 10,000 retail pharmacies. You will be at the forefront of developing an unprecedented level of personalized pharmacy and retail services for millions of customers every day.

Some of the work we do may include:
Making analytics faster, more insightful, and more efficient by building, architecting and maintaining a next-generation Big Data Machine Learning framework. Rapidly develop prototypes and proof of concepts for the selected solutions, and implementing complex big data projects

Designing a highly scalable and extensible Big Data platform which enables industrializing collection, storage, modeling, and analysis of massive data sets from heterogeneous channels

Bringing a DevOps mindset to enable big data and batch/real-time analytical solutions that leverage emerging technologies

Developing prototypes and proof of concepts for the selected solutions, and implementing complex big data projects

Applying an analytic mindset to collecting, parsing, managing, and automating data feedback loops in support of business innovation

Developing and releasing ML pipelines into a production environment using Spark and Databricks (primary languages: Scala/Python and SQL). Enable the integration of ML pipelines and refine the processes and tools with existing CICD framework/processes for the Personalization Engine environment.

Some of the things we look for in potential candidates:
Hands-on experience with “big data” platforms including Hadoop (preferably Azure or AWS) and Spark as well as experience with traditional RDBMS (eg, Teradata, Oracle)

Proficiency in “big data” technologies including Spark, Airflow, Kafka, Hbase, Pig, NoSQL databases, etc

Proficiency in the following programming languages: PySpark, Python, shell scripting, SQL (preferably Teradata and PL/SQL syntax) and Hive, Pig, Java, or Scala

Ability to design and build a framework to orchestrate data pipelines

Proficiency with tools to automate CI/CD pipelines (eg, Jenkins, GIT, Control-M)