Senior Data Engineer

$87,069 - $148,017/Yr

Unclassified

posted about 2 months ago

Full-time - Senior

5,001-10,000 employees

About the position

ICF is a mission-driven company filled with people who care deeply about improving the lives of others and making the world a better place. Our core values include Embracing Difference; we seek candidates who are passionate about building a culture that encourages, embraces, and hires dimensions of difference. Our Health Engineering Systems (HES) team works side by side with customers to articulate a vision for success, and then make it happen. We know success doesn't happen by accident. It takes the right team of people, working together on the right solutions for the customer. We are looking for a seasoned Senior Data Engineer who will be a key driver to make this happen. In this role, you will design, develop, and maintain scalable data pipelines using Spark, Hive, and Airflow. You will also develop and deploy data processing workflows on the Databricks platform and create API services to facilitate data access and integration. Your responsibilities will include creating interactive data visualizations and reports using AWS QuickSight, building the required infrastructure for optimal extraction, transformation, and loading of data from various data sources using AWS and SQL technologies, and monitoring and optimizing the performance of data infrastructure and processes. You will develop data quality and validation jobs, assemble large, complex sets of data that meet non-functional and functional business requirements, and write unit and integration tests for all data processing code. Collaboration is key in this position, as you will work with DevOps engineers on CI, CD, and IaC, read specifications and translate them into code and design documents, perform code reviews, and develop processes for improving code quality. You will also improve data availability and timeliness by implementing more frequent refreshes, tiered data storage, and optimizations of existing datasets while maintaining security and privacy for data at rest and in transit. Other duties may be assigned as needed.

Responsibilities

Design, develop, and maintain scalable data pipelines using Spark, Hive, and Airflow
Develop and deploy data processing workflows on the Databricks platform
Develop API services to facilitate data access and integration
Create interactive data visualizations and reports using AWS QuickSight
Build required infrastructure for optimal extraction, transformation, and loading of data from various data sources using AWS and SQL technologies
Monitor and optimize the performance of data infrastructure and processes
Develop data quality and validation jobs
Assemble large, complex sets of data that meet non-functional and functional business requirements
Write unit and integration tests for all data processing code
Work with DevOps engineers on CI, CD, and IaC
Read specs and translate them into code and design documents
Perform code reviews and develop processes for improving code quality
Improve data availability and timeliness by implementing more frequent refreshes, tiered data storage, and optimizations of existing datasets
Maintain security and privacy for data at rest and while in transit
Other duties as assigned

Requirements

Bachelor's degree in computer science, engineering or related field
7+ years of hands-on software development experience
4+ years of data pipeline experience using Python, Java and cloud technologies
Candidate must be able to obtain and maintain a Public Trust clearance
Candidate must reside in the US, be authorized to work in the US, and work must be performed in the US
Must have lived in the US 3 full years out of the last 5 years

Nice-to-haves

Experienced in Spark and Hive for big data processing
Experience building job workflows with the Databricks platform
Strong understanding of AWS products including S3, Redshift, RDS, EMR, AWS Glue, AWS Glue DataBrew, Jupyter Notebooks, Athena, QuickSight, EMR, and Amazon SNS
Familiar with work to build processes that support data transformation, workload management, data structures, dependency and metadata
Experienced in data governance process to ingest (batch, stream), curate, and share data with upstream and downstream data users
Experienced in data pipeline builder and data wrangler who enjoys optimizing data systems and building them from the ground up
Demonstrated understanding using software and tools including relational NoSQL and SQL databases including Cassandra and Postgres; workflow management and pipeline tools such as Airflow, Luigi and Azkaban; stream-processing systems like Spark-Streaming and Storm; and object function/object-oriented scripting languages including Scala, C++, Java and Python
Familiar with DevOps methodologies, including CI/CD pipelines (Github Actions) and IaC (Terraform)
Ability to obtain and maintain a Public Trust; residing in the United States
Experience with Agile methodology, using test-driven development.

Benefits

Reasonable Accommodations are available, including, but not limited to, for disabled veterans, individuals with disabilities, and individuals with sincerely held religious beliefs, in all phases of the application and employment process.
Pay Transparency Statement
Benefit offerings included in the Transparency in (Benefits) Coverage Act.

Senior Data Engineer

About the position

Responsibilities

Requirements

Nice-to-haves

Benefits

Tools

Career Hubs

Guides

Company