It Concepts - Washington, DC

posted 6 days ago

Full-time - Senior
Remote - Washington, DC
Professional, Scientific, and Technical Services

About the position

The position involves collaborating on the architecture, design, development, and maintenance of large-scale data and analytics platforms, focusing on system integrations, data pipelines, and API integrations. The role requires migrating data from on-premises sources to AWS, ensuring data integrity, and delivering high-quality data assets to support business processes and data-driven analyses. The candidate will lead engineering teams, improve data solutions, and implement best practices to reduce costs and enhance data quality.

Responsibilities

  • Collaborate & contribute to the architecture, design, development, and maintenance of large-scale data & analytics platforms.
  • Create transformation path for data to migrate from on-prem pipelines and sources to AWS.
  • Provide input and insights to the client in conjunction with Data Architects.
  • Coordinate with data engineers to provide feedback and organization to the team.
  • Ensure that data are optimally standardized and analysis-ready.
  • Prototype emerging business use cases to validate technology approaches and propose potential solutions.
  • Collaborate and ensure data integrity after large scale migrations.
  • Deliver high quality data assets to be used by the business to transform business processes and enable data-driven analyses.
  • Continuously improve data solutions to increase quality, speed of delivery, and trust of data engineering team's deliverables.
  • Reduce total cost of ownership of solutions by developing shared components and implementing best practices and coding standards.
  • Collaborate with the team to re-platform and reengineer data pipelines from on-prem to AWS cloud.
  • Work together with team members to ensure data quality and integrity during migrations.
  • Lead by example and pitch in to enable successful and seamless client delivery.

Requirements

  • 8+ years experience related to Data Engineering.
  • AWS Cloud Certificate.
  • Minimum of 3 years of experience in leading engineering teams, including task management and personnel management.
  • Experience working in or managing data centric teams in the government or other highly regulated environments.
  • Strong understanding of data lake, data lakehouse, and data warehousing architectures in a cloud-based environment.
  • Proficiency in Python for data manipulation, scripting, and automation.
  • In-depth knowledge of AWS services relevant to data engineering (e.g., S3, EC2, DMS, DataSync, SageMaker, Glue, RDS, Lambda, Elasticsearch).
  • Understanding of data integration patterns and technologies.
  • Proficiency designing and building flexible and scalable ETL processes and data pipelines using Python and/or PySpark and SQL.
  • Proficiency in data pipeline automation and workflow management tools like Apache Airflow or AWS Step Functions.
  • Knowledge of data quality management and data governance principles.
  • Strong problem-solving and troubleshooting skills related to data management challenges.
  • Experience managing code in GitHub or other similar tools.
  • Minimum of 2 years of experience with hands-on experience with Databricks including data ingestion, transformation, analysis and optimization.
  • Experience designing, deploying, securing, sustaining and maintaining applications and services in a cloud environment (e.g., AWS, Azure) using infrastructure as code (e.g., Terraform, CloudFormation, Boto3).
  • Experience with database administration, optimization, and data extraction.
  • Experience using containerization technology such as Kubernetes or Mesos.
  • Minimum of 1 year of experience with hands-on experience migrating from an on-premise data platform(s) to a modern cloud environment (e.g. AWS, Azure, GCP).
  • Linux/RHEL server & bash/shell scripting experience in on-prem or cloud environment.

Nice-to-haves

  • Bachelor's Degree in related field.
  • Previous experience with large-scale data migrations and cloud-based data platform implementations.
  • Prior experience with Databricks Unity Metastore/Catalog.
  • Familiarity with advanced SQL techniques for performance optimization and data analysis.
  • Knowledge of data streaming and real-time data processing frameworks such as Spark Structured Streaming.
  • Experience with data lakes and big data technologies (e.g., Apache Spark, Citus).
  • Familiarity with serverless computing and event-driven architectures in AWS.
  • Certifications in AWS, Databricks, or related technologies.
  • Experience working in Agile or DevSecOps environments and using related tools for collaboration and version control.
  • Extensive knowledge of software and data engineering best practices.
  • Strong communication and collaboration skills with internal and external stakeholders.
  • Experience establishing, implementing and documenting best practices, standard operating procedures, etc.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service