Lead Data Engineer

$150,000 - $160,000/Yr

Carlyle Investment Management - Washington, DC

posted about 2 months ago

Full-time - Mid Level
Washington, DC
1,001-5,000 employees
Securities, Commodity Contracts, and Other Financial Investments and Related Activities

About the position

As a Lead Data Engineer at Carlyle, you will be part of an innovative team focused on leveraging data to drive cutting-edge solutions. Your primary responsibility will be to design, build, and maintain the data infrastructure and pipelines that support data-driven products and insights. You will work with various tools and cloud platforms to create robust data architectures, ensuring data quality, security, and governance while collaborating with data consumers to deliver trusted data products.

Responsibilities

  • Design, implement, and support cloud data platforms such as Snowflake and Databricks.
  • Architect and administer data lakes and cloud data warehouses for secure and flexible data storage.
  • Build and maintain scalable data pipelines using AWS, Azure, SnapLogic, Apache Airflow, and Prefect.
  • Develop and optimize data processing workflows with Python, Scala, and Spark.
  • Utilize Git, GitHub, and Azure DevOps for version control and collaboration.
  • Champion the implementation of CI/CD pipelines for development and deployment processes.
  • Ensure data integrity and compliance with best practices in SQL and NoSQL systems.
  • Continuously explore new technologies to enhance data reliability and quality.
  • Collaborate with data consumers to understand their requirements and deliver trusted data products.
  • Create and maintain data documentation, metadata, and data dictionaries.
  • Perform data testing and validation to ensure accuracy and consistency.
  • Provide data engineering support and guidance to junior data engineers.
  • Stay updated with trends in data engineering and related fields.
  • Apply best practices for data governance, security, and quality across cloud platforms.
  • Evaluate and select appropriate data tools and technologies.
  • Design and implement data APIs and services for data consumption and integration.
  • Monitor and improve data pipeline performance and troubleshoot issues.
  • Conduct data analysis and provide insights for data-driven decision making.
  • Implement and integrate machine learning models into production systems.
  • Mentor and coach other data team members.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or related field.
  • 5+ years of relevant experience in data engineering, data analysis, and data pipeline development.
  • Proficient in AWS data services such as S3, Glue, Redshift, EMR, Athena, and Kinesis.
  • Expert skills in SQL, Python, Scala, and Spark for ETL processes.
  • Proficient in Snowflake and Databricks for data ingestion and processing.
  • Experience with pipeline orchestration tools like Apache Airflow, Prefect, or Luigi.
  • Knowledge of data warehouse, data lake, and data mart concepts.
  • Experience with data governance, data security, and data validation best practices.

Nice-to-haves

  • Relevant certifications in AWS, Azure, and other modern data technologies.
  • Experience with machine learning frameworks like AWS Sagemaker, MLFlow, and Jupyter Notebooks.
  • Familiarity with other industry-standard data tools like Kafka, Hive, Redis, MongoDB.
  • Alation data catalog experience.

Benefits

  • Comprehensive benefits package including retirement benefits, health insurance, life insurance, and disability.
  • Paid time off and paid holidays.
  • Family planning benefits and various wellness programs.
  • Eligibility for an annual discretionary incentive program based on performance.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service