Raas Infotek - Raleigh, NC

posted 4 days ago

Full-time - Senior
Raleigh, NC
Professional, Scientific, and Technical Services

About the position

The Senior Data Engineer will play a crucial role in designing, implementing, and optimizing scalable data pipelines and big data solutions. This position requires a strong background in big data technologies and aims to ensure seamless data integration and high performance to support data-driven decision-making across the organization.

Responsibilities

  • Design and develop scalable and efficient big data pipelines to process and analyze large datasets.
  • Implement ETL processes using modern data engineering tools and frameworks.
  • Build and optimize data lakes, distributed data platforms, and data warehouses for robust storage and querying capabilities.
  • Leverage big data technologies such as Hadoop, Spark, Hive, Kafka, and Flink for large-scale data processing.
  • Develop and maintain streaming data pipelines using tools like Kafka, Apache Beam, or Google Cloud Pub/Sub.
  • Optimize query performance and data retrieval processes for both batch and real-time use cases.
  • Work with cloud platforms such as AWS, Azure, or Google Cloud Platform to deploy and manage data infrastructure.
  • Collaborate with cross-functional teams to gather requirements, design solutions, and implement best practices in data engineering.
  • Ensure data quality, security, and governance across the data lifecycle.
  • Mentor junior engineers and contribute to architectural decisions for the data platform.

Requirements

  • 12+ years of experience in data engineering or related roles.
  • Strong proficiency in big data technologies such as Hadoop, Spark, Hive, Kafka, or Flink.
  • Advanced expertise in SQL, Python, Scala, or Java for data processing and analytics.
  • Hands-on experience with cloud-based data platforms (e.g., Google Cloud Platform BigQuery, AWS Redshift, Azure Synapse).
  • Proficiency in building and managing streaming data pipelines using tools like Kafka, Pub/Sub, or Kinesis.
  • Experience with CI/CD pipelines and version control systems like Git.
  • Deep understanding of data modeling, data architecture, and schema design principles.
  • Strong knowledge of ETL/ELT processes, distributed systems, and data orchestration tools (e.g., Apache Airflow, Apache Nifi).
  • Familiarity with containerization and orchestration tools like Docker and Kubernetes.
  • Experience with machine learning pipelines or integrating ML models into data workflows.
  • Knowledge of data visualization tools such as Tableau, Power BI, or Looker.
  • Understanding of data governance frameworks and compliance requirements.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service