Devops Engineer with EMR clusters

Accord Technologies - Piscataway, NJ

posted 5 days ago

Full-time - Mid Level

Piscataway, NJ

Repair and Maintenance

About the position

The DevOps Engineer position focuses on managing and optimizing AWS EMR clusters, implementing CI/CD pipelines, and developing robust ETL processes. The role requires strong infrastructure experience and proficiency in various AWS services, container orchestration with Kubernetes, and SQL development. The engineer will collaborate with data scientists and analysts to ensure data quality and governance while maintaining comprehensive documentation of data architecture and workflows.

Responsibilities

Design and implement robust ETL processes to extract, transform, and load data from various sources into data lakes and warehouses.
Configure, manage, and optimize Amazon EMR clusters for big data processing using Apache Spark, Hive, or Presto.
Utilize Kubernetes for deploying, scaling, and managing containerized applications and services.
Develop and maintain CI/CD pipelines for automated deployment of data applications and services using tools like Jenkins, GitLab CI, or AWS CodePipeline.
Write complex SQL queries for data manipulation and retrieval, ensuring high performance and scalability.
Implement data quality checks, monitoring, and logging mechanisms to ensure data reliability and compliance.
Work closely with data scientists, analysts, and other stakeholders to understand data requirements and deliver solutions that meet business needs.
Maintain comprehensive documentation of data architecture, processes, and workflows.

Requirements

Bachelor's degree in Computer Science, Data Science, Information Technology, or a related field.
3+ years of experience in data engineering or a related role, with a focus on AWS technologies.
Proficient in AWS services such as EMR, S3, RDS, Redshift, Lambda, and CloudFormation.
Experience with Kubernetes for container orchestration and microservices architecture.
Familiarity with CI/CD tools and practices, including version control using Git.
Strong knowledge of SQL, with experience in relational databases (e.g., MySQL, PostgreSQL) and data warehousing solutions.
Proficiency in programming languages such as Python, Java, or Scala for data processing tasks.
Ability to troubleshoot complex data issues and optimize performance.
Excellent communication skills with the ability to work collaboratively in a team environment.

Devops Engineer with EMR clusters

About the position

Responsibilities

Requirements

Tools

Career Hubs

Guides

Company