This job is closed

We regret to inform you that the job you were interested in has been closed. Although this specific position is no longer available, we encourage you to continue exploring other opportunities on our job board.

Alchemy Software Solutions - Columbus, OH

posted 2 months ago

Full-time
Columbus, OH

About the position

The AWS Data Engineer role focuses on developing and maintaining data platforms using Python and PySpark within the AWS ecosystem. The position involves designing and implementing data pipelines, migrating to PySpark on AWS, and optimizing Spark queries for performance. The engineer will also integrate with various SQL databases and produce unit tests for Spark transformations, contributing to the overall efficiency and effectiveness of data handling in the organization.

Responsibilities

  • Develop and maintain data platforms using Python, Spark, and PySpark.
  • Handle migration to PySpark on AWS.
  • Design and implement data pipelines.
  • Produce unit tests for Spark transformations and helper methods.
  • Create Scala/Spark jobs for data transformation and aggregation.
  • Write Scaladoc-style documentation for code.
  • Optimize Spark queries for performance.
  • Integrate with SQL databases (e.g., Microsoft, Oracle, Postgres, MySQL).
  • Understand distributed systems concepts (CAP theorem, partitioning, replication, consistency, and consensus).

Requirements

  • Proficiency in Python, Scala (with a focus on functional programming), and Spark.
  • Familiarity with Spark APIs, including RDD, DataFrame, MLlib, GraphX, and Streaming.
  • Experience working with HDFS, S3, Cassandra, and/or DynamoDB.
  • Deep understanding of distributed systems.
  • Experience with building or maintaining cloud-native applications.

Nice-to-haves

  • Familiarity with serverless approaches using AWS Lambda is a plus.
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service