Envisn - Houston, TX

posted 10 days ago

Full-time
Houston, TX
Professional, Scientific, and Technical Services

About the position

The Python Data Engineer role focuses on building and maintaining systems for large-scale data processing tasks. The ideal candidate will leverage their expertise in Python, particularly in API development and parallel data processing, to ensure high performance and scalability in distributed systems.

Responsibilities

  • Develop and maintain RESTful APIs using Python web frameworks such as FastAPI or Django.
  • Collaborate with front-end developers to integrate user-facing elements with server-side logic.
  • Utilize Pandas, NumPy, and other libraries to process large datasets efficiently.
  • Implement multithreading, multiprocessing, and asynchronous programming techniques.
  • Optimize data processing pipelines to handle millions of rows with minimal latency.
  • Design and implement distributed systems with a focus on scalability and reliability.
  • Understand and apply core concepts such as load balancing and task queues.
  • Use Docker to containerize applications and manage dependencies.
  • Document system designs, processes, and code effectively.
  • Collaborate with cross-functional teams to align on project goals and deliverables.

Requirements

  • Proficiency with FastAPI, Django, or similar frameworks.
  • Understanding of RESTful API principles and best practices.
  • Ability to create and manage Docker Files.
  • Basic knowledge of load balancing, task queues, and distributed system concepts.
  • Proficiency in multithreading and multiprocessing without relying solely on external libraries or frameworks.
  • Familiarity with asynchronous programming, particularly asyncIO in Python.
  • Excellent technical communication abilities.

Nice-to-haves

  • BS or MS in Computer Science.
  • Experience with Polars, PySpark, or similar tools.
  • Hands-on experience with distributed architectures in Docker.
  • Knowledge of container orchestration using Kubernetes.
  • Demonstrated ability to solve complex problems using parallel or distributed computing.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service