Python + Pyspark developer

$110,000 - $120,000/Yr

Synechron - Dallas, TX

posted 4 months ago

Full-time - Mid Level
Dallas, TX
10,001+ employees
Professional, Scientific, and Technical Services

About the position

At Synechron, we are seeking an experienced Python Developer with a strong background in PySpark to join our data engineering team. The ideal candidate will have a robust understanding of big data processing, experience with Apache Spark, and a proven track record in Python programming. You will be responsible for developing scalable data processing and analytics solutions in a cloud environment. This role is crucial as it involves designing, building, and maintaining efficient data processing pipelines that can handle large datasets effectively. You will collaborate closely with data scientists and analysts to transform data into actionable insights, ensuring that the systems meet business requirements and adhere to industry practices for security and privacy. In this position, you will be expected to write reusable, testable, and efficient Python code while optimizing data retrieval processes. You will also be involved in implementing data ingestion, cleansing, deduplication, and consolidation processes. Leveraging cloud-based big data services and architectures such as AWS, Azure, or GCP will be a key part of your responsibilities. Staying updated with the latest innovations in big data technologies and PySpark enhancements is essential to ensure that our solutions remain cutting-edge and effective. This role offers the opportunity to work on exciting projects at leading tier-one banks, financial institutions, and insurance firms, contributing to the modernization of their data processing capabilities.

Responsibilities

  • Design, build and maintain scalable and efficient data processing pipelines using PySpark.
  • Develop high-performance algorithms, predictive models, and proof-of-concept prototypes.
  • Work closely with data scientists and analysts to transform data into actionable insights.
  • Write reusable, testable, and efficient Python code.
  • Optimize data retrieval, develop dashboards, and reports for business stakeholders.
  • Implement data ingestion, data cleansing, deduplication, and data consolidation processes.
  • Leverage cloud-based big data services and architectures (AWS, Azure, or GCP) for processing large datasets.
  • Collaborate with cross-functional teams to define and refine data and analytics requirements.
  • Ensure systems meet business requirements and industry practices for security and privacy.
  • Stay updated with the latest innovations in big data technologies and PySpark enhancements.

Requirements

  • Bachelor's or Master's degree in Computer Science, Engineering, or a related field.
  • Minimum of 3 years of experience in Python development.
  • Strong experience with Apache Spark and its components (Spark SQL, Streaming, MLlib, GraphX) using PySpark.
  • Demonstrated ability to write efficient, complex queries against large data sets.
  • Knowledge of data warehousing principles and data modeling concepts.
  • Proficient understanding of distributed computing principles.
  • Experience with at least one cloud provider (AWS, Azure, GCP), including their big data processing services.
  • Strong problem-solving skills and ability to work under tight deadlines.
  • Excellent communication and collaboration abilities.

Nice-to-haves

  • Experience with additional big data tools like Hadoop, Kafka, or similar technologies.
  • Familiarity with machine learning frameworks and libraries.
  • Experience with data visualization tools and libraries.
  • Knowledge of containerization and orchestration technologies (Docker, Kubernetes).
  • Contributions to open-source projects or a strong GitHub portfolio showcasing relevant projects.

Benefits

  • A highly competitive compensation and benefits package
  • A multinational organization with 55 offices in 20 countries and the possibility to work abroad
  • Laptop and a mobile phone
  • 10 days of paid annual leave (plus sick leave and national holidays)
  • Maternity & Paternity leave plans
  • A comprehensive insurance plan including: medical, dental, vision, life insurance, and long-/short-term disability (plans vary by region)
  • Retirement savings plans
  • A higher education certification policy
  • Commuter benefits (varies by region)
  • Extensive training opportunities, focused on skills, substantive knowledge, and personal development.
  • On-demand Udemy for Business for all Synechron employees with free access to more than 5000 curated courses
  • Coaching opportunities with experienced colleagues from our Financial Innovation Labs (FinLabs) and Center of Excellences (CoE) groups
  • Cutting edge projects at the world's leading tier-one banks, financial institutions and insurance firms
  • A flat and approachable organization
  • A truly diverse, fun-loving and global work culture
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service