Scicom Infrastructure Services - Atlanta, GA

posted 28 days ago

Full-time - Senior
Atlanta, GA
Professional, Scientific, and Technical Services

About the position

We are seeking an experienced Databricks Architect to lead the design, architecture, and implementation of scalable data solutions on the Databricks platform. The ideal candidate will have expertise in data engineering, data architecture, cloud technologies, and hands-on experience with Databricks, Apache Spark, and related big data technologies. This role requires a strong technical background, excellent problem-solving skills, and the ability to work closely with stakeholders to deliver high-quality data solutions.

Responsibilities

  • Lead the architecture and design of Databricks-based data solutions that support data engineering, machine learning, and real-time analytics.
  • Design and implement ETL (Extract, Transform, Load) pipelines using Databricks, Apache Spark, and other big data tools to process and integrate large-scale data from multiple sources.
  • Work with business and data teams to understand requirements, identify opportunities for automation, and design solutions that improve data workflows.
  • Create highly optimized, scalable, and cost-effective architectures for processing large data sets and managing big data workloads using Databricks, Delta Lake, and Apache Spark.
  • Define and promote best practices for Databricks implementation, including data governance, security, performance optimization, and monitoring.
  • Manage and optimize Databricks clusters for performance, cost, and reliability.
  • Implement best practices for data governance, security, and compliance on the Databricks platform.
  • Automate repetitive tasks, streamline data processes, and optimize data workflows to improve efficiency and reduce operational costs.
  • Mentor and provide guidance to junior engineers, ensuring the team follows best practices in the development of data pipelines and analytics solutions.
  • Stay current with emerging technologies in the big data and cloud space, and recommend new solutions or improvements to existing processes.

Requirements

  • Extensive experience with Databricks, Apache Spark, and cloud platforms (AWS, Azure, or GCP).
  • Proficiency in programming languages such as Python, Scala, or SQL.
  • Strong understanding of distributed computing, data modeling, and data storage technologies.
  • Hands-on experience with Delta Lake, Spark SQL, and MLlib.
  • Expertise in deploying and managing data platforms and workloads on cloud environments like AWS, Azure, or GCP.
  • Familiarity with cloud-native services like S3, Redshift, Azure Blob Storage, and BigQuery.
  • Experience designing, building, and optimizing ETL data pipelines.
  • Familiarity with data warehousing concepts, OLAP, and OLTP systems.
  • Experience in integrating machine learning workflows with Databricks, building models, and automating model deployment.
  • Strong leadership and communication skills to interact with both technical and non-technical stakeholders.
  • Experience in leading cross-functional teams and mentoring junior team members.

Nice-to-haves

  • In-depth experience with Databricks components, such as notebooks, jobs, and collaboration features.
  • Experience with DevOps practices, automation, and CI/CD pipelines in data engineering.
  • Strong knowledge of data governance principles, such as metadata management, data lineage, and data quality.
  • Databricks Certified Associate Developer for Apache Spark.
  • Cloud certifications (e.g., AWS Certified Solutions Architect, Azure Solutions Architect Expert).
Job Description Matching

Match and compare your resume to any job description

Start Matching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service