Aria Consulting Services - Santa Clara, CA

posted 3 months ago

Full-time
Santa Clara, CA
Professional, Scientific, and Technical Services

About the position

The Databricks Administrator role is a critical position within the client team, focusing on the implementation and management of cloud-based infrastructure. The successful candidate will be responsible for ensuring optimal performance and availability of Python-based APIs, web servers, and application servers. This role requires staying current with the latest trends in Python, APIs, web server administration, application server administration, and Machine Learning technologies. The ideal candidate will have a strong background in Cloud Kubernetes/OpenShift, application lifecycle management, and experience with Cloud Kubernetes clusters, along with expertise in CICD implementation, architecture design, containerization, Dockers, and AWS. In this position, the Databricks Administrator will manage and maintain the Databricks platform, including cluster management, user access, and security configuration. The role involves monitoring and troubleshooting issues related to Databricks clusters, job scheduling, data pipelines, and data processing workflows. Collaboration with data engineers and data scientists is essential to optimize and tune Databricks performance, ensuring efficient data processing and analytics. The administrator will also work closely with IT operations teams to integrate Databricks with other data storage systems, data lakes, and data warehouses. The role also encompasses automating and streamlining Databricks administration tasks using scripting and automation tools. The candidate will manage the entire lifecycle of applications running on Cloud Kubernetes/OpenShift platforms, ensuring high availability and performance. This includes designing, deploying, and maintaining Kubernetes clusters on cloud platforms such as AWS, and deploying and managing applications using AWS services like EC2, S3, and RDS. The Databricks Administrator will implement Continuous Integration and Continuous Deployment (CICD) pipelines for application releases and configure and maintain CICD tools and frameworks such as Jenkins and GitLab CI/CD. Additionally, the role requires containerizing applications using Docker and implementing monitoring tools and frameworks to track the health and performance of applications and infrastructure.

Responsibilities

  • Administer and maintain Python-based APIs, web servers, and application servers to ensure optimal performance and availability.
  • Monitor and troubleshoot system issues, including performance bottlenecks, server crashes, and connectivity problems.
  • Collaborate with development teams to ensure the seamless integration of APIs and web services into existing systems.
  • Implement security measures and best practices to protect APIs and servers from unauthorized access and potential threats.
  • Manage and maintain Machine Learning infrastructure, including the deployment and monitoring of models, data pipelines, and other ML components.
  • Administer and maintain the Databricks platform, including cluster management, user access, and security configuration.
  • Monitor and troubleshoot issues related to Databricks clusters, job scheduling, data pipelines, and data processing workflows.
  • Collaborate with data engineers and data scientists to optimize and tune Databricks performance, ensuring efficient data processing and analytics.
  • Work closely with IT operations teams to ensure seamless integration of Databricks with other data storage systems, data lakes, and data warehouses.
  • Automate and streamline Databricks administration tasks using scripting and automation tools.
  • Manage the entire lifecycle of applications running on Cloud Kubernetes/OpenShift platforms.
  • Monitor and troubleshoot application issues, perform upgrades, and ensure high availability.
  • Design, deploy, and maintain Kubernetes clusters on cloud platforms such as AWS.
  • Deploy and manage applications on AWS using services like EC2, S3, RDS, etc.
  • Ensure scalability, security, and optimal performance of the Kubernetes infrastructure.
  • Implement Continuous Integration and Continuous Deployment (CICD) pipelines for application releases.
  • Configure and maintain CICD tools and frameworks such as Jenkins, GitLab CI/CD, or similar.
  • Collaborate with cross-functional teams to design and architect scalable and resilient cloud-based solutions.
  • Containerize applications using Docker to enable portability and scalability.
  • Implement and configure monitoring tools and frameworks such as Prometheus, Grafana, or similar.
  • Set up monitoring dashboards to track the health, performance, and availability of applications and infrastructure.

Requirements

  • Experience in the installation and configuration of Python of all the latest versions, including knowledge of Python upgrades and maintaining Python environments.
  • Knowledge in storage management of unstructured data with a good understanding of on-prem (NAS) and cloud storage technologies (FSX, Snap-Mirror, S3).
  • Strong background in Cloud Kubernetes/OpenShift and application lifecycle management.
  • Experience with Cloud Kubernetes clusters and a solid understanding of CICD implementation and architecture design.
  • Proficiency in containerization, Dockers, and AWS services.

Nice-to-haves

  • Familiarity with monitoring tools and frameworks such as Prometheus and Grafana.
  • Experience with scripting and automation tools for streamlining administration tasks.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service