The Friedkin Group - Houston, TX
posted about 2 months ago
As a Lead Machine Learning Ops Engineer, you will play a pivotal role in implementing DevOps and ML Ops practices within the Corporate Data & Analytics Team to support AI/ML application enablement across The Friedkin Group of companies. Your primary responsibility will be to drive the adoption of best practices in DevOps and ML Ops, accelerating the deployment of AI/ML and data-driven solutions that meet our business needs. We seek a motivated and skilled individual with a strong background in DevOps and ML Ops, a deep understanding of Infra Ops, and solid knowledge of AI/ML data and analytics cloud services and components. You will collaborate closely with data scientists, machine learning engineers, data engineers, software engineers, and platform architects, utilizing the latest tools and technologies to deploy and maintain AI/ML and advanced analytics solutions, as well as integrate analytic models with existing business applications. In this role, you will develop automated build and deployment processes to enable continuous delivery of software releases, enhance the existing CI/CD pipelines for AIML application development and deployment. You will collaborate with data scientists, data engineers, data analysts, software engineers, IT specialists, and stakeholders to accelerate deployment of AI applications via CI/CD pipelines and maintain the SLAs of those applications at the centralized platform. Additionally, you will design, develop and maintain infrastructure using infrastructure as code tools such as Terraform, Ansible, CloudFormation, etc. You will templatize existing Databricks CLI codes to manage Databricks platform as code for AIML data pipelines (batch processing, batch streaming, and streaming) and model serving endpoints. Your role will also involve enhancing the existing DevOps practices to improve the overall AIML application development lifecycle, ensuring that applications are highly available and scalable, and establishing best practices for cloud security, compliance, and cost optimization.