JPMorgan Chase - New York, NY

posted 2 months ago

Full-time - Senior
New York, NY
Credit Intermediation and Related Activities

About the position

As a Senior Machine Learning Operations Engineer, VP within our Consumer & Community Banking division, you will play a pivotal role in building and maintaining the infrastructure necessary for model training and deployment. This position is crucial for ensuring that our machine learning models are developed, tested, and deployed in a controlled and efficient manner. You will be responsible for creating and managing pipelines that facilitate batch and real-time model serving, hyperparameter tuning at scale, and model monitoring. Your work will directly impact the performance of our applications, which are designed to provide personalized experiences across various banking channels, integrating traditional banking services with innovative offerings in travel, shopping, and dining. In this role, you will deploy and maintain infrastructure such as Sagemaker Notebooks, which will serve as an effective model development platform for our data scientists and ML engineers. You will also be tasked with building, deploying, and maintaining pipelines for feature generation, ensuring that the input features for model training and inference are calculated accurately and efficiently. Your expertise will be essential in identifying and implementing high-quality model monitoring and observability tools, as well as managing compute-intensive tasks related to hyperparameter tuning and model interpretability. Collaboration is key in this position; you will partner with product, architecture, and other engineering teams to define scalable and performant technical solutions. Your deep technical expertise will not only help in designing extensible solutions but also in coaching and developing your team members. You will ensure that all work is executed in compliance with established standards and business requirements, while proactively maintaining high operational excellence standards for our production systems. Your ability to anticipate the needs of broader teams and manage dependencies will be critical to the success of your initiatives.

Responsibilities

  • Deploy and maintain infrastructure (e.g., Sagemaker Notebooks) for an effective model development platform for data scientists and ML engineers.
  • Build, deploy, and maintain ingress/egress and feature generation pipelines to calculate input features for model training and inference.
  • Deploy and maintain infrastructure for batch and real-time model serving in high throughput, low latency applications at scale.
  • Identify, deploy, and maintain high-quality model monitoring and observability tools.
  • Deploy and maintain infrastructure for compute-intensive tasks such as hyperparameter tuning and interpretability and explainability.
  • Partner with product, architecture, and other engineering teams to define scalable and performant technical solutions.
  • Leverage deep technical expertise to design extensible and scalable solutions and coach and grow individuals and teams.
  • Ensure team executes work according to compliance standards, SLAs, and business requirements to meet the objectives of an initiative.
  • Anticipate the needs of broader teams and potential dependencies with other teams.
  • Identify and mitigate issues to execute a book of work while escalating issues as necessary.
  • Proactively help maintain high operational excellence standards for production systems.

Requirements

  • BS degree in Computer Science or related Engineering field
  • 5+ years applied experience
  • Experience in model training, building, deployment, and execution ecosystem such as Sagemaker and/or Vertex AI
  • Experience in monitoring and observability tools to monitor model input/output and features stats
  • Operational experience in data tools such as Spark, EMR, Ray
  • Experience and interest in ML model architectures, linear or logistic regression, Gradient Boosted Trees, Neural Network architectures
  • Experience in containers like Docker ecosystem, container orchestration systems like Kubernetes, ECS, and Airflow, Kubeflow etc.
  • Experience with cloud technologies such as EC2, Sagemaker, IAM

Nice-to-haves

  • Bias for action and iterative development
  • Experience with recommendation and personalization systems is a plus
  • Experience in programming languages such as Python and Java
  • Familiarity with Databases

Benefits

  • Comprehensive health care coverage
  • On-site health and wellness centers
  • Retirement savings plan
  • Backup childcare
  • Tuition reimbursement
  • Mental health support
  • Financial coaching
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service