Walmart - Sunnyvale, CA
posted 3 months ago
As a member of our machine learning platform team, you will be integral to the development and deployment of a foundational AI layer along with Generative AI models for online inferencing. This role is crucial for supporting various applications, including Ads moderation and social commerce. You will be responsible for the development and maintenance of a robust, scalable machine learning platform that facilitates end-to-end machine learning operations, from model development and versioning to deployment. Additionally, you will create, deploy, and maintain high-quality web-based dashboard systems that draw visual insights from our ever-expanding dataset, monitor machine learning models, and evaluate model performance through A/B testing. Your responsibilities will also include the development, testing, and deployment of big-data and machine learning pipelines, which encompass data ingestion, model production, and visualization. Efficient management of public cloud services will be essential, utilizing public cloud tools and resources to scale our storage and computation capabilities on the Google Cloud Platform. We are looking for individuals who possess a strong proficiency in at least one programming language such as Python, Java, C++, or JavaScript, along with a solid understanding of data structures and algorithms. Extensive knowledge and experience in distributed inferencing in GPUs, LLM serving frameworks, and PyTorch/Tensorflow are also required. Familiarity with Linux, public cloud computing, Docker, and Kubernetes is essential, as is experience in the development of REST API services. This position offers a unique opportunity to work in an environment where your contributions can significantly impact the lives of millions of people, as we strive to innovate and redefine the retail experience.