Pubmatic - Redwood City, CA

posted about 2 months ago

Full-time - Mid Level
Redwood City, CA
Computing Infrastructure Providers, Data Processing, Web Hosting, and Related Services

About the position

PubMatic is seeking a Senior Machine Learning Engineer with big data experience who can work on building the next generation ML platform. The ideal candidate is a self-motivated problem solver with a strong background in big data tech stack, software design, and development. If you get excited about building a highly impactful machine learning platform that processes large datasets in a creative and fast-paced open cultured environment, then you should consider applying for this position. In this role, you will be responsible for building, designing, and implementing our highly scalable, fault-tolerant, and highly available big data platform to process terabytes of data and provide customers with in-depth analytics. You will develop Big Data pipelines using modern technology stacks such as Spark, Hadoop, Kafka, HBase, and Hive. Additionally, you will be tasked with developing analytics applications from the ground up using modern technology stacks such as Java, Spring, Tomcat, Jenkins, REST APIs, JDBC, Amazon Web Services, and Hibernate. You will work collaboratively with the Machine Learning and monetization teams to democratize data for analysis and impact. Your role will also involve building solutions to help the monetization team run experiments at a fast pace and analyze data accurately to calculate impact. A good understanding of the engineering tech stack and ML algorithms will be essential to make data processing jobs powering these algorithms more efficient and scalable. Moreover, you will develop systems to objectively monitor the impact of various experimental changes on machine learning algorithms, clearly highlighting both positive and negative outcomes. Managing Hadoop MapReduce and Spark Jobs, solving ongoing issues with operating the cluster, and participating in Agile/Scrum processes such as Sprint Planning, Sprint Retrospective, and Backlog grooming will also be part of your responsibilities. You will keep in regular touch with the quality engineering team to ensure the quality of the platforms/products and performance SLAs of Java-based microservices and Spark-based data pipelines. Supporting customer issues over email or JIRA, providing updates, patches to customers, and discussing technical documents with the Technical Writing team will round out your duties.

Responsibilities

  • Build, design and implement a highly scalable, fault-tolerant, highly available big data platform to process terabytes of data and provide customers with in-depth analytics.
  • Develop Big Data pipelines using modern technology stack such as Spark, Hadoop, Kafka, HBase, Hive, etc.
  • Develop analytics applications from the ground up using modern technology stack such as Java, Spring, Tomcat, Jenkins, REST APIs, JDBC, Amazon Web Services, Hibernate.
  • Build data pipelines to automate high-volume data collection and processing to provide real-time data analytics.
  • Work collaboratively with Machine Learning and monetization teams to democratize data for analysis and impact.
  • Build solutions to help the monetization team run experiments at a fast pace and analyze data accurately to calculate impact.
  • Develop systems to objectively monitor the impact of various experimental changes on machine learning algorithms, clearly highlighting both positive and negative outcomes.
  • Manage Hadoop MapReduce and Spark Jobs and solve any ongoing issues with operating the cluster.
  • Implement professional software engineering best practices for the full software development life cycle, including coding standards, performing code reviews, committing to Github, preparing documents in Confluence, continuous delivery using Jenkins, automated testing, and operations.
  • Participate in Agile/Scrum processes such as Sprint Planning, Sprint Retrospective, Backlog grooming, User story management, work item prioritization, etc.
  • Keep in regular touch with the quality engineering team to ensure the quality of the platforms/products and performance SLAs of Java-based microservices and Spark-based data pipelines.
  • Support customer issues over email or JIRA, providing updates and patches to customers to fix the issues.
  • Discuss with the Technical Writing team about the technical documents that are published on the documentation portal.
  • Perform code and design reviews for code implemented by peers or as per the code review process.

Requirements

  • 3-5 years coding experience in Java.
  • Solid computer science fundamentals including data structure and algorithm design, and creation of architectural specifications.
  • Expertise in developing implementation of professional software engineering best practices for the full software development life cycle, including coding standards, code reviews, source control management, documentation, build processes, automated testing, and operations.
  • A passion for developing and maintaining a high-quality code and test base, and enabling contributions from engineers across the team.
  • Expertise in big data technologies like Hadoop, Spark, Kafka, HBase, etc. would be an added advantage.
  • Experience in developing and delivering large scale big data pipelines, real-time systems & data warehouses would be preferred.
  • Demonstrated ability to achieve stretch goals in a very innovative and fast-paced environment.
  • Demonstrated ability to learn new technologies quickly and independently.
  • Excellent verbal and written communication skills, especially in technical communications.
  • Strong inter-personal skills and a desire to work collaboratively.

Nice-to-haves

  • Experience with cloud services, particularly Amazon Web Services (AWS).
  • Familiarity with Agile/Scrum methodologies.

Benefits

  • Base Salary Range: $160,000 - $180,000
  • Bonus opportunities
  • Competitive benefits package
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service