Ford - Trenton, NJ

posted 3 months ago

Full-time - Entry Level
Onsite - Trenton, NJ
Transportation Equipment Manufacturing

About the position

We are the movers of the world and the makers of the future. We get up every day, roll up our sleeves and build a better world -- together. At Ford, we're all a part of something bigger than ourselves. Are you ready to change the way the world moves? Enterprise Technology plays a critical part in shaping the future of mobility. If you're looking for the chance to leverage advanced technology to redefine the transportation landscape, enhance the customer experience and improve people's lives, this is the opportunity for you. Join us and challenge your IT expertise and analytical skills to help create vehicles that are as smart as you are. The Monitoring as a Service (MaaS) Team is building and evolving their services with customers in mind. MaaS will enable teams to modernize and disrupt by providing robust monitoring tools powered by AI and easy-to-use dashboards. Monitoring increases transparency of applications' performance end-to-end, regardless of hosting location (on-prem or in the cloud), which means a better view into how we can proactively manage our apps and improve performance. In this position, we are seeking an experienced DevOps and Site Reliability Engineer (SRE) to join our team and lead the development, enhancement, and extension of our global monitoring and observability platform. As a DevOps/SRE, your role will combine software engineering and systems engineering disciplines to ensure that software systems are available, scalable, and maintainable. This individual will play a pivotal role in shaping the evolving needs of our customers including code and pipeline development, best practices with associated templates, as well as automation to remove toil and facilitate adoption. Please note, this job is posted as remote unless the selected candidate lives within 50 miles of Dearborn, MI, then it may require a hybrid onsite schedule, up to 60% of the time.

Responsibilities

  • Constructing API Libraries & automation scripts based on existing project workflows, mainly developing in Python
  • Consulting with Product Teams to onboard new applications to Splunk, Dynatrace, VictorOps, and other Monitoring Applications
  • Work with First Responders and Product teams to improve and support tooling for existing applications - May include partaking in an On-Call rotation schedule for incident-management
  • Integrating & consolidating application workflows efficiently
  • Deploying applications to containers using CloudRun and Tekton pipelines
  • Delivering a positive web user interface/experience to our internal Ford customers
  • Leverage experience to safely perform destructive testing to seek and discover vulnerabilities
  • Architect, design and develop automation to improve resilience, recoverability, availability, and scalability of supported applications
  • Recognize, validate, and evangelize emerging technologies and architectures that align with business objectives
  • Develop tooling to improve reliability, quality, and time-to-market for software solutions
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Identify and reduce or eliminate toil via automation to maximize the time spent on engineering and innovation
  • Collaborate with development teams to design, build, and operate scalable and resilient software systems using Cloud native principles
  • Proactively identify stability risks and work with engineering leadership to establish appropriate mitigation plans
  • Regularly review key technical metrics such as transactions errors, logging, response times, caching strategies, conversion/bounce rates, capacity, and resource utilization
  • Assist in establishing SRE mindset to ensure maximum availability/uptime.

Requirements

  • Strong background in software development and systems administration
  • Excellent problem-solving, troubleshooting, and communication skills
  • Experience in constructing API Libraries and automation scripts, particularly in Python
  • Familiarity with monitoring applications such as Splunk, Dynatrace, and VictorOps
  • Experience in deploying applications to containers using CloudRun and Tekton pipelines
  • Ability to conduct performance analysis and optimization of systems
  • Experience in developing automation to improve application resilience and scalability
  • Knowledge of Cloud native principles and practices

Nice-to-haves

  • Experience with AI-powered monitoring tools
  • Familiarity with incident management processes
  • Understanding of software development best practices
  • Experience in working with cross-functional teams

Benefits

  • Health insurance
  • 401k retirement plan
  • Paid time off
  • Flexible work hours
  • Professional development opportunities
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service