Yahoo Holdings - Omaha, NE

posted 2 months ago

Full-time - Mid Level
Hybrid - Omaha, NE

About the position

The Senior Software Development Engineer at Yahoo will play a crucial role in the analysis, design, programming, debugging, and modification of software enhancements and new products. This position requires leading the development of data warehouse designs in collaboration with a team of Big Data engineers. The engineer will work in an agile Scrum-driven environment, focusing on delivering innovative products that meet the needs of users and stakeholders. The role involves designing applications, writing code, developing and testing software, debugging issues, and documenting work and results. Staying current with relevant technology is essential to maintain and improve the functionality of the applications developed. In this position, the engineer will perform all phases of software engineering, including requirements analysis, application design, and code development and testing. A significant part of the role (50%) will involve designing and implementing reusable frameworks, libraries, and Java components, as well as product features in collaboration with business and IT stakeholders. The engineer will also be responsible for ingesting data from various structured and unstructured data sources into Hadoop and other distributed Big Data systems (15%). Supporting the sustainment and delivery of an automated ETL pipeline is another key responsibility, which includes validating data extracted from sources like HDFS, databases, and other repositories using scripts and automated capabilities (10%). The engineer will enrich and transform extracted data as required and monitor the data flow through the ETL process (5%). Additionally, the role includes performing data extractions, data purges, or data fixes in accordance with internal procedures and policies (5%). Tracking development and operational support via user stories and technical tasks in issue tracking software, including GIT, Maven, and JIRA (5%), troubleshooting production support issues post-deployment (5%), and mentoring junior engineers within the team (5%) are also part of the responsibilities.

Responsibilities

  • Perform all phases of software engineering including requirements analysis, application design, and code development & testing
  • Design and implement reusable frameworks, libraries and Java components, product features in collaboration with business and IT stakeholders
  • Ingest data from various structured and unstructured data sources into Hadoop and other distributed Big Data systems
  • Support the sustainment and delivery of an automated ETL pipeline
  • Validate data that is extracted from sources like HDFS, databases, and other repositories using scripts and other automated capabilities, logs, and queries
  • Enrich and transform extracted data, as required
  • Monitor and report the data flow through the ETL process
  • Perform data extractions, data purges, or data fixes in accordance with current internal procedures and policies
  • Track development and operational support via user stories and decomposed technical tasks in a provided issue tracking software, including GIT, Maven, and JIRA
  • Troubleshoot production support issues post-deployment and come up with solutions as required
  • Mentor junior engineers within the team for development and delivery

Requirements

  • B.S. or M.S. in Computer Science (or equivalent experience)
  • Five years of related industry experience
  • Experience with Cloud providers like AWS, GCP in the Big Data domain is a must
  • Experience in back-end programming, like Java, Python and OOAD and ETL Tools
  • Experience with one of Database technologies (Ex: MySQL)
  • Experience of working with large scale databases (BigQuery, Vertica, Redshift)
  • Knowledge and experience of Unix (Linux) Platforms and Shell Scripting
  • Experience in writing Pig Latin scripts, MapReduce jobs, HiveQL, Spark etc.
  • Good knowledge of database structures, theories, principles, and practices
  • Familiarity with data loading tools like Flume, Sqoop
  • Knowledge of workflow/schedulers like Oozie, Airflow
  • Analytical and problem solving skills, applied to Big Data domain
  • Proven understanding with Hadoop(Dataproc), HBase, Hive, Pig, SQ
  • Writing high-performance, reliable and maintainable code
  • Expertise in version control tools like GIT
  • Good aptitude in multi-threading and concurrency concepts
  • Effective analytical, troubleshooting and problem-solving skill
  • Strong customer focus, ownership, urgency and drive

Benefits

  • Healthcare
  • 401K savings plan
  • Company holidays
  • Vacation
  • Sick time
  • Parental leave
  • Employee assistance program
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service