Yahoo Holdings - Champaign, IL

posted 2 months ago

Full-time - Mid Level
Hybrid - Champaign, IL

About the position

The Senior Software Development Engineer at Yahoo will play a crucial role in analyzing, designing, programming, debugging, and modifying software enhancements and new products. This position involves leading the development of data warehouse designs in collaboration with a team of Big Data engineers. The engineer will work in an agile Scrum-driven environment, focusing on delivering innovative products that meet the needs of the business. Responsibilities include designing applications, writing code, developing and testing software, debugging, and documenting work and results. Staying up-to-date with relevant technology is essential to maintain and improve the functionality of the applications developed. In this role, the engineer will be responsible for designing and implementing reusable frameworks, libraries, and Java components, as well as product features in collaboration with business and IT stakeholders. The position requires ingesting data from various structured and unstructured data sources into Hadoop and other distributed Big Data systems. The engineer will support the sustainment and delivery of an automated ETL pipeline, validate data extracted from sources like HDFS, databases, and other repositories, and enrich and transform extracted data as required. Monitoring and reporting the data flow through the ETL process, performing data extractions, data purges, or data fixes in accordance with internal procedures and policies, and tracking development and operational support via user stories and technical tasks in issue tracking software are also key responsibilities. Additionally, the engineer will troubleshoot production support issues post-deployment and mentor junior engineers within the team. This position requires a strong background in back-end programming, experience with large-scale databases, and familiarity with various database technologies and tools. The ideal candidate will possess excellent analytical and problem-solving skills, particularly in the Big Data domain, and have a proven understanding of Hadoop and cloud providers like AWS, GCP, and Azure.

Responsibilities

  • Design and implement reusable frameworks, libraries, and Java components, product features in collaboration with business and IT stakeholders.
  • Ingest data from various structured and unstructured data sources into Hadoop and other distributed Big Data systems.
  • Support the sustainment and delivery of an automated ETL pipeline.
  • Validate data that is extracted from sources like HDFS, databases, and other repositories using scripts and other automated capabilities, logs, and queries.
  • Enrich and transform extracted data, as required.
  • Monitor and report the data flow through the ETL process.
  • Perform data extractions, data purges, or data fixes in accordance with current internal procedures and policies.
  • Track development and operational support via user stories and decomposed technical tasks in a provided issue tracking software, including GIT, Maven, and JIRA.
  • Troubleshoot production support issues post-deployment and come up with solutions as required.
  • Mentor junior engineers within the team for development.

Requirements

  • B.S. or M.S. in Computer Science (or equivalent experience).
  • Five years of related industry experience.
  • Experience in back-end programming, like Java, JS, Python, Node.js and OOAD and ETL Tools.
  • Experience with one of Database technologies (Ex: Vertica, Oracle, Netezza, MySQL, BigQuery).
  • Experience of working with large scale databases.
  • Knowledge and experience of Unix (Linux) Platforms and Shell Scripting.
  • Experience in writing Pig Latin scripts, MapReduce jobs, HiveQL etc.
  • Good knowledge of database structures, theories, principles, and practices.
  • Familiarity with data loading tools like Flume, Sqoop.
  • Knowledge of workflow/schedulers like Oozie, Airflow.
  • Analytical and problem solving skills, applied to Big Data domain.
  • Proven understanding with Hadoop(Dataproc), HBase, Hive, Pig.
  • Knowledge of Cloud providers like AWS, GCP, Azure.
  • Writing high-performance, reliable and maintainable code.
  • Expertise in version control tools like GIT.
  • Good aptitude in multi-threading and concurrency concepts.
  • Effective analytical, troubleshooting and problem-solving skills.
  • Strong customer focus, ownership, urgency and drive.

Benefits

  • Healthcare
  • 401K savings plan
  • Company holidays
  • Vacation
  • Sick time
  • Parental leave
  • Employee assistance program
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service