Infosys - Bridgewater Township, NJ

posted about 2 months ago

Full-time
Bridgewater Township, NJ
Professional, Scientific, and Technical Services

About the position

Infosys is seeking a Data Engineer who will play a pivotal role in developing enterprise-level software applications. As a Polyglot developer, you will focus on core programming across multiple languages, ensuring that the applications are performant, scalable, and extensible. You will drive innovation within your chosen domain and collaborate with teams to implement a predictable agile DevSecOps model. This position requires a full-stack developer who can tackle challenges in both front-end and back-end architecture, ultimately delivering exceptional experiences for users around the globe. In this role, you will be part of a small, dynamic team that leverages new technologies to solve complex problems. Your responsibilities will include working on the end-to-end implementation of projects using various technologies such as Cloudera Hadoop, Spark, Hive, HBase, and more. You will also be involved in the categorization, cataloging, cleansing, and normalization of datasets, as well as providing user access to these datasets through REST and Python APIs. The position demands a strong foundation in data extraction, transformation, and loading (ETL) processes from diverse data sources using Python, SQL, and AWS technologies. The ideal candidate will have a minimum of four years of core development experience, particularly in Scala or Python for Spark application development, and a solid understanding of SQL and Unix shell scripting. You will be expected to work closely with analytics teams to build end-to-end data integration and warehousing solutions, ensuring that the data is optimized for performance and usability. Additionally, you will need to possess excellent planning and coordination skills, as well as the ability to thrive in a global delivery environment with diverse stakeholders.

Responsibilities

  • Develop enterprise-level software applications using multiple programming languages.
  • Ensure performance, quality, scalability, and extensibility of applications.
  • Drive innovation in the chosen domain.
  • Collaborate with teams to implement a predictable agile DevSecOps model.
  • Work on end-to-end implementation of projects using Cloudera Hadoop, Spark, Hive, HBase, and other technologies.
  • Categorize, catalog, cleanse, and normalize datasets.
  • Provide user access to datasets using REST and Python APIs.
  • Perform extraction, transformation, and loading of data from various sources using Python, SQL, and AWS technologies.
  • Build end-to-end data integration and data warehousing solutions for analytics teams.

Requirements

  • Bachelor's degree or foreign equivalent from an accredited institution.
  • Minimum of 4 years of core development experience in relevant technology stacks.
  • Deep expertise in Scala or Python for Spark application development.
  • Strong knowledge and hands-on experience in SQL and Unix shell scripting.
  • Experience in end-to-end implementation of projects using Cloudera Hadoop, Spark, Hive, HBase, Sqoop, Kafka, Elasticsearch, Grafana, and ELK stack.
  • Experience in categorizing, cataloging, cleansing, and normalizing datasets.
  • Experience in providing user access to datasets using REST and Python APIs.
  • Experience in extraction, transformation, and loading of data from various data sources.

Nice-to-haves

  • Experience in data warehousing technologies and ETL/ELT implementations.
  • Sound knowledge of software engineering design patterns and practices.
  • Strong understanding of functional programming.
  • Experience with Ranger, Atlas, Tez, Hive LLAP, Neo4J, NiFi, Airflow, or any DAG-based tools.
  • Knowledge and experience with cloud and containerization technologies such as Azure, Kubernetes, OpenShift, and Docker.
  • Experience with data visualization tools like Tableau and Kibana.
  • Experience with design and implementation of ETL/ELT frameworks for complex warehouses/marts.
  • Knowledge of large data sets and experience with performance tuning and troubleshooting.
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service