EPAM Systems - New York, NY

posted 4 days ago

Full-time - Mid Level
New York, NY
Professional, Scientific, and Technical Services

About the position

The Senior Data Integration Engineer will play a crucial role in supporting a digital transformation project for one of EPAM's top clients. This position involves developing, implementing, and maintaining big data solutions on the client's Cloud Platform, which supports multiple deployed routers and collects data from them. The role offers opportunities for skill advancement and growth within a global organization.

Responsibilities

  • Creating and integrating tables from different data sources on cloud for data acquisition, integration, and analysis
  • Collecting router logs, storing them in AWS S3, and analyzing them with Athena to generate statistics
  • Performing transformations & actions on RDDs that can be used as data stage for ETL
  • Developing automation frameworks by connecting to multiple clusters and databases on the cloud like HBase, Mongo DB, Oracle, Teradata, and SQL Server for achieving simultaneous dataflow
  • Configuring Glue crawlers, tables, and Athena external tables from the S3 data source to run SQL queries
  • Querying different data sets with Boto3, Pandas, and Python to connect to AWS S3, HBase, and Athena for querying a large set of data and scheduling multiple jobs on EMR
  • Using Python, pandas, boto3, and spark to write a reusable code module to handle a large number of datasets
  • Developing APIs using Lambda and API Gateway to achieve an abstraction layer to integrate with on-prem systems
  • Building end-to-end test cases to validate the data flow of the APIs. Performing testing on router for the API's sent from cloud to router and verify E-E functionality. Testing end-to-end API parameters from the routers to the cloud API, and end-to-end testing of TR-69 parameters from the routers

Requirements

  • Self-driven with the ability to work independently and develop solutions without close supervision
  • Hands-on experience with AWS services
  • Very good understanding of big data pipeline and common databases like relational (MySQL & PostgreSQL), MongoDB, HBase
  • Extensive experience with SQL, including modifying and writing complex queries, particularly using window functions
  • Experience performing batch/real-time processing using Spark
  • Experience designing and developing appropriate test automation frameworks and data validation techniques to ensure optimized product performance
  • Understanding of networking stack and any wireless protocol - preferably 802.11
  • Experience working with routers & embedded consumer products, preferably knowledge of TR-69 CPE WAN Management Protocol
  • Experience with Cloud testing tools
  • Very strong scripting experience using Spark and Python is a must
  • Experience in the Telecommunications Industry is preferred

Nice-to-haves

  • Experience with additional ETL tools
  • Familiarity with data governance and data quality frameworks

Benefits

  • Medical, Dental and Vision Insurance (Subsidized)
  • Health Savings Account
  • Flexible Spending Accounts (Healthcare, Dependent Care, Commuter)
  • Short-Term and Long-Term Disability (Company Provided)
  • Life and AD&D Insurance (Company Provided)
  • Employee Assistance Program
  • Unlimited access to LinkedIn learning solutions
  • Matched 401(k) Retirement Savings Plan
  • Paid Time Off - the employee will be eligible to accrue 15-25 paid days, depending on specific level and tenure with EPAM
  • Paid Holidays - nine (9) total per year
  • Legal Plan and Identity Theft Protection
  • Accident Insurance
  • Employee Discounts
  • Pet Insurance
  • Employee Stock Purchase Program
  • Participation in the discretionary annual bonus program
  • Participation in the discretionary Long-Term Incentive (LTI) Program
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service