Amazon - Seattle, WA

posted about 2 months ago

Full-time - Mid Level
Seattle, WA
10,001+ employees
Sporting Goods, Hobby, Musical Instrument, Book, and Miscellaneous Retailers

About the position

Amazon Web Services (AWS) is seeking an experienced Data Engineer to join the Global Services Support Operations (GSSO) team in Seattle, Washington. This role is pivotal in designing, developing, and maintaining efficient and scalable data pipelines, data models, and data warehousing solutions. The Data Engineer will ensure data integrity, quality, and availability across the organization, enabling data-driven decision-making and supporting business intelligence initiatives. The successful candidate will be responsible for conducting data discovery and profiling for various data sources, including Page0, entitlement models, and propensity models. They will design and implement robust ETL pipelines to extract data from multiple sources, transform it according to defined business rules, and load it into centralized data storage systems. In addition to developing and maintaining efficient data ingestion processes, the Data Engineer will design and develop effective data models and schemas for various data sources. They will implement and maintain a centralized data warehouse architecture, ensuring scalability, performance, and data integrity. Collaboration with stakeholders to understand data requirements and translate them into optimized data structures and models is essential. The role also involves establishing and enforcing data governance policies and procedures to maintain data quality, consistency, and integrity across the organization. The Data Engineer will implement data quality controls, monitoring, and reporting mechanisms to ensure data accuracy and reliability. Continuous monitoring of data quality, identifying issues, and implementing remediation measures will be part of the responsibilities. The Data Engineer will design and implement centralized data integration pipelines to consolidate data from various sources into a unified data platform, ensuring seamless data integration and interoperability between different systems, applications, and processes. They will also integrate propensity models with downstream systems and processes, facilitating data-driven decision-making. Implementing robust data security measures, including access controls, authentication, and authorization mechanisms, to protect sensitive data is crucial. The Data Engineer will ensure data lineage and traceability across the organization, enabling auditing and compliance, and will continuously monitor data flows to optimize performance by identifying bottlenecks and implementing efficient solutions. Additionally, they will develop and implement data retention and archiving policies to manage data growth and storage requirements, and establish data disaster recovery and business continuity plans to ensure data availability and resilience.

Responsibilities

  • Conduct data discovery and profiling for various data sources, including Page0, entitlement models, and propensity models.
  • Design and implement robust ETL pipelines to extract data from multiple sources, transform it according to defined business rules, and load it into centralized data storage systems.
  • Develop and maintain efficient data ingestion processes, ensuring timely and accurate data availability.
  • Design and develop effective data models and schemas for Page0 data, entitlement model data, propensity model data, and other relevant data sources.
  • Implement and maintain a centralized data warehouse architecture, ensuring scalability, performance, and data integrity.
  • Collaborate with stakeholders to understand data requirements and translate them into optimized data structures and models.
  • Establish and enforce data governance policies and procedures to maintain data quality, consistency, and integrity across the organization.
  • Implement data quality controls, monitoring, and reporting mechanisms to ensure data accuracy and reliability.
  • Continuously monitor data quality, identify issues, and implement remediation measures as needed.
  • Design and implement centralized data integration pipelines to consolidate data from various sources into a unified data platform.
  • Ensure seamless data integration and interoperability between different systems, applications, and processes.
  • Integrate propensity models with downstream systems and processes, facilitating data-driven decision-making.
  • Implement robust data security measures, including access controls, authentication, and authorization mechanisms, to protect sensitive data.
  • Ensure data lineage and traceability across the organization, enabling auditing and compliance.
  • Continuously monitor data flows and optimize performance by identifying bottlenecks and implementing efficient solutions.
  • Develop and implement data retention and archiving policies to manage data growth and storage requirements.
  • Establish data disaster recovery and business continuity plans to ensure data availability and resilience.

Requirements

  • 2+ years of data engineering experience
  • Experience with data modeling, warehousing and building ETL pipelines
  • Experience with one or more query language (e.g., SQL, PL/SQL, DDL, MDX, HiveQL, SparkSQL, Scala)

Nice-to-haves

  • Experience with big data technologies such as: Hadoop, Hive, Spark, EMR
  • Experience with AWS technologies like Redshift, S3, AWS Glue, EMR, Kinesis, FireHose, Lambda, and IAM roles and permissions
  • Experience with at least one modern language such as Java, Python, C++, or C# including object-oriented design

Benefits

  • Comprehensive medical, financial, and other benefits
  • Equity and sign-on payments as part of total compensation package
  • Flexible working culture to support work-life balance
  • Mentorship and career advancement resources
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service