Cloud Data Lake Architect

Executive Recruiting [By Hr Pros, Llc.Net] - Silver Spring, MD

posted 3 months ago

Full-time - Executive

Remote - Silver Spring, MD

Administrative and Support Services

About the position

We are seeking a remote experienced and highly motivated Cloud Data Lake Architect with extensive experience in AWS and data lake technologies. In this role, the successful candidate will drive our efforts to support cloud initiatives for the National Oceanic and Atmospheric Administration (NOAA) and National Environmental Satellite, Data, and Information Service (NESDIS) with a focus on architecting and provisioning enterprise-level Cloud services, including Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). The ideal candidate will have a robust background in big data analytics, data engineering, and cloud computing, with a particular focus on building and optimizing data lakes on AWS. This role is crucial for designing, implementing, and managing advanced data solutions using AWS and Data Lakehouse platforms. The position requires expertise in cloud computing, data engineering, and analytics, emphasizing data lake and data lakehouse architectures. The successful candidate will be responsible for designing and implementing scalable and efficient data pipelines using AWS and Data Lakehouse Platform services. They will leverage AWS cloud services like S3, Kinesis, Redshift, EMR, Glue, Lambda, and others, in combination with Data Lakehouse platform/Apache Spark Integration for advanced data processing and analytics. Collaboration with cross-functional teams to understand business needs and translate them into technical solutions is essential. The candidate will develop and maintain data lakes and data warehouses on AWS and Data Lakehouse Platform, ensuring data integrity and accessibility. They will also optimize data storage and processing for performance and cost efficiency, automate data workflows, and ensure high data quality and reliability. Monitoring, troubleshooting, and resolving data pipeline issues will be part of the daily responsibilities, along with organizing and managing data within the environment to ensure it is stored efficiently, securely, and supports easy access and analysis. The role also involves ensuring high standards of data quality and implementing data governance practices while staying current with emerging trends and technologies in cloud computing, big data, and data engineering. Ongoing support for the platform, troubleshooting any issues, and ensuring high availability and reliability of data infrastructure will be required. The candidate will create documentation for the platform infrastructure and processes and train other team members or users on the platform effectively.

Responsibilities

Design and implement scalable and efficient data pipelines using AWS and Data Lakehouse Platform services.
Leverage AWS cloud services like S3, Kinesis, Redshift, EMR, Glue, Lambda, and others, in combination with Data Lakehouse platform/Apache Spark Integration for advanced data processing and analytics.
Collaborate with cross-functional teams to understand business needs and translate them into technical solutions.
Utilize Databricks for big data processing and streaming analytics.
Develop and maintain data lakes and data warehouses on AWS and Data Lakehouse Platform, ensuring data integrity and accessibility.
Optimize data storage and processing for performance and cost efficiency.
Automate data workflows and ensure high data quality and reliability.
Monitor, troubleshoot, and resolve data pipeline issues.
Organize and manage data within the environment, ensuring it is stored efficiently, securely, and supports easy access and analysis.
Monitor the performance of data processes and queries and optimize for efficiency and speed.
Ensure high standards of data quality and implement data governance practices.
Stay current with emerging trends and technologies in cloud computing, big data, and data engineering.
Provide ongoing support for the platform, troubleshoot any issues, and ensure high availability and reliability of data infrastructure.
Create documentation for the platform infrastructure and processes, and train other team members or users on the platform effectively.

Requirements

5 years experience in big data analytics, data engineering, or related roles.
Experience in designing, building, and maintaining data warehouses, with an understanding of data modeling, data warehousing, and data lake concepts.
Proficiency in programming languages such as Python, Java, Scala, and scripting languages like Bash or PowerShell.
Experience with big data technologies such as Apache Hadoop, Spark, Kafka, and others.
Hands-on experience with AWS services such as S3, Glue, Redshift, EMR, Athena, etc.
Proficiency in SQL and experience with relational databases.
Experience in building and optimizing big data pipelines, architectures, and data sets.
Familiarity with ETL tools, processes, and data integration techniques.
Excellent communication and team collaboration skills.
Must be able to obtain a security clearance.

Cloud Data Lake Architect

About the position

Responsibilities

Requirements

Tools

Career Hubs

Guides

Company