Teknismart Solutions - Rosemont, IL
posted 2 months ago
The Site Reliability Engineer (Kafka) position is designed for a seasoned professional with extensive experience in managing and administering Confluent Kafka clusters. The ideal candidate will have over 10 years of total IT experience, with a strong engineering background that includes a minimum of 6-8 years specifically focused on Confluent Kafka. This role requires expertise in both on-premises and cloud environments, ensuring that the Kafka infrastructure is robust, scalable, and efficient. The engineer will be responsible for setting up and managing the Confluent Kafka cluster, monitoring its performance, and ensuring optimal distribution of workloads across the system. In this role, the engineer will design, configure, and manage Role-Based Access Control (RBAC) and multi-tenancy features to ensure secure and efficient access to the Kafka cluster. The position also involves managing all Kafka configurations through automation tools like Ansible, which streamlines the deployment and management processes. A critical aspect of the job is to establish and maintain disaster recovery (DR) strategies for Confluent Kafka instances, ensuring business continuity in the event of failures. Collaboration is key in this role, as the engineer will coordinate with various development teams to manage their connectivity and usage within the Kafka cluster. This includes documenting best engineering practices and strategies to align near-term changes with long-term release objectives. The engineer will also work closely with infrastructure and backend teams on software upgrades, disaster recovery plans, and other critical initiatives to enhance the overall reliability and performance of the Kafka environment.