GovCIO - Austin, TX
posted about 2 months ago
GovCIO is currently seeking a Technical Data Scientist/ETL Engineer to join our ETL Team, which is dedicated to ingesting and visualizing data from various cloud sources while alerting on deviations from normalization. This position is fully remote, allowing for flexibility while working from anywhere. The successful candidate will be responsible for developing, inspecting, mining, transforming, and analyzing data to create descriptive and predictive models that significantly impact productivity and decision-making processes, ultimately providing strategic mission impact. In this role, the engineer will apply data wrangling tools, including ETL and ELT processes, along with programming languages to collect and blend data from both operational and relevant external systems. The position requires a strong focus on data analysis, utilizing data mining, machine learning, and statistical analysis to create predictive and descriptive models. The engineer will also be responsible for applying and integrating these models to develop segmentation, clustering, forecasting, classification, and other analytical models. Data visualization is a key component of this role, as the engineer will use data discovery and visualization tools to interpret and present findings in a compelling and usable manner. The engineer will maintain and integrate analytical systems with operational systems, ensuring the accuracy of data and analytics. Close interaction with both business and data subject matter experts (SMEs) is essential to prioritize business and information needs. The role also involves generating new business insights through the extraction, storage, transformation, analysis, and visualization of diverse data sets, including structured, unstructured, relational, and NoSQL data. The engineer will utilize distributed methods, such as MapReduce, to handle multi-Terabyte sized data collections effectively. Additionally, the engineer will analyze data using various data mining, machine learning, and statistical algorithms available in commercial off-the-shelf (COTS) tools, such as SAS, SPSS, and Oracle, while building analytical solutions using programming languages like R and Python, along with relevant programming libraries. The position requires interpreting and evaluating the accuracy of results through iterative, agile methods, and developing actionable data stories using visualization tools like Tableau and Trifacta.