GovCIO - Sacramento, CA
posted about 2 months ago
GovCIO is currently seeking a Technical Data Scientist/ETL Engineer to join our ETL Team, which is dedicated to ingesting and visualizing data from various cloud sources while alerting on deviations from normalization. This position is fully remote, allowing for flexibility while working from anywhere. The successful candidate will be responsible for developing, inspecting, mining, transforming, and analyzing data to create descriptive and predictive models that significantly impact productivity and decision-making processes, ultimately providing strategic mission impact. In this role, you will apply data wrangling tools, including ETL and ELT processes, along with programming languages to collect and blend data from both operational and relevant external systems. You will engage in data analysis by utilizing data mining, machine learning, and statistical analysis techniques to create predictive and descriptive models. These models will be integrated to develop various analytical approaches such as segmentation, clustering, forecasting, and classification. Additionally, you will leverage data visualization tools to interpret and present findings in a compelling manner, ensuring that insights are actionable and useful for stakeholders. The position requires close collaboration with both business and data subject matter experts (SMEs) to prioritize business and information needs. You will generate new business insights through the extraction, storage, transformation, analysis, and visualization of diverse data sets. This includes collecting and transforming structured, unstructured, relational, and NoSQL data using ETL and ELT tools, as well as developing custom code using programming languages. You will also need to understand and utilize distributed methods that can scale to handle multi-Terabyte data collections. Your analytical work will involve using data mining, machine learning, and statistical algorithms available in commercial off-the-shelf (COTS) tools, such as SAS, SPSS, and Oracle, while building analytical solutions using programming languages like R and Python. You will interpret and evaluate the accuracy of results through iterative, agile methods, ensuring that the data-driven insights you provide are reliable and impactful.