Ngap - Bonsall, CA

posted about 1 month ago

Full-time - Mid Level
Bonsall, CA
Professional, Scientific, and Technical Services

About the position

We are seeking a skilled Python Developer with experience in extracting unstructured data from PDFs, converting it into structured formats, and using this data for analysis and reporting. The ideal candidate will be proficient in creating data pipelines, transforming unstructured data into usable datasets, and generating insightful reports through interactive dashboards. The role requires strong problem-solving skills, attention to detail, and the ability to design, develop, and deploy scalable services independently while ensuring seamless integration and timely project delivery.

Responsibilities

  • Develop and maintain Python scripts to extract unstructured data from various PDF files.
  • Convert extracted data into structured formats (e.g., CSV, JSON, etc.) suitable for analysis.
  • Implement data transformation processes to clean and organize extracted data for reporting purposes.
  • Perform data analysis using Python libraries to derive insights from structured data.
  • Design, develop, and maintain interactive dashboards for visualizing data trends and reports using JavaScript frontend.
  • Collaborate with business teams to understand data needs and tailor reports to meet business requirements.
  • Troubleshoot and optimize data extraction, transformation, and loading (ETL) processes.
  • Ensure data accuracy and integrity throughout the extraction and reporting pipeline.
  • Document code, processes, and procedures for data extraction and report generation.

Requirements

  • Bachelor's degree in computer science, Data Science, or a related field (or equivalent experience).
  • Proven experience in Python development, particularly with data extraction and manipulation.
  • Strong knowledge of Python libraries for PDF data extraction.
  • Experience with data analysis tools such as Pandas, NumPy, and data visualization libraries like Matplotlib, Plotly, or Seaborn.
  • Familiarity with SQL and databases to store and retrieve structured data.
  • Experience in creating interactive dashboards and visualizations.
  • Strong problem-solving skills with the ability to work independently and deliver high-quality results.
  • Experience with version control tools like Git is a plus.

Nice-to-haves

  • Familiarity with OCR technologies (e.g., Tesseract) for extracting text from image-based PDFs.
  • Knowledge of machine learning techniques for improving data extraction and analysis.
  • Experience in handling large datasets and optimizing data workflows.

Benefits

  • Health insurance
  • Dental insurance
  • Paid time off
  • Vision insurance
  • Life insurance
© 2024 Teal Labs, Inc
Privacy PolicyTerms of Service