Automated Data Scientist

Overview

Automated Data Scientist is an intelligent, adaptive data analysis tool that leverages AI-driven automation to dynamically plan, execute, and refine data science workflows. It automates data preparation, analysis planning, code generation, and result interpretation using advanced language models. Designed for rapid, data-driven decision-making with configurable control over analysis depth and scope.

Key Features

How It Works

The solution automatically initializes data from a data dictionary and sample production data. It generates an analysis plan using an external language model API, dynamically creates Python code for each analysis step, and executes it. The outputs are reviewed and interpreted, allowing for adaptive planning based on findings.

Get Started

  1. Clone the Repository: Download the project from GitHub.
  2. Set Up Environment Variables: Configure your environment variables in a .env file.
  3. Install Dependencies: Install the required Python libraries from requirements.txt.
  4. Run the Application: Execute main.py to start the automated data science process.

For more details, visit the GitHub Repository.