Agentic Company Researcher 🔍

May 6, 2026 · View on GitHub

Agentic Company Researcher 🔍

web ui

A multi-agent tool that generates comprehensive company research reports. The platform uses a pipeline of AI agents to gather, curate, and synthesize information about any company.

✨Check it out online! https://companyresearcher.tavily.com ✨

https://github.com/user-attachments/assets/0e373146-26a7-4391-b973-224ded3182a9

Features

Multi-Source Research: Gathers data from various sources, including company websites, news articles, financial reports, and industry analyses
AI-Powered Content Filtering: Uses Tavily's relevance scoring for content curation
Asynchronous Processing: Efficient polling-based architecture for tracking research progress
Dual Model Architecture:
- Gemini 2.5 Flash for high-context research synthesis
- GPT-5.1 for precise report formatting and editing
Modern React Frontend: Responsive UI with progress tracking and download options
Modular Architecture: Built using a pipeline of specialized research and processing nodes

Agent Framework

Research Pipeline

The platform follows an agentic framework with specialized nodes that process data sequentially:

Research Nodes:
- CompanyAnalyzer: Researches core business information
- IndustryAnalyzer: Analyzes market position and trends
- FinancialAnalyst: Gathers financial metrics and performance data
- NewsScanner: Collects recent news and developments
Processing Nodes:
- Collector: Aggregates research data from all analyzers
- Curator: Implements content filtering and relevance scoring
- Briefing: Generates category-specific summaries using Gemini 2.5 Flash
- Editor: Compiles and formats the briefings into a final report using GPT-5.1

Content Generation Architecture

The platform leverages separate models for optimal performance:

Gemini 2.5 Flash (briefing.py):
- Handles high-context research synthesis tasks
- Excels at processing and summarizing large volumes of data
- Used for generating initial category briefings
- Efficient at maintaining context across multiple documents
GPT-5.1 (editor.py):
- Specializes in precise formatting and editing tasks
- Handles markdown structure and consistency
- Superior at following exact formatting instructions
- Used for:
  - Final report compilation
  - Content deduplication
  - Markdown formatting
  - Real-time report streaming

This approach combines Gemini's strength in handling large context windows with GPT-5.1's precision in following specific formatting instructions.

Content Curation System

The platform uses a content filtering system in curator.py:

Relevance Scoring:
- Documents are scored by Tavily's AI-powered search
- A minimum threshold (default 0.4) is required to proceed
- Scores reflect relevance to the specific research query
- Higher scores indicate better matches to the research intent
Document Processing:
- Content is normalized and cleaned
- URLs are deduplicated and standardized
- Documents are sorted by relevance scores
- Research runs asynchronously in the background

Backend Architecture

The platform implements a simple polling-based communication system:

web ui

Backend Implementation:
- Uses FastAPI with async support
- Research tasks run in background
- Results are stored and accessed via REST endpoints
- Simple job status tracking
Frontend Integration:
- React frontend submits research requests
- Receives job_id for tracking
- Polls /research/{job_id}/report endpoint
- Displays final report when complete
API Endpoints:
- POST /research: Submit new research request
- GET /research/{job_id}/report: Poll for completed report
- POST /generate-pdf: Generate PDF from report content

Setup

Quick Setup (Recommended)

The easiest way to get started is using the setup script, which automatically detects and uses uv for faster Python package installation when available:

Clone the repository:

git clone https://github.com/guy-hartstein/tavily-company-research.git
cd tavily-company-research

Make the setup script executable and run it:

chmod +x setup.sh
./setup.sh

The setup script will:

Detect and use uv for faster Python package installation (if available)
Check for required Python and Node.js versions
Optionally create a Python virtual environment (recommended)
Install all dependencies (Python and Node.js)
Guide you through setting up your environment variables
Optionally start both backend and frontend servers

💡 Pro Tip: Install uv for significantly faster Python package installation:
curl -LsSf https://astral.sh/uv/install.sh | sh

You'll need the following API keys ready:

Tavily API Key
Google Gemini API Key
OpenAI API Key
Google Maps API Key
MongoDB URI (optional)

Manual Setup

If you prefer to set up manually, follow these steps:

Clone the repository:

git clone https://github.com/guy-hartstein/tavily-company-research.git
cd tavily-company-research

Install backend dependencies:

# Optional: Create and activate virtual environment
# With uv (faster - recommended if available):
uv venv .venv
source .venv/bin/activate

# Or with standard Python:
# python -m venv .venv
# source .venv/bin/activate

# Install Python dependencies
# With uv (faster):
uv pip install -r requirements.txt

# Or with pip:
# pip install -r requirements.txt

Install frontend dependencies:

cd ui
npm install

Set up Environment Variables:

This project requires two separate .env files for the backend and frontend.

For the Backend:

Create a .env file in the project's root directory and add your backend API keys:

TAVILY_API_KEY=your_tavily_key
GEMINI_API_KEY=your_gemini_key
OPENAI_API_KEY=your_openai_key

# Optional: Enable MongoDB persistence
# MONGODB_URI=your_mongodb_connection_string

For the Frontend:

Create a .env file inside the ui directory. You can copy the example file first:

cp ui/.env.development.example ui/.env

Then, open ui/.env and add your frontend environment variables:

VITE_API_URL=http://localhost:8000
VITE_GOOGLE_MAPS_API_KEY=your_google_maps_api_key_here

Docker Setup

The application can be run using Docker and Docker Compose:

Clone the repository:

git clone https://github.com/guy-hartstein/tavily-company-research.git
cd tavily-company-research

Set up Environment Variables:

The Docker setup uses two separate .env files.

For the Backend:

Create a .env file in the project's root directory with your backend API keys:

TAVILY_API_KEY=your_tavily_key
GEMINI_API_KEY=your_gemini_key
OPENAI_API_KEY=your_openai_key

# Optional: Enable MongoDB persistence
# MONGODB_URI=your_mongodb_connection_string

For the Frontend:

Create a .env file inside the ui directory. You can copy the example file first:

cp ui/.env.development.example ui/.env

Then, open ui/.env and add your frontend environment variables:

VITE_API_URL=http://localhost:8000
VITE_GOOGLE_MAPS_API_KEY=your_google_maps_api_key_here

Build and start the containers:

docker compose up --build

This will start both the backend and frontend services:

Backend API will be available at http://localhost:8000
Frontend will be available at http://localhost:5174

To stop the services:

docker compose down

Note: When updating environment variables in .env, you'll need to restart the containers:

docker compose down && docker compose up

Running the Application

Start the backend server (choose one):

# Option 1: Direct Python Module
python -m application.py

# Option 2: FastAPI with Uvicorn
uvicorn application:app --reload --port 8000

In a new terminal, start the frontend:

cd ui
npm run dev

Access the application at http://localhost:5173

Usage

Local Development

Start the backend server (choose one option):

Option 1: Direct Python Module

python -m application.py

Option 2: FastAPI with Uvicorn

# Install uvicorn if not already installed
# With uv (faster):
uv pip install uvicorn
# Or with pip:
# pip install uvicorn

# Run the FastAPI application with hot reload
uvicorn application:app --reload --port 8000

The backend will be available at:

API Endpoint: http://localhost:8000

Start the frontend development server:
```
cd ui
npm run dev
```
Access the application at http://localhost:5173

⚡ Performance Note: If you used uv during setup, you'll benefit from significantly faster package installation and dependency resolution. uv is a modern Python package manager written in Rust that can be 10-100x faster than pip.

Deployment Options

The application can be deployed to various cloud platforms. Here are some common options:

AWS Elastic Beanstalk

Install the EB CLI:
```
pip install awsebcli
```
Initialize EB application:
```
eb init -p python-3.11 tavily-research
```
Create and deploy:
```
eb create tavily-research-prod
```

Other Deployment Options

Docker: The application includes a Dockerfile for containerized deployment
Heroku: Deploy directly from GitHub with the Python buildpack
Google Cloud Run: Suitable for containerized deployment with automatic scaling

Choose the platform that best suits your needs. The application is platform-agnostic and can be hosted anywhere that supports Python web applications.

Contributing

Fork the repository
Create a feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

Tavily for the research API
All other open-source libraries and their contributors