HOPOMOP (Hundreds Of Points Over Millions Of Pixels)

February 26, 2025 Β· View on GitHub

Contributors Forks Stargazers Issues MIT License

HOPOMOP (Hundreds Of Points Over Millions Of Pixels)

This repository contains the code for our paper, Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks. Our approach leverages foundation models and graph neural networks to perform few-shot segmentation of machinery parts, even in complex and low-data scenarios.


Logo

πŸ“Link to PaperπŸ“

To help you get started, we provide a small set of synthetic truck images in data/test_data, which you can use for quick testing. For a detailed explanation of the methodology, implementation, and results, please refer to our paper.

Installation

For a streamlined and reproducible setup, we provide a Docker Devcontainer. This ensures a properly configured environment with all necessary dependencies. Prerequisites:

  • Docker (including NVIDIA Docker for GPU acceleration)
  • VS Code with the Dev Containers extension

To get started, simply open the repository in VS Code and launch the Devcontainer. The setup process will handle all required installations automatically, including downloading the necessary weights.

Option 2: Local Installation

If you prefer running the code directly on your machine, follow these steps:

  1. Install the required dependencies:

     pip install -r requirements.txt
    
  2. Install PyTorch with CUDA 12.1 support (recommended for GPU acceleration):

     pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu121
    

SuperPoint Weights (Local Installation Only)

For local setups, you need to manually download the three required SuperPoint weight files from the SuperGlue repository and place them in:

foundation_graph_segmentation/interest_point_detectors/superpoint/weights/

For further details, please refer to our paper.

Usage

The repository includes three trained checkpoints stored in the checkpoints folder. Each checkpoint corresponds to a different segmentation granularity:

  • TRUCK
  • TRUCK CRANE
  • LOW

These checkpoints were trained on the synthetic truck dataset.

Each granularity has a corresponding configuration file in the config folder. The configuration files define the parameters for training and testing. Feel free to experiment with them!

Testing

To run inference on the test images, use the following command, replacing the config file with the one matching the granularity you want to test:

python3.10 test.py --config_file config/parameters_test_TRUCK.yaml
python3.10 test.py --config_file config/parameters_test_TRUCK_CRANE.yaml
python3.10 test.py --config_file config/parameters_test_LOW.yaml

The results will be saved in the results folder.

Training

To train a model from scratch, use the following command with the desired configuration file:

python3.10 train.py --config_file config/parameters_train_TRUCK.yaml
python3.10 train.py --config_file config/parameters_train_TRUCK_CRANE.yaml
python3.10 train.py --config_file config/parameters_train_LOW.yaml

The trained model will be saved in the checkpoints folder.

Architecture

Combination of SuperPoint, CLIPSeg, Segment Anything and Graph Neural Networks.

Data

Domain Randomization

Using blender to create synthetic images by randomizing environment, perspective anc crane articulation.

Sample

Rendered video of the synthetic truck with changing perspective, background, lighting and articulation. Right side shows rendering, left side shows segmentation overlay.

Results

Few-shot evaluation

Different granularity and sample sizes. Qualitative results on synthetic truck dataset.

Simulation to Real Transfer

Training on 10 synthetic images. The synthetic truck-mounted loading crane differs from the real one. The model is able to transfer the knowledge to the real world.

Semi-Supervised Video Segmentation

Using Davis2017 Dataset. Trained on First, Middle and Last Frame.

Segmentation ClassesImage
One Class
Two Classes
Multi Classes

Meet the Authors πŸ‘©β€πŸ”¬

This work was conducted at the AIT Austrian Institute of Technology πŸ‡¦πŸ‡Ή in the Center for Vision, Automation & Control πŸ—οΈ.

Name & EmailAIT Research ProfileGoogle Scholar
πŸ‘¨β€πŸ”¬ Michael Schwingshackl
πŸ“§Michael.Schwingshackl@ait.ac.at
πŸ”— ProfileπŸ”— Scholar
πŸ‘¨β€πŸ”¬ Fabio Francisco Oberweger
πŸ“§Fabio.Oberweger@ait.ac.at
πŸ”— ProfileπŸ”— Scholar
πŸ‘¨β€πŸ”¬ Markus Murschitz
πŸ“§Markus.Murschitz@ait.ac.at
πŸ”— ProfileπŸ”— Scholar

Full Dataset Access

The provided test images allow you to evaluate the approach on a small sample of our synthetic truck dataset. If you require access to the full dataset for research or further experiments, please reach out to us.

For inquiries, feel free to contact us.

Citing Hopomop

If you use Hopomop in your research, please use the following BibTeX entry.

@InProceedings{Schwingshackl_2025_WACV,
    author    = {Schwingshackl, Michael and Oberweger, Fabio F. and Murschitz, Markus},
    title     = {Few-Shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks},
    booktitle = {Proceedings of the Winter Conference on Applications of Computer Vision (WACV)},
    month     = {February},
    year      = {2025},
    pages     = {1989-1998}
}