Temporal Logic Video (TLV) Dataset
July 15, 2024 · View on GitHub
Temporal Logic Video (TLV) Dataset
Synthetic and real video dataset with temporal logic annotation
Explore the docs »
NSVS-TL Project Webpage
·
NSVS-TL Source Code
Overview
The Temporal Logic Video (TLV) Dataset addresses the scarcity of state-of-the-art video datasets for long-horizon, temporally extended activity and object detection. It comprises two main components:
- Synthetic datasets: Generated by concatenating static images from established computer vision datasets (COCO and ImageNet), allowing for the introduction of a wide range of Temporal Logic (TL) specifications.
- Real-world datasets: Based on open-source autonomous vehicle (AV) driving datasets, specifically NuScenes and Waymo.
Table of Contents
- Dataset Composition
- Dataset (Release)
- Installation
- Usage
- Data Generation
- Contribution Guidelines
- License
- Acknowledgments
Dataset Composition
Synthetic Datasets
- Source: COCO and ImageNet
- Purpose: Introduce artificial Temporal Logic specifications
- Generation Method: Image stitching from static datasets
Real-world Datasets
- Sources: NuScenes and Waymo
- Purpose: Provide real-world autonomous vehicle scenarios
- Annotation: Temporal Logic specifications added to existing data
Dataset
Though we provide a source code to generate datasets from different types of data sources, we release a dataset v1 as a proof of concept.
Dataset Structure
We provide a v1 dataset as a proof of concept. The data is offered as serialized objects, each containing a set of frames with annotations. You can download the dataset from our dataset repository in Hugging Face.
File Naming Convention
\<tlv_data_type\>:source:\<datasource\>-number_of_frames:\<number_of_frames\>-\<uuid\>.pkl
Object Attributes
Each serialized object contains the following attributes:
ground_truth: Boolean indicating whether the dataset contains ground truth labelsltl_formula: Temporal logic formula applied to the datasetproposition: A set of proposition for ltl_formulanumber_of_frame: Total number of frames in the datasetframes_of_interest: Frames of interest which satisfy the ltl_formulalabels_of_frames: Labels for each frameimages_of_frames: Image data for each frame
You can download a dataset from here. The structure of dataset is as follows: serializer
tlv-dataset-v1/
├── tlv_real_dataset/
├──── prop1Uprop2/
├──── (prop1&prop2)Uprop3/
├── tlv_synthetic_dataset/
├──── Fprop1/
├──── Gprop1/
├──── prop1&prop2/
├──── prop1Uprop2/
└──── (prop1&prop2)Uprop3/
Dataset Statistics
- Total Number of Frames
| Ground Truth TL Specifications | Synthetic TLV Dataset | Real TLV Dataset | ||
|---|---|---|---|---|
| COCO | ImageNet | Waymo | Nuscenes | |
| Eventually Event A | - | 15,750 | - | - |
| Always Event A | - | 15,750 | - | - |
| Event A And Event B | 31,500 | - | - | - |
| Event A Until Event B | 15,750 | 15,750 | 8,736 | 19,808 |
| (Event A And Event B) Until Event C | 5,789 | - | 7,459 | 7,459 |
- Total Number of datasets
| Ground Truth TL Specifications | Synthetic TLV Dataset | Real TLV Dataset | ||
|---|---|---|---|---|
| COCO | ImageNet | Waymo | Nuscenes | |
| Eventually Event A | - | 60 | - | - |
| Always Event A | - | 60 | - | - |
| Event A And Event B | 120 | - | - | - |
| Event A Until Event B | 60 | 60 | 45 | 494 |
| (Event A And Event B) Until Event C | 97 | - | 30 | 186 |
Installation
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build
python -m pip install --editable ."[dev, test]"
Prerequisites
-
ImageNet (ILSVRC 2017):
ILSVRC/ ├── Annotations/ ├── Data/ ├── ImageSets/ └── LOC_synset_mapping.txt -
COCO (2017):
COCO/ └── 2017/ ├── annotations/ ├── train2017/ └── val2017/
Usage
Detailed usage instructions for data loading and processing.
Data Loader Configuration
data_root_dir: Root directory of the datasetmapping_to: Label mapping scheme (default: "coco")save_dir: Output directory for processed data
Synthetic Data Generator Configuration
initial_number_of_frame: Starting frame count per videomax_number_frame: Maximum frame count per videonumber_video_per_set_of_frame: Videos to generate per frame setincrease_rate: Frame count increment rateltl_logic: Temporal Logic specification (e.g., "F prop1", "G prop1")save_images: Boolean flag for saving individual frames
Data Generation
COCO Synthetic Data Generation
python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output_dir>"
ImageNet Synthetic Data Generation
python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output_dir>"
Note: ImageNet generator does not support '&' LTL logic formulae.
License
This project is licensed under the MIT License. See the LICENSE file for details.
Connect with Me
Feel free to connect with me through these professional channels:
Citation
If you find this repo useful, please cite our paper:
@inproceedings{Choi_2024_ECCV,
author={Choi, Minkyu and Goel, Harsh and Omama, Mohammad and Yang, Yunhao and Shah, Sahil and Chinchali, Sandeep},
title={Towards Neuro-Symbolic Video Understanding},
booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
month={September},
year={2024}
}