Temporal Logic Video (TLV) Dataset

July 15, 2024 · View on GitHub

Temporal Logic Video (TLV) Dataset

Synthetic and real video dataset with temporal logic annotation
Explore the docs »

NSVS-TL Project Webpage · NSVS-TL Source Code

Overview

The Temporal Logic Video (TLV) Dataset addresses the scarcity of state-of-the-art video datasets for long-horizon, temporally extended activity and object detection. It comprises two main components:

Synthetic datasets: Generated by concatenating static images from established computer vision datasets (COCO and ImageNet), allowing for the introduction of a wide range of Temporal Logic (TL) specifications.
Real-world datasets: Based on open-source autonomous vehicle (AV) driving datasets, specifically NuScenes and Waymo.

Dataset Composition
Dataset (Release)
Installation
Usage
Data Generation
Contribution Guidelines
License
Acknowledgments

Dataset Composition

Synthetic Datasets

Source: COCO and ImageNet
Purpose: Introduce artificial Temporal Logic specifications
Generation Method: Image stitching from static datasets

Real-world Datasets

Sources: NuScenes and Waymo
Purpose: Provide real-world autonomous vehicle scenarios
Annotation: Temporal Logic specifications added to existing data

Dataset

Though we provide a source code to generate datasets from different types of data sources, we release a dataset v1 as a proof of concept.

We provide a v1 dataset as a proof of concept. The data is offered as serialized objects, each containing a set of frames with annotations. You can download the dataset from our dataset repository in Hugging Face.

File Naming Convention

\<tlv_data_type\>:source:\<datasource\>-number_of_frames:\<number_of_frames\>-\<uuid\>.pkl

Object Attributes

Each serialized object contains the following attributes:

ground_truth: Boolean indicating whether the dataset contains ground truth labels
ltl_formula: Temporal logic formula applied to the dataset
proposition: A set of proposition for ltl_formula
number_of_frame: Total number of frames in the dataset
frames_of_interest: Frames of interest which satisfy the ltl_formula
labels_of_frames: Labels for each frame
images_of_frames: Image data for each frame

You can download a dataset from here. The structure of dataset is as follows: serializer

tlv-dataset-v1/
├── tlv_real_dataset/
├──── prop1Uprop2/
├──── (prop1&prop2)Uprop3/
├── tlv_synthetic_dataset/
├──── Fprop1/
├──── Gprop1/
├──── prop1&prop2/
├──── prop1Uprop2/
└──── (prop1&prop2)Uprop3/

Dataset Statistics

Total Number of Frames

Ground Truth TL Specifications	Synthetic TLV Dataset		Real TLV Dataset
	COCO	ImageNet	Waymo	Nuscenes
Eventually Event A	-	15,750	-	-
Always Event A	-	15,750	-	-
Event A And Event B	31,500	-	-	-
Event A Until Event B	15,750	15,750	8,736	19,808
(Event A And Event B) Until Event C	5,789	-	7,459	7,459

Total Number of datasets

Ground Truth TL Specifications	Synthetic TLV Dataset		Real TLV Dataset
	COCO	ImageNet	Waymo	Nuscenes
Eventually Event A	-	60	-	-
Always Event A	-	60	-	-
Event A And Event B	120	-	-	-
Event A Until Event B	60	60	45	494
(Event A And Event B) Until Event C	97	-	30	186

Installation

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip build
python -m pip install --editable ."[dev, test]"

Prerequisites

ImageNet (ILSVRC 2017):

ILSVRC/
├── Annotations/
├── Data/
├── ImageSets/
└── LOC_synset_mapping.txt

COCO (2017):

COCO/
└── 2017/
    ├── annotations/
    ├── train2017/
    └── val2017/

Usage

Detailed usage instructions for data loading and processing.

Data Loader Configuration

data_root_dir: Root directory of the dataset
mapping_to: Label mapping scheme (default: "coco")
save_dir: Output directory for processed data

Synthetic Data Generator Configuration

initial_number_of_frame: Starting frame count per video
max_number_frame: Maximum frame count per video
number_video_per_set_of_frame: Videos to generate per frame set
increase_rate: Frame count increment rate
ltl_logic: Temporal Logic specification (e.g., "F prop1", "G prop1")
save_images: Boolean flag for saving individual frames

Data Generation

COCO Synthetic Data Generation

python3 run_scripts/run_synthetic_tlv_coco.py --data_root_dir "../COCO/2017" --save_dir "<output_dir>"

ImageNet Synthetic Data Generation

python3 run_synthetic_tlv_imagenet.py --data_root_dir "../ILSVRC" --save_dir "<output_dir>"

Note: ImageNet generator does not support '&' LTL logic formulae.

@inproceedings{Choi_2024_ECCV,
  author={Choi, Minkyu and Goel, Harsh and Omama, Mohammad and Yang, Yunhao and Shah, Sahil and Chinchali, Sandeep},
  title={Towards Neuro-Symbolic Video Understanding},
  booktitle={Proceedings of the European Conference on Computer Vision (ECCV)},
  month={September},
  year={2024}
}

Temporal Logic Video (TLV) Dataset

Temporal Logic Video (TLV) Dataset