Affect-aware Cross-Domain Recommendation for Art Therapy via Music Preference Elicitation
August 1, 2025 ยท View on GitHub
Overview
The official PyTorch implementation of the paper paper "Affect-aware Cross-Domain Recommendation for Art Therapy via Music Preference Elicitation" published in the Proceedings of the 19th ACM Conference on Recommender Systems (RecSys โ25) Read on ACM, Arxiv, ORBilu
Our three Affect Awre Cross-Domain VA RecSys engeines from music preferences, Mozart, Haydn and Salieri are named after the three iconic composers whose contributions shaped the history of Western music. Wolfgang Amadeus Mozart, Joseph Haydn and Antonio Salieri.
Proposed architectures. From left to right: Affect-aware contrastive alignment (Mozart), Affective Space Search (Haydn), and Multi-Modal alignment with LLM and VLM (Salieri)
Joint embeddings
๐ Project Structure
mozart-crossmodal/
โโโ ๐ data/ # Datasets and precomputed embeddings
โ โโโ ๐ music/ # Music-related data
โ โ โโโ ๐ต audio/ # MP3 audio files
โ โ โโโ ๐ต heal_audio/ # MP3 audio files selected for preference elicitation
โ โ โโโ ๐ features/ # Precomputed acoustic features (CSVs)
โ โ โโโ ๐ filtered_songs.csv # Metadata (song_id, valence, arousal)
โ โ โโโ ๐ music_embeddings_258D_normalized.csv # 258D embeddings (normalized)
โ โ โโโ ๐ music_features_with_embeddings.csv # Combined features & embeddings
โ โโโ ๐ paintings/ # Painting-related data
โ โ โโโ ๐ผ๏ธ images/ # Painting image files (JPG)
โ โ โโโ ๐ผ๏ธ heal_paintings/ # Painting image files (JPG) selected for art therapy
โ โ โโโ ๐ features/ # Precomputed painting features
โ โ โโโ ๐ painting_data.csv # Metadata (ID, valence, arousal)
โ โ โโโ ๐ painting_embeddings_258D_normalized.csv # 258D embeddings (normalized)
โ โ โโโ ๐ resnet_similarity_matrix.csv # 63 ร 63 painting similarity matrix filtered by expert
โ โโโ ๐ similarity_matrix.csv # 909 ร 4105 similarity matrix from contrastive alignement
โ โโโ ๐ heal_similarity_matrix_haydn.csv # 239 ร 63 similarity matrix filtered by expert
โ โโโ ๐ haydn_similarity_matrix.csv # 909 ร 4105 similarity matrix from V-A vectors
โ โโโ ๐ heal_similarity_matrix_mozart.csv # 239 ร 63 similarity matrix filtered by expert
โ โโโ ๐ salieri_similarity_matrix.csv # 909 ร 4105 similarity matrix from Salieri alighnement
โ โโโ ๐ heal_salieri_similarity_matrix.csv # 239 ร 63 similarity matrix filtered by expert
โ โโโ ๐ joint_embeddings.csv # 128D joint embeddings (music-art alignment)
โ
โโโ ๐ feature_extraction/ # Feature extraction scripts
โ โโโ ๐ music/
โ โ โโโ ๐ผ feature_extraction_music.py # Extracts MERT & acoustic features
โ โ โโโ ๐ผ reduce_normalize_music.py # Dimensionality reduction & normalization
โ โโโ ๐ painting/
โ โ โโโ ๐จ feature_extraction_painting.py # ResNet-based feature extraction
โ โ โโโ ๐จ reduce_normalize_painting.py # Dimensionality reduction & normalization
โ โโโ ๐ salieri/ # Salieri feature extraction for cross-modal alignment
โ โ โโโ ๐ค music_salieri_features.py # Extracts Salieri (GPT-4o) features for music
โ โ โโโ ๐ค painting_vlm_features.py # Extracts VLM (GPT-4V) features for paintings
โ โ โโโ ๐ค multi_modal_music.py # Combines MERT + VLM features, reduces to 256D
โ โ โโโ ๐ค multi_modal_painting.py # Combines ResNet + Salieri features, reduces to 256D
โ โ โโโ ๐ค similarity_computation.py # Computes cross-modal similarity matrix (S_LV)
โโโ ๐ flask/ # Backend API using Flask
โ โโโ ๐ง mozart_engine.py # Core recommendation engine
โ โโโ ๐ mozart.py # API server for recommendations
โ โโโ โ๏ธ engine.py # Generic engine class
โ โโโ ๐ haydn_engine.py # Haydn baseline
โ โโโ ๐ haydn.py # Haydn API
โ โโโ ๐ค salieri_engine.py # Large Language Model (Salieri) based RecSys engine
โ โโโ ๐ค salieri.py # API endpoint for Salieri-based tasks
โ โโโ ๐ start.sh # Starts the Flask server
โ โโโ ๐ stop.sh # Stops the Flask server
โ โโโ ๐ restart.sh # Restarts the Flask server
โ โโโ ๐ status.sh # Checks server status
โ โโโ ๐ README.md # Flask setup and usage
โโโ ๐ฅ contrastive_alignment.py # Contrastive learning for joint embeddings
โโโ ๐ run_feature_pipeline.py # Orchestrates feature extraction
โโโ ๐ README.md # Project overview, setup & usage
โโโ ๐ฆ requirements.txt # Dependencies
โโโ ๐ LICENSE # Open-source license (CC BY-NC)
โโโ ๐ .gitignore # Excludes temp files & datasets
โโโ โ๏ธ .env # Store private keys here (OpenAI Key)
โโโ ๐ figs/ # figures
โโโ ๐ app/ # Web application for user study
Setup Instructions
Install Requirements
To install the required dependencies run:
pip install -r requirements.txt
Download Data
Download the data.zip file containing all pre-trained models and features using the following gdown link: run
gdown "https://drive.google.com/uc?export=download&id=14aehmvmf-MAwRUZqOOBS5xsHHNBR0uPC"
Extract the data.zip file in the project root (mozart-crossmodal/): run
unzip -q data.zip
Usage
All trained models and features are provided via the download. If you want to extract features from scratch, run:
python3 run_feature_pipeline.py
To train the contrastive alignment model, run:
python3 contrastive_alignment.py
See flask instructions for setting up the services. For furhter usage please see LICENCE