Affect-aware Cross-Domain Recommendation for Art Therapy via Music Preference Elicitation

August 1, 2025 ยท View on GitHub

CC BY-NC license Python Pytorch Black

Overview

The official PyTorch implementation of the paper paper "Affect-aware Cross-Domain Recommendation for Art Therapy via Music Preference Elicitation" published in the Proceedings of the 19th ACM Conference on Recommender Systems (RecSys โ€™25) Read on ACM, Arxiv, ORBilu

Our three Affect Awre Cross-Domain VA RecSys engeines from music preferences, Mozart, Haydn and Salieri are named after the three iconic composers whose contributions shaped the history of Western music. Wolfgang Amadeus Mozart, Joseph Haydn and Antonio Salieri.

Proposed architectures. From left to right: Affect-aware contrastive alignment (Mozart), Affective Space Search (Haydn), and Multi-Modal alignment with LLM and VLM (Salieri)

Joint embeddings

๐Ÿ“‚ Project Structure

mozart-crossmodal/
โ”œโ”€โ”€ ๐Ÿ“ data/                         # Datasets and precomputed embeddings  
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ music/                    # Music-related data  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐ŸŽต audio/                 # MP3 audio files  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐ŸŽต heal_audio/            # MP3 audio files selected for preference elicitation
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“Š features/              # Precomputed acoustic features (CSVs)  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“œ filtered_songs.csv     # Metadata (song_id, valence, arousal)  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“œ music_embeddings_258D_normalized.csv  # 258D embeddings (normalized)  
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“œ music_features_with_embeddings.csv    # Combined features & embeddings  
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ paintings/                 # Painting-related data  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ–ผ๏ธ images/                # Painting image files (JPG)  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ–ผ๏ธ heal_paintings/        # Painting image files (JPG) selected for art therapy
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“Š features/              # Precomputed painting features  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿ“œ painting_data.csv      # Metadata (ID, valence, arousal)  
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“œ painting_embeddings_258D_normalized.csv  # 258D embeddings (normalized)  
โ”‚   โ”‚   โ””โ”€โ”€ ๐Ÿ“œ resnet_similarity_matrix.csv  # 63 ร— 63 painting similarity matrix filtered by expert  
โ”‚   โ”œโ”€โ”€ ๐Ÿ“œ similarity_matrix.csv                    # 909 ร— 4105 similarity matrix from contrastive alignement 
โ”‚   โ”œโ”€โ”€ ๐Ÿ“œ heal_similarity_matrix_haydn.csv           # 239 ร— 63 similarity matrix filtered by expert 
โ”‚   โ”œโ”€โ”€ ๐Ÿ“œ haydn_similarity_matrix.csv                # 909 ร— 4105 similarity matrix  from V-A vectors
โ”‚   โ”œโ”€โ”€ ๐Ÿ“œ heal_similarity_matrix_mozart.csv        # 239 ร— 63 similarity matrix filtered by expert 
โ”‚   โ”œโ”€โ”€ ๐Ÿ“œ salieri_similarity_matrix.csv                # 909 ร— 4105 similarity matrix from Salieri alighnement 
โ”‚   โ”œโ”€โ”€ ๐Ÿ“œ heal_salieri_similarity_matrix.csv           # 239 ร— 63 similarity matrix filtered by expert 
โ”‚   โ””โ”€โ”€ ๐Ÿ“œ joint_embeddings.csv                     # 128D joint embeddings (music-art alignment)  
โ”‚  
โ”œโ”€โ”€ ๐Ÿ“ feature_extraction/             # Feature extraction scripts  
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ music/  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐ŸŽผ feature_extraction_music.py   # Extracts MERT & acoustic features  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐ŸŽผ reduce_normalize_music.py    # Dimensionality reduction & normalization  
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ painting/  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐ŸŽจ feature_extraction_painting.py  # ResNet-based feature extraction  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐ŸŽจ reduce_normalize_painting.py   # Dimensionality reduction & normalization  
โ”‚   โ”œโ”€โ”€ ๐Ÿ“ salieri/                         # Salieri feature extraction for cross-modal alignment 
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿค– music_salieri_features.py    # Extracts Salieri (GPT-4o) features for music 
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿค– painting_vlm_features.py # Extracts VLM (GPT-4V) features for paintings  
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿค– multi_modal_music.py  # Combines MERT + VLM features, reduces to 256D
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿค– multi_modal_painting.py  # Combines ResNet + Salieri features, reduces to 256D
โ”‚   โ”‚   โ”œโ”€โ”€ ๐Ÿค– similarity_computation.py # Computes cross-modal similarity matrix (S_LV) 
โ”œโ”€โ”€ ๐Ÿ“ flask/                          # Backend API using Flask  
โ”‚   โ”œโ”€โ”€ ๐Ÿง  mozart_engine.py            # Core recommendation engine  
โ”‚   โ”œโ”€โ”€ ๐ŸŒ mozart.py                   # API server for recommendations  
โ”‚   โ”œโ”€โ”€ โš™๏ธ engine.py                   # Generic engine class  
โ”‚   โ”œโ”€โ”€ ๐Ÿ” haydn_engine.py               # Haydn baseline  
โ”‚   โ”œโ”€โ”€ ๐Ÿ” haydn.py                      # Haydn API  
โ”‚   โ”œโ”€โ”€ ๐Ÿค– salieri_engine.py               # Large Language Model (Salieri) based RecSys engine  
โ”‚   โ”œโ”€โ”€ ๐Ÿค– salieri.py                      # API endpoint for Salieri-based tasks  
โ”‚   โ”œโ”€โ”€ ๐Ÿš€ start.sh                        # Starts the Flask server  
โ”‚   โ”œโ”€โ”€ ๐Ÿ›‘ stop.sh                         # Stops the Flask server  
โ”‚   โ”œโ”€โ”€ ๐Ÿ”„ restart.sh                      # Restarts the Flask server  
โ”‚   โ”œโ”€โ”€ ๐Ÿ“Ÿ status.sh                       # Checks server status  
โ”‚   โ”œโ”€โ”€ ๐Ÿ“– README.md                       # Flask setup and usage 
โ”œโ”€โ”€ ๐Ÿ”ฅ contrastive_alignment.py  # Contrastive learning for joint embeddings  
โ”œโ”€โ”€ ๐Ÿ”„ run_feature_pipeline.py   # Orchestrates feature extraction    
โ”œโ”€โ”€ ๐Ÿ“– README.md                 # Project overview, setup & usage  
โ”œโ”€โ”€ ๐Ÿ“ฆ requirements.txt          #  Dependencies  
โ”œโ”€โ”€ ๐Ÿ“œ LICENSE                   # Open-source license (CC BY-NC)  
โ””โ”€โ”€ ๐Ÿ”’ .gitignore                # Excludes temp files & datasets  
โ””โ”€โ”€ โš™๏ธ .env                      # Store private keys here (OpenAI Key) 
โ”œโ”€โ”€ ๐Ÿ“ figs/                     # figures   
โ”œโ”€โ”€ ๐Ÿ“ app/                      # Web application for user study

Setup Instructions

Install Requirements

To install the required dependencies run:

pip install -r requirements.txt 

Download Data

Download the data.zip file containing all pre-trained models and features using the following gdown link: run

 gdown "https://drive.google.com/uc?export=download&id=14aehmvmf-MAwRUZqOOBS5xsHHNBR0uPC"

Extract the data.zip file in the project root (mozart-crossmodal/): run

unzip -q data.zip 

Usage

All trained models and features are provided via the download. If you want to extract features from scratch, run:

python3 run_feature_pipeline.py

To train the contrastive alignment model, run:

 python3 contrastive_alignment.py

See flask instructions for setting up the services. For furhter usage please see LICENCE