How to apply the incremental changes

May 25, 2026 ยท View on GitHub

From here on out, I'm planning to provide incremental updates, downloading, extracting etc costs everyone time and effort

Not only do you need to maybe dump tables or something to push back in the new data, but extracting, waiting maybe a few hours just inserting the data

So Feb 22 2026 dataset is perhaps the last full-dataset (for now)

You can download the incremental changes here: https://mega.nz/folder/QqQWkJKI#LVtUdSYU8hxEuD_XsylYRA

How to apply the incremental changes

for f in $(ls changes_*.sql.gz | sort); do
    echo "Restoring $f"
    gunzip -c "$f" | psql -d minimedia -h 192.168.1.2 -U postgres -q
done

Dataset incremental 2026-05-25

ServiceTypeAmount
DeezerArtist9.4mil
DeezerAlbums41.6mil
DeezerTracks206.1mil
TidalArtist8.1mil
TidalAlbums29.9mil
TidalTracks117.9mil
SpotifyArtist1.2mil
SpotifyAlbums3.7mil
SpotifyTracks18mil
MusicBrainzArtist2.7mil
MusicBrainzAlbums5.5mil
MusicBrainzTracks54.3mil
SoundCloudArtist3.08mil
SoundCloudAlbums2.8mil
SoundCloudTracks10.3mil
Last.FmArtist5k
Last.FmAlbums9k
Last.FmTracks19k

Note: Spotify pulling is discontinued due API limitations/payment requirement

SoundCloud pulling started at 2026-04-04, be patient till pull is complete

Last.Fm pulling started at 2026-05-25, be patient till pull is complete

Dataset of Feb 22 2026

Datasets of MusicBrainz, Tidal, Spotify, Deezer

You can use these tables for MiniMedia's database, it will save you huge amounts of time, instead calling the API's yourself

Tidal, Spotify, Deezer datasets were obtained through their API, took months of calling their API's 24/7

MusicBrainz came mostly through their own published dataset + API

Note for Deezer dataset: The Preview Url (to listen to the first x seconds of a song) and TrackToken (for playback) fields will be empty, it took too much space to store all of this for me

Packed: CSV-Format 10.2GB, SQL 21.7GB

Unpacked CSV-Format 178GB, SQL 149GB

Loving the work I do? buy me a coffee https://buymeacoffee.com/musicmovearr

MusicBrainz

You can officially download it here: https://metabrainz.org/datasets/postgres-dumps#musicbrainz

But the official dataset is huge, a lot larger then I'm sharing, this is because I saved it more efficiently

Contains:

Total Size: ~25GB in postgres, 270GB provided by MusicBrainz in json-format

Artists: 2.6mil

Albums: 5mil

Tracks: 51.1mil

Spotify

Contains:

Total Size: ~20GB in postgres, 11.7GB in CSV-Format

Artists: 1.2mil

Albums: 3.7mil

Tracks: 18mil

Tidal

Contains:

Total Size: ~110GB in postgres, 55.4GB in CSV-Format

Artists: 8mil

Albums: 29.2mil

Tracks: 117.9mil

Deezer

Contains:

Total Size: ~170GB in postgres, 112.8GB in CSV-Format

Artists: 9.1mil

Albums: 35.5mil

Tracks: 180.4mil

Dataset of Sept 22 2025

Datasets of MusicBrainz, Tidal, Spotify, Deezer

You can use these tables for MiniMedia's database, it will save you huge amounts of time, instead calling the API's yourself

Tidal, Spotify, Deezer datasets were obtained through their API, took months of calling their API's 24/7

MusicBrainz came mostly through their own published dataset + API

Note for Deezer dataset: The Preview Url (to listen to the first x seconds of a song) and TrackToken (for playback) fields will be empty, it took too much space to store all of this for me

Packed: CSV-Format 8.7GB, SQL 16.9GB

Unpacked CSV-Format 155.4GB, SQL 117.1GB

Loving the work I do buy me a coffee https://buymeacoffee.com/musicmovearr

MusicBrainz

You can officially download it here: https://metabrainz.org/datasets/postgres-dumps#musicbrainz

But the official dataset is huge, a lot larger then I'm sharing, this is because I saved it more efficiently

Contains:

Total Size: ~25GB in postgres, 270GB provided by MusicBrainz in json-format

Artists: 2.6mil

Albums: 5mil

Tracks: 51.1mil

Spotify

Contains:

Total Size: ~10GB in postgres, 6.8GB in CSV-Format

Artists: 954k

Albums: 2.2mil

Tracks: 10.8mil

Tidal

Contains:

Total Size: ~65GB in postgres, 39.4GB in CSV-Format

Artists: 3.3mil

Albums: 19mil

Tracks: 82mil

Deezer

Contains:

Total Size: ~235GB in postgres, 109.3GB in CSV-Format

Artists: 9.1mil

Albums: 34.4mil

Tracks: 177mil

Dataset of July 06 2025

Datasets of MusicBrainz, Tidal, Spotify, Deezer

You can use these tables for MiniMedia's database, it will save you huge amounts of time, instead calling the API's yourself

Tidal, Spotify, Deezer datasets were obtained through their API, took months of calling their API's 24/7

MusicBrainz came mostly through their own published dataset + API

Note for Deezer dataset: The Preview Url (to listen to the first x seconds of a song) and TrackToken (for playback) fields will be empty, it took too much space to store all of this for me

Packed: CSV-Format 4.6GB, SQL 16.2GB

Unpacked CSV-Format 82.2GBGB, SQL 114.3GB

Loving the work I do buy me a coffee https://buymeacoffee.com/musicmovearr

MusicBrainz

You can officially download it here: https://metabrainz.org/datasets/postgres-dumps#musicbrainz

But the official dataset is huge, a lot larger then I'm sharing, this is because I saved it more efficiently

Contains:

Total Size: ~20GB in postgres, 270GB provided by MusicBrainz in json-format

Artists: 2.5mil

Albums: 4.8mil

Tracks: 49mil

Spotify

Contains:

Total Size: ~3GB in postgres, 1.2GB in CSV-Format

Artists: 214k

Albums: 408k

Tracks: 2.1mil

Tidal

Contains:

Total Size: ~15GB in postgres, 3GB in CSV-Format

Artists: 456k

Albums: 2.3mil

Tracks: 14.6mil

Deezer

Contains:

Total Size: ~120GB in postgres, 73.8GB in CSV-Format

Artists: 4.1mil

Albums: 21.7mil

Tracks: 118.7mil

Dataset of June 07 2025

Datasets of MusicBrainz, Tidal, Spotify

These datasets contain zero modifications from myself, they're straight from the source

You can use these tables for MiniMedia's database, it will save you huge amounts of time, instead calling the API's yourself

Tidal, Spotify datasets were obtained through their API, took months of calling their API's 24/7

MusicBrainz came mostly through their own published dataset + API

Packed: 4.5GB

Unpacked 44.6GB

Loving the work I do buy me a coffee https://buymeacoffee.com/musicmovearr

MusicBrainz

You can officially download it here: https://metabrainz.org/datasets/postgres-dumps#musicbrainz

But the official dataset is huge, a lot larger then I'm sharing, this is because I saved it more efficiently

Contains:

Total Size: ~20GB in postgres, 270GB provided by MusicBrainz in json-format

Artists: 2.5mil

Albums: 4.8mil

Tracks: 49mil

Spotify

Contains:

Total Size: ~1GB in postgres

Artists: 64k

Albums: 196k

Tracks: 1.1mil

Tidal

Contains:

Total Size: ~3GB in postgres

Artists: 118k

Albums: 403k

Tracks: 2.5mil

FAQ:

Why is Spotify's dataset so "small" compared to MusicBrainz?

Spotify has a pain in the ass rate limit, I can only call their API every 10seconds to not trigger their rate limiter too fast and then it will block the API key for ~15hours...

Plus keep in mind that the dataset of Spotify is not complete, I can only fetch ~500 artists in a day...

Why is Tidal's dataset so "small" to MusicBrainz?

Tidal's api rate limiter is alright but I can only make ~200 API calls per (15 minutes?), it's not super fast but compared to spotify it doesn't block me for ~15hours

Plus keep in mind that the dataset of Tidal is not complete

Is the deezer dataset complete?

The Deezer dataset is complete I can say with confidence for 99%, there surely must be a few artists I missed

CSV-Format of Deezer

TypeName
longArtistId
stringArtistName
intArtistNbAlbum
intArtistNbFan
boolArtistRadio
stringArtistType
stringArtistHref
stringArtistImageHref
longAlbumId
longAlbumArtistId
stringAlbumName
stringAlbumMd5Image
intAlbumGenreId
longAlbumFans
stringAlbumReleaseDate
stringAlbumRecordType
boolAlbumExplicitLyrics
intAlbumExplicitContentLyrics
intAlbumExplicitContentCover
stringAlbumType
stringAlbumUPC
stringLabel
longAlbumNbTracks
TimeSpanAlbumDuration
boolAlbumAvailable
stringAlbumHref
stringAlbumGenreName
stringAlbumGenrePicture
longTrackId
boolTrackReadable
stringTrackTitle
stringTrackTitleShort
stringTrackTitleVersion
stringTrackISRC
TimeSpanTrackDuration
intTrackPosition
intTrackDiscNumber
longTrackRank
stringTrackReleaseDate
boolTrackExplicitLyrics
intTrackExplicitContentLyrics
intTrackExplicitContentCover
doubleTrackBPM
doubleTrackGain
stringTrackMd5Image
longTrackArtistId
longTrackAlbumId
stringTrackType
stringTrackHref

CSV-Format of Spotify

TypeName
stringArtistId
stringArtistName
intArtistPopularity
stringArtistType
stringArtistUri
intArtistTotalFollowers
stringArtistHref
stringArtistGenres
stringArtistCoverUrl
stringAlbumId
stringAlbumAlbumGroup
stringAlbumAlbumType
stringAlbumName
stringAlbumReleaseDate
stringAlbumReleaseDatePrecision
intAlbumTotalTracks
stringAlbumType
stringAlbumUri
stringAlbumLabel
intAlbumPopularity
stringAlbumArtistId
stringAlbumCoverUrl
stringAlbumUPC
stringTrackId
stringTrackAlbumId
intTrackDiscNumber
TimeSpanTrackDuration
boolTrackExplicit
stringTrackHref
boolTrackIsPlayable
stringTrackName
stringTrackPreviewUrl
intTrackNumber
stringTrackType
stringTrackUri
stringTrackISRC

CSV-Format of Tidal

TypeName
intArtistId
stringArtistName
floatArtistPopularity
stringArtistImageHref
intAlbumId
stringAlbumTitle
stringAlbumBarcodeId
intAlbumNumberOfVolumes
intAlbumNumberOfItems
stringAlbumDuration
stringAlbumExplicit
stringAlbumReleaseDate
stringAlbumCopyright
floatAlbumPopularity
stringAlbumAvailability
stringAlbumMediatags
stringAlbumImageHref
intTrackId
intTrackAlbumId
stringTrackTitle
stringTrackISRC
stringTrackDuration
stringTrackCopyright
boolTrackExplicit
floatTrackPopularity
stringTrackAvailability
stringTrackMediatags
intTrackVolumeNumber
intTrackTrackNumber
stringTrackVersion
stringArtistHref
stringAlbumHref
stringTrackHref