pvx Algorithm Limitations and Applicability
May 25, 2026 ยท View on GitHub

pvx Algorithm Limitations and Applicability
Generated from commit 35e9761 (commit date: 2026-05-25T08:14:42-04:00).
This document summarizes assumptions, likely failure modes, and practical exclusion cases for each algorithm group and algorithm module.
Group-Level Summary
| Group | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
analysis_qa_and_automation | Feature extraction settings align with domain (speech vs music etc.). | False positives/negatives under domain shift. | Avoid treating single metrics as absolute quality verdicts. |
creative_spectral_effects | Spectral manipulations are desired even with timbral coloration. | Can introduce intentional but strong coloration or temporal artifacts. | Avoid for transparent restoration/mastering paths. |
denoise_and_restoration | Noise/artifacts are distinguishable from desired signal statistics. | Over-reduction can remove detail and create modulation artifacts. | Avoid high reduction settings on sparse acoustic sources without auditioning. |
dereverb_and_room_correction | Late reverberation is separable from direct content under chosen model. | Speech/music clarity can drop if early reflections are over-suppressed. | Avoid strong dereverb when room character is part of artistic intent. |
dynamics_and_loudness | Program dynamics fit compressor/limiter time constants and thresholds. | Pumping, breathing, or overs if thresholds and release are mis-set. | Avoid applying multiple strong dynamics stages without gain staging checks. |
granular_and_modulation | Grain and modulation rates are musically matched to source texture. | Incoherent grain scheduling can produce choppiness or blur. | Avoid dense granular settings on speech intelligibility-critical content. |
pitch_detection_and_tracking | F0 evidence is strong in the selected analysis band and frame size. | Octave errors and voicing flips under heavy noise/polyphony. | Avoid as the sole control signal for dense polyphonic mixtures. |
retune_and_intonation | Detected notes map cleanly to intended tonal center/scale. | Over-correction can flatten expressive vibrato or slides. | Avoid aggressive correction when preserving natural micro-intonation is required. |
separation_and_decomposition | Sources have partially separable spectral or statistical structure. | Component bleeding and musical noise under overlap or model mismatch. | Avoid expecting perfect stems from strongly correlated or co-modulated sources. |
spatial_and_multichannel | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spectral_time_frequency_transforms | Transform parameterization matches target time-frequency structure. | Incorrect parameterization can smear events or over-fragment spectra. | Avoid default settings for highly nonstationary signals without tuning. |
time_scale_and_pitch_core | Frames are locally quasi-stationary and harmonic evolution is reasonably smooth. | High-ratio stretch can introduce phasiness and blurred transients. | Avoid for extreme percussive-only material when attack realism is critical. |
analysis_qa_and_automation
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
analysis_qa_and_automation.auto_parameter_tuning_bayesian_optimization | Feature extraction settings align with domain (speech vs music etc.). | False positives/negatives under domain shift. | Avoid treating single metrics as absolute quality verdicts. |
analysis_qa_and_automation.batch_preset_recommendation_based_on_source_features | Feature extraction settings align with domain (speech vs music etc.). | False positives/negatives under domain shift. | Avoid treating single metrics as absolute quality verdicts. |
analysis_qa_and_automation.clip_hum_buzz_artifact_detection | Feature extraction settings align with domain (speech vs music etc.). | False positives/negatives under domain shift. | Avoid treating single metrics as absolute quality verdicts. |
analysis_qa_and_automation.key_chord_detection | Feature extraction settings align with domain (speech vs music etc.). | False positives/negatives under domain shift. | Avoid treating single metrics as absolute quality verdicts. |
analysis_qa_and_automation.onset_beat_downbeat_tracking | Feature extraction settings align with domain (speech vs music etc.). | False positives/negatives under domain shift. | Avoid treating single metrics as absolute quality verdicts. |
analysis_qa_and_automation.pesq_stoi_visqol_quality_metrics | Feature extraction settings align with domain (speech vs music etc.). | False positives/negatives under domain shift. | Avoid treating single metrics as absolute quality verdicts. |
analysis_qa_and_automation.silence_speech_music_classifiers | Feature extraction settings align with domain (speech vs music etc.). | False positives/negatives under domain shift. | Avoid treating single metrics as absolute quality verdicts. |
analysis_qa_and_automation.structure_segmentation_verse_chorus_sections | Feature extraction settings align with domain (speech vs music etc.). | False positives/negatives under domain shift. | Avoid treating single metrics as absolute quality verdicts. |
creative_spectral_effects
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
creative_spectral_effects.cross_synthesis_vocoder | Spectral manipulations are desired even with timbral coloration. | Can introduce intentional but strong coloration or temporal artifacts. | Avoid for transparent restoration/mastering paths. |
creative_spectral_effects.formant_painting_warping | Spectral manipulations are desired even with timbral coloration. | Can introduce intentional but strong coloration or temporal artifacts. | Avoid for transparent restoration/mastering paths. |
creative_spectral_effects.phase_randomization_textures | Spectral manipulations are desired even with timbral coloration. | Can introduce intentional but strong coloration or temporal artifacts. | Avoid for transparent restoration/mastering paths. |
creative_spectral_effects.resonator_filterbank_morphing | Spectral manipulations are desired even with timbral coloration. | Can introduce intentional but strong coloration or temporal artifacts. | Avoid for transparent restoration/mastering paths. |
creative_spectral_effects.spectral_blur_smear | Spectral manipulations are desired even with timbral coloration. | Can introduce intentional but strong coloration or temporal artifacts. | Avoid for transparent restoration/mastering paths. |
creative_spectral_effects.spectral_contrast_exaggeration | Spectral manipulations are desired even with timbral coloration. | Can introduce intentional but strong coloration or temporal artifacts. | Avoid for transparent restoration/mastering paths. |
creative_spectral_effects.spectral_convolution_effects | Spectral manipulations are desired even with timbral coloration. | Can introduce intentional but strong coloration or temporal artifacts. | Avoid for transparent restoration/mastering paths. |
creative_spectral_effects.spectral_freeze_banks | Spectral manipulations are desired even with timbral coloration. | Can introduce intentional but strong coloration or temporal artifacts. | Avoid for transparent restoration/mastering paths. |
denoise_and_restoration
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
denoise_and_restoration.declick_decrackle_median_wavelet_interpolation | Noise/artifacts are distinguishable from desired signal statistics. Noise model should be representative of observed noise floor. | Over-reduction can remove detail and create modulation artifacts. Mismatched noise estimate leaves residue or damages detail. | Avoid high reduction settings on sparse acoustic sources without auditioning. Avoid static settings on rapidly varying nonstationary noise. |
denoise_and_restoration.declip_via_sparse_reconstruction | Noise/artifacts are distinguishable from desired signal statistics. Noise model should be representative of observed noise floor. | Over-reduction can remove detail and create modulation artifacts. Mismatched noise estimate leaves residue or damages detail. | Avoid high reduction settings on sparse acoustic sources without auditioning. Avoid static settings on rapidly varying nonstationary noise. |
denoise_and_restoration.diffusion_based_speech_audio_denoise | Noise/artifacts are distinguishable from desired signal statistics. Noise model should be representative of observed noise floor. | Over-reduction can remove detail and create modulation artifacts. Mismatched noise estimate leaves residue or damages detail. | Avoid high reduction settings on sparse acoustic sources without auditioning. Avoid static settings on rapidly varying nonstationary noise. |
denoise_and_restoration.log_mmse | Noise/artifacts are distinguishable from desired signal statistics. Noise model should be representative of observed noise floor. | Over-reduction can remove detail and create modulation artifacts. Mismatched noise estimate leaves residue or damages detail. | Avoid high reduction settings on sparse acoustic sources without auditioning. Avoid static settings on rapidly varying nonstationary noise. |
denoise_and_restoration.minimum_statistics_noise_tracking | Noise/artifacts are distinguishable from desired signal statistics. Noise model should be representative of observed noise floor. | Over-reduction can remove detail and create modulation artifacts. Mismatched noise estimate leaves residue or damages detail. | Avoid high reduction settings on sparse acoustic sources without auditioning. Avoid static settings on rapidly varying nonstationary noise. |
denoise_and_restoration.mmse_stsa | Noise/artifacts are distinguishable from desired signal statistics. Noise model should be representative of observed noise floor. | Over-reduction can remove detail and create modulation artifacts. Mismatched noise estimate leaves residue or damages detail. | Avoid high reduction settings on sparse acoustic sources without auditioning. Avoid static settings on rapidly varying nonstationary noise. |
denoise_and_restoration.rnnoise_style_denoiser | Noise/artifacts are distinguishable from desired signal statistics. Noise model should be representative of observed noise floor. | Over-reduction can remove detail and create modulation artifacts. Mismatched noise estimate leaves residue or damages detail. | Avoid high reduction settings on sparse acoustic sources without auditioning. Avoid static settings on rapidly varying nonstationary noise. |
denoise_and_restoration.wiener_denoising | Noise/artifacts are distinguishable from desired signal statistics. Noise model should be representative of observed noise floor. | Over-reduction can remove detail and create modulation artifacts. Mismatched noise estimate leaves residue or damages detail. | Avoid high reduction settings on sparse acoustic sources without auditioning. Avoid static settings on rapidly varying nonstationary noise. |
dereverb_and_room_correction
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
dereverb_and_room_correction.blind_deconvolution_dereverb | Late reverberation is separable from direct content under chosen model. Reverberation tail is assumed more diffuse than direct content. | Speech/music clarity can drop if early reflections are over-suppressed. Over-suppression can thin tonal body and ambience. | Avoid strong dereverb when room character is part of artistic intent. Avoid for intentionally wet effects unless mix preservation is planned. |
dereverb_and_room_correction.drr_guided_dereverb | Late reverberation is separable from direct content under chosen model. Reverberation tail is assumed more diffuse than direct content. | Speech/music clarity can drop if early reflections are over-suppressed. Over-suppression can thin tonal body and ambience. | Avoid strong dereverb when room character is part of artistic intent. Avoid for intentionally wet effects unless mix preservation is planned. |
dereverb_and_room_correction.late_reverb_suppression_via_coherence | Late reverberation is separable from direct content under chosen model. Reverberation tail is assumed more diffuse than direct content. | Speech/music clarity can drop if early reflections are over-suppressed. Over-suppression can thin tonal body and ambience. | Avoid strong dereverb when room character is part of artistic intent. Avoid for intentionally wet effects unless mix preservation is planned. |
dereverb_and_room_correction.multi_band_adaptive_deverb | Late reverberation is separable from direct content under chosen model. Reverberation tail is assumed more diffuse than direct content. | Speech/music clarity can drop if early reflections are over-suppressed. Over-suppression can thin tonal body and ambience. | Avoid strong dereverb when room character is part of artistic intent. Avoid for intentionally wet effects unless mix preservation is planned. |
dereverb_and_room_correction.neural_dereverb_module | Late reverberation is separable from direct content under chosen model. Model priors assume training-like signal statistics. Reverberation tail is assumed more diffuse than direct content. | Speech/music clarity can drop if early reflections are over-suppressed. Generalization gaps can produce unstable artifacts. Over-suppression can thin tonal body and ambience. | Avoid strong dereverb when room character is part of artistic intent. Avoid fully unattended use on out-of-domain material. Avoid for intentionally wet effects unless mix preservation is planned. |
dereverb_and_room_correction.room_impulse_inverse_filtering | Late reverberation is separable from direct content under chosen model. Reverberation tail is assumed more diffuse than direct content. | Speech/music clarity can drop if early reflections are over-suppressed. Over-suppression can thin tonal body and ambience. | Avoid strong dereverb when room character is part of artistic intent. Avoid for intentionally wet effects unless mix preservation is planned. |
dereverb_and_room_correction.spectral_decay_subtraction | Late reverberation is separable from direct content under chosen model. Reverberation tail is assumed more diffuse than direct content. | Speech/music clarity can drop if early reflections are over-suppressed. Over-suppression can thin tonal body and ambience. | Avoid strong dereverb when room character is part of artistic intent. Avoid for intentionally wet effects unless mix preservation is planned. |
dereverb_and_room_correction.wpe_dereverberation | Late reverberation is separable from direct content under chosen model. Reverberation tail is assumed more diffuse than direct content. | Speech/music clarity can drop if early reflections are over-suppressed. Over-suppression can thin tonal body and ambience. | Avoid strong dereverb when room character is part of artistic intent. Avoid for intentionally wet effects unless mix preservation is planned. |
dynamics_and_loudness
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
dynamics_and_loudness.ebu_r128_normalization | Program dynamics fit compressor/limiter time constants and thresholds. | Pumping, breathing, or overs if thresholds and release are mis-set. | Avoid applying multiple strong dynamics stages without gain staging checks. |
dynamics_and_loudness.itu_bs_1770_loudness_measurement_gating | Program dynamics fit compressor/limiter time constants and thresholds. | Pumping, breathing, or overs if thresholds and release are mis-set. | Avoid applying multiple strong dynamics stages without gain staging checks. |
dynamics_and_loudness.lufs_target_mastering_chain | Program dynamics fit compressor/limiter time constants and thresholds. | Pumping, breathing, or overs if thresholds and release are mis-set. | Avoid applying multiple strong dynamics stages without gain staging checks. |
dynamics_and_loudness.multi_band_compression | Program dynamics fit compressor/limiter time constants and thresholds. | Pumping, breathing, or overs if thresholds and release are mis-set. | Avoid applying multiple strong dynamics stages without gain staging checks. |
dynamics_and_loudness.spectral_dynamics_bin_wise_compressor_expander | Program dynamics fit compressor/limiter time constants and thresholds. | Pumping, breathing, or overs if thresholds and release are mis-set. | Avoid applying multiple strong dynamics stages without gain staging checks. |
dynamics_and_loudness.transient_shaping | Program dynamics fit compressor/limiter time constants and thresholds. | Pumping, breathing, or overs if thresholds and release are mis-set. | Avoid applying multiple strong dynamics stages without gain staging checks. |
dynamics_and_loudness.true_peak_limiting | Program dynamics fit compressor/limiter time constants and thresholds. | Pumping, breathing, or overs if thresholds and release are mis-set. | Avoid applying multiple strong dynamics stages without gain staging checks. |
dynamics_and_loudness.upward_compression | Program dynamics fit compressor/limiter time constants and thresholds. | Pumping, breathing, or overs if thresholds and release are mis-set. | Avoid applying multiple strong dynamics stages without gain staging checks. |
granular_and_modulation
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
granular_and_modulation.am_fm_ring_modulation_blocks | Grain and modulation rates are musically matched to source texture. | Incoherent grain scheduling can produce choppiness or blur. | Avoid dense granular settings on speech intelligibility-critical content. |
granular_and_modulation.envelope_followed_modulation_routing | Grain and modulation rates are musically matched to source texture. | Incoherent grain scheduling can produce choppiness or blur. | Avoid dense granular settings on speech intelligibility-critical content. |
granular_and_modulation.formant_lfo_modulation | Grain and modulation rates are musically matched to source texture. | Incoherent grain scheduling can produce choppiness or blur. | Avoid dense granular settings on speech intelligibility-critical content. |
granular_and_modulation.freeze_grain_morphing | Grain and modulation rates are musically matched to source texture. | Incoherent grain scheduling can produce choppiness or blur. | Avoid dense granular settings on speech intelligibility-critical content. |
granular_and_modulation.grain_cloud_pitch_textures | Grain and modulation rates are musically matched to source texture. | Incoherent grain scheduling can produce choppiness or blur. | Avoid dense granular settings on speech intelligibility-critical content. |
granular_and_modulation.granular_time_stretch_engine | Grain and modulation rates are musically matched to source texture. | Incoherent grain scheduling can produce choppiness or blur. | Avoid dense granular settings on speech intelligibility-critical content. |
granular_and_modulation.rhythmic_gate_stutter_quantizer | Grain and modulation rates are musically matched to source texture. | Incoherent grain scheduling can produce choppiness or blur. | Avoid dense granular settings on speech intelligibility-critical content. |
granular_and_modulation.spectral_tremolo | Grain and modulation rates are musically matched to source texture. | Incoherent grain scheduling can produce choppiness or blur. | Avoid dense granular settings on speech intelligibility-critical content. |
pitch_detection_and_tracking
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
pitch_detection_and_tracking.crepe_style_neural_f0 | F0 evidence is strong in the selected analysis band and frame size. Model priors assume training-like signal statistics. | Octave errors and voicing flips under heavy noise/polyphony. Generalization gaps can produce unstable artifacts. | Avoid as the sole control signal for dense polyphonic mixtures. Avoid fully unattended use on out-of-domain material. |
pitch_detection_and_tracking.harmonic_product_spectrum_hps | F0 evidence is strong in the selected analysis band and frame size. | Octave errors and voicing flips under heavy noise/polyphony. | Avoid as the sole control signal for dense polyphonic mixtures. |
pitch_detection_and_tracking.pyin | F0 evidence is strong in the selected analysis band and frame size. | Octave errors and voicing flips under heavy noise/polyphony. | Avoid as the sole control signal for dense polyphonic mixtures. |
pitch_detection_and_tracking.rapt | F0 evidence is strong in the selected analysis band and frame size. | Octave errors and voicing flips under heavy noise/polyphony. | Avoid as the sole control signal for dense polyphonic mixtures. |
pitch_detection_and_tracking.subharmonic_summation | F0 evidence is strong in the selected analysis band and frame size. | Octave errors and voicing flips under heavy noise/polyphony. | Avoid as the sole control signal for dense polyphonic mixtures. |
pitch_detection_and_tracking.swipe | F0 evidence is strong in the selected analysis band and frame size. | Octave errors and voicing flips under heavy noise/polyphony. | Avoid as the sole control signal for dense polyphonic mixtures. |
pitch_detection_and_tracking.viterbi_smoothed_pitch_contour_tracking | F0 evidence is strong in the selected analysis band and frame size. | Octave errors and voicing flips under heavy noise/polyphony. | Avoid as the sole control signal for dense polyphonic mixtures. |
pitch_detection_and_tracking.yin | F0 evidence is strong in the selected analysis band and frame size. | Octave errors and voicing flips under heavy noise/polyphony. | Avoid as the sole control signal for dense polyphonic mixtures. |
retune_and_intonation
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
retune_and_intonation.adaptive_intonation_context_sensitive_intervals | Detected notes map cleanly to intended tonal center/scale. Pitch trajectory estimates should be continuous enough for retuning. | Over-correction can flatten expressive vibrato or slides. Fast F0 jumps can cause audible stepping. | Avoid aggressive correction when preserving natural micro-intonation is required. Avoid high-strength retune on breath/noise segments. |
retune_and_intonation.chord_aware_retuning | Detected notes map cleanly to intended tonal center/scale. Pitch trajectory estimates should be continuous enough for retuning. | Over-correction can flatten expressive vibrato or slides. Fast F0 jumps can cause audible stepping. | Avoid aggressive correction when preserving natural micro-intonation is required. Avoid high-strength retune on breath/noise segments. |
retune_and_intonation.just_intonation_mapping_per_key_center | Detected notes map cleanly to intended tonal center/scale. Pitch trajectory estimates should be continuous enough for retuning. | Over-correction can flatten expressive vibrato or slides. Fast F0 jumps can cause audible stepping. | Avoid aggressive correction when preserving natural micro-intonation is required. Avoid high-strength retune on breath/noise segments. |
retune_and_intonation.key_aware_retuning_with_confidence_weighting | Detected notes map cleanly to intended tonal center/scale. Pitch trajectory estimates should be continuous enough for retuning. | Over-correction can flatten expressive vibrato or slides. Fast F0 jumps can cause audible stepping. | Avoid aggressive correction when preserving natural micro-intonation is required. Avoid high-strength retune on breath/noise segments. |
retune_and_intonation.portamento_aware_retune_curves | Detected notes map cleanly to intended tonal center/scale. Pitch trajectory estimates should be continuous enough for retuning. | Over-correction can flatten expressive vibrato or slides. Fast F0 jumps can cause audible stepping. | Avoid aggressive correction when preserving natural micro-intonation is required. Avoid high-strength retune on breath/noise segments. |
retune_and_intonation.scala_mts_scale_import_and_quantization | Detected notes map cleanly to intended tonal center/scale. Pitch trajectory estimates should be continuous enough for retuning. | Over-correction can flatten expressive vibrato or slides. Fast F0 jumps can cause audible stepping. | Avoid aggressive correction when preserving natural micro-intonation is required. Avoid high-strength retune on breath/noise segments. |
retune_and_intonation.time_varying_cents_maps | Detected notes map cleanly to intended tonal center/scale. Pitch trajectory estimates should be continuous enough for retuning. | Over-correction can flatten expressive vibrato or slides. Fast F0 jumps can cause audible stepping. | Avoid aggressive correction when preserving natural micro-intonation is required. Avoid high-strength retune on breath/noise segments. |
retune_and_intonation.vibrato_preserving_correction | Detected notes map cleanly to intended tonal center/scale. Pitch trajectory estimates should be continuous enough for retuning. | Over-correction can flatten expressive vibrato or slides. Fast F0 jumps can cause audible stepping. | Avoid aggressive correction when preserving natural micro-intonation is required. Avoid high-strength retune on breath/noise segments. |
separation_and_decomposition
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
separation_and_decomposition.demucs_style_stem_separation_backend | Sources have partially separable spectral or statistical structure. | Component bleeding and musical noise under overlap or model mismatch. | Avoid expecting perfect stems from strongly correlated or co-modulated sources. |
separation_and_decomposition.ica_bss_for_multichannel_stems | Sources have partially separable spectral or statistical structure. | Component bleeding and musical noise under overlap or model mismatch. | Avoid expecting perfect stems from strongly correlated or co-modulated sources. |
separation_and_decomposition.nmf_decomposition | Sources have partially separable spectral or statistical structure. | Component bleeding and musical noise under overlap or model mismatch. | Avoid expecting perfect stems from strongly correlated or co-modulated sources. |
separation_and_decomposition.probabilistic_latent_component_separation | Sources have partially separable spectral or statistical structure. | Component bleeding and musical noise under overlap or model mismatch. | Avoid expecting perfect stems from strongly correlated or co-modulated sources. |
separation_and_decomposition.rpca_hpss | Sources have partially separable spectral or statistical structure. | Component bleeding and musical noise under overlap or model mismatch. | Avoid expecting perfect stems from strongly correlated or co-modulated sources. |
separation_and_decomposition.sinusoidal_residual_transient_decomposition | Sources have partially separable spectral or statistical structure. | Component bleeding and musical noise under overlap or model mismatch. | Avoid expecting perfect stems from strongly correlated or co-modulated sources. |
separation_and_decomposition.tensor_decomposition_cp_tucker | Sources have partially separable spectral or statistical structure. | Component bleeding and musical noise under overlap or model mismatch. | Avoid expecting perfect stems from strongly correlated or co-modulated sources. |
separation_and_decomposition.u_net_vocal_accompaniment_split | Sources have partially separable spectral or statistical structure. | Component bleeding and musical noise under overlap or model mismatch. | Avoid expecting perfect stems from strongly correlated or co-modulated sources. |
spatial_and_multichannel
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
spatial_and_multichannel.binaural_itd_ild_synthesis | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.binaural_motion_trajectory_designer | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.coherence_based_dereverb_multichannel | Channel geometry/order and timing metadata are correct. Reverberation tail is assumed more diffuse than direct content. | Spatial collapse, combing, or localization bias from misalignment. Over-suppression can thin tonal body and ambience. | Avoid blind spatial processing when channel order/calibration is unknown. Avoid for intentionally wet effects unless mix preservation is planned. |
spatial_and_multichannel.cross_channel_click_pop_repair | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.dbap_distance_based_amplitude_panning | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.decorrelated_reverb_upmix | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.microphone_array_calibration_tones | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.multichannel_noise_psd_tracking | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.multichannel_wiener_postfilter | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.phase_aligned_mid_side_field_rotation | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.phase_consistent_multichannel_denoise | Channel geometry/order and timing metadata are correct. Noise model should be representative of observed noise floor. | Spatial collapse, combing, or localization bias from misalignment. Mismatched noise estimate leaves residue or damages detail. | Avoid blind spatial processing when channel order/calibration is unknown. Avoid static settings on rapidly varying nonstationary noise. |
spatial_and_multichannel.pvx_directional_spectral_warp | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.pvx_interaural_coherence_shaping | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.pvx_interchannel_phase_locking | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.pvx_multichannel_time_alignment | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.pvx_spatial_freeze_and_trajectory | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.pvx_spatial_transient_preservation | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.rotating_speaker_doppler_field | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.spatial_freeze_resynthesis | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.spectral_spatial_granulator | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.stereo_width_frequency_dependent_control | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.stochastic_spatial_diffusion_cloud | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.transaural_crosstalk_cancellation | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spatial_and_multichannel.vbap_adaptive_panning | Channel geometry/order and timing metadata are correct. | Spatial collapse, combing, or localization bias from misalignment. | Avoid blind spatial processing when channel order/calibration is unknown. |
spectral_time_frequency_transforms
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
spectral_time_frequency_transforms.chirplet_transform_analysis | Transform parameterization matches target time-frequency structure. | Incorrect parameterization can smear events or over-fragment spectra. | Avoid default settings for highly nonstationary signals without tuning. |
spectral_time_frequency_transforms.constant_q_transform_cqt_processing | Transform parameterization matches target time-frequency structure. | Incorrect parameterization can smear events or over-fragment spectra. | Avoid default settings for highly nonstationary signals without tuning. |
spectral_time_frequency_transforms.multi_window_stft_fusion | Transform parameterization matches target time-frequency structure. | Incorrect parameterization can smear events or over-fragment spectra. | Avoid default settings for highly nonstationary signals without tuning. |
spectral_time_frequency_transforms.nsgt_based_processing | Transform parameterization matches target time-frequency structure. | Incorrect parameterization can smear events or over-fragment spectra. | Avoid default settings for highly nonstationary signals without tuning. |
spectral_time_frequency_transforms.reassigned_spectrogram_methods | Transform parameterization matches target time-frequency structure. | Incorrect parameterization can smear events or over-fragment spectra. | Avoid default settings for highly nonstationary signals without tuning. |
spectral_time_frequency_transforms.synchrosqueezed_stft | Transform parameterization matches target time-frequency structure. | Incorrect parameterization can smear events or over-fragment spectra. | Avoid default settings for highly nonstationary signals without tuning. |
spectral_time_frequency_transforms.variable_q_transform_vqt | Transform parameterization matches target time-frequency structure. | Incorrect parameterization can smear events or over-fragment spectra. | Avoid default settings for highly nonstationary signals without tuning. |
spectral_time_frequency_transforms.wavelet_packet_processing | Transform parameterization matches target time-frequency structure. | Incorrect parameterization can smear events or over-fragment spectra. | Avoid default settings for highly nonstationary signals without tuning. |
time_scale_and_pitch_core
| Algorithm ID | Assumptions | Failure Modes | When Not To Use |
|---|---|---|---|
time_scale_and_pitch_core.beat_synchronous_time_warping | Frames are locally quasi-stationary and harmonic evolution is reasonably smooth. | High-ratio stretch can introduce phasiness and blurred transients. | Avoid for extreme percussive-only material when attack realism is critical. |
time_scale_and_pitch_core.harmonic_percussive_split_tsm | Frames are locally quasi-stationary and harmonic evolution is reasonably smooth. | High-ratio stretch can introduce phasiness and blurred transients. | Avoid for extreme percussive-only material when attack realism is critical. |
time_scale_and_pitch_core.lp_psola | Frames are locally quasi-stationary and harmonic evolution is reasonably smooth. | High-ratio stretch can introduce phasiness and blurred transients. | Avoid for extreme percussive-only material when attack realism is critical. |
time_scale_and_pitch_core.multi_resolution_phase_vocoder | Frames are locally quasi-stationary and harmonic evolution is reasonably smooth. Phase continuity assumptions hold best for moderate stretch ratios. | High-ratio stretch can introduce phasiness and blurred transients. Extreme settings increase phasiness/transient blur risk. | Avoid for extreme percussive-only material when attack realism is critical. Avoid very large stretch+pitch shifts without transient controls. |
time_scale_and_pitch_core.nonlinear_time_maps | Frames are locally quasi-stationary and harmonic evolution is reasonably smooth. | High-ratio stretch can introduce phasiness and blurred transients. | Avoid for extreme percussive-only material when attack realism is critical. |
time_scale_and_pitch_core.td_psola | Frames are locally quasi-stationary and harmonic evolution is reasonably smooth. | High-ratio stretch can introduce phasiness and blurred transients. | Avoid for extreme percussive-only material when attack realism is critical. |
time_scale_and_pitch_core.wsola_waveform_similarity_overlap_add | Frames are locally quasi-stationary and harmonic evolution is reasonably smooth. | High-ratio stretch can introduce phasiness and blurred transients. | Avoid for extreme percussive-only material when attack realism is critical. |
Attribution
See ATTRIBUTION.md.