Audio Samples

Results on


Sample #1
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)


Sample #2
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)


Sample #3
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)


Sample #4
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)


Sample #5
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)


Sample #6
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)


Sample #7
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)


Sample #8
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)


Sample #9
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)


Sample #10
Ground-truth RIR Reverberant speech Dry speech
AudioLM-like (RIR) RQ-Transformer-like (RIR) VALL-E-like (RIR) FVN (RIR)
AudioLM-like (convolved) RQ-Transf. (convolved) VALL-E-like (convolved) FVN (convolved)