Audio Samples: Drum Mixing Estimation

Results on seen kits

① Autoencoding: graph decoder with another graph encoder.
② Unconditioned: graph decoder with dummy zero latents.
③ Estimation (proposed: token, 2-stage): token-by-token decoding + 2-stage (categorical/continuous) decoding.
④ Node, 2-stage: node-by-node decoding + 2-stage decoding.
⑤ Token, 1-stage: token-by-token decoding + single-stage autoregressive decoding.
⑥ Oracle source: the proposed method ③ with dry source conditioned reference encoder.
Dry source: (sum of) dry source(s) without any processing.


Sample #1
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full


Sample #2
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full


Sample #3
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full


Sample #4
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full


Sample #5
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full


Sample #6
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full


Sample #7
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full


Sample #8
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full


Sample #9
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full


Sample #10
Ground-truth
gt-img
full
① Autoencoding
prototype full
② Unconditioned
prototype full
Dry source
(bypass graph)
③ Estimation (proposed: token, 2-stage)
pred-img
full

④ Node, 2-stage
prototype full
⑤ Token, 1-stage
prototype full
⑥ Oracle source
prototype full