grafx.processors.dynamics
- class Compressor(energy_smoother='iir', gain_smoother=None, gain_smooth_in_log=False, knee='quadratic', iir_len=16384, flashfftconv=True, max_input_len=131072)
Bases:
Module
A feed-forward dynamic range compressor [GMR12].
We first calculate the mean input energy \(e[n]\) across all channels. Then, we optionally calculate its log-energy envelope \(G_u[n] = \log g_u[n]\).
\[ g_u[n] = \alpha[n] g_u[n-1]+(1-\alpha[n]) e[n]. \]There are two options for this smoothing. If we use the
"ballistics"
mode, the coefficient \(\alpha[n]\) is set to a different constant for an “attack” (where \(g_u[n]\) increases) and “release” (where \(g_u[n]\) decreases). For such a case, we use an optimized backenddiffcomp
[YMC+24]. To achieve further speedup, we can use the"iir"
mode, restricting the coefficients to the same value \(\alpha\) [SBR22]. This simplifies the above equation to a one-pole IIR filter so that we can compute the impulse response up to a certain length \(N\) and convolve it to approximate the envelope.\[ g_u[n] \approx e[n] * (1-\alpha)\alpha^n. \]Next, we compute the output (compressed) envelope \(G_y[n]\). We provide three options for the knee shape:
"quadratic"
,"hard"
, and"exponential"
. First, the quadratic knee gives the following output envelope,\[\begin{split} G_y[n] = \begin{cases} G_y^\mathrm{above}[n] & G_u[n] \geq T+W, \\ G_y^\mathrm{mid}[n] & T-W \leq G_u[n] < T+W, \\ G_y^\mathrm{below}[n] & G_u[n] < T-W \end{cases} \end{split}\]where \(T\) and \(W\) is a threshold and knee width (both in the log domain), respectively. The output envelopes are computed as
\[\begin{split} G_y^\mathrm{above}[n] &= T+\frac{G_u[n]-T}{R}, \\ G_y^\mathrm{mid}[n] &= G_u[n] + \Big(\frac{1}{R}-1\Big)\frac{(G_u[n]-T+W)^2}{4W}, \\ G_y^\mathrm{below}[n] &= G_u[n]. \end{split}\]From the quadratic knee, we can obtain the hard knee by setting \(W = 0\). If we use the exponential knee, there is no conditional branch and the output envelope is given as
\[ G_y[n] = G_u[n] + (1 - R) \frac{\log (1 + \exp(W \cdot (T - G_u[n]))}{W}. \]Finally, we compute the gain reduction curve
\[ g[n] = \exp(G_y[n] - G_u[n]). \]Before multiplying it to all channels, we can optionally smooth it (like the energy smoothing) with a one-pole IIR or ballistics filter.
This compressor’s learnable parameter is \(p = \{ z_{\alpha}^{\mathrm{pre}}, z_{\alpha}^{\mathrm{post}}, T, \bar{R}, W_{\mathrm{log}} \}\). The smoothing filter coefficients are recovered with a logistic sigmoid \(\alpha = \sigma (z_{\alpha})\). The ratio is recovered with \(R = 1 + \exp (\bar{R})\). Finally, the knee width is obtained with \(W = \exp (W_{\mathrm{log}})\).
- Parameters:
energy_smoother (
str
orNone
, optional) – The type of energy smoother to use. It can be either “iir” or “ballistics”, and if set toNone
, the energy envelope is computed without any smoothing (default:"iir"
).gain_smoother (
str
orNone
, optional) – The type of gain smoother to use. It can be either “iir” or “ballistics”, and if set toNone
, the gain reduction is computed without any smoothing (default:None
).gain_smooth_in_log (
bool
, optional) – An option to smooth the gain reduction in the log domain (default:False
).knee (
str
, optional) – The type of knee shape to use. It can be either “hard”, “quadratic”, or “exponential” (default:"quadratic"
).iir_len (
int
, optional) – The legnth of the smoothing FIR (default:16384
).flashfftconv (
bool
, optional) – An option to useFlashFFTConv
[FKNRe23] as a backend to perform the causal convolution in the gain smoothing stage efficiently (default:True
).max_input_len (
int
, optional) – Whenflashfftconv
is set toTrue
, the max input length must be also given (default:2**17
).
- forward(input_signals, log_threshold, log_ratio, log_knee=None, z_alpha_pre=None, z_alpha_post=None)
Processes input audio with the processor and given parameters.
- Parameters:
input_signals (
FloatTensor
, \(N \times C \times L\)) – A batch of input audio signals.z_alpha (
FloatTensor
, \(N \times 1\)) – IIR coefficients before applying the sigmoid.log_threshold (
FloatTensor
, \(N \times 1\)) – Compression threshold in log scale.log_ratio (
FloatTensor
, \(N \times 1\)) – Unconstrained ratio values, which will be transformed into the range of \([1, \infty)\).log_knee (
FloatTensor
, \(N \times 1\), optional) – Log of knee values (default:None
).
- Returns:
A batch of output signals of shape \(B \times C \times L\).
- Return type:
FloatTensor
- parameter_size()
- Returns:
A dictionary that contains each parameter tensor’s shape.
- Return type:
Dict[str, Tuple[int, ...]]
- static gain_hard_knee(log_energy, log_threshold, log_ratio, _)
Compute log-compression gain with the hard knee.
- static gain_quad_knee(log_energy, log_threshold, log_ratio, log_knee)
Compute log-compression gain with the quadratic knee.
- static gain_exp_knee(log_energy, log_threshold, log_ratio, log_knee)
Compute log-compression gain with the exponential knee.
- class NoiseGate(energy_smoother='iir', gain_smoother=None, gain_smooth_in_log=False, knee='quadratic', iir_len=16384, flashfftconv=True, max_input_len=131072)
Bases:
Module
A feed-forward noisegate [GMR12].
This processor is identical to the
Compressor
except for the output gain computation. Instead of compressing the signal above the threshold, it compresses below the threshold. For the quadratic knee, the output envelopes are computed as\[\begin{split} G_y^\mathrm{above}[n] &= G_u[n], \\ G_y^\mathrm{mid}[n] &= G_u[n] + (1-R)\frac{(G_u[n]-T-W)^2}{4W}, \\ G_y^\mathrm{below}[n] &= T+R(G_u[n]-T). \end{split}\]Or, if we use the exponential knee, the output envelope is given as
\[ G_y[n] = G_u[n] + \Big(1 - R\Big) \frac{\log (1 + \exp(W \cdot (G_u[n] - T))}{W}. \]Again, this processor’s learnable parameter is \(p = \{ z_{\alpha}^{\mathrm{pre}}, z_{\alpha}^{\mathrm{post}}, T, \bar{R}, W_{\mathrm{log}} \}\).
- Parameters:
energy_smoother (
str
orNone
, optional) – The type of energy smoother to use. It can be either “iir” or “ballistics”, and if set toNone
, the energy envelope is computed without any smoothing (default:"iir"
).gain_smoother (
str
orNone
, optional) – The type of gain smoother to use. It can be either “iir” or “ballistics”, and if set toNone
, the gain reduction is computed without any smoothing (default:None
).gain_smooth_in_log (
bool
, optional) – An option to smooth the gain reduction in the log domain (default:False
).knee (
str
, optional) – The type of knee shape to use. It can be either “hard”, “quadratic”, or “exponential” (default:"quadratic"
).iir_len (
int
, optional) – The legnth of the smoothing FIR (default:16384
).flashfftconv (
bool
, optional) – An option to useFlashFFTConv
[FKNRe23] as a backend to perform the causal convolution in the gain smoothing stage efficiently (default:True
).max_input_len (
int
, optional) – Whenflashfftconv
is set toTrue
, the max input length must be also given (default:2**17
).
- forward(input_signals, log_threshold, log_ratio, log_knee=None, z_alpha_pre=None, z_alpha_post=None)
Processes input audio with the processor and given parameters.
- Parameters:
input_signals (
FloatTensor
, \(N \times C \times L\)) – A batch of input audio signals.z_alpha (
FloatTensor
, \(N \times 1\)) – IIR coefficients before applying the sigmoid.log_threshold (
FloatTensor
, \(N \times 1\)) – Compression threshold in log scale.log_ratio (
FloatTensor
, \(N \times 1\)) – Unconstrained ratio values, which will be transformed into the range of \([1, \infty)\).log_knee (
FloatTensor
, \(N \times 1\), optional) – Log of knee values (default:None
).
- Returns:
A batch of output signals of shape \(B \times C \times L\).
- Return type:
FloatTensor
- parameter_size()
- Returns:
A dictionary that contains each parameter tensor’s shape.
- Return type:
Dict[str, Tuple[int, ...]]
- static gain_hard_knee(log_energy, log_threshold, log_ratio, _)
Compute log-compression gain with the hard knee.
- static gain_quad_knee(log_energy, log_threshold, log_ratio, log_knee)
Compute log-compression gain with the quadratic knee.
- static gain_exp_knee(log_energy, log_threshold, log_ratio, log_knee)
Compute log-compression gain with the exponential knee.