grafx.processors.container
- class DryWet(processor, external_param=True)
Bases:
Module
An utility module that mixes the input (dry) with the wrapped processor’s output (wet).
For each pair of input
and output signal where and denote the wrapped processor and the parameters, respectively, we mix the input and output with a dry/wet mix as follows,Here, the dry/wet is further parameterized as
where is an unbounded logit and is logistic sigmoid. Hence, this processor’s learnable parameter is .- Parameters:
processor (
Module
) – Any SISO processor withforward
andparameter_size
method implemented properly.external_param (
bool
, optional) – If set toTrue
, we do not add our dry/wet weight shape to theparameter_size
method. This is useful when every processor usesDryWet
and it is more convinient to have a single dry/wet tensor for entire nodes instead of keeping a tensor for each type (default:True
).
- forward(input_signals, drywet_weight, **processor_kwargs)
Processes input audio with the processor and given parameters.
- Parameters:
input_signals (
FloatTensor
, ) – A batch of input audio signals that will be passed to the processor.**processor_kwargs (optional) – Keyword arguments (i.e., mostly parameters) that will be passed to the processor.
- Returns:
A batch of output signals of shape
.- Return type:
FloatTensor
- parameter_size()
- Returns:
The wrapped processor’s
parameter_size()
, optionally added with the dry/wet weight whenexternal_param
is set toFalse
.- Return type:
Dict[str, Tuple[int, ...]]
- class SerialChain(processors)
Bases:
Module
A utility module that serially connects the provided processors.
For processors
with their respective parameters , the serial chain applies each processor in order, where the output of the previous processor is fed to the next one.The set of all learnable parameters is given as
.Note that, from the audio processing perspective, exactly the same result can be achieved by connecting the processors
as individual nodes in a graph. Yet, this module can be useful when we use the same chain of processors repeatedly so that encapsulating them in a single node is more convenient.- Parameters:
processors (
Dict[str, Module]
) – A dictionary of processors with their names as keys. The order of the processors will be the same as the dictionary order. We assume that each processor hasforward()
andparameter_size()
method implemented properly.
- forward(input_signals, **processors_kwargs)
Processes input audio with the processor and given parameters.
- Parameters:
input_signals (
FloatTensor
, ) – A batch of input audio signals.**processors_kwargs (optional) – Keyword arguments (i.e., mostly parameters) that will be passed to the processor.
- Returns:
A batch of output signals of shape
.- Return type:
Tuple[FloatTensor, Dict[str, Any]]
- parameter_size()
- Returns:
A nested dictionary of depth at least 2 that contains each processor name as key and its
parameter_size()
as value.- Return type:
Dict[str, Dict[str, Union[dict, Tuple[int, ...]]]]
- class ParallelMix(processors, activation='softmax')
Bases:
Module
A container that mixes the multiple processor outputs.
We create a single processor with
processors , mixing their outputs with weights .By default, we take the pre-activation weights
as input. Then, for each , we apply , making it non-negative and have value of if the pre-activation input is near zero. Also, we can force the weights to have a sum of 1 by applying softmax, . This resembles the Differentiable architecture search (DARTS) [LSY19], if our aim is to select the best one among the processors. The set of all learnable parameters is given as .- forward(input_signals, parallel_weights, **processors_kwargs)
Processes input audio with the processor and given parameters.
- Parameters:
input_signals (
FloatTensor
, ) – A batch of input audio signals.log_gains (
FloatTensor
, ) – A batch of log-gain vectors of the GEQ.
- Returns:
A batch of output signals of shape
.- Return type:
FloatTensor
- parameter_size()
- Returns:
A nested dictionary of depth at least 2 that contains each processor name as key and its
parameter_size()
as value.- Return type:
Dict[str, Dict[str, Union[dict, Tuple[int, ...]]]]
- class GainStagingRegularization(processor, key='gain_reg')
Bases:
Module
A regularization module that wraps an audio processor and calculates the energy differences between the input and output audio. It can be used guide the processors to mimic gain-staging, a practice that aims to keep the signal energy roughly the same throughout the processing chain.
For each pair of input
and output signal where and denote the wrapped processor and the parameters, respeectively, we calculate their loudness difference with an energy function as follows,The energy function
computes log of mean energy across the time and channel axis. If the signals are stereo, then it is equivalent to calculating the log of mid-channel energy.- Parameters:
processor (
Module
) – Any SISO processor withforward
andparameter_size
method implemented properly.key (
str
, optional) – A dictionary key that will be used to store energy difference in the intermediate results. (default:"gain_reg"
)
- forward(input_signals, **processor_kwargs)
Processes input audio with the processor and given parameters.
- Parameters:
input_signals (
FloatTensor
, ) – A batch of input audio signals that will be passed to the processor.**processor_kwargs (optional) – Keyword arguments (i.e., mostly parameters) that will be passed to the processor.
- Returns:
A batch of output signals of shape
and dictionary of intermediate/auxiliary results added with the regularization loss.- Return type:
Tuple[FloatTensor, dict]
- parameter_size()
- Returns:
The wrapped processor’s
parameter_size()
.- Return type:
Dict[str, Tuple[int, ...]]