grafx.processors.container

class DryWet(processor, external_param=True)

Bases: Module

An utility module that mixes the input (dry) with the wrapped processor’s output (wet).

For each pair of input u[n] and output signal y[n]=f(u[n],p) where f and p denote the wrapped processor and the parameters, respectively, we mix the input and output with a dry/wet mix 0<w<1 as follows,

y[n]=(1w)u[n]+wy[n].

Here, the dry/wet is further parameterized as w=σ(zw) where zw is an unbounded logit and σ is logistic sigmoid. Hence, this processor’s learnable parameter is p{zw}.

Parameters:
  • processor (Module) – Any SISO processor with forward and parameter_size method implemented properly.

  • external_param (bool, optional) – If set to True, we do not add our dry/wet weight shape to the parameter_size method. This is useful when every processor uses DryWet and it is more convinient to have a single dry/wet tensor for entire nodes instead of keeping a tensor for each type (default: True).

forward(input_signals, drywet_weight, **processor_kwargs)

Processes input audio with the processor and given parameters.

Parameters:
  • input_signals (FloatTensor, B×C×L) – A batch of input audio signals that will be passed to the processor.

  • **processor_kwargs (optional) – Keyword arguments (i.e., mostly parameters) that will be passed to the processor.

Returns:

A batch of output signals of shape B×C×L.

Return type:

FloatTensor

parameter_size()
Returns:

The wrapped processor’s parameter_size(), optionally added with the dry/wet weight when external_param is set to False.

Return type:

Dict[str, Tuple[int, ...]]

class SerialChain(processors)

Bases: Module

A utility module that serially connects the provided processors.

For processors f1,,fK with their respective parameters p1,,pK, the serial chain f=fKf1 applies each processor in order, where the output of the previous processor is fed to the next one.

y[n]=(fKf1)(s[n];p1,,pK).

The set of all learnable parameters is given as p={p1,,pK}.

Note that, from the audio processing perspective, exactly the same result can be achieved by connecting the processors f1,,fK as individual nodes in a graph. Yet, this module can be useful when we use the same chain of processors repeatedly so that encapsulating them in a single node is more convenient.

Parameters:

processors (Dict[str, Module]) – A dictionary of processors with their names as keys. The order of the processors will be the same as the dictionary order. We assume that each processor has forward() and parameter_size() method implemented properly.

forward(input_signals, **processors_kwargs)

Processes input audio with the processor and given parameters.

Parameters:
  • input_signals (FloatTensor, B×C×L) – A batch of input audio signals.

  • **processors_kwargs (optional) – Keyword arguments (i.e., mostly parameters) that will be passed to the processor.

Returns:

A batch of output signals of shape B×C×L.

Return type:

Tuple[FloatTensor, Dict[str, Any]]

parameter_size()
Returns:

A nested dictionary of depth at least 2 that contains each processor name as key and its parameter_size() as value.

Return type:

Dict[str, Dict[str, Union[dict, Tuple[int, ...]]]]

class ParallelMix(processors, activation='softmax')

Bases: Module

A container that mixes the multiple processor outputs.

We create a single processor with K processors f1,,fK, mixing their outputs with weights w1,,wK.

y[n]=k=1Kwkfk(s[n];pk).

By default, we take the pre-activation weights w~1,,w~K as input. Then, for each w~k, we apply wk=log(1+exp(w~k))/Klog2, making it non-negative and have value of 1/K if the pre-activation input is near zero. Also, we can force the weights to have a sum of 1 by applying softmax, wk=exp(w~k)/i=1Kexp(w~i). This resembles the Differentiable architecture search (DARTS) [LSY19], if our aim is to select the best one among the K processors. The set of all learnable parameters is given as p={w~,p1,,pK}.

forward(input_signals, parallel_weights, **processors_kwargs)

Processes input audio with the processor and given parameters.

Parameters:
  • input_signals (FloatTensor, B×C×L) – A batch of input audio signals.

  • log_gains (FloatTensor, B×K) – A batch of log-gain vectors of the GEQ.

Returns:

A batch of output signals of shape B×C×L.

Return type:

FloatTensor

parameter_size()
Returns:

A nested dictionary of depth at least 2 that contains each processor name as key and its parameter_size() as value.

Return type:

Dict[str, Dict[str, Union[dict, Tuple[int, ...]]]]

class GainStagingRegularization(processor, key='gain_reg')

Bases: Module

A regularization module that wraps an audio processor and calculates the energy differences between the input and output audio. It can be used guide the processors to mimic gain-staging, a practice that aims to keep the signal energy roughly the same throughout the processing chain.

For each pair of input u[n] and output signal y[n]=f(u[n],p) where f and p denote the wrapped processor and the parameters, respeectively, we calculate their loudness difference with an energy function σ as follows,

d=|g(y[n])g(u[n])|.

The energy function g computes log of mean energy across the time and channel axis. If the signals are stereo, then it is equivalent to calculating the log of mid-channel energy.

Parameters:
  • processor (Module) – Any SISO processor with forward and parameter_size method implemented properly.

  • key (str, optional) – A dictionary key that will be used to store energy difference in the intermediate results. (default: "gain_reg")

forward(input_signals, **processor_kwargs)

Processes input audio with the processor and given parameters.

Parameters:
  • input_signals (FloatTensor, B×C×L) – A batch of input audio signals that will be passed to the processor.

  • **processor_kwargs (optional) – Keyword arguments (i.e., mostly parameters) that will be passed to the processor.

Returns:

A batch of output signals of shape B×C×L and dictionary of intermediate/auxiliary results added with the regularization loss.

Return type:

Tuple[FloatTensor, dict]

parameter_size()
Returns:

The wrapped processor’s parameter_size().

Return type:

Dict[str, Tuple[int, ...]]