Audio (Signal, TimeData, FrequencyData)#

The spherical harmonic (SH) audio classes store audio data in the SH domain. Please refer to the Spherical Harmonic Definitions page for more general information.

The spherical harmonic audio classes are build upon the pyfar audio classes and we recommend to get familiar with these classes before continuing.

In addition to all functionality provided by the pyfar audio classes, the spherical harmonic audio classes allow to store parameters defining the spherical harmonics, these are the basis_type, normalization, channel_convention and the condon_shortley phase convention. The last dimension of the channel shape must always match a valid number of spherical harmonics, i.e. \((N+1)^2\), where \(N\) is the spherical harmonic order for which the audio data is created. The spherical harmonic order of the data contained in the signal can be accessed through the property n_max.

A SH signal can be created either directly

>>> import spharpy
>>> data = [[0, 0],  # data of first SH channel
...         [1, 1],  # data of second SH channel
...         [2, 2],  # data of third SH channel
...         [3, 4]]  # data of fourth SH channel
>>> sh_signal = spharpy.SphericalHarmonicSignal(
...     data, 44100, basis_type='real', normalization='N3D',
...     channel_convention='ACN', condon_shortley=False)

or from an SH definition

>>> # create a SH definition with default parameters
>>> definition = spharpy.SphericalHarmonicDefinition()
>>> sh_signal = spharpy.SphericalHarmonicSignal.from_definition(
...     definition, data, 44100)

Both examples create a first order SH signal with four SH channels and two time samples at a sampling rate of 44.1 kHz.

Classes:

SphericalHarmonicFrequencyData(data, ...[, ...])

Create spherical harmonic audio object with frequency domain spherical harmonic coefficients and frequencies.

SphericalHarmonicTimeData(data, times, ...)

Create spherical harmonic audio object with time domain spherical harmonic coefficients and times.

SphericalHarmonicSignal(data, sampling_rate, ...)

Create audio object with spherical harmonic coefficients in time or frequency domain.

class spharpy.SphericalHarmonicSignal(data, sampling_rate, basis_type, normalization, channel_convention, condon_shortley, n_samples=None, domain='time', fft_norm='none', comment='', is_complex=False)[source]#

Create audio object with spherical harmonic coefficients in time or frequency domain.

Objects of this class contain spherical harmonic coefficients which are directly convertible between time and frequency domain (equally spaced samples and frequency bins), the channel conventions ACN and FuMa, as well as the normalizations N3D, SN3D, or MaxN, see [1]. The definition of the spherical harmonics basis functions is based on the scipy convention which includes the Condon-Shortley phase, [2], [3].

Parameters:
  • data (ndarray, double) – Raw data of the spherical harmonics signal in the time or frequency domain. The data should have at least 2 dimensions, with the last dimension representing the time domain samples/frequency domain bins, the second to last the spherical harmonic coefficients, and any leading dimensions representing optional channels. Accordingly, the data should follow the ‘C’ memory layout, e.g. data of shape = (1, 4, 1024) has 1 channel with 4 spherical harmonic coefficients with 1024 samples or frequency bins each. Time data is converted to float. Frequency is converted to complex and must be provided as single sided spectra, i.e., for all frequencies between 0 Hz and half the sampling rate.

  • sampling_rate (double) – Sampling rate in Hz

  • basis_type (str) – Type of spherical harmonic basis, either 'complex' or 'real'.

  • normalization (str) – Normalization convention, either 'N3D', 'NM', 'SN3D', 'SNM', or 'maxN'. 'maxN' is only supported up to 3rd order.

  • channel_convention (str) – Channel ordering convention, either 'ACN' or 'FuMa'. 'FuMa' is only supported up to 3rd order.

  • condon_shortley (bool or str) – Whether to include the Condon-Shortley phase term. If True, Condon-Shortley is included, if False it is not included. 'auto' corresponds to True for complex basis_type and False for real basis_type.

  • n_samples (int, optional) – Number of time domain samples. Required if domain is 'freq'. The default is None, which assumes an even number of samples if the data is provided in the frequency domain.

  • domain ('time', 'freq', optional) – Domain of data. The default is 'time'

  • fft_norm (str, optional) – The normalization of the Discrete Fourier Transform (DFT). Can be 'none', 'unitary', 'amplitude', 'rms', 'power', or 'psd'. See normalization for more information. The default is 'none', which is typically used for energy signals, such as impulse responses.

  • comment (str) – A comment related to data. The default is None.

  • is_complex (bool, optional) – Specifies if the underlying time domain data are complex or real-valued. If True and domain is 'time', the input data will be cast to complex. The default is False.

References

Attributes:

T

Shorthand for Signal.transpose().

basis_type

Get or set the spherical harmonic basis type.

cdim

Return channel dimension.

channel_convention

Get or set the channel ordering convention.

comment

Get comment.

complex

Return or set the flag indicating if the time data is complex.

condon_shortley

Get or set the Condon-Shortley phase term.

cshape

Return channel shape.

domain

The domain the data is stored in.

fft_norm

The normalization for the Discrete Fourier Transform (DFT).

freq

Return or set the data in the frequency domain.

freq_raw

Return or set the frequency domain data without normalization.

frequencies

Frequencies of the discrete signal spectrum.

n_bins

Number of frequency bins.

n_max

Get or set the spherical harmonic order.

n_samples

The number of samples.

normalization

Get or set the normalization convention.

sampling_rate

The sampling rate of the signal.

signal_length

The length of the data in seconds.

signal_type

The signal type is 'energy' if the fft_norm = None and 'power' otherwise.

time

Return or set the time data.

times

Time instances the signal is sampled at.

Methods:

copy()

Return a copy of the audio object.

find_nearest_frequency(value)

Return the index that is closest to the query frequency.

find_nearest_time(value)

Return the index that is closest to the query time.

flatten()

Return flattened copy of the audio object.

from_definition(sh_definition, data, ...[, ...])

Create a SphericalHarmonicSignal class object from SphericalHarmonicDefinition object, data, and sampling rate.

reshape(newshape)

Return reshaped copy of the audio object.

transpose(*axes)

Transpose time/frequency data and return copy of the audio object.

property T#

Shorthand for Signal.transpose().

property basis_type#

Get or set the spherical harmonic basis type.

property cdim#

Return channel dimension.

The channel dimension (cdim) gives the number of dimensions of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects. Therefore it is equivalent to the length of the channel shape (cshape) (e.g. self.cshape = (2, 3); self.cdim = 2).

property channel_convention#

Get or set the channel ordering convention.

property comment#

Get comment.

property complex#

Return or set the flag indicating if the time data is complex.

property condon_shortley#

Get or set the Condon-Shortley phase term.

copy()#

Return a copy of the audio object.

property cshape#

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain#

The domain the data is stored in.

property fft_norm#

The normalization for the Discrete Fourier Transform (DFT).

See normalization and arithmetic operations for more information.

find_nearest_frequency(value)#

Return the index that is closest to the query frequency.

Parameters:

value (float, array-like) – The frequencies for which the indices are to be returned

Returns:

indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.

Return type:

int, array-like

find_nearest_time(value)#

Return the index that is closest to the query time.

Parameters:

value (float, array-like) – The times for which the indices are to be returned

Returns:

indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.

Return type:

int, array-like

flatten()#

Return flattened copy of the audio object.

Returns:

flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)

Return type:

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property freq#

Return or set the data in the frequency domain.

property freq_raw#

Return or set the frequency domain data without normalization.

property frequencies#

Frequencies of the discrete signal spectrum.

classmethod from_definition(sh_definition, data, sampling_rate, domain='time', fft_norm='none', comment='', is_complex=False)[source]#

Create a SphericalHarmonicSignal class object from SphericalHarmonicDefinition object, data, and sampling rate.

Parameters:
  • sh_definition (SphericalHarmonicDefinition) – The spherical harmonic definition.

  • data (ndarray, double) – Raw data of the spherical harmonic signal in the time or frequency domain. The data should have at least 2 dimensions, with the last dimension representing the time domain samples/frequency domain bins, the second to last the spherical harmonic coefficients, and any leading dimensions representing optional channels. Accordingly, the data should follow the ‘C’ memory layout, e.g. data of shape = (1, 4, 1024) has 1 channel with 4 spherical harmonic coefficients with 1024 samples or frequency bins each. Time data is converted to float. Frequency is converted to complex and must be provided as single sided spectra, i.e., for all frequencies between 0 Hz and half the sampling rate.

  • sampling_rate (double) – Sampling rate in Hz

  • domain ('time', 'freq', optional) – Domain of data. The default is 'time'

  • fft_norm (str, optional) – The normalization of the Discrete Fourier Transform (DFT). Can be 'none', 'unitary', 'amplitude', 'rms', 'power', or 'psd'. See normalization for more information. The default is 'none', which is typically used for energy signals, such as impulse responses.

  • comment (str) – A comment related to data. The default is None.

  • is_complex (bool, optional) – Specifies if the underlying time domain data are complex or real-valued. If True and domain is 'time', the input data will be cast to complex. The default is False.

property n_bins#

Number of frequency bins.

property n_max#

Get or set the spherical harmonic order.

property n_samples#

The number of samples.

property normalization#

Get or set the normalization convention.

reshape(newshape)#

Return reshaped copy of the audio object.

Parameters:

newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.

Returns:

reshaped – reshaped copy of the audio object.

Return type:

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

property sampling_rate#

The sampling rate of the signal.

property signal_length#

The length of the data in seconds.

property signal_type#

The signal type is 'energy' if the fft_norm = None and 'power' otherwise.

property time#

Return or set the time data.

property times#

Time instances the signal is sampled at.

transpose(*axes)#

Transpose time/frequency data and return copy of the audio object.

Parameters:

axes (empty, None, iterable of ints, or n ints) –

Define how the caxes are ordered in the transposed audio object. Note that the last dimension of the data in the audio object always contains the time samples or frequency bins and can not be transposed.

empty (default) or None

reverses the order of self.caxes.

iterable of ints

i in the j-th place of the interable means that the i-th caxis becomes transposed object’s j-th caxis.

n ints

same as ‘iterable of ints’.

class spharpy.SphericalHarmonicTimeData(data, times, basis_type, normalization, channel_convention, condon_shortley, comment='', is_complex=False)[source]#

Create spherical harmonic audio object with time domain spherical harmonic coefficients and times.

Objects of this class contain time data which is not directly convertible to the frequency domain, i.e., non-equidistant temporal sampling.

Parameters:
  • data (array, double) – Raw data in the time domain. The data should have at least 2 dimensions, with the last dimension representing the time domain samples, the second to last the spherical harmonic coefficients, and any leading dimensions representing optional channels. Accordingly, the data should follow the ‘C’ memory layout, e.g. data of shape = (1, 4, 1024) has 1 channel with 4 spherical harmonic coefficients with 1024 samples each. The data can be int, float or complex. Data of type int is converted to float.

  • times (array, double) – Times in seconds at which the data is sampled. The number of times must match the size of the last dimension of data, i.e., data.shape[-1].

  • basis_type (str) – Type of spherical harmonic basis, either 'complex' or 'real'.

  • normalization (str) – Normalization convention, either 'N3D', 'NM', 'maxN', 'SN3D' or 'SNM'. (maxN is only supported up to 3rd order)

  • channel_convention (str) – Channel ordering convention, either 'ACN' or 'FuMa'. (FuMa is only supported up to 3rd order)

  • condon_shortley (bool or str) – Whether to include the Condon-Shortley phase term. If True, Condon-Shortley is included, if False it is not included. 'auto' corresponds to True for complex basis_type and False for real basis_type.

  • comment (str) – A comment related to data. The default is "".

  • is_complex (bool, optional) – A flag which indicates if the time data are real or complex-valued. The default is False.

Attributes:

T

Shorthand for Signal.transpose().

basis_type

Get or set the spherical harmonic basis type.

cdim

Return channel dimension.

channel_convention

Get or set the channel ordering convention.

comment

Get comment.

complex

Return or set the flag indicating if the time data is complex.

condon_shortley

Get or set the Condon-Shortley phase term.

cshape

Return channel shape.

domain

The domain the data is stored in.

n_max

Get or set the spherical harmonic order.

n_samples

The number of samples.

normalization

Get or set the normalization convention.

signal_length

The length of the data in seconds.

time

Return or set the time data.

times

Time in seconds at which the signal is sampled.

Methods:

copy()

Return a copy of the audio object.

find_nearest_time(value)

Return the index that is closest to the query time.

flatten()

Return flattened copy of the audio object.

from_definition(sh_definition, data, times)

Create a SphericalHarmonicTimeData class object from SphericalHarmonicDefinition object, data, and times.

reshape(newshape)

Return reshaped copy of the audio object.

transpose(*axes)

Transpose time/frequency data and return copy of the audio object.

property T#

Shorthand for Signal.transpose().

property basis_type#

Get or set the spherical harmonic basis type.

property cdim#

Return channel dimension.

The channel dimension (cdim) gives the number of dimensions of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects. Therefore it is equivalent to the length of the channel shape (cshape) (e.g. self.cshape = (2, 3); self.cdim = 2).

property channel_convention#

Get or set the channel ordering convention.

property comment#

Get comment.

property complex#

Return or set the flag indicating if the time data is complex.

property condon_shortley#

Get or set the Condon-Shortley phase term.

copy()#

Return a copy of the audio object.

property cshape#

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain#

The domain the data is stored in.

find_nearest_time(value)#

Return the index that is closest to the query time.

Parameters:

value (float, array-like) – The times for which the indices are to be returned

Returns:

indices – The index for the given time instance. If the input was an array like, a numpy array of indices is returned.

Return type:

int, array-like

flatten()#

Return flattened copy of the audio object.

Returns:

flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)

Return type:

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

classmethod from_definition(sh_definition, data, times, comment='', is_complex=False)[source]#

Create a SphericalHarmonicTimeData class object from SphericalHarmonicDefinition object, data, and times.

Parameters:
  • sh_definition (SphericalHarmonicDefinition) – The spherical harmonic definition.

  • data (array, double) – Raw data in the time domain. The data should have at least 2 dimensions, with the last dimension representing the time domain samples, the second to last the spherical harmonic coefficients, and any leading dimensions representing optional channels. Accordingly, the data should follow the ‘C’ memory layout, e.g. data of shape = (1, 4, 1024) has 1 channel with 4 spherical harmonic coefficients with 1024 samples each. The data can be int, float or complex. Data of type int is converted to float.

  • times (array, double) – Times in seconds at which the data is sampled. The number of times must match the size of the last dimension of data, i.e., data.shape[-1].

  • comment (str) – A comment related to data. The default is None.

  • is_complex (bool, optional) – A flag which indicates if the time data are real or complex-valued. The default is False.

property n_max#

Get or set the spherical harmonic order.

property n_samples#

The number of samples.

property normalization#

Get or set the normalization convention.

reshape(newshape)#

Return reshaped copy of the audio object.

Parameters:

newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.

Returns:

reshaped – reshaped copy of the audio object.

Return type:

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

property signal_length#

The length of the data in seconds.

property time#

Return or set the time data.

property times#

Time in seconds at which the signal is sampled.

transpose(*axes)#

Transpose time/frequency data and return copy of the audio object.

Parameters:

axes (empty, None, iterable of ints, or n ints) –

Define how the caxes are ordered in the transposed audio object. Note that the last dimension of the data in the audio object always contains the time samples or frequency bins and can not be transposed.

empty (default) or None

reverses the order of self.caxes.

iterable of ints

i in the j-th place of the interable means that the i-th caxis becomes transposed object’s j-th caxis.

n ints

same as ‘iterable of ints’.

class spharpy.SphericalHarmonicFrequencyData(data, frequencies, basis_type, normalization, channel_convention, condon_shortley, comment='')[source]#

Create spherical harmonic audio object with frequency domain spherical harmonic coefficients and frequencies.

Objects of this class contain frequency data which is not directly convertible to the time domain, i.e., non-equidistantly spaced bins or incomplete spectra.

Parameters:
  • data (array, double) – Raw data in the frequency domain. The data should have at least 2 dimensions, with the last dimension representing the frequency domain bins, the second to last the spherical harmonic coefficients, and any leading dimensions representing optional channels. Accordingly, the data should follow the ‘C’ memory layout, e.g. data of shape = (1, 4, 1024) has 1 channel with 4 spherical harmonic coefficients with 1024 frequency bins each. The data can be int, float or complex. Data of type int is converted to float.

  • frequencies (array, double) – Frequencies of the data in Hz. The number of frequencies must match the size of the last dimension of data, i.e., data.shape[-1].

  • basis_type (str) – Type of spherical harmonic basis, either 'complex' or 'real'.

  • normalization (str) – Normalization convention, either 'N3D', 'NM', 'SN3D', 'SNM', or 'maxN'. 'maxN' is only supported up to 3rd order.

  • channel_convention (str) – Channel ordering convention, either 'ACN' or 'FuMa'. 'FuMa' is only supported up to 3rd order.

  • condon_shortley (bool or str) – Whether to include the Condon-Shortley phase term. If True, Condon-Shortley is included, if False it is not included. 'auto' corresponds to True for complex basis_type and False for real basis_type.

  • comment (str) – A comment related to data. The default is "".

Attributes:

T

Shorthand for Signal.transpose().

basis_type

Get or set the spherical harmonic basis type.

cdim

Return channel dimension.

channel_convention

Get or set the channel ordering convention.

comment

Get comment.

condon_shortley

Get or set the Condon-Shortley phase term.

cshape

Return channel shape.

domain

The domain the data is stored in.

freq

Return or set the data in the frequency domain.

frequencies

Frequencies of the discrete signal spectrum.

n_bins

Number of frequency bins.

n_max

Get or set the spherical harmonic order.

normalization

Get or set the normalization convention.

Methods:

copy()

Return a copy of the audio object.

find_nearest_frequency(value)

Return the index that is closest to the query frequency.

flatten()

Return flattened copy of the audio object.

from_definition(sh_definition, data, frequencies)

Create a SphericalHarmonicFrequencyData class object from SphericalHarmonicDefinition object, data, and frequencies rate.

reshape(newshape)

Return reshaped copy of the audio object.

transpose(*axes)

Transpose time/frequency data and return copy of the audio object.

property T#

Shorthand for Signal.transpose().

property basis_type#

Get or set the spherical harmonic basis type.

property cdim#

Return channel dimension.

The channel dimension (cdim) gives the number of dimensions of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects. Therefore it is equivalent to the length of the channel shape (cshape) (e.g. self.cshape = (2, 3); self.cdim = 2).

property channel_convention#

Get or set the channel ordering convention.

property comment#

Get comment.

property condon_shortley#

Get or set the Condon-Shortley phase term.

copy()#

Return a copy of the audio object.

property cshape#

Return channel shape.

The channel shape gives the shape of the audio data excluding the last dimension, which is n_samples for time domain objects and n_bins for frequency domain objects.

property domain#

The domain the data is stored in.

find_nearest_frequency(value)#

Return the index that is closest to the query frequency.

Parameters:

value (float, array-like) – The frequencies for which the indices are to be returned

Returns:

indices – The index for the given frequency. If the input was an array like, a numpy array of indices is returned.

Return type:

int, array-like

flatten()#

Return flattened copy of the audio object.

Returns:

flat – Flattened copy of audio object with flat.cshape = np.prod(audio.cshape)

Return type:

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same, e.g., an audio object of cshape=(4,3) and n_samples=512 will have cshape=(12, ) and n_samples=512 after flattening.

property freq#

Return or set the data in the frequency domain.

property frequencies#

Frequencies of the discrete signal spectrum.

classmethod from_definition(sh_definition, data, frequencies, comment='')[source]#

Create a SphericalHarmonicFrequencyData class object from SphericalHarmonicDefinition object, data, and frequencies rate.

Parameters:
  • sh_definition (SphericalHarmonicDefinition) – The spherical harmonic definition.

  • data (ndarray, double) – Raw data in the frequency domain. The data should have at least 2 dimensions, with the last dimension representing the frequency domain bins, the second to last the spherical harmonic coefficients, and any leading dimensions representing optional channels. Accordingly, the data should follow the ‘C’ memory layout, e.g. data of shape = (1, 4, 1024) has 1 channel with 4 spherical harmonic coefficients with 1024 frequency bins each. The data can be int, float or complex. Data of type int is converted to float.

  • frequencies (array, double) – Frequencies of the data in Hz. The number of frequencies must match the size of the last dimension of data, i.e., data.shape[-1].

  • comment (str) – A comment related to data. The default is None.

property n_bins#

Number of frequency bins.

property n_max#

Get or set the spherical harmonic order.

property normalization#

Get or set the normalization convention.

reshape(newshape)#

Return reshaped copy of the audio object.

Parameters:

newshape (int, tuple) – new cshape of the audio object. One entry of newshape dimension can be -1. In this case, the value is inferred from the remaining dimensions.

Returns:

reshaped – reshaped copy of the audio object.

Return type:

Signal, FrequencyData, TimeData

Notes

The number of samples and frequency bins always remains the same.

transpose(*axes)#

Transpose time/frequency data and return copy of the audio object.

Parameters:

axes (empty, None, iterable of ints, or n ints) –

Define how the caxes are ordered in the transposed audio object. Note that the last dimension of the data in the audio object always contains the time samples or frequency bins and can not be transposed.

empty (default) or None

reverses the order of self.caxes.

iterable of ints

i in the j-th place of the interable means that the i-th caxis becomes transposed object’s j-th caxis.

n ints

same as ‘iterable of ints’.