mstc.processing package

Submodules

mstc.processing.aggregator module

Components for data aggreation.

class Aggregator(attributes={})[source]

Bases: mstc.processing.core.Component

An abstract implementation of an aggregator class.

__init__(attributes={})[source]

Initialize the aggregator.

Parameters

attributes (dict) – attributes to add to the resulting xr.DataArray.

__call__(data_arrays)[source]

Aggregate an iterable of xr.DataArrays in a single xr.DataArray.

Parameters

data_arrays (iterable) – an iterable containing xr.DataArrays.

Returns

a xr.DataArray.

class Stacker(dim='new', axis=0, **kwargs)[source]

Bases: mstc.processing.aggregator.Aggregator

An implementation for a stack aggregator.

__init__(dim='new', axis=0, **kwargs)[source]

Initialize the stack aggregator.

Parameters
  • dim (str) – name of the new dimension, defaults to ‘new’.

  • axis (int) – axis where to insert the dimension, defaults to 0.

  • kwargs (dict) – arguments to pass to Aggregator as attributes.

__call__(data_arrays)[source]

Aggregate an iterable of xr.DataArrays in a single xr.DataArray by stacking on a new dimension.

Parameters

data_arrays (iterable) – an iterable containing xr.DataArrays.

Returns

a xr.DataArray.

mstc.processing.brancher module

Components for data branching.

class Brancher(attributes={})[source]

Bases: mstc.processing.core.Component

An abstract implementation of an brancher class.

__init__(attributes={})[source]

Initialize the brancher.

Parameters

attributes (dict) – attributes to add to each xr.DataArray contained in the resulting iterable of xr.DataArrays.

__call__(an_object)[source]

Branch an object in an iterable of xr.DataArrays.

Parameters

an_object (object) – an object.

Returns

an iterable of xr.DataArrays.

mstc.processing.core module

Abstract implementation of components and operations working on them.

class Component(attributes={})[source]

Bases: object

An abstract component implementation.

__init__(attributes={})[source]

Initialize the brancher.

Parameters

attributes (dict) – attributes to add to each xr.DataArray contained in the resulting iterable of xr.DataArrays.

__call__(an_object)[source]

A abstract component implementation

Parameters

an_object (object) – input for the component.

Returns

an object processed by the component.

class Operation(attributes={})[source]

Bases: mstc.processing.core.Component

An abstract implementation of a higher order class to define pipelines.

__init__(attributes={})[source]

Initialize an operation.

Parameters

kwargs (dict) – arguments to pass to Brancher as attributes.

__call__(an_object)[source]

A abstract operation implementation

Parameters

an_object (object) – input for the operation.

Returns

an object processed by the operation.

class SingleOperation(component, **kwargs)[source]

Bases: mstc.processing.core.Operation

An abstract implementation of an operation with a single component.

__init__(component, **kwargs)[source]

Initialize an operation where a single component is applied.

Parameters
  • component (Component) – a component.

  • kwargs (dict) – arguments to pass to Brancher as attributes.

class MultipleOperation(components, **kwargs)[source]

Bases: mstc.processing.core.Operation

An abstract implementation of an operation with multiple components.

__init__(components, **kwargs)[source]

Initialize an operation where multiple components are applied.

Parameters
  • components (iterable) – an iterable containing components.

  • kwargs (dict) – arguments to pass to Brancher as attributes.

mstc.processing.encoder module

Components for encoding.

class Encoder(attributes={})[source]

Bases: mstc.processing.core.Component

__init__(attributes={})[source]

Initialize the encoder.

Parameters

attributes (dict) – attributes to add to the resulting xr.DataArray.

__call__(an_object)[source]

Encoding samples from an object and return results in xr.DataArray.

Parameters

an_object (object) – an object containing the data to be encoded.

Returns

an object.

class HubEncoder(hub_module, batch_size=32, **kwargs)[source]

Bases: mstc.processing.encoder.Encoder

__init__(hub_module, batch_size=32, **kwargs)[source]

Initialize the encoder.

Parameters

attributes (dict) – attributes to add to the resulting xr.DataArray.

__call__(data_array)[source]

Encoding images with a tensorflow hub module. The images are resized to fit the module.

Parameters

data_array (xarray.DataArray) – expected dims are sample, height, width, channel. Length of channel must be 1 or 3.

Returns

a xr.DataArray.

class Flatten(dim='features', dim_to_keep='', **kwargs)[source]

Bases: mstc.processing.encoder.Encoder

Flatten a xr.DataArray over all dimensions but one.

__init__(dim='features', dim_to_keep='', **kwargs)[source]

Initialize the flattening encoder.

Parameters
  • dim (str) – name of the dimension generated by flattening, defaults to ‘features’.

  • dim_to_keep (str) – name of the dimension to keep, defaults to ‘’ that conists in flattening all dimensions but the first.

  • kwargs (dict) – arguments to pass to Encoder as attributes.

__call__(data_array)[source]

Encoding a xr.DataArray by flattening all dimensions but one. The kept dimension becomes the first of the generated xr.DataArray.

Parameters

data_array (xr.DataArray) – a data array that has to be flattened.

Returns

a xr.DataArray.

mstc.processing.io module

Components for data I/O.

class Reader(attributes={})[source]

Bases: mstc.processing.core.Component

An abstract implementation of a reader class.

__init__(attributes={})[source]

Initialize the reader.

Parameters

attributes (dict) – attributes to add to the resulting xr.DataArray.

__call__(globbing_pattern)[source]

Parse samples from a globbing pattern and generate an xr.DataArray.

Parameters

globbing_pattern (str) – a globbing pattern.

Returns

a xr.DataArray.

class PNGReader(**kwargs)[source]

Bases: mstc.processing.io.Reader

A .png reader.

__init__(**kwargs)[source]

Initialize the .png reader.

Parameters

kwargs (dict) – arguments to pass to Reader as attributes.

__call__(globbing_pattern)[source]

Parse samples from a globbing pattern for png files and generate an xr.DataArray using lazy loading.

Parameters

globbing_pattern (str) – a globbing pattern.

Returns

a xr.DataArray (sample, height, width, channels) assembled stacking the images and adding the filepath as a coordinate on the sample dimension.

mstc.processing.operation module

Higher order operations initialized with a component or components.

class Compose(components, **kwargs)[source]

Bases: mstc.processing.core.MultipleOperation

Implement a pipeline to execute a sequence of components. Propagating attributes.

__init__(components, **kwargs)[source]

Initialize a pipeline.

Parameters
  • components (iterable) – an iterable containing components.

  • kwargs (dict) – arguments to pass to Encoder as attributes.

__call__(an_object)[source]

Execute a composition of components.

Parameters

an_object (object) – an input for the composition.

Returns

a xr.DataArray or iterable of xr.DataArray generated from the composition.

class Broadcast(components, **kwargs)[source]

Bases: mstc.processing.core.MultipleOperation

Broadcast an input using multiple components.

__init__(components, **kwargs)[source]

Initialize the operation.

Parameters
  • components (iterable) – an iterable containing components.

  • kwargs (dict) – arguments to pass to Brancher as attributes.

__call__(an_object)[source]

Broadcast an object into an iterable of xr.DataArrays using multiple components.

Parameters

an_object (object) – an object.

Returns

an iterable of xr.DataArrays.

class BroadcastMap(components, **kwargs)[source]

Bases: mstc.processing.core.MultipleOperation

Broadcast input to multiple Map components initialized on the fly.

__init__(components, **kwargs)[source]

Initialize the operation.

Parameters
  • components (iterable) – an iterable containing components.

  • kwargs (dict) – arguments to pass to Brancher as attributes.

__call__(an_iterable)[source]

Apply each object to all components.

Parameters

an_iterable (iterable) – an iterable of objects.

Returns

an iterable of xr.DataArrays.

class ZipMap(components, **kwargs)[source]

Bases: mstc.processing.core.MultipleOperation

Map component of an iterable to respective xr.DataArray of an iterable.

__init__(components, **kwargs)[source]

Initialize the zip.

Parameters
  • components (iterable) – an iterable containing components.

  • kwargs (dict) – arguments to pass to components as attributes.

__call__(an_iterable)[source]

Encoding an iterable to an iterable of xr.DataArrays, the attributes are added to xr.DataArray.

Parameters

an_iterable (iterable) – an_iterable.

Returns

an iterable of xr.DataArrays.

class Map(component, **kwargs)[source]

Bases: mstc.processing.core.SingleOperation

Apply component to all objects in iterable

__init__(component, **kwargs)[source]

Initialize the reduction.

Parameters
  • component (Component) – a component accepting iterable.

  • kwargs (dict) – arguments to pass to component as attributes.

__call__(an_iterable)[source]

Map the component over an iterable using the standard map.

Parameters

an_iterable (iterable) – an iterable of objects.

Returns

a map object (iterable).

class Reduce(component, **kwargs)[source]

Bases: mstc.processing.core.SingleOperation

Apply component with iterable input with single returned object.

__init__(component, **kwargs)[source]

Initialize the reduction.

Parameters
  • component (Component) – a component accepting iterable.

  • kwargs (dict) – arguments to pass to component as attributes.

__call__(an_iterable)[source]

Reduce an iterable to a single object using the standard reduce.

Parameters

an_iterable (iterable) – an iterable of objects.

Returns

an object.

Module contents

Importing components.