Workspace Class

class flowkit.Workspace(wsp_file_path, fcs_samples=None, ignore_missing_files=False, find_fcs_files_from_wsp=False)

A Workspace represents an imported FlowJo workspace (.wsp file).

Parameters:
  • wsp_file_path – FlowJo WSP file as a file name/path, file object, or file-like object

  • fcs_samples – str or list. If given a string, it can be a directory path or a file path. If a directory, any .fcs files in the directory will be found. If a list, then it must be a list of file paths or a list of Sample instances. Lists of mixed types are not supported. Note that only FCS files matching the ones referenced in the .wsp file will be retained in the Workspace.

  • ignore_missing_files – Controls behavior for missing FCS files. If True, gate data for missing FCS files (i.e. not in fcs_samples arg) will still be loaded. If False, warnings are issued for FCS files found in the WSP file that were not loaded in the Workspace and gate data for these missing files will not be retained. Default is False.

  • find_fcs_files_from_wsp – Controls whether to search for FCS files based on URI params within the FlowJo workspace file.

Public Methods:

__init__(wsp_file_path[, fcs_samples, ...])

__repr__()

Return repr(self).

summary()

Retrieve a summary of Workspace information, including a list of sample groups defined, along with the sample and gate counts for those sample groups.

get_sample_ids([group_name, loaded_only])

Retrieve the list of Sample IDs that in the Workspace, optionally filtered by sample group and/or loaded status.

get_sample(sample_id)

Retrieve a Sample instance from the Workspace.

get_samples([group_name])

Retrieve list of Sample instances, optionally filtered by sample group.

get_sample_groups()

Retrieve the list of sample group names defined in the Workspace.

get_gate_ids(sample_id)

Retrieve the list of gate IDs defined for the specified sample.

find_matching_gate_paths(sample_id, gate_name)

Find all gate paths in the gating strategy for the given Sample matching the given gate name.

get_child_gate_ids(sample_id, gate_name[, ...])

Retrieve list of child gate IDs for a sample given the parent gate name (and path if ambiguous) in the gating strategy.

get_gate_hierarchy(sample_id[, output])

Retrieve the hierarchy of gates in the sample's gating strategy.

get_gating_strategy(sample_id)

Retrieve a copy of the GatingStrategy for a specific sample.

get_comp_matrix(sample_id)

Retrieve the compensation matrix for a specific sample.

get_transform(sample_id, transform_id)

Retrieve a single transform for a sample using the transform ID.

get_transforms(sample_id)

Retrieve the list of transformations for a specific sample.

get_gate(sample_id, gate_name[, gate_path])

Retrieve a gate instance for a sample by its gate ID.

analyze_samples([group_name, sample_id, ...])

Process gates for samples.

get_gating_results(sample_id)

Retrieve analyzed gating results gates for a sample.

get_analysis_report([group_name])

Retrieve the report for the analyzed samples as a pandas DataFrame.

get_gate_membership(sample_id, gate_name[, ...])

Retrieve a boolean array indicating gate membership for the events in the specified sample.

get_gate_events(sample_id[, gate_name, ...])

Retrieve gated events for a specific gate & sample as a pandas DataFrame.

plot_gate(sample_id, gate_name[, gate_path, ...])

Returns an interactive plot for the specified gate.

plot_scatter(sample_id, x_label, y_label[, ...])

Returns an interactive scatter plot for the specified channel data.


summary()

Retrieve a summary of Workspace information, including a list of sample groups defined, along with the sample and gate counts for those sample groups.

Returns:

Pandas DataFrame containing Workspace summary information

get_sample_ids(group_name=None, loaded_only=True)

Retrieve the list of Sample IDs that in the Workspace, optionally filtered by sample group and/or loaded status. Default is all loaded samples.

Parameters:
  • group_name – Filter returned sample IDs by a sample group. If None, all sample IDs are returned

  • loaded_only – Filter returned sample IDs for only loaded samples. If False, all the samples will be returned, including any missing sample IDs referenced in the workspace. Default is True for returning only loaded sample IDs.

Returns:

list of Sample ID strings

get_sample(sample_id)

Retrieve a Sample instance from the Workspace.

Parameters:

sample_id – a text string representing the sample

Returns:

a Sample instance

get_samples(group_name=None)

Retrieve list of Sample instances, optionally filtered by sample group.

Parameters:

group_name – Filter returned samples by a sample group. If None, all samples are returned

Returns:

list of Sample instances

get_sample_groups()

Retrieve the list of sample group names defined in the Workspace.

Returns:

list of sample group ID strings

get_gate_ids(sample_id)

Retrieve the list of gate IDs defined for the specified sample. The gate ID is a 2-item tuple where the first item is a string representing the gate name and the second item is a tuple of the gate path.

Parameters:

sample_id – a text string representing a Sample instance

Returns:

list of gate ID tuples

find_matching_gate_paths(sample_id, gate_name)

Find all gate paths in the gating strategy for the given Sample matching the given gate name.

Parameters:
  • sample_id – a text string representing a Sample instance

  • gate_name – text string of a gate name

Returns:

list of gate paths (list of tuples)

get_child_gate_ids(sample_id, gate_name, gate_path=None)

Retrieve list of child gate IDs for a sample given the parent gate name (and path if ambiguous) in the gating strategy.

Parameters:
  • sample_id – a text string representing a Sample instance

  • gate_name – text string of a gate name

  • gate_path – complete tuple of gate IDs for unique set of gate ancestors. Required if gate.gate_name is ambiguous

Returns:

list of Gate IDs (tuple of gate name plus gate path). Returns an empty list if no child gates exist.

Raises:

GateReferenceError – if gate ID is not found in gating strategy or if gate name is ambiguous

get_gate_hierarchy(sample_id, output='ascii', **kwargs)

Retrieve the hierarchy of gates in the sample’s gating strategy. Output is available in several formats, including text, dictionary, or JSON. If output == ‘json’, extra keyword arguments are passed to json.dumps

Parameters:
  • sample_id – a text string representing a Sample instance

  • output – Determines format of hierarchy returned, either ‘ascii’, ‘dict’, or ‘JSON’ (default is ‘ascii’)

Returns:

gate hierarchy as a text string or a dictionary

get_gating_strategy(sample_id)

Retrieve a copy of the GatingStrategy for a specific sample. sample_id is required as each sample may have customized gates

Parameters:

sample_id – a text string representing a Sample instance

Returns:

a copy of the GatingStrategy instance

get_comp_matrix(sample_id)

Retrieve the compensation matrix for a specific sample.

Parameters:

sample_id – a text string representing a Sample instance

Returns:

a copy of a Matrix instance

get_transform(sample_id, transform_id)

Retrieve a single transform for a sample using the transform ID. Transform IDs in the Workspace class correspond to a channel label in the sample.

Parameters:
  • sample_id – a text string representing a Sample instance

  • transform_id – a text string representing a Transform instance

Returns:

get_transforms(sample_id)

Retrieve the list of transformations for a specific sample.

Parameters:

sample_id – a text string representing a Sample instance

Returns:

a list of Transform instances

get_gate(sample_id, gate_name, gate_path=None)

Retrieve a gate instance for a sample by its gate ID.

Parameters:
  • sample_id – a text string representing a Sample instance.

  • gate_name – text string of a gate ID

  • gate_path – tuple of gate IDs for unique set of gate ancestors. Required if gate_name is ambiguous

Returns:

Subclass of a Gate object

analyze_samples(group_name=None, sample_id=None, cache_events=False, use_mp=True, verbose=False)

Process gates for samples. Samples to analyze can be filtered by group name or sample ID. After running, results can be retrieved using the get_gating_results, get_group_report, and get_gate_membership, methods.

Parameters:
  • group_name – optional group name, if specified only samples in this group will be processed

  • sample_id – optional sample ID, if specified only this sample will be processed (overrides group filter)

  • cache_events – Whether to cache pre-processed events (compensated and transformed). This can be useful to speed up processing of gates that share the same pre-processing instructions for the same channel data, but can consume significantly more memory space. See the related clear_cache method for additional information. Default is False.

  • use_mp – Controls whether multiprocessing is used to gate samples (default is True). Multiprocessing can fail for large workloads (lots of samples & gates) due to running out of memory. If encountering memory errors, set use_mp to False (processing will take longer, but will use significantly less memory).

  • verbose – if True, print a line for every gate processed (default is False)

Returns:

None

get_gating_results(sample_id)

Retrieve analyzed gating results gates for a sample.

Parameters:

sample_id – a text string representing a Sample instance

Returns:

GatingResults instance

get_analysis_report(group_name=None)

Retrieve the report for the analyzed samples as a pandas DataFrame.

Parameters:

group_name – optional group name, if specified only results from samples in this group will be processed, otherwise results from all analyzed samples will be returned

Returns:

pandas DataFrame

get_gate_membership(sample_id, gate_name, gate_path=None)

Retrieve a boolean array indicating gate membership for the events in the specified sample. Note, the same gate ID may be found in multiple gate paths, i.e. the gate ID can be ambiguous. In this case, specify the full gate path to retrieve gate indices.

Parameters:
  • sample_id – a text string representing a Sample instance

  • gate_name – text string of a gate name

  • gate_path – complete tuple of gate IDs for unique set of gate ancestors. Required if gate_name is ambiguous

Returns:

NumPy boolean array (length of sample event count)

get_gate_events(sample_id, gate_name=None, gate_path=None)

Retrieve gated events for a specific gate & sample as a pandas DataFrame. Gated events are processed according to the sample’s compensation & channel transforms.

Parameters:
  • sample_id – a text string representing a Sample instance

  • gate_name – text string of a gate ID. If None, all Sample events will be returned (i.e. un-gated)

  • gate_path – complete tuple of gate IDs for unique set of gate ancestors. Required if gate_name is ambiguous

Returns:

a pandas DataFrames with the gated events, compensated & transformed according to the group’s compensation matrix and transforms

plot_gate(sample_id, gate_name, gate_path=None, subsample_count=10000, random_seed=1, x_min=None, x_max=None, y_min=None, y_max=None, color_density=True, bin_width=4)

Returns an interactive plot for the specified gate. The type of plot is determined by the number of dimensions used to define the gate: single dimension gates will be histograms, 2-D gates will be returned as a scatter plot.

Parameters:
  • sample_id – The sample ID for the FCS sample to plot

  • gate_name – Gate name to filter events (only events within the given gate will be plotted)

  • gate_path – tuple of gate names for full set of gate ancestors. Required if gate_name is ambiguous

  • subsample_count – Number of events to use as a sub-sample. If the number of events in the Sample is less than the requested sub-sample count, then the maximum number of available events is used for the sub-sample.

  • random_seed – Random seed used for sub-sampling events

  • x_min – Lower bound of x-axis. If None, channel’s min value will be used with some padding to keep events off the edge of the plot.

  • x_max – Upper bound of x-axis. If None, channel’s max value will be used with some padding to keep events off the edge of the plot.

  • y_min – Lower bound of y-axis. If None, channel’s min value will be used with some padding to keep events off the edge of the plot.

  • y_max – Upper bound of y-axis. If None, channel’s max value will be used with some padding to keep events off the edge of the plot.

  • color_density – Whether to color the events by density, similar to a heat map. Default is True.

  • bin_width – Bin size to use for the color density, in units of event point size. Larger values produce smoother gradients. Default is 4 for a 4x4 grid size.

Returns:

A Bokeh Figure object containing the interactive scatter plot.

plot_scatter(sample_id, x_label, y_label, gate_name=None, gate_path=None, subsample_count=10000, random_seed=1, color_density=True, bin_width=4, x_min=None, x_max=None, y_min=None, y_max=None)

Returns an interactive scatter plot for the specified channel data.

Parameters:
  • sample_id – The sample ID for the FCS sample to plot

  • x_label – channel label (PnN) to use for the x-axis data

  • y_label – channel label (PnN) to use for the y-axis data

  • gate_name – Gate name to filter events (only events within the given gate will be plotted)

  • gate_path – tuple of gate names for full set of gate ancestors. Required if gate_name is ambiguous

  • subsample_count – Number of events to use as a sub-sample. If the number of events in the Sample is less than the requested sub-sample count, then the maximum number of available events is used for the sub-sample.

  • random_seed – Random seed used for sub-sampling events

  • color_density – Whether to color the events by density, similar to a heat map. Default is True.

  • bin_width – Bin size to use for the color density, in units of event point size. Larger values produce smoother gradients. Default is 4 for a 4x4 grid size.

  • x_min – Lower bound of x-axis. If None, channel’s min value will be used with some padding to keep events off the edge of the plot.

  • x_max – Upper bound of x-axis. If None, channel’s max value will be used with some padding to keep events off the edge of the plot.

  • y_min – Lower bound of y-axis. If None, channel’s min value will be used with some padding to keep events off the edge of the plot.

  • y_max – Upper bound of y-axis. If None, channel’s max value will be used with some padding to keep events off the edge of the plot.

Returns:

A Bokeh Figure object containing the interactive scatter plot.