Workspace Class
- class flowkit.Workspace(wsp_file_path, fcs_samples=None, ignore_missing_files=False, find_fcs_files_from_wsp=False)
A Workspace represents an imported FlowJo workspace (.wsp file).
- Parameters:
wsp_file_path – FlowJo WSP file as a file name/path, file object, or file-like object
fcs_samples – str or list. If given a string, it can be a directory path or a file path. If a directory, any .fcs files in the directory will be found. If a list, then it must be a list of file paths or a list of Sample instances. Lists of mixed types are not supported. Note that only FCS files matching the ones referenced in the .wsp file will be retained in the Workspace.
ignore_missing_files – Controls behavior for missing FCS files. If True, gate data for missing FCS files (i.e. not in fcs_samples arg) will still be loaded. If False, warnings are issued for FCS files found in the WSP file that were not loaded in the Workspace and gate data for these missing files will not be retained. Default is False.
find_fcs_files_from_wsp – Controls whether to search for FCS files based on URI params within the FlowJo workspace file.
Public Methods:
__init__
(wsp_file_path[, fcs_samples, ...])__repr__
()Return repr(self).
summary
()Retrieve a summary of Workspace information, including a list of sample groups defined, along with the sample and gate counts for those sample groups.
get_sample_ids
([group_name, loaded_only])Retrieve the list of Sample IDs that in the Workspace, optionally filtered by sample group and/or loaded status.
get_sample
(sample_id)Retrieve a Sample instance from the Workspace.
get_samples
([group_name])Retrieve list of Sample instances, optionally filtered by sample group.
get_sample_groups
()Retrieve the list of sample group names defined in the Workspace.
get_gate_ids
(sample_id)Retrieve the list of gate IDs defined for the specified sample.
find_matching_gate_paths
(sample_id, gate_name)Find all gate paths in the gating strategy for the given Sample matching the given gate name.
get_child_gate_ids
(sample_id, gate_name[, ...])Retrieve list of child gate IDs for a sample given the parent gate name (and path if ambiguous) in the gating strategy.
get_gate_hierarchy
(sample_id[, output])Retrieve the hierarchy of gates in the sample's gating strategy.
get_gating_strategy
(sample_id)Retrieve a copy of the GatingStrategy for a specific sample.
get_comp_matrix
(sample_id)Retrieve the compensation matrix for a specific sample.
get_transform
(sample_id, transform_id)Retrieve a single transform for a sample using the transform ID.
get_transforms
(sample_id)Retrieve the list of transformations for a specific sample.
get_gate
(sample_id, gate_name[, gate_path])Retrieve a gate instance for a sample by its gate ID.
analyze_samples
([group_name, sample_id, ...])Process gates for samples.
get_gating_results
(sample_id)Retrieve analyzed gating results gates for a sample.
get_analysis_report
([group_name])Retrieve the report for the analyzed samples as a pandas DataFrame.
get_gate_membership
(sample_id, gate_name[, ...])Retrieve a boolean array indicating gate membership for the events in the specified sample.
get_gate_events
(sample_id[, gate_name, ...])Retrieve gated events for a specific gate & sample as a pandas DataFrame.
plot_gate
(sample_id, gate_name[, gate_path, ...])Returns an interactive plot for the specified gate.
plot_scatter
(sample_id, x_label, y_label[, ...])Returns an interactive scatter plot for the specified channel data.
- summary()
Retrieve a summary of Workspace information, including a list of sample groups defined, along with the sample and gate counts for those sample groups.
- Returns:
Pandas DataFrame containing Workspace summary information
- get_sample_ids(group_name=None, loaded_only=True)
Retrieve the list of Sample IDs that in the Workspace, optionally filtered by sample group and/or loaded status. Default is all loaded samples.
- Parameters:
group_name – Filter returned sample IDs by a sample group. If None, all sample IDs are returned
loaded_only – Filter returned sample IDs for only loaded samples. If False, all the samples will be returned, including any missing sample IDs referenced in the workspace. Default is True for returning only loaded sample IDs.
- Returns:
list of Sample ID strings
- get_sample(sample_id)
Retrieve a Sample instance from the Workspace.
- Parameters:
sample_id – a text string representing the sample
- Returns:
a Sample instance
- get_samples(group_name=None)
Retrieve list of Sample instances, optionally filtered by sample group.
- Parameters:
group_name – Filter returned samples by a sample group. If None, all samples are returned
- Returns:
list of Sample instances
- get_sample_groups()
Retrieve the list of sample group names defined in the Workspace.
- Returns:
list of sample group ID strings
- get_gate_ids(sample_id)
Retrieve the list of gate IDs defined for the specified sample. The gate ID is a 2-item tuple where the first item is a string representing the gate name and the second item is a tuple of the gate path.
- Parameters:
sample_id – a text string representing a Sample instance
- Returns:
list of gate ID tuples
- find_matching_gate_paths(sample_id, gate_name)
Find all gate paths in the gating strategy for the given Sample matching the given gate name.
- Parameters:
sample_id – a text string representing a Sample instance
gate_name – text string of a gate name
- Returns:
list of gate paths (list of tuples)
- get_child_gate_ids(sample_id, gate_name, gate_path=None)
Retrieve list of child gate IDs for a sample given the parent gate name (and path if ambiguous) in the gating strategy.
- Parameters:
sample_id – a text string representing a Sample instance
gate_name – text string of a gate name
gate_path – complete tuple of gate IDs for unique set of gate ancestors. Required if gate.gate_name is ambiguous
- Returns:
list of Gate IDs (tuple of gate name plus gate path). Returns an empty list if no child gates exist.
- Raises:
GateReferenceError – if gate ID is not found in gating strategy or if gate name is ambiguous
- get_gate_hierarchy(sample_id, output='ascii', **kwargs)
Retrieve the hierarchy of gates in the sample’s gating strategy. Output is available in several formats, including text, dictionary, or JSON. If output == ‘json’, extra keyword arguments are passed to json.dumps
- Parameters:
sample_id – a text string representing a Sample instance
output – Determines format of hierarchy returned, either ‘ascii’, ‘dict’, or ‘JSON’ (default is ‘ascii’)
- Returns:
gate hierarchy as a text string or a dictionary
- get_gating_strategy(sample_id)
Retrieve a copy of the GatingStrategy for a specific sample. sample_id is required as each sample may have customized gates
- Parameters:
sample_id – a text string representing a Sample instance
- Returns:
a copy of the GatingStrategy instance
- get_comp_matrix(sample_id)
Retrieve the compensation matrix for a specific sample.
- Parameters:
sample_id – a text string representing a Sample instance
- Returns:
a copy of a Matrix instance
- get_transform(sample_id, transform_id)
Retrieve a single transform for a sample using the transform ID. Transform IDs in the Workspace class correspond to a channel label in the sample.
- Parameters:
sample_id – a text string representing a Sample instance
transform_id – a text string representing a Transform instance
- Returns:
- get_transforms(sample_id)
Retrieve the list of transformations for a specific sample.
- Parameters:
sample_id – a text string representing a Sample instance
- Returns:
a list of Transform instances
- get_gate(sample_id, gate_name, gate_path=None)
Retrieve a gate instance for a sample by its gate ID.
- Parameters:
sample_id – a text string representing a Sample instance.
gate_name – text string of a gate ID
gate_path – tuple of gate IDs for unique set of gate ancestors. Required if gate_name is ambiguous
- Returns:
Subclass of a Gate object
- analyze_samples(group_name=None, sample_id=None, cache_events=False, use_mp=True, verbose=False)
Process gates for samples. Samples to analyze can be filtered by group name or sample ID. After running, results can be retrieved using the get_gating_results, get_group_report, and get_gate_membership, methods.
- Parameters:
group_name – optional group name, if specified only samples in this group will be processed
sample_id – optional sample ID, if specified only this sample will be processed (overrides group filter)
cache_events – Whether to cache pre-processed events (compensated and transformed). This can be useful to speed up processing of gates that share the same pre-processing instructions for the same channel data, but can consume significantly more memory space. See the related clear_cache method for additional information. Default is False.
use_mp – Controls whether multiprocessing is used to gate samples (default is True). Multiprocessing can fail for large workloads (lots of samples & gates) due to running out of memory. If encountering memory errors, set use_mp to False (processing will take longer, but will use significantly less memory).
verbose – if True, print a line for every gate processed (default is False)
- Returns:
None
- get_gating_results(sample_id)
Retrieve analyzed gating results gates for a sample.
- Parameters:
sample_id – a text string representing a Sample instance
- Returns:
GatingResults instance
- get_analysis_report(group_name=None)
Retrieve the report for the analyzed samples as a pandas DataFrame.
- Parameters:
group_name – optional group name, if specified only results from samples in this group will be processed, otherwise results from all analyzed samples will be returned
- Returns:
pandas DataFrame
- get_gate_membership(sample_id, gate_name, gate_path=None)
Retrieve a boolean array indicating gate membership for the events in the specified sample. Note, the same gate ID may be found in multiple gate paths, i.e. the gate ID can be ambiguous. In this case, specify the full gate path to retrieve gate indices.
- Parameters:
sample_id – a text string representing a Sample instance
gate_name – text string of a gate name
gate_path – complete tuple of gate IDs for unique set of gate ancestors. Required if gate_name is ambiguous
- Returns:
NumPy boolean array (length of sample event count)
- get_gate_events(sample_id, gate_name=None, gate_path=None)
Retrieve gated events for a specific gate & sample as a pandas DataFrame. Gated events are processed according to the sample’s compensation & channel transforms.
- Parameters:
sample_id – a text string representing a Sample instance
gate_name – text string of a gate ID. If None, all Sample events will be returned (i.e. un-gated)
gate_path – complete tuple of gate IDs for unique set of gate ancestors. Required if gate_name is ambiguous
- Returns:
a pandas DataFrames with the gated events, compensated & transformed according to the group’s compensation matrix and transforms
- plot_gate(sample_id, gate_name, gate_path=None, subsample_count=10000, random_seed=1, x_min=None, x_max=None, y_min=None, y_max=None, color_density=True, bin_width=4)
Returns an interactive plot for the specified gate. The type of plot is determined by the number of dimensions used to define the gate: single dimension gates will be histograms, 2-D gates will be returned as a scatter plot.
- Parameters:
sample_id – The sample ID for the FCS sample to plot
gate_name – Gate name to filter events (only events within the given gate will be plotted)
gate_path – tuple of gate names for full set of gate ancestors. Required if gate_name is ambiguous
subsample_count – Number of events to use as a sub-sample. If the number of events in the Sample is less than the requested sub-sample count, then the maximum number of available events is used for the sub-sample.
random_seed – Random seed used for sub-sampling events
x_min – Lower bound of x-axis. If None, channel’s min value will be used with some padding to keep events off the edge of the plot.
x_max – Upper bound of x-axis. If None, channel’s max value will be used with some padding to keep events off the edge of the plot.
y_min – Lower bound of y-axis. If None, channel’s min value will be used with some padding to keep events off the edge of the plot.
y_max – Upper bound of y-axis. If None, channel’s max value will be used with some padding to keep events off the edge of the plot.
color_density – Whether to color the events by density, similar to a heat map. Default is True.
bin_width – Bin size to use for the color density, in units of event point size. Larger values produce smoother gradients. Default is 4 for a 4x4 grid size.
- Returns:
A Bokeh Figure object containing the interactive scatter plot.
- plot_scatter(sample_id, x_label, y_label, gate_name=None, gate_path=None, subsample_count=10000, random_seed=1, color_density=True, bin_width=4, x_min=None, x_max=None, y_min=None, y_max=None)
Returns an interactive scatter plot for the specified channel data.
- Parameters:
sample_id – The sample ID for the FCS sample to plot
x_label – channel label (PnN) to use for the x-axis data
y_label – channel label (PnN) to use for the y-axis data
gate_name – Gate name to filter events (only events within the given gate will be plotted)
gate_path – tuple of gate names for full set of gate ancestors. Required if gate_name is ambiguous
subsample_count – Number of events to use as a sub-sample. If the number of events in the Sample is less than the requested sub-sample count, then the maximum number of available events is used for the sub-sample.
random_seed – Random seed used for sub-sampling events
color_density – Whether to color the events by density, similar to a heat map. Default is True.
bin_width – Bin size to use for the color density, in units of event point size. Larger values produce smoother gradients. Default is 4 for a 4x4 grid size.
x_min – Lower bound of x-axis. If None, channel’s min value will be used with some padding to keep events off the edge of the plot.
x_max – Upper bound of x-axis. If None, channel’s max value will be used with some padding to keep events off the edge of the plot.
y_min – Lower bound of y-axis. If None, channel’s min value will be used with some padding to keep events off the edge of the plot.
y_max – Upper bound of y-axis. If None, channel’s max value will be used with some padding to keep events off the edge of the plot.
- Returns:
A Bokeh Figure object containing the interactive scatter plot.