FlowKit Tutorial - Part 2 - The transforms Module & Matrix Class

https://flowkit.readthedocs.io/en/latest/?badge=latest

Part 1 of tutorial notebook series introduced the Sample class. In part 2, we demonstrate how to transform Sample event data for better visualization and analysis, including how to compensate fluorescent channel data for spillover.

Explanations for why compensation and transformation is necessary for analysis are beyond the scope of this tutorial. There are many resources available online covering these topics, including the Wikipedia page on flow cytometry bioinformatics.

If you have any questions about FlowKit, find any bugs, or feel something is missing from these tutorials please submit an issue to the GitHub repository here.

Table of Contents

[1]:
import bokeh
from bokeh.plotting import show
import numpy as np

import flowkit as fk

bokeh.io.output_notebook()
Loading BokehJS ...
[2]:
# check version so users can verify they have the same version/API
fk.__version__
[2]:
'1.1.0'

transforms Module

The transforms module contains classes related to the various transformation of flow cytometry data. The classes include:

  • AsinhTransform

  • HyperlogTransform

  • LinearTransform

  • LogicleTransform

  • LogTransform

  • RatioTransform

  • WSPBiexTransform

  • WSPLogTransform

The most common transforms used in FCM data are the “logicle”, “biex”, and “asinh” transforms. The logicle transform (available as the LogicleTransform class) was created specifically for FCM data. In the flow community, the term “biex” transform typically refers to the default transform in the FlowJo application (and is now the default in FlowJo 10). The WSPBiexTransform replicates the FlowJo implementation of this transform. All transform classes beginning with “WSP” are FlowJo compatible transforms, the other being the WSPLogTransform which is a FlowJo variant of the standard LogTransform. In recent years, the inverse hyperbolic sine (asinh) has gained in popularity and is available in FlowKit as the AsinhTransform.

We will compare the output of these transforms and cover their input parameters in this notebook.

Create our Sample

We’ll use an 8-color FCS file for the examples in this notebook. First, let’s create our Sample instance and review the channels. Remember from part 1, the Sample is sub-sampled by default (for better plot performance).

[3]:
fcs_path = '../../data/8_color_data_set/fcs_files/101_DEN084Y5_15_E01_008_clean.fcs'
[4]:
sample = fk.Sample(fcs_path)
[5]:
sample.channels
[5]:
channel_number pnn pns png pne pnr
0 1 FSC-A 1.0 (0.0, 0.0) 262144.0
1 2 FSC-H 1.0 (0.0, 0.0) 262144.0
2 3 FSC-W 1.0 (0.0, 0.0) 262144.0
3 4 SSC-A 1.0 (0.0, 0.0) 262144.0
4 5 SSC-H 1.0 (0.0, 0.0) 262144.0
5 6 SSC-W 1.0 (0.0, 0.0) 262144.0
6 7 TNFa FITC FLR-A 1.0 (0.0, 0.0) 262144.0
7 8 CD8 PerCP-Cy55 FLR-A 1.0 (0.0, 0.0) 262144.0
8 9 IL2 BV421 FLR-A 1.0 (0.0, 0.0) 262144.0
9 10 Aqua Amine FLR-A 1.0 (0.0, 0.0) 262144.0
10 11 IFNg APC FLR-A 1.0 (0.0, 0.0) 262144.0
11 12 CD3 APC-H7 FLR-A 1.0 (0.0, 0.0) 262144.0
12 13 CD107a PE FLR-A 1.0 (0.0, 0.0) 262144.0
13 14 CD4 PE-Cy7 FLR-A 1.0 (0.0, 0.0) 262144.0
14 15 Time 1.0 (0.0, 0.0) 262144.0

Plot Untransformed

Let’s plot the CD3 channel versus our live/dead marker without transforming

Note: We can use the channel numbers (not indices) for easy reference to channels

[6]:
p = sample.plot_scatter(12, 10, source='raw', subsample=True)
show(p)

Plot LogicleTransform

Without a transformation it is difficult to visualize the different cell sub-populations. Let’s try the logicle transform and re-plot the data

[7]:
logicle_xform = fk.transforms.LogicleTransform(
    'logicle',
    param_t=262144,
    param_w=0.5,
    param_m=4.5,
    param_a=0
)
sample.apply_transform(logicle_xform)

p = sample.plot_scatter(12, 10, source='xform', subsample=True)
show(p)

That’s much better, but what are all those parameters? Check the help docs for more info…

The 1st argument is the ID for the transform, used to easily reference which transform we want to use for a particular gate. Below we’ll look at the documentation for the LogicleTransform to get a better understanding of the other input parameters.

Note: Reading the documentation is helpful, but experimenting with the input parameters gives a more intuitive understanding of what they do.

[8]:
help(fk.transforms.LogicleTransform)
Help on class LogicleTransform in module flowkit._models.transforms._transforms:

class LogicleTransform(flowkit._models.transforms._base_transform.Transform)
 |  LogicleTransform(transform_id, param_t, param_w, param_m, param_a)
 |
 |  Logicle transformation, implemented as defined in the
 |  GatingML 2.0 specification:
 |
 |  logicle(x, T, W, M, A) = root(B(y, T, W, M, A) − x)
 |
 |  where B is a modified bi-exponential function defined as:
 |
 |  B(y, T, W, M, A) = ae^(by) − ce^(−dy) − f
 |
 |  The Logicle transformation was originally defined in the publication:
 |
 |  Moore WA and Parks DR. Update for the logicle data scale including operational
 |  code implementations. Cytometry A., 2012:81A(4):273–277.
 |
 |  :param transform_id: A string identifying the transform
 |  :param param_t: parameter for the top of the linear scale (e.g. 262144)
 |  :param param_w: parameter for the approximate number of decades in the linear region
 |  :param param_m: parameter for the number of decades the true logarithmic scale
 |      approaches at the high end of the scale
 |  :param param_a: parameter for the additional number of negative decades
 |
 |  Method resolution order:
 |      LogicleTransform
 |      flowkit._models.transforms._base_transform.Transform
 |      abc.ABC
 |      builtins.object
 |
 |  Methods defined here:
 |
 |  __init__(self, transform_id, param_t, param_w, param_m, param_a)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |
 |  __repr__(self)
 |      Return repr(self).
 |
 |  apply(self, events)
 |      Apply transform to given events.
 |
 |      :param events: NumPy array of FCS event data
 |      :return: NumPy array of transformed events
 |
 |  inverse(self, events)
 |      Apply the inverse transform to given events.
 |
 |      :param events: NumPy array of FCS event data
 |      :return: NumPy array of inversely transformed events
 |
 |  ----------------------------------------------------------------------
 |  Data and other attributes defined here:
 |
 |  __abstractmethods__ = frozenset()
 |
 |  ----------------------------------------------------------------------
 |  Methods inherited from flowkit._models.transforms._base_transform.Transform:
 |
 |  __eq__(self, other)
 |      Tests where 2 transforms share the same attributes, ignoring the 'id' attribute.
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors inherited from flowkit._models.transforms._base_transform.Transform:
 |
 |  __dict__
 |      dictionary for instance variables (if defined)
 |
 |  __weakref__
 |      list of weak references to the object (if defined)
 |
 |  ----------------------------------------------------------------------
 |  Data and other attributes inherited from flowkit._models.transforms._base_transform.Transform:
 |
 |  __hash__ = None

Plot AsinhTransform

[9]:
asinh_xform = fk.transforms.AsinhTransform(
    'asinh',
    param_t=262144,
    param_m=4.0,
    param_a=0.0
)
sample.apply_transform(asinh_xform)

p = sample.plot_scatter(12, 10, source='xform', subsample=True)
show(p)

The main difference between the logicle and the asinh transforms is the inclusion of the extra “w” parameter in the logicle transform, which controls the number of decades in the linear region. The asinh transform has no such parameter. Notice we reduced the “m” parameter from 4.5 (logicle) to 4.0 (asinh) in an attempt to create a closer output between them. Since asinh has no linear parameter the total decades was reduced by the amount of the linear region used in the logicle.

Plot WSPBiexTransform

The FlowJo biex transform has similar input parameters, with some subtle but important differences. Details about these parameters from the FlowJo docs are included below. Note, the FlowJo interface does not allow changing the max value parameter, I believe it is set automatically depending on the FCS data range.

Adjusting width: The value for w will determine the amount of channels to be compressed into linear space around zero. The space of linear does not change, but rather the number of channels or bins being compressed into the linear space.

Width should be set high enough that all of the data in the histogram is visible on screen, but not so high that extra white space is seen to the left hand side of your dimmest distribution. For most practical uses, once all events have been shifted off the axis and there is no more axis ‘pile-up’, then the optimal width basis value has been reached.

Negative: Another component in the biexponential transform calculation is the negative decades or negative space. This is the only other value you will probably ever need to adjust. In cases where a high width basis may start compressing dim events into the negative cluster, you may want to lower the width basis (less compression around zero) and instead, increase the negative space by 0.5 – 1.0. Doing this will expand the space around zero so the dim events are still visible, but also expand the negative space to remove the cells from the axis and allow you to see the full distribution.

Positive: The presence of the positive decade adjustment is due to the algorithm used for logicle transformation, but is not useful in 99.9% of the cases that require adjusting the biexponential transform. It may be appropriate to adjust this value only if you use data that displays data with a data range greater than 5 decades.

[10]:
# Note: these are the default values in FlowJo (and in FlowKit, so you don't have to remember these specific values)
biex_xform = fk.transforms.WSPBiexTransform(
    'biex',
    max_value=262144.000029,
    positive=4.418540,
    width=-10,
    negative=0
)
sample.apply_transform(biex_xform)

p = sample.plot_scatter(12, 10, source='xform', subsample=True)
show(p)

Other Transform Classes

  • LinearTransform

    The linear transform linearly scales the data. When plotted, the axes values will be different but the plotted events will look the same as untransformed data.

  • LogTransform

    The logarithmic transform works well for visualizing high intensity event values, but becomes problematic as we approach zero. Low intensity values are affected by the binning of the log function (esp. from >0 to 100). The log function is only defined for x > 0, so any zero or negative event values become undefined. It is for these reasons, alternative transforms were created to visualize and analyze FCM data.

  • HyperlogTransform

    The hyperlog is another transform created specifically for FCM data, though is not used as often in recent years. More information about hyperlog can be found in the original manuscript:

    1. Bruce Bagwell. Hyperlog – A flexible log-like transform for negative, zero, and positive valued data. Cytometry part A, Volume 64A, Issue 1, pages 34-42, March 2005.

  • RatioTransform

    The ratio transform is defined in the GatingML-2.0 specification, and is basically a parametrized ratio of 2 channels. It is included in FlowKit for full compatibility with GatingML-2.0.

  • WSPLogTransform

    FlowJo implements a custom version of the logarithmic function, and it is included in FlowKit for better compatibility with FlowJo workspaces. If you plan on using a log transform and exporting to a FlowJo workspace from FlowKit, you must use this transform.

Exercise for insight on transforms

The “hybrid” linear-log transforms commonly used to display FCM data can be difficult to grasp. FlowKit includes a synthetic FCS file with 2 channels where the events have been arranged in a diamond pattern. This file is used internally for test purposes, but may also be helpful for some users to gain a better intuition for how these hybrid transforms work. Let’s load the diamond sample and plot the events before and after a logicle transform.

[11]:
diamond_sample = fk.Sample("../../data/simple_diamond_example/test_data_diamond_01.fcs")
[12]:
diamond_sample.channels
[12]:
channel_number pnn pns png pne pnr
0 1 channel_A 1.0 (0.0, 0.0) 262144.0
1 2 channel_B 1.0 (0.0, 0.0) 262144.0
[13]:
f = diamond_sample.plot_scatter(1, 2, source='raw')
show(f)
[14]:
diamond_sample.apply_transform(logicle_xform)
f = diamond_sample.plot_scatter(1, 2)
show(f)

We can see in the transformed plot how events in the lower range get distributed more widely, allowing better visualization of any cell populations that may be concentrated there.

Using Transforms without a Sample

For cases where event data has been extracted using FlowKit or was derived from a source other than an FCS file, and needs to be transformed, the Transform sub-classes can be used independent of a Sample instance. The API for all the Transform sub-classes (except the RatioTransform class) provide apply and inverse methods that both take NumPy arrays as input and output transformed data, also as NumPy arrays.

Below is an example of transforming a NumPy array using the LogicleTransform, but all the Transform sub-classes work the same way.

[15]:
data = np.arange(0, 100000, 100000/20).reshape(2, 10).T
[16]:
data
[16]:
array([[    0., 50000.],
       [ 5000., 55000.],
       [10000., 60000.],
       [15000., 65000.],
       [20000., 70000.],
       [25000., 75000.],
       [30000., 80000.],
       [35000., 85000.],
       [40000., 90000.],
       [45000., 95000.]])
[17]:
xform = fk.transforms.LogicleTransform('asinh', param_t=95000., param_w=0.5, param_m=4.5, param_a=0)
[18]:
xform_data = xform.apply(data)
[19]:
xform_data
[19]:
array([[0.11111111, 0.93801936],
       [0.71515638, 0.9472245 ],
       [0.78240245, 0.95562761],
       [0.82165336, 0.9633573 ],
       [0.84947795, 0.97051356],
       [0.87105003, 0.97717562],
       [0.88867034, 0.98340735],
       [0.90356495, 0.98926098],
       [0.91646525, 0.99477978],
       [0.92784278, 1.        ]])
[20]:
inv_xform_data = xform.inverse(xform_data)
[21]:
inv_xform_data
[21]:
array([[    0., 50000.],
       [ 5000., 55000.],
       [10000., 60000.],
       [15000., 65000.],
       [20000., 70000.],
       [25000., 75000.],
       [30000., 80000.],
       [35000., 85000.],
       [40000., 90000.],
       [45000., 95000.]])

Matrix Class

So far, we have only transformed our sample for better visualization. However, it is often necessary for flow cytometry data to be compensated to correct for fluorescence “spillover” between fluorescent channels. FlowKit supports compensation via the Matrix class.

The Matrix class represents a compensation matrix used for correcting spillover in fluorescence channels. A compensation matrix (also called a spillover matrix) is applied prior to transforming events, however fluorescent events generally need to be transformed in order to verify if the compensation matrix is adequate or needs to be modified. This is somewhat confusing, but makes sense if you remember that visualizing fluorescent events without a transform is very difficult to interpret, as we saw above. The good news is that FlowKit will handle the order of compensation and transformation operations for you.

A Matrix instance can easily be created from a variety of data sources, including:

  • a file path, file handle, or Path object to a CSV/TSF file

  • a NumPy array of spill data

  • a Pandas DataFrame (channel labels as headers)

Below we show the help docs for the Matrix class, let’s look at the constructor:

Matrix(
    matrix_id,
    spill_data_or_file,
    detectors,
    fluorochromes=None,
    null_channels=None
)

The matrix_id, like the previous transform_id for the Transform sub-classes, is used to easily reference which compensation matrix we want to use for a particular gate. However, there are a few reserved values that cannot be used when creating a Matrix instance. The values ‘uncompensated’ or ‘fcs’ are reserved for events in a gate that should not be compensated (‘uncompensated’) or events that should be compensated using the spillover matrix found in the FCS sample (‘fcs’).

[22]:
help(fk.Matrix)
Help on class Matrix in module flowkit._models.transforms._matrix:

class Matrix(builtins.object)
 |  Matrix(matrix_id, spill_data_or_file, detectors, fluorochromes=None, null_channels=None)
 |
 |  Represents a single compensation matrix from a CSV/TSV file, NumPy array or pandas
 |  DataFrame.
 |
 |  :param matrix_id: Text string used to identify the matrix (cannot be 'uncompensated' or 'fcs')
 |  :param spill_data_or_file: matrix data array, can be either:
 |          - text string from FCS $SPILL or $SPILLOVER metadata value
 |          - a file path or file handle to a CSV/TSF file
 |          - a pathlib Path object to a CSV/TSF file
 |          - a NumPy array of spill data
 |  :param detectors: A list of strings to use for the detector labels.
 |  :param fluorochromes: A list of strings to use for the fluorochrome labels.
 |  :param null_channels: List of PnN labels for channels that were collected
 |      but do not contain useful data. Note, this should only be used if there were
 |      truly no fluorochromes used targeting those detectors and the channels
 |      do not contribute to compensation.
 |
 |  Methods defined here:
 |
 |  __init__(self, matrix_id, spill_data_or_file, detectors, fluorochromes=None, null_channels=None)
 |      Initialize self.  See help(type(self)) for accurate signature.
 |
 |  __repr__(self)
 |      Return repr(self).
 |
 |  apply(self, sample)
 |      Apply compensation matrix to given Sample instance.
 |
 |      :param sample: Sample instance with matching set of detectors
 |      :return: NumPy array of compensated events
 |
 |  as_dataframe(self, fluoro_labels=False)
 |      Returns the compensation matrix as a pandas DataFrame.
 |
 |      :param fluoro_labels: If True, the fluorochrome names are used as the column headers & row indices, else
 |          the detector names are used (default).
 |      :return: pandas DataFrame
 |
 |  inverse(self, sample)
 |      Apply compensation matrix to given Sample instance.
 |
 |      :param sample: Sample instance with matching set of detectors
 |      :return: NumPy array of compensated events
 |
 |  ----------------------------------------------------------------------
 |  Data descriptors defined here:
 |
 |  __dict__
 |      dictionary for instance variables (if defined)
 |
 |  __weakref__
 |      list of weak references to the object (if defined)

Create Matrix from CSV

Let’s create a Matrix instance from a CSV file made for the Sample we’ve been using

Note: we need to provide the detector labels (i.e. the PnN values for the channels) in the order they appear as columns in the matrix data source.

Our comp matrix file happens to have columns in the order the fluoro channels appear in the FCS file, so we can just get the fluoro PnN labels as follows:

[23]:
detectors = [sample.pnn_labels[i] for i in sample.fluoro_indices]
[24]:
detectors
[24]:
['TNFa FITC FLR-A',
 'CD8 PerCP-Cy55 FLR-A',
 'IL2 BV421 FLR-A',
 'Aqua Amine FLR-A',
 'IFNg APC FLR-A',
 'CD3 APC-H7 FLR-A',
 'CD107a PE FLR-A',
 'CD4 PE-Cy7 FLR-A']
[25]:
comp_file_path = '../../data/8_color_data_set/den_comp.csv'
[26]:
comp_mat = fk.Matrix(
    'my_spill',
    comp_file_path,
    detectors
)

Retrieve Matrix as pandas DataFrame

[27]:
comp_mat.as_dataframe()
[27]:
TNFa FITC FLR-A CD8 PerCP-Cy55 FLR-A IL2 BV421 FLR-A Aqua Amine FLR-A IFNg APC FLR-A CD3 APC-H7 FLR-A CD107a PE FLR-A CD4 PE-Cy7 FLR-A
TNFa FITC FLR-A 1.000000 0.014139 0.0 0.0 0.000458 0.000000 0.015546 0.000000
CD8 PerCP-Cy55 FLR-A 0.000000 1.000000 0.0 0.0 0.020596 0.087982 0.000000 0.127012
IL2 BV421 FLR-A 0.004253 0.000140 1.0 0.0 0.000068 0.000000 0.000000 0.000000
Aqua Amine FLR-A 0.000000 0.000000 0.0 1.0 0.000000 0.000000 0.000000 0.000000
IFNg APC FLR-A 0.000000 0.007508 0.0 0.0 1.000000 0.178821 0.000000 0.020226
CD3 APC-H7 FLR-A 0.000745 0.000000 0.0 0.0 0.010806 1.000000 0.000000 0.127640
CD107a PE FLR-A 0.000342 0.034257 0.0 0.0 0.000455 0.000174 1.000000 0.006291
CD4 PE-Cy7 FLR-A 0.000000 0.017456 0.0 0.0 0.000134 0.060009 0.036800 1.000000

Now, apply the matrix to our sample (our previous transform will be re-applied automatically)

[28]:
sample.apply_compensation(comp_mat)
[29]:
p = sample.plot_scatter(12, 10, source='xform', subsample=True)
show(p)

Using a Matrix Directly

Similar to the Transform subclasses, a Matrix instance can also be used to apply compensation directly via its apply method.

[30]:
comp_events = comp_mat.apply(sample)
[31]:
comp_events
[31]:
array([[ 1.75425031e+05,  1.44243000e+05,  7.97033828e+04, ...,
         3.99687703e+02,  4.04086770e+03,  1.25099998e+00],
       [ 1.23368773e+05,  1.06204000e+05,  7.61279844e+04, ...,
         2.77847769e+02,  9.44784092e+01,  1.25400002e+00],
       [ 1.62093781e+05,  1.36004000e+05,  7.81078359e+04, ...,
         2.94570928e+02,  2.21874879e+02,  1.25699997e+00],
       ...,
       [ 4.55830000e+04,  3.81100000e+04,  7.83869766e+04, ...,
         2.05111296e+03, -8.72674527e+01,  6.89679980e+01],
       [ 1.96877484e+05,  1.41196000e+05,  9.13805156e+04, ...,
         7.31415111e+03,  5.20573371e+01,  6.89689990e+01],
       [ 1.17830875e+05,  9.38370000e+04,  8.22933828e+04, ...,
         3.01327310e+02,  5.72060799e+03,  6.89689990e+01]])

Modify a Matrix

If the compensation is not adequate and needs adjusting, the underlying NumPy array of the Matrix class can be modified directly or a new Matrix instance can be created from a DataFrame. Let’s load a new Sample and use the compensation matrix from the $SPILL keyword. We will see this cytometer-created matrix needs adjusting and demonstrate how to modify the Matrix.

[32]:
fcs_path = '../../data/100715.fcs'
sample = fk.Sample(fcs_path, subsample=20000)  # subsample to 20k events
[33]:
sample.channels
[33]:
channel_number pnn pns png pne pnr
0 1 FSC-A 1.0 (0.0, 0.0) 262207.0
1 2 FSC-H 1.0 (0.0, 0.0) 262207.0
2 3 SSC-A 1.0 (0.0, 0.0) 261588.0
3 4 B515-A KI67 1.0 (0.0, 0.0) 261588.0
4 5 R780-A CD3 1.0 (0.0, 0.0) 261588.0
5 6 R710-A CD28 1.0 (0.0, 0.0) 261588.0
6 7 R660-A CD45RO 1.0 (0.0, 0.0) 261588.0
7 8 V800-A CD8 1.0 (0.0, 0.0) 261588.0
8 9 V655-A CD4 1.0 (0.0, 0.0) 261588.0
9 10 V585-A CD57 1.0 (0.0, 0.0) 261588.0
10 11 V450-A VIVID / CD14 1.0 (0.0, 0.0) 261588.0
11 12 G780-A CCR5 1.0 (0.0, 0.0) 261588.0
12 13 G710-A CD19 1.0 (0.0, 0.0) 261588.0
13 14 G660-A CD27 1.0 (0.0, 0.0) 261588.0
14 15 G610-A CCR7 1.0 (0.0, 0.0) 261588.0
15 16 G560-A CD127 1.0 (0.0, 0.0) 261588.0
[34]:
# Apply logicle transform and plot the CD3 vs CD8 channels (uncompensated)
xform = fk.transforms.LogicleTransform('logicle', param_t=262144, param_w=0.5, param_m=4.5, param_a=0)
sample.apply_transform(xform)

fig = sample.plot_scatter(5, 8, source='xform', subsample=True)
show(fig)
[35]:
# Apply the matrix from the 'spill' keyword and re-plot
sample.apply_compensation(sample.metadata['spill'])

fig = sample.plot_scatter(5, 8, source='xform', subsample=True)
show(fig)

The default comp matrix doesn’t look too good, let’s edit it

We can easily get the loaded compensation, which is a Matrix instance. The Matrix method as_dataframe also takes an optional argument fluoro_labels to display the fluoro labels rather than the detectors.

[36]:
comp_mat = sample.compensation
[37]:
comp_mat.as_dataframe()
[37]:
B515-A R780-A R710-A R660-A V800-A V655-A V585-A V450-A G780-A G710-A G660-A G610-A G560-A
B515-A 1.000000 0.000000 0.000000 0.000088 0.000249 0.000645 0.007198 0.0 0.000000 0.000131 0.000067 0.000582 0.002520
R780-A 0.000000 1.000000 0.071188 0.148448 0.338903 0.009717 0.000000 0.0 0.301380 0.007478 0.012354 0.000000 0.000000
R710-A 0.000000 0.331405 1.000000 0.061965 0.120979 0.004053 0.000000 0.0 0.109117 0.100314 0.005832 0.000000 0.000000
R660-A 0.000000 0.088621 0.389424 1.000000 0.029759 0.065553 0.000000 0.0 0.031294 0.039306 0.091375 0.000396 0.000057
V800-A 0.000000 0.136618 0.010757 0.000000 1.000000 0.000156 0.000000 0.0 0.483235 0.014858 0.000000 0.000000 0.000000
V655-A 0.000000 0.000124 0.019463 0.218206 0.004953 1.000000 0.003583 0.0 0.001311 0.029646 0.408902 0.006506 0.000119
V585-A 0.000000 0.000000 0.000000 0.000000 0.001056 0.002287 1.000000 0.0 0.000389 0.000194 0.000000 0.062551 0.132484
V450-A 0.000000 0.000000 0.000000 0.000000 0.000000 0.008118 0.170066 1.0 0.000000 0.000000 0.000000 0.000000 0.000000
G780-A 0.003122 0.008526 0.001024 0.001163 0.125401 0.018142 0.193646 0.0 1.000000 0.066898 0.161456 0.286823 1.238037
G710-A 0.002015 0.069645 0.194715 0.001008 0.151611 0.001270 0.007133 0.0 1.150032 1.000000 0.016077 0.014674 0.055352
G660-A 0.001685 0.054340 0.277852 0.343008 0.061753 0.077523 0.004263 0.0 0.497488 0.743923 1.000000 0.010329 0.037635
G610-A 0.000000 0.008713 0.048213 0.073190 0.150563 0.386293 0.101896 0.0 0.370277 0.613490 1.218024 1.000000 0.065211
G560-A 0.001684 0.000000 0.000000 0.000095 0.003463 0.015712 0.174122 0.0 0.023802 0.049474 0.132511 0.239216 1.000000

By default, the columns/rows have detector labels. Let’s switch to use the more human-readable fluorescent labels.

[38]:
comp_df = comp_mat.as_dataframe(fluoro_labels=True)
[39]:
comp_df
[39]:
KI67 CD3 CD28 CD45RO CD8 CD4 CD57 VIVID / CD14 CCR5 CD19 CD27 CCR7 CD127
KI67 1.000000 0.000000 0.000000 0.000088 0.000249 0.000645 0.007198 0.0 0.000000 0.000131 0.000067 0.000582 0.002520
CD3 0.000000 1.000000 0.071188 0.148448 0.338903 0.009717 0.000000 0.0 0.301380 0.007478 0.012354 0.000000 0.000000
CD28 0.000000 0.331405 1.000000 0.061965 0.120979 0.004053 0.000000 0.0 0.109117 0.100314 0.005832 0.000000 0.000000
CD45RO 0.000000 0.088621 0.389424 1.000000 0.029759 0.065553 0.000000 0.0 0.031294 0.039306 0.091375 0.000396 0.000057
CD8 0.000000 0.136618 0.010757 0.000000 1.000000 0.000156 0.000000 0.0 0.483235 0.014858 0.000000 0.000000 0.000000
CD4 0.000000 0.000124 0.019463 0.218206 0.004953 1.000000 0.003583 0.0 0.001311 0.029646 0.408902 0.006506 0.000119
CD57 0.000000 0.000000 0.000000 0.000000 0.001056 0.002287 1.000000 0.0 0.000389 0.000194 0.000000 0.062551 0.132484
VIVID / CD14 0.000000 0.000000 0.000000 0.000000 0.000000 0.008118 0.170066 1.0 0.000000 0.000000 0.000000 0.000000 0.000000
CCR5 0.003122 0.008526 0.001024 0.001163 0.125401 0.018142 0.193646 0.0 1.000000 0.066898 0.161456 0.286823 1.238037
CD19 0.002015 0.069645 0.194715 0.001008 0.151611 0.001270 0.007133 0.0 1.150032 1.000000 0.016077 0.014674 0.055352
CD27 0.001685 0.054340 0.277852 0.343008 0.061753 0.077523 0.004263 0.0 0.497488 0.743923 1.000000 0.010329 0.037635
CCR7 0.000000 0.008713 0.048213 0.073190 0.150563 0.386293 0.101896 0.0 0.370277 0.613490 1.218024 1.000000 0.065211
CD127 0.001684 0.000000 0.000000 0.000095 0.003463 0.015712 0.174122 0.0 0.023802 0.049474 0.132511 0.239216 1.000000
[40]:
# Review the CD3/CD8 values
print(comp_df.loc['CD3']['CD8'], comp_df.loc['CD8']['CD3'])
0.3389031912802132 0.13661791418865094
[41]:
# Let's change these
comp_df.loc['CD3', 'CD8'] = 0.15
comp_df.loc['CD8', 'CD3'] = 0.01

Create a new Matrix with our customized comp values, then re-apply to our Sample

[42]:
comp_mat_modified = fk.Matrix(
    'custom_spill',
    comp_df.values,  # use new values
    comp_mat.detectors,  # use old detectors
    fluorochromes=comp_mat.fluorochomes  # use old fluoros
)

sample.apply_compensation(comp_mat_modified)

fig = sample.plot_scatter(5, 8, source='xform', subsample=True)
show(fig)

That’s better, probably needs some adjustment in other channels, but the process would be the same.

[ ]: