Identifying and fixing errors in flow data

Feb 7, 20246 min read

Errors in flow data are all too common. The key to reducing errors lies in being able to recognize the symptoms and distinguish between the various causes. With that knowledge, we can work towards preventing the errors from happening again, and, in some cases, fix them in existing data.

So, with that, let's have a look at some bad data! I have plenty of bad data to share, but I also want to thank Leqi Tang for agreeing to share some of his files from panel testing and optimization, which nicely illustrate some of these issues.

Today we're going to start with one of the most basic errors, which is incorrect spillover identification, AKA compensation error. These errors can occur just as easily with spectral flow cytometry, so I'm going to refer to them as spillover errors.

Here's an example with conventional flow:

And one with spectral flow:

What are the diagnostic characteristics here?

Skewed signals
Correlation (or anti-correlation) between channels
Usually happens between fluorophores with overlapping emission

In the first plot, we see that the CD4 is skewed (or leaning) upwards into CCR2. As we can see from the fluorophore names BV785 and BB790 have similar numbers, telling us they both emit around the same wavelength. BB790 is also excited quite well by the violet laser, so without any spillover adjustment, we're going to get signal from BB790 into BV785.

In the second example, there are actually a couple of things going on here. First, we have the traditional skewed spillover error for Foxp3 - eFluor450 with respect to CXCR3 - BV480. The Tregs become hyper-negative as the Foxp3 expression increase. There's no biological way for us to have negative expression, so this is clearly an artefact. Secondly, there's also some fuzz below zero on the CXCR3 - BV480 axis that isn't really clearly Foxp3+ or -. This happens in spectral data due to incorrect unmixing of the autofluorescence signature(s). The autofluorescence is over-corrected, resulting in hyper-negative events, frequently at around 500nm emission.

Why is this happening to me?

Here are some reasons I see this occurring:

A difference between the samples and the single color controls. For example, you fixed your samples but not your controls. Your controls were stained quickly, but you took longer pipetting your hundreds of samples on the bench in the light. You made up your master mix hours/days ahead of time. Your master mix contains buffers/reagents not present in the controls. Etc.
Improper controls (beads rather than cells, FITC instead of GFP).
Incorrect identification of spectrum within controls. If you use an automated gating process to identify your cells and separate them from debris, or rely on automated gates to identify positive and negative events, this frequently causes issues. You need to feed the computer good information.
Poor signal. You didn't acquire enough events (hopefully at least a few hundred positive cells) or the positive isn't much brighter than the tail of the negative. Or, you forgot to add the antibody to the control. This happens to all of us at some point.
Autofluorescence intrusion. Some channels (~450-500nm UV and violet & ~650-750 violet & blue) are particularly prone to interference from autofluorescence. The autofluorescence signal will mix into your positive fluorophore signal, creating a hybrid that is not what you're trying to adjust for.
Mix-up of controls. The single color controls are actually two colors, or the wrong color. This can also happen if you're using cells expressing fluorescent proteins.
Instrument issue (blockage, deviation of stream, inadequate cleaning, laser failure, APD/PMT failure). This is less common.
Software/user issue: too many events per second causing variability. Less common.
Software issue: poor algorithm. Fairly common issue for compensation. Not generally an issue for spectral in my experience.

How do we fix this for the future?

Better controls. More controls. Run FMOs for any critical markers or channels that give you trouble. Run multiple types of single color controls if you have to.
Acquire more data for your controls. More data means you can be pickier about which events you use to define positives and negatives.
Better sample prep
Care for the machine

What can we do with the existing data?

For conventional flow:

Check the single color controls. Are they correctly compensated using the same matrix that has been applied to the samples? If not, you can adjust based on the controls.
Check against an FMO control if you have one. Is there signal spilling into that channel?
Try AutoSpill Here's an example where the samples were initially compensated using beads in DIVA. In the lower row, the compensation has been applied after calculating it in AutoSpill using single-stained cell controls. There are still issues with spread between similar fluorophores (BB790 vs BV785) and BV750 vs BUV737), but these are symmetrical, not skewed. That's a panel design choice/issue.

Otherwise, if you have other sets of single color controls that were run on the same instrument with the same settings close to when you ran your experiment, you could try using those. Some instruments (e.g., ZE5) are quite stable even with large panels. Others, not so much.

For spectral flow:

Check out the unmixing tips
Run a new, better control and see if that helps
Use the reference controls if you've programmed them into your machine
If you're using automated autofluorescence calculation, check that AF channel against all others to look for spillover errors. Consider using a directed or targeted autofluorescence identification.

What about editing the matrix?

I don't generally recommend applying manually edited compensation matrices to samples. It's very easy to introduce new errors if you're just basing your compensation off of what you think it should look like rather than any controls. Furthermore, most of the time spillover errors propagate through multiple dimensions of the data, so the spillover coefficient you adjust may not be the right one and you may introduce unseen errors elsewhere.

For instance, something is pretty clearly wrong in the plot below:

We have negative correlation between the autofluorescence channel and the Spark Blue 550. This is an example of the error that can arise from using the automated autofluorescence extraction. It might be tempting to add a compensation matrix with negative coefficients for both of these, but that's not the correct answer. We get a hint of why it isn't by looking at the SB550+ cells: there are cells correctly position as well as ones that are hypernegative.

If we look instead at the GFP (close to SB550 in the spectrum), we can see this is a more likely source of error for the cells that are SB550+ and hypernegative for AF. There is a stronger correlation in this pair of channels than in the SB550 vs AF. Why is this also showing up in SB550 vs AF? Because the GFP and the CD3-SB550 are both on T cells.

To show you what happens when you incorrectly edit a compensation matrix, let's use this example of some brain cells run on an Aria. This is 9 colors, which is near the max for this particular Aria, and the controls were brain cells, so messy.

The files produced by DIVA have some clear problems with respect to APC, but which channel(s) are the problem?

I can reduce these problems by editing the matrix.

What I've actually done here, however, is hide the O4-APC population in an unused channel. So, we wouldn't be able to see or sort those cells. The plots look a bit cleaner, though.

In this last version, I've re-generated the compensation matrix using AutoSpill in FlowJo. This is a lot better. There is still a spillover error because the microglial autofluorescence hasn't been taken into account, and that's causing the hypernegative events for BV510 and the fuzz in PE-Fire 700. Also, there's too much Ly-6G-BV510 antibody being used here, so everything is shifted up a bit.

Here are the matrices that come from DIVA:

and AutoSpill using the same controls:

Things that resemble spillover error:

In the next example we have strong correlation between the channels. Is this a spillover error? The two fluorophores (APC and Super Bright 436) are very dissimilar and in fact have no emission overlap. No, this is actually an example of co-expression. The Thy1.1 here is a surface reporter driven by the Foxp3 promoter, so the same cells express Foxp3 protein in the nucleus and do so at a similar level.

Similarly, here we have perfect co-expression, not spillover error (well, mostly). These cells are transduced with tags in combinations, so some cells express S, others VSVg, and others both. The tags are on the same construct, so the levels match perfectly. Here we can actually see the distortion introduced by the biexponential scaling where the S and VSVg co-expressing cells jag slightly at the lower end. Also, there's spreading error because AF532 and CF488A have similar profiles. This spreading manifests as the trumpet-shaped widening in the brighter cells. There may also be a spillover error in the VSVg-CF488A, which is leaning away from the S-AF432 in the single positive cells.

Here's a different type of spillover-ish error resulting from tandem breakdown:

The PD-1-PE-Cy7 is hypernegative versus CX3CR1-PE-Fire 810. It's reasonable to have spillover error between these two because they're similar fluorophores hitting adjacent detectors. Why is this happening, though?

If we look at PD-1-PE-Cy7 vs T-bet-PE, we can see that the PE-Cy7 appears to have undergone tandem breakdown. So, the fluorescence has blue-shifted towards PE and away from PE-Fire 810. Presumably the single color control didn't capture this tandem breakdown, most likely in this case because the sample was lung (containing metabolically active macrophages) and the control was spleen.

That's all for today.

Colibri Cytometry

Identifying and fixing errors in flow data

Recent Posts