CytoPipeline: building and visualizing automated pre-processing and quality control pipelines for flow cytometry data
Philippe Hauchamps,Dan Lin,Laurent Gatto
Computation Biology and Bioinformatics (CBIO) Unit, de Duve Institute, UCLouvain, Belgium
With the increase of the dimensionality in conventional flow cytometry data over the past years, there is a growing need to replace or complement traditional manual analysis (i.e. iterative 2D gating) with automated data analysis pipelines. Examples of such pipelines have been documented in the recent literature (e.g. ,). A crucial part of these pipelines consists of pre-processing and applying quality control filtering to the raw data, in order to use high quality events in the downstream statistical analysis. This part can in turn be split into a number of elementary steps : margin events removal, signal compensation, scale transformations, debris and dead cells removal, batch effect correction,… etc. However, as a bioinformatician who designs and builds automated flow cytometry data analysis pipelines, assembling and assessing the pre-processing part can be challenging for a number of reasons. First, each of the involved elementary steps can be implemented using various methods and R packages. Second, the order of the steps can have an impact on the downstream analysis results. Finally, each method typically comes with its specific, unstandardized diagnostic and visualizations, making objective comparison difficult for the end user. In this work, we present an R package to build, compare and assess pre-processing pipelines for flow cytometry data. We focus on three main aspects: explicit and centralized description of the pipeline steps, extensibility and maintainability through the use of S4 classes, and genericity and flexibility of interactive visualization utilities. To demonstrate our tool, we present the different steps involved in designing a pre-processing pipeline on publicly available data, and show the accompanying visualization utilities. References:  Quintelier, Katrien, Artuur Couckuyt, Annelies Emmaneel, Joachim Aerts, Yvan Saeys, and Sofie Van Gassen. 2021. “Analyzing High-Dimensional Cytometry Data Using FlowSOM.” Nature Protocols 16 (8): 3775–3801.  Ashhurst, Thomas Myles, Felix Marsh-Wakefield, Givanna Haryono Putri, Alanna Gabrielle Spiteri, Diana Shinko, Mark Norman Read, Adrian Lloyd Smith, and Nicholas Jonathan Cole King. 2021. “Integration, Exploration, and Analysis of High-Dimensional Single-Cell Cytometry Data Using Spectre.” Cytometry. Part A: The Journal of the International Society for Analytical Cytology, no. cyto.a.24350 (April). https://doi.org/10.1002/cyto.a.24350.