pyInfinityFlow¶
pyInfinityFlow is a Python package that enables imputation of hundreds of features from Flow Cytometry using XGBoost regression. It is an adaptation of the original implementation in R1 with the goal of optimizing the workflow for large datasets by increasing the speed and memory efficiency of the analysis pipeline.
The package includes tools to read and write FCS files, following the FCS3.1 file standard, into AnnData objects, allowing for easy downstream analysis of single-cell data with Scanpy2 and UMAP3.
Read more about the pyInfinityFlow package on its Read the Docs page!
Graphical Summary¶

Recommended Installation¶
It is recommended to set up a virtual environment to install the package.
Creating a new conda environment and installing pyInfinityFlow:
conda create -n pyInfinityFlow python=3.8
conda activate pyInfinityFlow
pip install pyInfinityFlow
Then pyInfinityFlow will be installed in a conda environment named ‘pyInfinityFlow’.
Quickstart¶
To run the pyInfinityFlow pipeline, we can use this command:
pyInfinityFlow --data_dir /home/kyle/Documents/GitHub/pyInfinityFlow/example_data/mouse_lung_dataset_subset/ \
--out_dir /media/kyle_ssd1/example_outputs/ \
--backbone_annotation /home/kyle/Documents/GitHub/pyInfinityFlow/example_data/mouse_lung_dataset_subset_backbone_anno.csv \
--infinity_marker_annotation /home/kyle/Documents/GitHub/pyInfinityFlow/example_data/mouse_lung_dataset_subset_infinity_marker_anno.csv
Selected References¶
Contents:
- Installation
- Tutorial - Command Line Tools
- API Tutorial: Full pyInfinityFlow Pipeline
- Provide paths for your machine
- Step 1: Preparing the Inputs
- Step 2: Checking the Inputs and Building an InfinityFlowFileHandler
- Step 3: Specify Output Directories
- Step 4: Fitting the XGBoost Regression Models
- Step 5: Validating Regression Models
- Step 6: Predict InfinityMarker Values for Final InfinityFlow Object
- Step 7: Isotype Background Correction
- Step 8: Silencing Features
- Step 9: Dimensionality Reduction
- Step 10: Making Feature Plots
- Step 11: Clustering the Data
- Step 12: Find Markers for Clusters
- Step 13: Moving Features out of Silent
- Step 14: Saving Regression Outputs
- Finish
- Command Line Tools
- API