This SCMMIB_pipeline folder is a sub-project of SCMMIB project. The main project github folder is at https://github.com/bm2-lab/SCMMI_Benchmark.
SCMMIB_pipeline contains:
-
preprocessing_scripts: including scripts to generate the gene activity matrix and downsample and scalability simulation datasets from all benchmark datasets, and a introduction to these scripts.
-
envs: including conda environment yaml files for 40 benchmark algorithms, and an example to use the env files.
-
wdl_workflow: including the wdl worklfow file and input json file for all benchmark method, and an example for input json configuration.
-
benchmark_methods: including task specific module scripts for 6 single-cell multimodal integration types. These scripts can be executed with uniform pipeline of wdl_workflow and algorithm specific envs.
The output of all benchmark methods can be evaluated with scmmib python package in our project folder https://github.com/bm2-lab/SCMMI_Benchmark.
The pre-processed project data (h5ad, rds and rSeurat format) is available at figshare folder (https://figshare.com/articles/dataset/SCMMIB_Register_Report_Stage_2_processed_datasets/27161451/2). Simluation datasets in SCMMIB study can be generated from R scripts in the data_simulation folder.
-
Tutorial for reproducing all methods and all tasks in SCMMIB project tutorial 1.
-
Tutorial for applying scmmib pipeline to new integration methods tutorial 2.
-
Tutorial for applying scmmib pipeline to new benchmark datasets tutorial 3.
Fu, Shaliu; Wang, Shuguang; Si, Duanmiao; Li, Gaoyang; Gao, Yawei; Liu, Qi (2024). Benchmarking single-cell multi-modal data integrations. figshare. Journal contribution. https://doi.org/10.6084/m9.figshare.26789572.v1
SCMMIB project processed datasets. figshare. Dataset. https://doi.org/10.6084/m9.figshare.27161451.v2