Liger单细胞多组学分析:整合单细胞RNA-seq数据集

介绍完liger的数学原理,我们便可以来跑一跑liger的流程,并对liger的数学原理和参数选择有更深刻的理解。我们可以在GitHub上找到liger的码源以及相关的教程,本文也主要翻译自GitHub上的教程,并加入了一些自己的函数和理解。

载入包和数据

本文的教程希望整合控制组以及干扰素刺激组的PBMC单细胞RNA-seq数据,数据来源于Kang et al, 2017,liger的作者则提供了以RData格式存储的经过down sample的表达矩阵,我们可以从这里下载并直接使用。当然,如果你使用10x的数据的话,则也可以使用read10X函数将数据读入。

############################################
## Project: Liger-learning
## Script Purpose: run pipeline of liger integration and further analysis
## Data: 2020.10.24
## Author: Yiming Sun
############################################

# general setting
setwd('/data/User/sunym/project/liger_learning/')
# library
library(liger)
library(Seurat)
library(dplyr)
library(tidyverse)
library(viridis)
library(parallel)
#my function
MySuggestLambda <- function(object,lambda,k,knn_k = 20,thresh = 1e-04,max.iters = 100,k2 = 500,ref_dataset = NULL,resolution = 1,parl_num){
  res <- matrix(ncol = 2,nrow = length(lambda))
  colnames(res) <- c('lambda','alignment')
  res[,'lambda'] <- lambda
  cl <- makeCluster(parl_num)
  clusterExport(cl,c('object','k','knn_k','thresh','max.iters','k2','ref_dataset','resolution'),envir = environment())
  clusterExport(cl,c('optimizeALS','quantileAlignSNF','calcAlignment'),envir = .GlobalEnv)
  alignment_score <- parLapply(cl = cl,lambda,function(x){
    object <- optimizeALS(object,k = k,lambda = x,thresh = thresh,max.iters = max.iters)
    object <- quantileAlignSNF(object,knn_k = knn_k,k2 = k2,ref_dataset = ref_dataset,resolution = resolution)
    return(calcAlignment(object))
  })
  stopCluster(cl)
  res[,'alignment'] <- (alignment_score %>% unlist())
  res <- as.data.frame(res)
  res[,1] <- as.numeric(as.character(res[,1]))
  res[,2] <- as.numeric(as.character(res[,2]))
  p <- ggplot(res,aes(x = lambda,y = alignment))+geom_point()+geom_line()
  return(list(res,p))
}
# 1.Integrating Multiple Single-Cell RNA-seq Datasets
#load data
ctrl_dge <- readRDS("./data/PBMC_control.RDS")
stim_dge <- readRDS("./data/PBMC_interferon-stimulated.RDS")

这里我写了一个函数MySuggestLambda用来优化λ的选择,其实liger包自带有suggestK和suggestLambda函数帮助我们优化K和λ的选择,但是suggestLambda函数似乎有什么bug,在一台服务器上跑完优化后会报一个莫名其妙的错误,而在另一台服务器上则完全正常,我还没有发现bug在哪里所以就自己写了个类似的函数帮助我优化λ的选择,如果你使用liger默认的函数也会报错的话不妨试试我的函数,虽然他内存运用效率特别低,建议配置好50G左右的swap区,因为当R跑多线程时,如果内存不够rsession会被直接挤掉,配置好swap能拯救一下它(血泪教训)。

标准化

在降维和聚类(整合)之前,我们通常都要对表达矩阵进行标准化、挑选variable gene最后再归一化。需要注意的是,liger使用的是非负矩阵分解的方法,因此我们在做归一化的时候注意不要做中心化,因为这会导致负数的出现。

#initialize a liger object
ifnb_liger <- createLiger(list(ctrl = ctrl_dge, stim = stim_dge))
#explore liger object
dim([email protected]$ctrl)
head(colnames([email protected]$ctrl))
head(rownames([email protected]$ctrl))
dim([email protected]$stim)
#normalize data
ifnb_liger <- normalize(ifnb_liger)
#select variable gene
ifnb_liger <- selectGenes(ifnb_liger)
#scale data but not center
ifnb_liger <- scaleNotCenter(ifnb_liger)

可以简单关注一下liger的数据结构,原始的表达矩阵存储在raw.data中,聚类的信息存储在clusters中,降维的信息存储在tsne.coords中。这里就不得不提一下seurat,seurat有一个meta.data的表格来存储各个细胞的各种信息,因此在我们做个性化的比较分析的时候特别有用。liger虽然也有cell.data,但我们似乎不太能够利用liger自带的函数指定cell.data中的变量来进行比较,我想到的解决方法是把liger导出为seurat,在seurat中做一些个性化的比较,而liger则单纯用来做整合、降维以及聚类。

《Liger单细胞多组学分析:整合单细胞RNA-seq数据集》

非负矩阵分解

Liger利用非负矩阵分解的方法进行数据集之间的整合,我们需要提供的参数为K和λ。通常,K反映了数据集中真实cluster的个数,而λ则反映了数据集之间的异质性,如果你对你的数据完全没有把握,那么默认参数K=20,λ=5往往也能有不错的表现。当然我们也可以根据KL散度和Alignment score来优化K和λ的选择。

#find the optimized k
suggestK(ifnb_liger,k.test = seq(10,30,1),lambda = 5,plot.log2 = FALSE,num.cores = 10)

设置λ=5,调用10个进程,测试K从10到30时所计算得到的KL散度,最终选择20作为最优的K。

《Liger单细胞多组学分析:整合单细胞RNA-seq数据集》

使用liger自带的函数优化λ的选择,取K=20,测试lambda的范围从1到10,调用10个进程,最终选择5作为最优的λ。

#find the optimized lambda
#use liger function
suggestLambda(object = ifnb_liger,k = 20,lambda.test = seq(1,10,1),num.cores = 10)

《Liger单细胞多组学分析:整合单细胞RNA-seq数据集》

使用我自己写的函数来优化λ的选择,

#use my function
suggest_lambda <- MySuggestLambda(ifnb_liger,lambda = seq(1,10,1),k = 20,parl_num = 10)
suggest_lambda[[2]]

《Liger单细胞多组学分析:整合单细胞RNA-seq数据集》

官方的函数应该是在Alignment score上做了平均而使得曲线更加平滑了,总之建议使用官方的函数,如果出现报错的话,可以试试我的函数。最后我们选择K=20,λ=5进行非负矩阵分解。

#use lambda 5 k 20
#integrate NMF
ifnb_liger <- optimizeALS(ifnb_liger,k = 20,lambda = 5,max.iters = 30,thresh = 1e-06)

聚类

做完非负矩阵分解后就可以进行聚类了,liger聚类的方法也比较有趣,可以参考liger的数学原理。quantile_norm函数会构建SFN图(shared factor neighborhood graph)对细胞进行聚类,在聚类完成后将同一cluster中不同数据集来源的细胞在坐标轴(W+V矩阵)上进行分位数归一化。

#Quantile Normalization and Joint Clustering
ifnb_liger <- quantile_norm(ifnb_liger,knn_k = 20,quantiles = 50,min_cells = 20,do.center = FALSE,
                            max_sample = 1000,refine.knn = TRUE,eps = 0.9)

这里需要讲两个参数,knn_k是在构建SFN图中选择邻居细胞的个数,do.center是指是否做中心化,一般默认为FALSE,但当我们处理一些比较dense的数据时,如DNA甲基化数据,metagene的loading值通常会很高,此时可以做一下中心化来进行矫正。当然,我们也可以用一些图聚类算法来进一步优化我们的聚类,这一步可做可不做。

# you can use louvain cluster to detect and assign cluster
ifnb_liger <- louvainCluster(ifnb_liger, resolution = 0.25)

可视化与下游分析

  • UMAP降维图
#Visualization and Downstream Analysis
ifnb_liger <- runUMAP(ifnb_liger, distance = 'cosine', n_neighbors = 30, min_dist = 0.3)
all.plots <- plotByDatasetAndCluster(ifnb_liger, axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = T)
all.plots[[1]] + all.plots[[2]]

《Liger单细胞多组学分析:整合单细胞RNA-seq数据集》

  • Gene loading plot

我们可以找到在各个cluster中高表达的metagene,以及这些metagene在不同的数据集中(W和V矩阵)对其贡献最高的基因。

#find the maximun loading factor on each cluster and the maximun loading gene on the factor
gene_loadings <- plotGeneLoadings(ifnb_liger, do.spec.plot = FALSE, return.plots = TRUE)
gene_loadings[[4]]

《Liger单细胞多组学分析:整合单细胞RNA-seq数据集》

metagene_4在cluster_3中特异性高表达,那么metagene_4在W矩阵中对其贡献最大的基因(Shared),则很有可能是cluster_3的marker基因。类似的,metagene_4在V矩阵中对其贡献最大的基因,则反映了同一细胞类型数据集之间(或处理之间)的差异。我们也可以很容易得到图片背后的数据。

#the metadata of the factor/gene loading plot 
#UMAP
head(gene_loadings[[4]][[1]]$data)
#gene loading value on the specific factor
head(gene_loadings[[4]][[2]][[1]]$data[order(gene_loadings[[4]][[2]][[1]]$data$loadings,decreasing = TRUE),])
head(gene_loadings[[4]][[2]][[2]]$data[order(gene_loadings[[4]][[2]][[2]]$data$loadings,decreasing = TRUE),])
head(gene_loadings[[4]][[2]][[3]]$data[order(gene_loadings[[4]][[2]][[3]]$data$loadings,decreasing = TRUE),])
# conclusion: gene_loadings object is a list of all the factor, specific the factor you want
# conclusion: the factor object listed by gene_loadings has two lists, the first is the UMAP with the factor loading value, the second is the dataset-sepcific matrix W
# the dataset specific matrix W is a list contains the all the dataset and shared gene loading value
# what a mess!

总之就是list套list,很麻烦,可以对照着图和输出理解一下。

> head(gene_loadings[[4]][[1]]$data)
                          Factor4       tSNE1      tSNE2
ctrlTCAGCGCTGGTCAT-1 0.0018384202  -9.5633178 -1.5025842
ctrlTTATGGCTTCATTC-1           NA  -7.4026990 -0.5618219
ctrlACCCACTGCTTAGG-1 0.0007611497   3.7179575  4.9839707
ctrlATGGGTACCCCGTT-1 0.0001820607 -11.0730367 -4.6024990
ctrlTGACTGGACAGTCA-1           NA   0.5273923 -8.4171533
ctrlGTGTAGTGGTTGTG-1 0.0013735353   8.8018236  5.2826267
> #gene loading value on the specific factor
> head(gene_loadings[[4]][[2]][[1]]$data[order(gene_loadings[[4]][[2]][[1]]$data$loadings,decreasing = TRUE),])
       loadings      xpos top_k
FAM43A 2.522310 1.0000000  TRUE
FGFBP2 2.411532 0.9995883  TRUE
SPON2  2.264957 0.9991766 FALSE
ZNF638 1.925689 0.9987649 FALSE
KARS   1.916834 0.9983532 FALSE
PRDX5  1.889175 0.9979415  TRUE
> head(gene_loadings[[4]][[2]][[2]]$data[order(gene_loadings[[4]][[2]][[2]]$data$loadings,decreasing = TRUE),])
       loadings      xpos top_k
GNLY  136.98236 1.0000000  TRUE
NKG7  119.07990 0.9995883  TRUE
CLIC3 117.71097 0.9991766  TRUE
GZMB  108.58467 0.9987649  TRUE
PRF1  108.14754 0.9983532  TRUE
GZMA   93.22887 0.9979415  TRUE
> head(gene_loadings[[4]][[2]][[3]]$data[order(gene_loadings[[4]][[2]][[3]]$data$loadings,decreasing = TRUE),])
        loadings      xpos top_k
PRF1    3.273654 1.0000000  TRUE
SDF2L1  2.913147 0.9995883 FALSE
CD38    2.204181 0.9991766  TRUE
IFNG    2.111837 0.9987649 FALSE
HSPA5   2.081298 0.9983532 FALSE
CIAPIN1 2.002635 0.9979415 FALSE
  • 差异表达分析

转录组分析怎么离得开差异表达分析嘛,我们可以用runWilcoxon比较不同的cluster之间的差异表达基因,也可以比较同一个cluster的不同数据集来源之间的差异表达基因。

#differential gene express
# for different cluster
cluster.results <- runWilcoxon(ifnb_liger, compare.method = "clusters")
head(cluster.results)
datasets.results <- runWilcoxon(ifnb_liger, compare.method = "datasets")
head(datasets.results)

只需要指定compare.method就可以进行不同的比较,不过这个函数其实也只能做这两种比较,如果我想做之前提到的更加个性化的比较,就只能导入到seurat中再进行比较。

> cluster.results <- runWilcoxon(ifnb_liger, compare.method = "clusters")
[1] "Performing Wilcoxon test on ALL datasets: ctrl, stim"
> head(cluster.results)
        feature group   avgExpr        logFC statistic
1 RP11-206L10.2     0 -23.01898 -0.005557170   4259604
2 RP11-206L10.9     0 -23.01914 -0.005979592   4259602
3     LINC00115     0 -23.00609 -0.031260202   4252898
4         NOC2L     0 -21.83167  0.292228474   4340248
5        KLHL17     0 -23.00556  0.003658342   4262141
6       PLEKHN1     0 -23.01194 -0.039663835   4249915
        auc        pval       padj pct_in pct_out
1 0.4998057 0.570575909 0.69419204    100     100
2 0.4998055 0.570110484 0.69419204    100     100
3 0.4990189 0.138680541 0.24813295    100     100
4 0.5092682 0.004585054 0.01435755    100     100
5 0.5001034 0.819517383 0.88363326    100     100
6 0.4986689 0.044537037 0.10099905    100     100
> datasets.results <- runWilcoxon(ifnb_liger, compare.method = "datasets")
[1] "Performing Wilcoxon test on ALL datasets: ctrl, stim"
> head(datasets.results)
        feature  group   avgExpr       logFC statistic
1 RP11-206L10.2 0-ctrl -23.02585 -0.01408078  665437.5
2 RP11-206L10.9 0-ctrl -23.02585 -0.01376717  665437.5
3     LINC00115 0-ctrl -22.99983  0.01283157  666564.5
4         NOC2L 0-ctrl -21.87312 -0.08498766  662383.5
5        KLHL17 0-ctrl -22.98625  0.03960509  667718.0
6       PLEKHN1 0-ctrl -23.02585 -0.02853147  664846.0
        auc       pval      padj pct_in pct_out
1 0.4995560 0.30577296 0.6990712    100     100
2 0.4995560 0.30577296 0.6990712    100     100
3 0.5004020 0.59231247 0.8964793    100     100
4 0.4972633 0.61893894 0.9110340    100     100
5 0.5012680 0.09102028 0.4580475    100     100
6 0.4991119 0.14726281 0.5773302    100     100

对差异表达进行筛选。

> # filter the data to find the meaningful DEG for your own experiment
> cluster.results <- cluster.results[cluster.results$padj < 0.05,]
> cluster.results <- cluster.results[cluster.results$logFC > 3,]
> wilcoxon.cluster_3 <- cluster.results[cluster.results$group == 3, ]
> wilcoxon.cluster_3 <- wilcoxon.cluster_3[order(wilcoxon.cluster_3$padj), ]
> markers <- wilcoxon.cluster_3[1:20, ]
> head(markers)
      feature group    avgExpr     logFC statistic       auc
41861    GNLY     3  -5.282466 16.147323   2379942 0.9647642
46904   CLIC3     3 -12.962070  9.485599   1946458 0.7890417
47130    PRF1     3 -12.260573  9.830082   1970124 0.7986352
49239    GZMB     3  -7.840488 13.697178   2231966 0.9047789
52832    NKG7     3  -6.594620 14.433185   2324832 0.9424243
48310   KLRC1     3 -18.000059  4.916239   1606731 0.6513252
               pval          padj pct_in pct_out
41861  0.000000e+00  0.000000e+00    100     100
46904  0.000000e+00  0.000000e+00    100     100
47130  0.000000e+00  0.000000e+00    100     100
49239  0.000000e+00  0.000000e+00    100     100
52832  0.000000e+00  0.000000e+00    100     100
48310 1.839405e-289 4.099114e-286    100     100

可以看一下PRF1在不同的cluster之间的表达情况。

# feature plot on UMAP
PRF1 <- plotGene(ifnb_liger, "PRF1", axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = T)
PRF1[[1]] + PRF1[[2]]

《Liger单细胞多组学分析:整合单细胞RNA-seq数据集》

可以看到PRF1在两个数据集的cluster_3中均显著高表达。也可以用小提琴图来展示。

# vlnplot
PRF1 <- plotGeneViolin(ifnb_liger,gene = 'PRF1',return.plots = TRUE)
PRF1[[1]] + PRF1[[2]]

《Liger单细胞多组学分析:整合单细胞RNA-seq数据集》

完整代码

############################################
## Project: Liger-learning
## Script Purpose: run pipeline of liger integration and further analysis
## Data: 2020.10.24
## Author: Yiming Sun
############################################

# general setting
setwd('/data/User/sunym/project/liger_learning/')
# library
library(liger)
library(Seurat)
library(dplyr)
library(tidyverse)
library(viridis)
library(parallel)
#my function
MySuggestLambda <- function(object,lambda,k,knn_k = 20,thresh = 1e-04,max.iters = 100,k2 = 500,ref_dataset = NULL,resolution = 1,parl_num){
  res <- matrix(ncol = 2,nrow = length(lambda))
  colnames(res) <- c('lambda','alignment')
  res[,'lambda'] <- lambda
  cl <- makeCluster(parl_num)
  clusterExport(cl,c('object','k','knn_k','thresh','max.iters','k2','ref_dataset','resolution'),envir = environment())
  clusterExport(cl,c('optimizeALS','quantileAlignSNF','calcAlignment'),envir = .GlobalEnv)
  alignment_score <- parLapply(cl = cl,lambda,function(x){
    object <- optimizeALS(object,k = k,lambda = x,thresh = thresh,max.iters = max.iters)
    object <- quantileAlignSNF(object,knn_k = knn_k,k2 = k2,ref_dataset = ref_dataset,resolution = resolution)
    return(calcAlignment(object))
  })
  stopCluster(cl)
  res[,'alignment'] <- (alignment_score %>% unlist())
  res <- as.data.frame(res)
  res[,1] <- as.numeric(as.character(res[,1]))
  res[,2] <- as.numeric(as.character(res[,2]))
  p <- ggplot(res,aes(x = lambda,y = alignment))+geom_point()+geom_line()
  return(list(res,p))
}
# 1.Integrating Multiple Single-Cell RNA-seq Datasets
#load data
ctrl_dge <- readRDS("./data/PBMC_control.RDS")
stim_dge <- readRDS("./data/PBMC_interferon-stimulated.RDS")
#initialize a liger object
ifnb_liger <- createLiger(list(ctrl = ctrl_dge, stim = stim_dge))
#explore liger object
dim([email protected]$ctrl)
head(colnames([email protected]$ctrl))
head(rownames([email protected]$ctrl))
dim([email protected]$stim)
#normalize data
ifnb_liger <- normalize(ifnb_liger)
#select variable gene
ifnb_liger <- selectGenes(ifnb_liger)
#scale data but not center
ifnb_liger <- scaleNotCenter(ifnb_liger)
#find the optimized k
suggestK(ifnb_liger,k.test = seq(10,30,1),lambda = 5,plot.log2 = FALSE,num.cores = 10)
#find the optimized lambda
#use liger function
suggestLambda(object = ifnb_liger,k = 20,lambda.test = seq(1,10,1),num.cores = 10)
#use my function
suggest_lambda <- MySuggestLambda(ifnb_liger,lambda = seq(1,10,1),k = 20,parl_num = 10)
suggest_lambda[[2]]
#use lambda 5 k 20
#integrate NMF
ifnb_liger <- optimizeALS(ifnb_liger,k = 20,lambda = 5,max.iters = 30,thresh = 1e-06)
#Quantile Normalization and Joint Clustering
ifnb_liger <- quantile_norm(ifnb_liger,knn_k = 20,quantiles = 50,min_cells = 20,do.center = FALSE,
                            max_sample = 1000,refine.knn = TRUE,eps = 0.9)
# you can use louvain cluster to detect and assign cluster
ifnb_liger <- louvainCluster(ifnb_liger, resolution = 0.25)
head([email protected])
#Visualization and Downstream Analysis
ifnb_liger <- runUMAP(ifnb_liger, distance = 'cosine', n_neighbors = 30, min_dist = 0.3)
all.plots <- plotByDatasetAndCluster(ifnb_liger, axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = T)
#the place of each cell and the cluster it belongs
head(all.plots[[1]]$data)
pdf(file = './res/fig_201024/plot_by_dataset_and_cluster.pdf',width = 8,height = 4)
all.plots[[1]] + all.plots[[2]]
dev.off()
#find the maximun loading factor on each cluster and the maximun loading gene on the factor
gene_loadings <- plotGeneLoadings(ifnb_liger, do.spec.plot = FALSE, return.plots = TRUE)
pdf(file = './res/fig_201024/gene_loading_plot.pdf',width = 6.5,height = 6)
gene_loadings[[4]]
dev.off()
#the metadata of the factor/gene loading plot 
#UMAP
head(gene_loadings[[4]][[1]]$data)
#gene loading value on the specific factor
head(gene_loadings[[4]][[2]][[1]]$data[order(gene_loadings[[4]][[2]][[1]]$data$loadings,decreasing = TRUE),])
head(gene_loadings[[4]][[2]][[2]]$data[order(gene_loadings[[4]][[2]][[2]]$data$loadings,decreasing = TRUE),])
head(gene_loadings[[4]][[2]][[3]]$data[order(gene_loadings[[4]][[2]][[3]]$data$loadings,decreasing = TRUE),])
# conclusion: gene_loadings object is a list of all the factor, specific the factor you want
# conclusion: the factor object listed by gene_loadings has two lists, the first is the UMAP with the factor loading value, the second is the dataset-sepcific matrix W
# the dataset specific matrix W is a list contains the all the dataset and shared gene loading value
# what a mess!
#differential gene express
# for different cluster
cluster.results <- runWilcoxon(ifnb_liger, compare.method = "clusters")
head(cluster.results)
datasets.results <- runWilcoxon(ifnb_liger, compare.method = "datasets")
head(datasets.results)
# filter the data to find the meaningful DEG for your own experiment
cluster.results <- cluster.results[cluster.results$padj < 0.05,]
cluster.results <- cluster.results[cluster.results$logFC > 3,]
wilcoxon.cluster_3 <- cluster.results[cluster.results$group == 3, ]
wilcoxon.cluster_3 <- wilcoxon.cluster_3[order(wilcoxon.cluster_3$padj), ]
markers <- wilcoxon.cluster_3[1:20, ]
head(markers)
# feature plot on UMAP
PRF1 <- plotGene(ifnb_liger, "PRF1", axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = T)
pdf(file = './res/fig_201024/feature_plot_PRF1.pdf',width = 8,height = 4)
PRF1[[1]] + PRF1[[2]]
dev.off()
# vlnplot
PRF1 <- plotGeneViolin(ifnb_liger,gene = 'PRF1',return.plots = TRUE)
pdf(file = './res/fig_201024/vlnplot_PRF1.pdf',width = 6,height = 3)
PRF1[[1]] + PRF1[[2]]
dev.off()
# compare cluster marker express winthin and across dataset
IFIT3 <- plotGene(ifnb_liger, "IFIT3", axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = TRUE)
IFITM3 <- plotGene(ifnb_liger, "IFITM3", axis.labels = c('UMAP 1', 'UMAP 2'), return.plots = TRUE)
pdf(file = './res/fig_201024/cluster_marker_within_between_dataset.pdf',width = 8,height = 8)
plot_grid(IFIT3[[1]],IFIT3[[2]],IFITM3[[1]],IFITM3[[2]], ncol=2)
dev.off()

参考链接

Joint definition of cell types from multiple scRNA-seq datasets

点赞
  1. ZhuYueXin说道:

    我想问问Liger这个方法和Seurat中的SCT方法有什么区别呢?我好像看到Seurat的SCT也是使用负二项分析的。。。。。十分抱歉,可能问题比较简单,谢谢

  2. mrdoge说道:

    首先呢,SCT是一种批次矫正的方法,它的目标是对表达矩阵进行矫正从而使得不同批次的数据能够可比。Seurat的单细胞数据整合方法则是基于CCA的,而在做CCA的整合之前呢我们需要先对表达矩阵进行标准化,标准化的方法有两种,一种是取对数,另一种就是SCT。所以呢,SCT只是一种标准化的方法的区别,而Seurat和Liger在数据整合上的差异则体现在CCA和NMF的差异上。我不知道我这么回答有没有解释清楚这里面的关系。如果你比较感兴趣CCA和NMF的差异呢,可以查阅一下相关的数学原理,NMF的数学原理在我的博客里有。有任何问题都欢迎找我讨论。

发表评论

%d 博主赞过: