For example, performing downstream analyses with only 5 PCs does significantly and adversely affect results. We advise users to err on the higher side when choosing this parameter.As you will observe, the results often do not differ dramatically. We encourage users to repeat downstream analyses with a different number of PCs (10, 15, or even 50!).However, these groups are so rare, they are difficult to distinguish from background noise for a dataset of this size without prior knowledge. Dendritic cell and NK aficionados may recognize that genes strongly associated with PCs 12 and 13 define rare immune subsets (i.e. MZB1 is a marker for plasmacytoid DCs).We chose 10 here, but encourage users to consider the following: In this example, all three approaches yielded similar results, but we might have been justified in choosing anything between PC 7-12 as a cutoff. ![]() The third is a heuristic that is commonly used, and can be calculated instantly. ![]() ![]() The second implements a statistical test based on a random null model, but is time-consuming for large datasets, and may not return a clear PC cutoff. The first is more supervised, exploring PCs to determine relevant sources of heterogeneity, and could be used in conjunction with GSEA for example. We therefore suggest these three approaches to consider. Identifying the true dimensionality of a dataset – can be challenging/uncertain for the user.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |