## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3     ✓ purrr   0.3.4
## ✓ tibble  3.0.4     ✓ dplyr   1.0.6
## ✓ tidyr   1.0.2     ✓ stringr 1.4.0
## ✓ readr   1.4.0     ✓ forcats 0.5.1
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag()    masks stats::lag()
## This version of bslib is designed to work with shiny version 1.6.0 or higher.
## Registered S3 method overwritten by 'spatstat':
##   method     from
##   print.boxx cli
## Loading DevKidCC
## Loading required package: scPred
## 
## Attaching package: 'testthat'
## The following object is masked from 'package:dplyr':
## 
##     matches
## The following object is masked from 'package:purrr':
## 
##     is_null
## The following object is masked from 'package:tidyr':
## 
##     matches
## Warning in setup_ns_exports(path, export_all, export_imports): Objects listed as
## exports, but not present in namespace: SankeyPlot
## Loading SeansKit
## Warning in setup_ns_exports(path, export_all, export_imports): Objects listed as
## exports, but not present in namespace: ComponentPlot

Purpose

This document will run through the analysis requested by Jess for the PT paper.

The resulting D13 and D13+14 pooled replicate libraries resolved 19,956 and 15,852 individual cell transcriptomes per timepoint, respectively. tSNE plots showed the resolution of distinct clusters for both D13 monolayers and resulting PT-enhanced (D13+14) organoids (Figure 3B – Jess still to add).

To confirm whether the enhanced protocol improves the specification, patterning, and maturation of kidney cell types, D13 and D13+14 samples were directly compared to monolayers and organoids arising from our standard organoid protocol (D7 and D7+14[?]), as well as normal human (fetal?) kidney, using the R package DevKidCC (Wilson et al 2021) (Figure 3C-D – sean to add proportion plots).

Load extended diff datasets

Run DevKidCC on required data

Same plot but wider.

We want to compare the data to relevant samples within the literature. We can filter using GetSampleIDs and include these in the DotPlotCompare function call.

Here we want to compare the NPC populations, so the most relevant samples are those younger than D16 for NPC comparison and older than 20 for the PT comparison

With the original groupings

Merging NPC, will use bars below to signify category (+/- or colours between graph and gene )

Note the lack of HOXD11 in Low et al, seemingly the same time point (although not monolayer)

For the PT genes, there is high expression of CUBN and SLC3A1, lower (but still clearly present) expression of LRP2, SLC47A1 and HNF4A, while ACE2 is clearly expressed albeit lowly and in fewer cells.

This may be where imputation with MAGIC could come in handy. Their original paper said it had utility in filling out gene expression that is likely there but not collected, particularly with lower expressing genes, allowing for better head to head plotting (i.e. can do an ACE2 vs PT markers plot and show correlation between them).

Running MAGIC has led to a firming of the correlation between markers such as LRP2 and HNF4A as values that were originally 0 have been imputed to have expression. ACE2 has a high correlation of expression with HNF4A. There has not been an over-imputation of ACE2, as we cannot see any coexpressed with GATA3 as a distal marker.