The R source program dfHclust.R
is in the github repository for lab 7. Source it into your R session, and then verify that it works with the call
dfHclust(mtcars, labels=rownames(mtcars))
If it fails, add any missing libraries, and do what it takes to get it to work. Interrupt the shiny session to proceed.
Set up inputs to dfHclust
using the tissuesGeneExpression
df = data.frame(t(e))
no = which(tab$SubType == "normal")
df = df[no,]
tisslabel = tab$Tissue[no]
Use dfHclust(df[,1:50], tisslabel)
as a check. Interrupt
Map the column names of df
to gene symbols. Use hgu133a.db
. Remove columns with unmappable symbols and rename the remaining columns with the symbols.
nrow=3, ncol=4, template=template[,-1])
We’ll reduce the data matrix (for convenience) to 701 unique genes
uex = expressionPatterns[,uniqueGenes]
We’ll begin with a factorization using a basis of rank 10.
m10 =nmf(uex, rank=10)
## <Object of class: NMFfit>
## # Model:
## <Object of class:NMFstd>
## features: 405
## basis/rank: 10
## samples: 701
## # Details:
## algorithm: brunet
## seed: random
## RNG: 403L, 624L, ..., 2099891502L [e38d032700af470a3a1013304e0fcab6]
## distance metric: 'KL'
## residuals: 4901.298
## Iterations: 2000
## Timing:
## user system elapsed
## 125.648 0.024 125.697
The authors of the Wu et al. 2016 PNAS paper justify a rank 21 basis.
m21 =nmf(uex, rank=21)
To visualize the clustering of the expression patterns with the rank 10 basis, use
imageBatchDisplay(basis(m10), nrow=4,ncol=3,template=template[,-1])
The ‘predicted’ matrix with the rank 10 basis is
PM10 = basis(m10)%*%coef(m10)
Compare the faithfulness of the rank 10 and rank 21 approximations.
Produce the display of the m21
basis with imageBatchDisplay
and check that the constituents are similar to those shown as principal patterns below (from the Wu et al. paper).
Can any of the patterns found with the rank 10 basis be mapped to key anatomical components of the blastocyst fate schematic?