Reproducing results of a publication. Systems biology of vaccination for seasonal influenza in humans. Nakaya et al (2011).
The paper uses system biology approach to study immune response to vaccination against influenza in three seasons.
Two cohorts vaccinated with TIV (2007 & 2008). And one with LAIV (2008).
The data was made available on GEO after publication. The SuperSeries is referenced in the paper.
library(GEOquery)
gse <- getGEO("GSE29619")
The returned object is a list
of 3 ExpressionSet
s. One per SubSeries/cohort.
gse
## $`GSE29619-GPL13158_series_matrix.txt.gz`
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 54715 features, 163 samples
## element names: exprs
## protocolData: none
## phenoData
## sampleNames: GSM733843 GSM733844 ... GSM734021 (163 total)
## varLabels: title geo_accession ... data_row_count (43 total)
## varMetadata: labelDescription
## featureData
## featureNames: 1007_PM_s_at 1053_PM_at ... AFFX-TrpnX-M_at (54715
## total)
## fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16
## total)
## fvarMetadata: Column Description labelDescription
## experimentData: use 'experimentData(object)'
## Annotation: GPL13158
##
## $`GSE29619-GPL3921_series_matrix.txt.gz`
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 22277 features, 84 samples
## element names: exprs
## protocolData: none
## phenoData
## sampleNames: GSM734022 GSM734023 ... GSM734105 (84 total)
## varLabels: title geo_accession ... data_row_count (38 total)
## varMetadata: labelDescription
## featureData
## featureNames: 1007_s_at 1053_at ... AFFX-TrpnX-M_at (22277
## total)
## fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16
## total)
## fvarMetadata: Column Description labelDescription
## experimentData: use 'experimentData(object)'
## Annotation: GPL3921
##
## $`GSE29619-GPL570_series_matrix.txt.gz`
## ExpressionSet (storageMode: lockedEnvironment)
## assayData: 54675 features, 27 samples
## element names: exprs
## protocolData: none
## phenoData
## sampleNames: GSM733816 GSM733817 ... GSM733842 (27 total)
## varLabels: title geo_accession ... data_row_count (43 total)
## varMetadata: labelDescription
## featureData
## featureNames: 1007_s_at 1053_at ... AFFX-TrpnX-M_at (54675
## total)
## fvarLabels: ID GB_ACC ... Gene Ontology Molecular Function (16
## total)
## fvarMetadata: Column Description labelDescription
## experimentData: use 'experimentData(object)'
## Annotation: GPL570
The seasons cannot be identified without looking in the objects.
names(gse)
## [1] "GSE29619-GPL13158_series_matrix.txt.gz"
## [2] "GSE29619-GPL3921_series_matrix.txt.gz"
## [3] "GSE29619-GPL570_series_matrix.txt.gz"
Let’s look at the metadata available on GEO
library(Biobase)
es_LAIV <- gse[[1]]
head(pData(es_LAIV), 3)
## title geo_accession
## GSM733843 2008 LAIV subject ID 1 at D0 post-vaccination GSM733843
## GSM733844 2008 LAIV subject ID 1 at D3 post-vaccination GSM733844
## GSM733845 2008 LAIV subject ID 1 at D7 post-vaccination GSM733845
## status submission_date last_update_date type
## GSM733843 Public on Jul 10 2011 May 29 2011 Jul 10 2011 RNA
## GSM733844 Public on Jul 10 2011 May 29 2011 Jul 10 2011 RNA
## GSM733845 Public on Jul 10 2011 May 29 2011 Jul 10 2011 RNA
## channel_count
## GSM733843 1
## GSM733844 1
## GSM733845 1
## source_name_ch1
## GSM733843 Peripheral blood mononuclear cells of subjects vaccinated with Influenza vaccine at day 0 (before vaccination), and days 3 and 7 post-vaccination.
## GSM733844 Peripheral blood mononuclear cells of subjects vaccinated with Influenza vaccine at day 0 (before vaccination), and days 3 and 7 post-vaccination.
## GSM733845 Peripheral blood mononuclear cells of subjects vaccinated with Influenza vaccine at day 0 (before vaccination), and days 3 and 7 post-vaccination.
## organism_ch1 characteristics_ch1 characteristics_ch1.1
## GSM733843 Homo sapiens cell type: PBMC subject id: 1
## GSM733844 Homo sapiens cell type: PBMC subject id: 1
## GSM733845 Homo sapiens cell type: PBMC subject id: 1
## characteristics_ch1.2 characteristics_ch1.3
## GSM733843 time point: D0 vaccine: LAIV (FluMist, MedImmune)
## GSM733844 time point: D3 vaccine: LAIV (FluMist, MedImmune)
## GSM733845 time point: D7 vaccine: LAIV (FluMist, MedImmune)
## characteristics_ch1.4
## GSM733843 hai titer (day 0) - a/south dakota/6/2007 (h1n1): 40
## GSM733844 hai titer (day 0) - a/south dakota/6/2007 (h1n1): 40
## GSM733845 hai titer (day 0) - a/south dakota/6/2007 (h1n1): 40
## characteristics_ch1.5
## GSM733843 hai titer (day 28) - a/south dakota/6/2007 (h1n1): 40
## GSM733844 hai titer (day 28) - a/south dakota/6/2007 (h1n1): 40
## GSM733845 hai titer (day 28) - a/south dakota/6/2007 (h1n1): 40
## characteristics_ch1.6
## GSM733843 hai titer (day 0) - a/uruguay/716/2007 nymc x-175c (h3n2): 40
## GSM733844 hai titer (day 0) - a/uruguay/716/2007 nymc x-175c (h3n2): 40
## GSM733845 hai titer (day 0) - a/uruguay/716/2007 nymc x-175c (h3n2): 40
## characteristics_ch1.7
## GSM733843 hai titer (day 28) - a/uruguay/716/2007 nymc x-175c (h3n2): 40
## GSM733844 hai titer (day 28) - a/uruguay/716/2007 nymc x-175c (h3n2): 40
## GSM733845 hai titer (day 28) - a/uruguay/716/2007 nymc x-175c (h3n2): 40
## characteristics_ch1.8
## GSM733843 hai titer (day 0) - b/florida 4/2006: 20
## GSM733844 hai titer (day 0) - b/florida 4/2006: 20
## GSM733845 hai titer (day 0) - b/florida 4/2006: 20
## characteristics_ch1.9 molecule_ch1
## GSM733843 hai titer (day 28) - b/florida 4/2006: 40 total RNA
## GSM733844 hai titer (day 28) - b/florida 4/2006: 40 total RNA
## GSM733845 hai titer (day 28) - b/florida 4/2006: 40 total RNA
## extract_protocol_ch1
## GSM733843 Following PMBC isolation from CPT, 2 x 10^6 cells were lysed in 1 ml of TRIzol and stored at -80C (Cat# 15596-026; Invitrogen Life Technologies). After all time points were collected for a subject, the samples were thawed, and the RNA isolation proceeded according to the manufacturer’s protocol.
## GSM733844 Following PMBC isolation from CPT, 2 x 10^6 cells were lysed in 1 ml of TRIzol and stored at -80C (Cat# 15596-026; Invitrogen Life Technologies). After all time points were collected for a subject, the samples were thawed, and the RNA isolation proceeded according to the manufacturer’s protocol.
## GSM733845 Following PMBC isolation from CPT, 2 x 10^6 cells were lysed in 1 ml of TRIzol and stored at -80C (Cat# 15596-026; Invitrogen Life Technologies). After all time points were collected for a subject, the samples were thawed, and the RNA isolation proceeded according to the manufacturer’s protocol.
## label_ch1
## GSM733843 biotin
## GSM733844 biotin
## GSM733845 biotin
## label_protocol_ch1
## GSM733843 Total RNA sample quality was evaluated by spectrophotometer to determine quantity, protein contamination and organic solvent contamination, and an Agilent 2100 Bioanalyzer was used to check for RNA degradation. Two-round in vitro transcription amplification and labeling was performed starting with 50 ng intact, total RNA per sample, following the Affymetrix protocol.
## GSM733844 Total RNA sample quality was evaluated by spectrophotometer to determine quantity, protein contamination and organic solvent contamination, and an Agilent 2100 Bioanalyzer was used to check for RNA degradation. Two-round in vitro transcription amplification and labeling was performed starting with 50 ng intact, total RNA per sample, following the Affymetrix protocol.
## GSM733845 Total RNA sample quality was evaluated by spectrophotometer to determine quantity, protein contamination and organic solvent contamination, and an Agilent 2100 Bioanalyzer was used to check for RNA degradation. Two-round in vitro transcription amplification and labeling was performed starting with 50 ng intact, total RNA per sample, following the Affymetrix protocol.
## taxid_ch1
## GSM733843 9606
## GSM733844 9606
## GSM733845 9606
## hyb_protocol
## GSM733843 Hybridization was performed on Human U133 Plus 2.0 Arrays (using GeneTitan platform, Affymetrix, or individual cartridges) for 16 h at 45 oC, and 60 r.p.m. in a Hybridization Oven 640 (Affymetrix), slides were washed and stained with a Fluidics Station 450 (Affymetrix).
## GSM733844 Hybridization was performed on Human U133 Plus 2.0 Arrays (using GeneTitan platform, Affymetrix, or individual cartridges) for 16 h at 45 oC, and 60 r.p.m. in a Hybridization Oven 640 (Affymetrix), slides were washed and stained with a Fluidics Station 450 (Affymetrix).
## GSM733845 Hybridization was performed on Human U133 Plus 2.0 Arrays (using GeneTitan platform, Affymetrix, or individual cartridges) for 16 h at 45 oC, and 60 r.p.m. in a Hybridization Oven 640 (Affymetrix), slides were washed and stained with a Fluidics Station 450 (Affymetrix).
## scan_protocol
## GSM733843 Scanning was performed on a 7th generation GeneChip Scanner 3000, and the Affymetrix GCOS software was used to perform image analysis and generate raw intensity data.
## GSM733844 Scanning was performed on a 7th generation GeneChip Scanner 3000, and the Affymetrix GCOS software was used to perform image analysis and generate raw intensity data.
## GSM733845 Scanning was performed on a 7th generation GeneChip Scanner 3000, and the Affymetrix GCOS software was used to perform image analysis and generate raw intensity data.
## description
## GSM733843 2008-LAIV-1-D0
## GSM733844 2008-LAIV-1-D3
## GSM733845 2008-LAIV-1-D7
## data_processing
## GSM733843 RMA normalization was performed using Expression Console software (Affymetrix Inc, version 1.1).
## GSM733844 RMA normalization was performed using Expression Console software (Affymetrix Inc, version 1.1).
## GSM733845 RMA normalization was performed using Expression Console software (Affymetrix Inc, version 1.1).
## platform_id contact_name contact_email contact_phone
## GSM733843 GPL13158 Helder,I,Nakaya hnakaya@emory.edu 1-404-712-2594
## GSM733844 GPL13158 Helder,I,Nakaya hnakaya@emory.edu 1-404-712-2594
## GSM733845 GPL13158 Helder,I,Nakaya hnakaya@emory.edu 1-404-712-2594
## contact_laboratory contact_department contact_institute
## GSM733843 Bali Pulendran Emory Vaccine Center Emory University
## GSM733844 Bali Pulendran Emory Vaccine Center Emory University
## GSM733845 Bali Pulendran Emory Vaccine Center Emory University
## contact_address contact_city contact_state
## GSM733843 954 Gatewood Road, room 2040 Atlanta GA
## GSM733844 954 Gatewood Road, room 2040 Atlanta GA
## GSM733845 954 Gatewood Road, room 2040 Atlanta GA
## contact_zip/postal_code contact_country
## GSM733843 30329 USA
## GSM733844 30329 USA
## GSM733845 30329 USA
## supplementary_file
## GSM733843 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733843/GSM733843.CEL.gz
## GSM733844 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733844/GSM733844.CEL.gz
## GSM733845 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733845/GSM733845.CEL.gz
## supplementary_file.1
## GSM733843 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733843/GSM733843.chp.gz
## GSM733844 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733844/GSM733844.chp.gz
## GSM733845 ftp://ftp.ncbi.nlm.nih.gov/pub/geo/DATA/supplementary/samples/GSM733nnn/GSM733845/GSM733845.chp.gz
## data_row_count
## GSM733843 54715
## GSM733844 54715
## GSM733845 54715
Without any additional information the various seasons cannot be combined.
combine(gse[[1]], gse[[2]])
## Error in combine(gse[[1]], gse[[2]]): objects have different annotations: GPL13158, GPL3921
The platforms are different accross seasons.
In order to do something simple like differential gene expression analyses:
ExpressionSet
for visit and patient IDannotate
to map probes to genescombine
into one expression matrixCombining this information with another dataset would require additional data cleaning (assuming the IDs are consistent accros experiment).
This study was funded by HIPC. As a result, all of its data has been curated and standardized by ImmPort and is now publicly available on ImmuneSpace. We will use ImmuneSpaceR to retrieve the gene-expression data and associated metadata.