Supplementary Data
- Lung Adenocarcinoma Probe Dataset (LAPD) Files
- AffyMAPSDetector input and output files
- Supporting data for Figures and Tables
- Data analysis confirming behavior of SNP-containing probes (intensity distribution figures/supporting-data)
- Data analysis confirming behavior of probes without SNPs (intensity distribution figures/supporting-data)
- Examples: SNPs affecting binding efficiencies of PM and MM probes
- Additional Data (from human, mouse, and rat gene chips)
- Additional Data Files (also uploaded at BMC Bioinformatics)
Go to top
- LAPD - Raw PM/MM data for probes without SNPs (>2GB).
- LAPD - Raw PM/MM data for probes with SNP at the 13th position (~3MB).
- LAPD - Raw PM/MM data for probes with SNP NOT at the 13th position (~80MB).
Go to top
AffyMAPSDetector requires two ASCII text files as input data sources: "NetAffx
Annotation File" and "Sequence File". Both of these files are available for
download from the Affymetrix support page under "NetAffx Annotation File" and
"Sequence Files" respectively. However, Affymetrix requires registration before
you can download the annotation files. Here we refer to the "NetAffx Annotation
File" as the gene information file (GIF) and the "Sequence File" as the probe-set
information file (PIF). GIF and PIF, from HG-U95Av2 GeneChipTM that
were used to characterize SNPs are provided below for easy access, however,
Affymetrix NetAffxTM
analysis center is recommended for obtaining an upto date version of these files.
- Tab delimited gene information
file in ASCII text.
- Tab delimited probe information
file in ASCII text. (download
PIF in FASTA format.)
The output files generated by AffyMAPSDetector (based on dbSNP build 123)
are:
- HG-U95Av2_Probes_With_SNPs.xls:
The HG-U95Av2 probes that were found to have documented SNPS.
- HG-U95Av2_Genes_Without_Locus_Link.xls:
List of those genes for which either LocusLink information was not provided
in the gene-information file or AffyMAPSDetector could not parse LocusLink
as a positive integer.
- HG-U95Av2_Probes_Without_Snps.xls:
List of genes and probe-sets for which no documented SNPs were found.
- HG-U95Av2_Genes_Info_From_Web.xls:
This file contains the gene description and the mRNA sequences of genes that were collected by
AffyMAPSDetector from the NCBI nucleotide database.
- HG-U95Av2_Snps_Info_From_Web.xls:
This file contains additional SNP information including: "Nucleotide Accession Number of Gene", "SNP
position with respect to mRNA sequence", "Genomic Axis Orientation", "dbSNP Reference Cluster ID - rs#",
"Protein Accession Number", "Function", "SNP Class", "Heterozygosity", and "Allele".
- Log File: This file contains the output
log messages from AffyMAPSDetector run.
Note that all the above files, except the log file, are tab delimited
ASCII text files. As a convenience, you can also download
all the output files zipped together.
The output files generated by AffyMAPSDetector (based on dbSNP build
126) are:
- HG-U95Av2_Probes_With_SNPs.xls:
The HG-U95Av2 probes that were found to have documented SNPS.
- HG-U95Av2_Genes_Without_Locus_Link.xls:
List of those genes for which either LocusLink information was not provided
in the gene-information file or AffyMAPSDetector could not parse LocusLink
as a positive integer.
- HG-U95Av2_Probes_Without_Snps.xls:
List of genes and probe-sets for which no documented SNPs were found.
- HG-U95Av2_Genes_Info_From_Web.xls:
This file contains the gene description and the mRNA sequences of genes that were collected by
AffyMAPSDetector from the NCBI nucleotide database.
- HG-U95Av2_Snps_Info_From_Web.xls:
This file contains additional SNP information including: "Nucleotide Accession Number of Gene", "SNP
position with respect to mRNA sequence", "Genomic Axis Orientation", "dbSNP Reference Cluster ID - rs#",
"Protein Accession Number", "Function", "SNP Class", "Heterozygosity", and "Allele".
- Log File: This file contains the output
log messages from AffyMAPSDetector run.
Note that all the above files, except the log file, are tab delimited
ASCII text files. As a convenience, you can also download
all the output files zipped together.
Comparison of the output results between dbSNP builds 123 and 126 from AffyMAPSDetector run on HG-U95Av2:
The GeneChipTM HG-U95Av2 contains 199,084 probes belonging to 12,625 probe-sets (or 11,302 unique genes).
| |
dbSNP Build 123 |
dbSNP Build 126 |
| Number of probes that contain documented SNPs |
7,286 probes from 2,582 probe-sets (or 2,479 unique genes) |
8,758 probes from 3,002 probe-sets (or 2,858 unique genes) |
| Number of SNP containing probes involving 13th position |
325 probes |
409 probes |
| Number of SNP containing probes NOT involving 13thposition |
6,961 probes |
8,349 probes |
| Number of SNP containing probes involving 13th position only |
251 probes |
332 probes |
| Number of SNP containing probes involving 13th position and atleast one more position |
74 probes |
77 probes |
| Number of probe-sets (or unique genes) without documented SNPs |
8,474 probe-sets (or 7,662 unique genes) |
9,450 probe-sets (or 8,533 unique genes) |
| Number of probes NOT mapped into their respective reference mRNA sequence |
15,269 probes belonging to 2,304 probe-sets (or 2,249 unique genes) |
16,753 probes belonging to 2,168 probe-sets (or 2,017 unique genes) |
Note: At the time of submission of the draft for this publication, only dbSNP-build-123
was available, therefore, analysis of the data presented below is based on dbSNP-build-123.
Go to top
- Download HG-U95Av2
probes having one or more SNP (post processed and compiled; Microsoft excel
format).
- Download Probes with
SNP only at 13th position.
- Download Probes
with SNP not at 13th position.
- A set of 325 probes each are used for intensity distribution
profiling of probes with/without SNPS (to compare and contrast the behavior
of SNP containing probes with respect to normal behavior of probes without
SNPs). Both the datasets below are subset of Lung Adenocarcinoma Dataset.
For (PM - MM) intensity differenced computation, PM and MM intensity values
for each of 325 probes is extracted from 190 lung adeno CEL files and used
for their intensity distribution profiling. Here is the raw (PM - MM) probe intensity data of 325 probes:
- with SNPs only at 13th
position.
- without SNPs (randomly
selected).
- 325 probes' raw intensity (PM - MM) data transformed into (325
x 190) matrix for
- with SNPs only at 13th
position.
- without SNPs.
- Download Table 1
complete dataset containing all 251 probes having SNP only at the 13th
position.
- Download Table 2
complete dataset containing all 75 probes having SNP at the 13th
position and some other position.
Go to top
Go to top
Go to top
Click here
to see a number of examples where presence of one or more SNPs in a probe affects
the binding efficiencies of the related Perfect Match (PM) and Mis-Match (MM)
probes.
Go to top
In addition to HG-U95Av2, below are the AffyMAPSDetector
results (using dbSNP-build-123) for other expression chips from human, mouse and rat genomes.
Human Expression Array GeneChip™ HG-U133
Mouse Expression Array GeneChip™ MG-430A2
Rat Expression Array GeneChip™ Rat-230
- file 1 - Complete SNP output file.
- file 2 - Probes having SNP at mismatch location.
- file 3 - Probe-sets without SNPs.
- file 4 - Genes with undefined LocusLink.
- file 5 - HG-U95Av2 genes mRNA sequence.
- file 6 - Additional SNP information for Probes having SNPs.
- file 7 - AffyMAPSDetector execution log.
- file 8 - Behavior of SNP-containing probes with respect to PM and MM binding efficiencies.
- file 9 - Behavior of SNP-containing probes with respect to PM and MM binding efficiencies.
- file 10 - Example of probes affecting probe-set detection calls.
- file 11 - SNP-containing probes’ PM/MM ratio data file for expression genotype.
- file 12 - AffyMAPSDetector v1 distribution package (compiled code).
- file 13 - AffyMAPSDetector v1 source code.
- file 14 - Intensity distribution profiles' confirmation.
Go to top