----------------------------------------------------------------------- BIOINFORMATICS COLLOQUIUM School of Computational Sciences George Mason University ----------------------------------------------------------------------- Data-Driven and Peak-Based Feature Selection in Serum Protein Mass Spectrometry Walter S. Liggett National Institute of Standards and Technology Tuesday, September 7, 2004 4:30 pm Verizon Auditorium, Prince William Campus Consider functional canonical correlation analysis (CCA) applied to disjoint sections of lengthy protein mass spectra for the purpose of finding long-distance correlation structure. The relations between the CCA weight functions, which are derived from the data, and spectral peaks, which can be traced to individual proteins, provide a basis for interpreting the structure. The data analyzed consist of repeated measurements of a human serum standard by surface-enhanced laser desorption/ionization (SELDI) time-of-flight (TOF) mass spectrometry. There are 88 spectra obtained from 11 protein chips each with 8 spots. The data-analysis goal is insight into the sample preparation step in such spectrometry, a step that involves the protein chip. We see that variation in this step has an outsized effect on a few proteins. We obtain this insight through interpretation of the long-distance correlation structure and through comparison of spectral variation from chip to chip with variation from spot to spot on single chips. ---------------------------------------------------------------------- Refreshments are served at 4:00 pm. Find the schedule and directions at http://www.binf.gmu.edu/colloq.html