ADAP Aligner
This alignment algorithm has been developed as part of ADAP-GC v1.0, Automatic Data Analysis Pipeline for processing
GC-MS metabolomics data.
For details, see Jiang, W.; Qiu, Y.; Ni, Y.; Su, M.; Jia, W.; Du, X.: An automated data analysis pipeline for
GC-TOF-MS
metabonomics studies. Journal of proteome research 2010, 9 (11), 5974-81
Requirements
ADAP Aligner requires mass spectra to be constructed prior to the alignment (e.g. using Spectral
Deconvolution or CAMERA). A typical workflow where this alignment is used can be as following:
- Raw data methods / Raw data import imports raw data files
- Raw datamethods / Peak detection / Mass detection detects masses in the raw data
- Raw datamethods / Peak detection / ADAP Chromatogram builder builds extracted-ion chromatograms
- Peak list methods / Peak deteciton / Chromatogram deconvoltion detects peaks (features) in each
chromatogram
- Peak list methods / Spectral deconvolution / Multivariate Curve Resolution combines the detected
peaks (features) into analytes and builds pure fragmentation mass spectra for each analyte
- Peak list methods / Alignment / ADAP Aligner (GC) aligns the analytes produced by the previous step
- Peak list methods / Export/Import / Export to MSP file exports fragmentation mass spectra into
MSP format
Description
ADAP Aligner aligns features based on their mass spectra and retention time similarity.
This approach is different from Join Aligner that aligns peaks
across all samples, using their m/z and retention time similarity. Instead, ADAP Aligner
uses mass spectra and retention time to detect similar features in each sample and align them together.
Due to the usage of mass spectra, this alignment approach is significantly different from the
approach of Join Aligner. Therefore,
In fact, this algorithm is similar to Hierarchical Aligner (GC), but it uses a different
clustering method.
Similarity between two features f1 and f2 is calculated by the following score:
S(f1, f2) = w Stime(f1, f2) + (1 - w) Sspec(f1,
f2)
where Stime(f1, f2) is the relative retention time difference between two features
and Sspec(f1, f2) is the spectrum similarity between two features.
Parameters
- Min confidence (number between 0 and 1) is a fraction of the total number of samples. An aligned feature
must be detected at least in several samples. This parameter determines the minimum number of samples where a
feature must be detected. The default value is 0.7, so an aligned feature must be observed at least in 70% of
all samples.
- Retention time range (minutes) is the maximum allowed retention time difference between aligned features
from different samples.
- M/z tolerance is the maximum m/z difference, when two peaks from different mass spectra are considered
equal.
- Score threshold (number between 0 and 1) is the minimum value of the similarity function
S(f1, f2) required for features to be aligned together. The default value is 0.75.
- Score weight (number between 0 and 1) is the weight w that is used in the similarity function
S(f1, f2). The default value is 0.1.
- Retention time similarity chooses a method used for calculating the retention time similarity.
The retention time difference (fast) is preferred method.