Mass spectrometry data produced from the analysis of biological samples are highly complex. It is not practical to perform a systems-level analysis of the results by manual evaluation. The Patti laboratory develops software solutions to automate data processing and enhance interpretation.
Software
DecoID improves identification rates in metabolomics through database-assisted MS/MS deconvolution
Installation
Install DecoID from the command line
>pip install DecoID
Download
Source Code and Documentation
Dose-Response Metabolomics to Understand Biochemical Mechanisms and Off-Target Drug Effects with the TOXcms Software
Installation
Install TOXcms from the command line
>R CMD INSTALL toxcms_1.0.4.tar.gz
Installation (R Environment)
install.packages(“toxcms_1.0.4.tar.gz”, type=”source”)
Download
Example
An example from the inst/development is shown here.
library(toxcms)
library(data.table)
# upload example dataset from toxcms
feature <- data.table(read.csv(system.file(“extdata”, “dataset.csv”, package=”toxcms”)))
# Step 1 statistical analysis
etom_dosestat <- calcdosestat(Feature = feature, Dose_Levels = c(“_0uM”,”_10uM”,”_50uM”,”_200uM”), multicomp = “none”,
p.adjust.method = “none”,projectName = “dataset”)
# Step 2.1 monotonic trend filtering
etom_drreport_mono <- trendfilter(etom_dosestat, pval_cutoff = 0.05, pval_thres = 1, anova_cutoff = 0.05, trend = “mono”,
relChange_cutoff = 0.05, export = FALSE)
# Step 3.1 ED50 modeling and estimation
etom_drreport_fit = fitdrc(DoseResponse_report=etom_drreport_mono, Dose_values=c(0, 10, 50, 200), ED=0.5, export = TRUE,
mz_tag = “mzmed”, rt_tag = “rtmed”, plot=TRUE)
# Step 4.1 clustering
etom_drreport_clust = clusttrend(etom_drreport_mono, reference_index = NULL, sort.method =c(“clust”,”layer”), sort.thres = 20, dist.method = “euclidean”, hclust.method = “average”,
mztag = “mzmed”, rttag = “rtmed”, heatmap.on = TRUE, plot.all = TRUE, filename = “testdataset_hclust_20clusters.pdf”)
# Step 5.1 PCA of all features with color-coded ED50 values
plotpca(DoseResponse_report = etom_drreport_fit, DoseStat = etom_dosestat, EDrange = c(0,200))
# Step 2.2 inflection trend filtering
etom_drreport_reverse <- trendfilter(etom_dosestat, pval_cutoff = 0.05, pval_thres = 1, anova_cutoff = 0.05, trend = “reverse”,
relChange_cutoff = 0.05, export = TRUE)
# Step 3.2 inflection trend plotting
plottrend(etom_drreport_reverse, Dose_conditions = c(“0uM”,”10uM”,”50uM”,”200uM”), y_transform = T,
mz_tag = “mzmed”, rt_tag = “rtmed”)
Yao C-H, Wang L, Stancliffe E, Sindelar M, Cho K, Weitong Y, Wang Y, and Patti GJ
Dose-Response Metabolomics to Understand Biochemical Mechanisms and Off-Target Drug Effects with the TOXcms Software
Analytical Chemistry, in press, 2019
doi:10.1021/acs.analchem.9b03811
mz.unity
The mz.unity algorithm: defining and detecting complex peak relationships in mass spectral data
R Package
mz.unity is available as an R package on GitHub: nathaniel-mahieu/mz.unity
Installation
#install.packages(“devtools”)
devtools::install_github(“nathaniel-mahieu/mz.unity”)
Usage
library(mz.unity)
relationships = mz.unity.search(A, B, M, ppm, BM.limits)
Examples
Toy data and more examples can be found in the GitHub repository. nathaniel-mahieu/mz.unity
Publication
Mahieu NG, Spalding JL, Gelman S, Patti GJ
Defining and Detecting Complex Peak Relationships in Mass Spectral Data: The mz.unity Algorithm
Anal. Chem, in press, 2016
doi:10.1021/acs.analchem.6b01702
Warpgroup is an R package for processing chromatography-mass spectrometry data (or general time series data). Warpgroup implements:
• Chromatogram subregion detection
• Consensus integration bound determination
• Accurate missing value integration
R Package
Warpgroup is available as an R package on GitHub: nathaniel-mahieu/warpgroup.
Installation
#install.packages(“devtools”)
devtools::install_github(“nathaniel-mahieu/warpgroup”)
Usage
library(warpgroup)
warpgroup.bounds = warpgroup(peak.bounds, eic.matrix, sc.aligned.lim = 8)
Examples
Toy data and more examples can be found in the GitHub repository. nathaniel-mahieu/warpgroup
Publication
Mahieu NG, Spalding JL, Patti GJ
Warpgroup: Increased Precision of Metabolomic Data Processing by Consensus Integration Bound Analysis
Bioinformatics, 32(2), 268-275, 2016
doi:10.1093/bioinformatics/btv564
Supporting Data
Data supporting the publication can be downloaded from here. It includes 22 datasets, 11 HILIC and 11 reverse phase. Supporting Data
Download
Publication
Nikolskiy I, Siuzdak G, Patti GJ
Discriminating Precursors of Common Fragments for Large-Scale Metabolite Profiling by Triple Quadrupole Mass Spectrometry
Bioinformatics, 31(12), 2017-2023, 2015
doi:10.1093/bioinformatics/btv085
IsoMETLIN Metabolite Searching
The isoMETLIN Metabolite Database is a database designed for isotope-based metabolomics. Specifically, isoMETLIN facilitates the identification of metabolites incorporating isotopic labels. isoMETLIN enables users to search all computed isotopologues (>1 million) derived from METLIN on the basis of mass-to-charge values and specified isotopes of interest, such as 13C or 15N.
Tandem Mass Spectrometry
IsoMETLIN contains experimental MS/MS data on isotopomers. These data assist in localizing the position of isotopic labels within a metabolite. From these experimental MS/MS isotopomer spectra, precursor atoms are mapped to fragments. The MS/MS spectra of additional isotopomers have then been computationally generated and included within isoMETLIN.
Metabolites
Created in 2003, METLIN now includes over a million molecules ranging from lipids, steroids, plant & bacteria metabolites, small peptides, carbohydrates, exogenous drugs/metabolites, central carbon metabolites and toxicants. The metabolites and other chemical entities have been individually analyzed to provide experimental MS/MS data.
Technology
METLIN fragmentation. METLIN not only provides MS/MS data at multiple collision energies in both positive and negative ionization mode. It also uses the known structure of the metabolite, the elemental composition, and the accurate mass measurement of the fragments to predict the fragment structure.
METLIN Links. METLIN provides links and information for every one of its 960,000 compounds. These include name, systematic name, structure, elemental formula, mass, CAS number, KEGG ID and link, HMDB ID and link, PubChem ID and link, commercial availability and direct search options on the molecule itself.
R Package
Feature credentialing can be downloaded as an R package from our github repository pattilab/credential.
Credentialing depends on several R packages. XCMS and CAMERA are available on Bioconductor.
Installation
install.packages(“devtools”)
library(devtools)
install_github(“pattilab/credential”)
Example
An example from the inst/ directory is shown here.
library(credential)
credentialed_features = credential(
xs_a = xcms_set_a,
r_12t13_a = 3/4,
an_a = xs_annotate_b,
xs_b = xcms_set_b,
r_12t13_b = 4/3,
an_b = xsannoate_b,
isotope_rt_delta_s = 5,
ppm_for_isotopes = 5,
mixed_ratio_factor = 4,
mixed_ratio_ratio_factor = 1.8
)
The output will include: credential_summary.txt – A count of features during each step of the credentialing process. credentialed_features.csv – The final credentialed features.
Comments
We are working to improve the process, if you have suggestions or would like us to implement a feature please contact us.
Mahieu NG, Huang X, Chen Y-J, Patti GJ
Credentialed Features: A Platform to Benchmark and Optimize Untargeted Metabolomic Methods
Anal. Chem., 86(19), 9583-9589, 2014
doi:10.1021/ac503092d
Installation
Install X13CMS from the command line
>R CMD INSTALL X13CMS_1.4.tar.gz
Download
Sample Data & Scripts
Updates
Changes to v1.4:
– getIsoLabelReport()
1) Default intChoice set to “intb”; was previously “maxo”.
2) Default behavior for enforcing expected monotonic decrease in isotopologue intensity from M0 to Mn in unlabeled samples is to NOT enforce.
– plotLabelReport()
1) Error bars (std dev) are displayed when plotting relative intensity distributions for enriched isotopologue groups.
2) p-value from t-test for difference between total pool size of isotopologue group in unlabeled vs. labeled samples are displayed.
3) Additional arguments required compared to previous version.
– plotIsoDiffReport()
1) Formerly called plotDiffReport(); name changed to avoid confusion with XCMS diffReport()
2) Error bars (std dev) are displayed when plotting relative intensity distributions for enriched isotopologue groups. Groups not enriched in one particular sample class but enriched in the other are plotted only for the enriched class.
3) Additional arguments required compared to previous version.
– plotTotalIsoPools()
1) p-value from t-test for difference between total pool size of isotopologue group in labeled samples of one sample class vs the other are displayed.
2) If an isotopologue group was not significantly enriched in the labeled samples of one sample class, the peak intensities of those isotopologues are plotted anyway, but are indicated as being not significantly enriched.
3) Additional arguments required compared to previous version.
– miniDiffReport()
1) Produces an XCMS diffReport-like table comparing individual features in the unlabeled samples of both sample classes, without the need to re-run xcmsSet() with the sample classes designated as the two biological conditions rather than the labeling type as required for X13CMS
– filterIsoLabelReport()
1) No longer supported
– filterIsoDiffReport()
1) Removed is13C filter
Changes to v1.3:
– getIsoLabelReport():
1) Minor changes to retention time grouping for identifying potential isotopologues
2) Option created for user to enforce expected ion intensity pattern in unlabeled samples (i.e. monotonic decrease from M0 to higher isotopologues)
3) Improved isotopologue grouping for handling redundant peaks returned by XCMS
4) Output report now includes more statistics on both isotopologue peak intensities (standard deviations) and total pool size for each isotopologue group (coefficients of variation for total ion intensity)
– getIsoDiffReport(): Minor changes to output report formatting
– New functions: plotLabelReport(), plotDiffReport(), plotTotalIsoPools() generate pdf files containing plots of all isotopologue groups identified in either a label report of isoDiff report.
Publication
Huang X, Chen Y, Cho K, Nikolskiy I, Crawford PA, Patti GJ
X13CMS: Global Tracking of Isotopic Labels in Untargeted Metabolomics
Anal. Chem., 86(3), 1632-1639, 2014
doi:10.1021/ac403384n
Installation
Install decoMS2 from the command line
R CMD INSTALL decoMS2_0.1.tar.gz
Download
Publication
Nikolskiy I, Mahieu NG, Chen Y, Tautenhahn R, Patti GJ
An Untargeted Metabolomic Workflow to Improve Structural Characterization of Metabolites
Anal. Chem., 85(16), 7713-7719, 2013
doi:10.1021/ac400751j
Address
Washington University
St. Louis, MO 63130