Altered metabolism of mothers of young children with Autism Spectrum Disorder: a case control study

Background Previous research studies have demonstrated abnormalities in the metabolism of mothers of young children with autism. Methods Metabolic analysis was performed on blood samples from 30 mothers of young children with Autism Spectrum Disorder (ASD-M) and from 29 mothers of young typically-developing children (TD-M). Targeted metabolic analysis focusing on the folate one-carbon metabolism (FOCM) and the transsulfuration pathway (TS) as well as broad metabolic analysis were performed. Statistical analysis of the data involved both univariate and multivariate statistical methods. Results Univariate analysis revealed significant differences in 5 metabolites from the folate one-carbon metabolism and the transsulfuration pathway and differences in an additional 48 metabolites identified by broad metabolic analysis, including lower levels of many carnitine-conjugated molecules. Multivariate analysis with leave-one-out cross-validation allowed classification of samples as belonging to one of the two groups of mothers with 93% sensitivity and 97% specificity with five metabolites. Furthermore, each of these five metabolites correlated with 8–15 other metabolites indicating that there are five clusters of correlated metabolites. In fact, all but 5 of the 50 metabolites with the highest area under the receiver operating characteristic curve were associated with the five identified groups. Many of the abnormalities appear linked to low levels of folate, vitamin B12, and carnitine-conjugated molecules. Conclusions Mothers of children with ASD have many significantly different metabolite levels compared to mothers of typically developing children at 2–5 years after birth.


Background
Autism spectrum disorder (ASD) involves a combination of abnormal social communication, stereotyped behaviors, and restricted interests [1]. ASD is assumed to be caused by complex interactions between genetic and environmental factors, both of which can affect metabolism. Previous studies have revealed significant abnormalities in the folate-one carbon metabolism and the transsulfuration pathways of children with ASD [2][3][4][5] and their mothers [6][7][8], resulting in decreased methylation capability, decreased glutathione levels, and increased oxidative stress. Furthermore, the presence of mutations in the MTHFR gene (A1298C and C667T) was found to be associated with increased risk of ASD [9]. The MTHFR gene makes an enzyme, Methylenetetrahydrofolate reductase, which converts 5,10-methylenetetrahydrofolate, a form of folate, to 5methyltetrahydrofolate, a different form of folate. This latter form of folate is crucial in the conversion of homocysteine to methionine [10]. Additionally, levels of prenatal vitamins taken during pregnancy that include B12 and folate are associated with a decreased ASD risk [11], suggesting an association of metabolite levels of the folate one-carbon metabolism (FOCM) and the transsulfuration (TS) pathways with ASD. Studies have also found that too much folic acid supplementation can potentially lead to an increased risk of ASD [11,12]. One of these studies suggests that folic acid and B12 supplements are associated with ASD risk by a u-shaped curve, showing that too little and too much folic acid or B12 supplementation can both lead to an increased risk of ASD [11]. Other studies found that maternal gene variants in the one-carbon metabolism pathway were associated with increased ASD risk when there was no or only low levels of periconceptional prenatal vitamin intake [13,14].
Additional metabolic differences may also be present in mothers of children with ASD, but there has been relatively little investigation of their metabolic state. A more comprehensive understanding of metabolites and metabolic pathways of mothers of children with ASD may lead to a better understanding of the etiology of ASD and provide some insights for evaluating preconception risk and/or risk during pregnancy. For example, currently, the general risk of having a child with ASD in the US is approximately 1.9% [15], however, the recurrence risk increases to approximately 19% if the mother already has a child diagnosed with ASD [16].
This paper focuses on analyzing the metabolic profile of mothers of young children with ASD and mothers of typically developing children, 2-5 years after birth. Measurements were conducted with whole blood to provide information on both intra-cellular and extra-cellular metabolism. This study was limited to women who were not taking folate, B12, or multi-vitamin/ mineral supplements during the 2 months prior to sample collection, in order to minimize the effect of supplements on metabolism. The study includes assessments of many different aspects of metabolism, including analysis of amino acids, peptides, carbohydrates, lipids, nucleotides, Kreb's cycle, vitamins/co-factors, and xenobiotics. This work is part of a larger study, the ASU-Mayo Pilot Study of Young Children with ASD and their Mothers (AMPSYCAM). Although it would be ideal to have biological samples obtained during conception, pregnancy, lactation, and infancy, this would represent a significant hurdle for study design. Instead, this pilot study focuses on 2-5 years after birth to provide preliminary insight into metabolic differences that currently exist. Results from this study provide the motivation for larger future studies to validate the findings and potentially to expand the time horizon to include the time during conception/ pregnancy/lactation.

Study design and sample collection and analysis IRB approval and consent
This study was approved by the IRB of Mayo Clinic-Arizona and the IRB of Arizona State University. All parents signed informed consent forms after the study was explained to them.

Advertising
The study advertisement was emailed to several thousand ASD families on the email lists of the ASU Autism/ Asperger's Research Program and the Zoowalk for Autism Research. Other local autism groups such as the Autism Society of Greater Phoenix also helped advertise the study. Finally, participants were invited to share the study advertisement with their network of friends.

Participants
The inclusion criteria were: 1) Mother of a child 2-5 years of age 2) Child has ASD or has typically development (TD) including both neurological and physical development 3) Child ASD diagnosis verified by the Autism Diagnostic Interview-Revised [17] The exclusion criteria were: 1) Mother currently taking a vitamin/mineral supplement containing folic acid and/or vitamin B12 2) Mother currently taking or had taken any vitamin supplements within the past 2 months 3) Mother pregnant or planning to become pregnant in the next 6 months The recruitment period ran from August 2016 until July 2017. Thirty mothers who have a child with ASD (ASD-M) and twenty-nine mothers who have TD children (TD-M) were recruited for this study. Originally, there were three additional ASD-M participants. However, two of the mothers were disqualified because their child did not meet the ADI-R criteria and one was disqualified because the child did not have an official ASD diagnosis by a psychiatrist, licensed psychologist, or developmental pediatrician. The mothers were agematched by group. Enrollment was done on a rolling basis, the control group was recruited so that the two groups had a similar average age and all came from the greater Phoenix, Arizona area. All mothers in the ASD-M group had a child previously diagnosed with ASD and the diagnoses were confirmed using the ADI-R. The ADI-R is a 2-h structured parent interview and is one of the primary tools used for clinical and research diagnosis of ASD [17]. All the ADI-R interviews were conducted by Elena L. Pollard, who is a certified rater on the ADI-R and has conducted over 300 ADI-R evaluations.

Diet and medical history
An estimate of dietary intake during the previous week was obtained using Block Brief 2000 Food Frequency Questionnaire (Adult version), from Nutrition Quest (www.nutritionquest.com). Medical histories and current medical symptoms were collected from the mothers using a self-survey. The symptoms were collected as there is little research on the health of the mothers of children with ASD. In these surveys, pesticide exposure was defined as any pesticides used in their home during pregnancy. Furthermore, the prenatal supplement usage was recorded as whether any prenatal supplements were used, however, the specific type of prenatal supplement was not recorded. These variables are included in order to address potential cofactors in the metabolic analysis.

Biological sample collection
Urine collections and blood draws were conducted over a 12-month period from September 2016 to August 2017 for both groups. Most participants did their blood draws in fall/winter, with a few in spring or summer for both groups. Fasting whole blood samples were collected in the morning at the Mayo Clinic and all urine collections were first-morning. Samples were stored at -80°C freezers at Mayo and ASU until all samples were collected, and then all samples were sent together to Metabolon for testing. The amount of time the samples were frozen ranged from 1 to 12 months with an average of 8 months.

Laboratory tests
Laboratory measurements were conducted by Mayo Clinic, the Metabolic and Oxidative Stress Laboratory at the University of Arkansas for Medical Sciences, and Metabolon Inc. as described below.
Mayo Clinic Mayo Clinic laboratories measured levels of vitamin B12, folate, methylmalonic acid, homocysteine, isoprostane, vitamin D, vitamin E, hCG, and MTHFR variants as described below.
Vitamin B12 (cyanocobalamin) was measured quantitatively with a Beckman Coulter Access competitive binding immunoenzymatic assay. Briefly, serum is treated with alkaline potassium cyanide and dithiothreitol to denature binding proteins and convert all forms of vitamin B12 to cyanocobalamin. Cyanocobalamin from the serum competes against particle-bound anti-intrinsic factor antibody for binding to intrinsic factoralkaline phosphatase conjugate. After washing, alkaline phosphatase activity on a chemiluminescent substrate is measured and compared against a multi-point calibration curve of known cyanocobalamin concentrations.
Folate (vitamin B9) was measured quantitatively with a Beckman Coulter Access competitive binding receptor assay. Briefly, serum folate competes against a folic acid alkaline phosphatase conjugate for binding to solid phase-bound folate binding protein. After washing, alkaline phosphatase activity on a chemiluminescent substrate is measured and compared against a multi-point calibration curve of known folate concentrations. The Folate assay is designed to have equal affinities for Pteroylglutamic acid (Folic acid) and 5-Methyltetrahydrofolic acid (Methyl-THF), so the result is a measure of both.
Methylmalonic acid (MMA) was measured quantitatively by liquid chromatography tandem mass spectrometry (LC-MS/MS). Briefly, serum is mixed with d3methylmalonic acid as an internal standard, isolated by solid phase extraction, separated on a C18 column, and analyzed in negative ion mode. Chromatographic conditions and mass transitions were chosen to carefully distinguish methylmalonic acid from succinic acid. Homocysteine was measured quantitatively by LC-MS/ MS. Serum is spiked with d8-homocystine as an internal standard, reduced to break disulfide bonds, and deproteinized with formic acid and trifluoroacetic acid in acetonitrile. Measurement of total homocysteine and d4homocysteine (reduced from d8-homocystine) is performed in positive ion mode with electrospray ionization.
Urine F2-Isoprostane (8-isoprostane) was measured quantitatively by LC-MS/MS after separation from prostaglandin F2 alpha. Urine is spiked with deuterated F2isoprostane and deuterated prostaglandin F2 alpha, then positive pressure filtered. A mixed mode anion exchange turbulent flow column is used to clean up samples which are then separated on a C8 column and analyzed in negative ion mode.
Vitamin D (25-hydroxyvitamin D2 and D3) was measured quantitatively by LC-MS/MS. D6-25-hydroxyvitamin D3 is added to serum as an internal standard before protein precipitation with acetonitrile. Online turbulent flow chromatography is used to further clean up the samples prior to separation on a C18 column and analysis in positive ion mode. The D2 and D3 forms are measured separately; results are reported as D2, D3, and the sum.
Vitamin E was measured quantitatively by LC-MS/MS. D6-alpha-tocopherol internal standard is added to serum, and proteins are precipitated with acetonitrile. The supernatant is subjected to online turbulent flow for sample cleanup, separated on a C18 column, and analyzed in positive ion mode.
Serum ferritin was measured quantitatively with a Beckman Coulter Access two-site immunoenzymatic (sandwich) assay. Serum ferritin binds mouse antiferritin that is immobilized on paramagnetic particles; ferritin is also bound by a goat anti-ferritinalkaline phosphatase conjugate. After washing, alkaline phosphatase activity on a chemiluminescent substrate is measured and compared against a multi-point calibration curve of known ferritin concentrations.
MTHFR mutation analysis was performed for the A1298C and C677T variants using Hologic Invader assays. DNA was isolated from whole blood and amplified in the presence of probes for both wildtype and variant sequences. Hybridization of sequence-specific probes to genomic DNA leads to enzymatic cleavage of the probe, releasing an oligonucleotide that binds to a fluorescently labeled cassette. This second hybridization results in generation of a fluorescent signal that is specific to the wildtype or variant allele. The MTHFR gene mutations are measured as categorical variables that indicate whether a sample has the mutation.
The Metabolic and Oxidative Stress Laboratory (MOSL) located at Arkansas Children's Research Institute performed the measurements described below.
Sample preparation for measurement of plasma methylation and oxidative stress metabolites For concentration determination of total thiols (homocysteine, cysteine, cysteinyl-glycine, glutamyl-cysteine, and glutathione), the disulfide bonds were reduced and proteinbond thiols were released by the addition of 50 μl freshly prepared 1.43 M sodium borohydride solution containing 1.5 μM EDTA, 66 mM NaOH and 10 μl n-amyl alcohol and added to 200 μl of plasma. After gentle mixing, the solution was incubated at + 4°C for 30 min with gentle shaking. To precipitate proteins, 250 μl ice cold 10% meta-phosphoric acid was added and the sample was incubated for 20 min on ice. After centrifugation at 18,000 g for 15 min at 4°C, the supernatant was filtered through a 0.2 μm nylon filter and a 20 μl aliquot was injected into the high-performance liquid chromatography (HPLC) system.
For determination of free thiols and methylation metabolites, proteins were precipitated by the addition of 250 μl ice cold 10% meta-phosphoric acid and the sample was incubated for 10 min on ice. Following centrifugation at 18,000 g for 15 min at + 4°C, the supernatant was filtered through a 0.2 μm nylon and a 20 μl aliquot was injected into the HPLC system.

HPLC with Coulometric electrochemical detection
The methodological details for metabolite elution and electrochemical detection have been described previously [18,19] The analyses were accomplished using HPLC with a Shimadzu solvent delivery system (ESA model 580) and a reverse phase C 18 column (5 μm; 4.6 × 150 mm, MCM, Inc., Tokyo, Japan) obtained from ESA, Inc. (Chelmsford, MA). A 20 μl aliquot of plasma extract was directly injected onto the column using Beckman autosampler (model 507E). All plasma metabolites were quantified using a model 5200A Coulochem II electrochemical detector (ESA, Inc., Chelmsford, MA) equipped with a dual analytical cell (model 5010) and a guard cell (model 5020). The concentrations of plasma metabolites were calculated from peak areas and standard calibration curves using HPLC software.
Metabolon Inc Metabolon Inc. conducted measurements of 595 metabolites in whole blood samples in a manner similar to a previous study [20]. Briefly, individual samples were subjected to methanol extraction then split into aliquots for analysis by ultrahigh performance liquid chromatography/mass spectrometry (UHPLC/ MS). The global biochemical profiling analysis comprised of four unique arms consisting of reverse phase chromatography positive ionization methods optimized for hydrophilic compounds (LC/MS Pos Polar) and hydrophobic compounds (LC/MS Pos Lipid), reverse phase chromatography with negative ionization conditions (LC/MS Neg), as well as a hydrophilic interaction liquid chromatography (HILIC) method coupled to negative (LC/MS Polar) [21]. All of the methods alternated between full scan MS and data dependent MSn scans. The scan range varied slightly between methods but generally covered 70-1000 m/z. Metabolites were identified by automated comparison of the ion features in the experimental samples to a reference library of chemical standard entries that included retention time, molecular weight (m/z), preferred adducts, and in-source fragments as well as associated MS spectra and curated by visual inspection for quality control using software developed at Metabolon. Identification of known chemical entities was based on comparison to metabolomic library entries of purified standards [22]. Metabolites that were not officially confirmed with a standard are marked throughout the paper with a *. Measurements that were below the detection limit were replaced with the next lowest measurement divided by the square root of two.

Statistical analysis Univariate analysis
To conduct a univariate analysis, a test was performed for whether the population means or medians between the ASD-M group and the TD-M group are equal against the alternative hypothesis that they are not. To determine which testing method to use, the Anderson-Darling test [23] was applied to each sample. If the recorded samples of a particular metabolite or ratio were drawn from two normal distributions an F-test was subsequently performed to determine whether the population variances of both distributions were identical. If at least one of the two samples of a particular metabolite or ratio was not drawn from a normal distribution, the two-sample Kolmogorov-Smirnov test [24] was applied to examine whether the two samples were drawn from unknown distributions that had the same shape. This pre-analysis yielded four distinct scenarios for a particular metabolite or ratio: (i) both samples were drawn from normal distributions that had identical population variances, (ii) both samples were drawn from a normal distribution with unequal population variances, (iii) both samples were drawn from two unknown distributions that had the same shape and (iv) both samples were drawn from distinctively different distributions. For scenarios (i), (ii), (iii) and (iv) the standard Student t-test (t=), the Welch test (t ≠) [25], the Mann-Whitney U test (MW) [26] and the Welch t-test (t ≠ †) were applied, respectively, for a significance of α = 0.05. If a p-value is less than α, the null hypothesis is rejected. Conversely, for a p-value above or equal to α, the null hypothesis cannot be rejected.
Some of the data analyzed below is categorical. In order to analyze these data, the Chi-square test (χ 2 ) was used for independence. This tests if categorical variables are independent [27]. If this is so, the next step is to determine if the recorded categorical variables are dependent on whether the mother has previously had a child with ASD.
In order to determine the robustness of the hypothesis tests, the false discovery rates (FDR) for each metabolite were also calculated [28]. This was done by calculating the p-values for various combinations of mothers and calculating the fraction of p-values that were considered significant (≤ 0.05) over the total number of p-values. These combinations included leaving one mother out at a time, every combination leaving two mothers out at a time, and every combination leaving three mothers out at a time. This produced 1770 p-values for each metabolite from which the FDR was computed.
The area under the curve (AUC) of the receiver operating characteristic (ROC) curve was also calculated for each metabolite. The ROC curve is a plot of false positive rate (FPR) vs. the true positive rate (TPR). The higher the area under the curve is, the better the measurements are at classifying between the two groups of mothers [29].
A test was considered significant if the p-value was less than or equal to 0.05 and the FDR value was less than or equal to 0.1.

Multivariate analysis
While the univariate analyses focused on testing for equal population means or medians of individual metabolites/ratios, this does not answer the question of how important the differences in mean or median are to separate the two groups of mothers. In order to examine the extent of the differences within the recorded observations of the two groups of mothers, Fisher Discriminant Analysis (FDA) was applied [30]. This technique defines a projection direction in the data space such that the squared difference between the centers of the projected observations of both groups over the variances of the projected observations is a maximum. The objective function, J, to compute the projection direction is as follows: Here, t 1 ¼ 1 i are the orthogonally projected means of both groups onto the direction vector and the sample variances of the projected data points are s 2 1 ¼ 1 The orthogonal projection of i-th observation from the second sample, where p is the unit-length direction vector. Note that the projection coordinate, t 2, i , is often referred to as a score. Essentially, FDA produces a projection direction which represents a tradeoff between optimally separating the two groups of mothers and minimizing the spread of the projected data within each group. FDA is used to develop a multivariate model that can be used to classify between the two groups of mothers.
FDA works well with data consisting of real numbers. However, some of the data were discrete in nature such as the information about MTHFR gene mutation. For classification tasks including both continuous and discrete data, logistic regression was used. Logistic regression is similar to linear regression, but the output is a variable that can assume two or more discrete values, i.e. a binomial or multinomial variable. The prediction of a logistic regression model is the probability that a sample belongs to either the ASD-M group or the TD-M group. The group that produces the highest probability is considered the group that the model classified the sample as belonging to [31,32].
The multivariate analysis made use of both FDA and logistic regression. The data was split into multiple subsets for analysis. These subsets include: (i) the 20 measurements from the FOCM/TS pathways, (ii) the same 20 measurements plus additional nutritional information, (iii) the 20 FOCM/TS metabolites with the additional nutritional information and the MTHFR gene information, and (iv) the 20 FOCM/TS metabolites, the additional nutritional information, the MTHFR gene information and a select number of significant metabolites from the broad metabolomics analysis. The additional nutritional markers included B12, Folate, Ferritin, Methylmalonic acid (MMA), and Vitamin E. The 50 metabolites from the Metabolon dataset included in the analysis produced were selected based upon the 50 highest AUC values from the corresponding ROC curves. These steps reduced the total number of metabolites from 621 to 76 for case iv. All combinations of two through ten variables were analyzed in each subset. FDA was used for subsets i and ii and logistic regression was used for subsets iii and iv. The reason for using two different methods is that FDA was used to ensure consistency in the methodology with prior work [8] while logistics regression was needed for subsets iii and iv because they contained the MTHFR gene information which are binary variables. The reason for analyzing a reduced set of variables for each of the four cases, instead of just investigating the full variable set, is to alleviate some of the concerns related to overfitting of the classification models.
Furthermore, a leave-one-out cross-validation procedure (LOOCV) [32] was used to independently assess classification accuracy. LOOCV removes the first observation, or participant's data, determining a model using (Eq. 1) based on the n − 1 observations, and then applying this model to the first observation which was left out. This application determines whether this observation is correctly/incorrectly classified as belonging to the ASD-M or TD-M group. Then, the second observation is left out, whilst the first observation is included for determining a second model using (Eq. 1). The second model is then used to decide whether the second observation is correctly classified or misclassified. This procedure is repeated until each observation is left out once allowing the calculation of the overall rate of correctly classified and misclassified observations. To determine whether an observation is correctly or incorrectly classified, the samples describing the ASD-M group were defined as positives and the corresponding samples of the TD-M cohort as negatives. The decision boundary to assign the label "ASD-M" or "TD-M" to a data point was based on a kernel density estimation of the scores (projection coordinates) computed by the FDA model from the positives (ASD-M group). More precisely, the decision boundary is determined for a chosen confidence level (one-sided) such that a score that is less than or equal to this boundary is labeled an ASD-M subject and a score that is larger than this threshold is labeled as a TD-M subject. The confidence level is chosen to reduce the difference between the type I and type II errors.

Univariate analysis Participants
The medical histories and characteristics of the participants are shown in Tables 1, 2 and 3. The hypothesis testing shown in the tables was done using either the Chi-squared test or the Student's t-test. Each table lists n.s. (not significant) for the p-value or FDR when the result was greater than 0.05 for the p-value or 0.1 for FDR, indicating that the measurement showed no statistically significant differences between the two groups. All ASD-M participants enrolled in this study had a child that met full criteria for ASD based on ADI-R scores. Table 1 lists basic characteristics and medical histories of the mothers. Information on the children can be found in the supplemental section (Table S-1) as well as information about the mother's pregnancies (Table S- The average age of mothers in the ASD-M group and TD-Group were similar (35.4 years and 34.9 years, respectively), since they were matched for maternal age. The average ages of the children were slightly older for the ASD group (4.71 vs. 3.87 years), as any age between 2 and 5 years was allowed, and the ASD group was skewed towards the end of that range since it takes time for children with ASD to be diagnosed and to have been contacted for this study (Table S-2).
More information on the mothers can be found in Tables 2 and 3. Table 2 lists the medications that were taken by the mothers at the time of the study and Table 3 lists mental and physical symptoms of the mothers. The p-values and FDR results in Tables 2 and  3 show that there were no significant differences in the medication use listed and the symptoms experienced between the two groups of mothers during the study period.

FOCM/TS metabolites
The univariate results for the FOCM/TS metabolites are shown in Table 4. Levels of vitamin B12 and the SAM/ SAH ratios are significantly lower in the ASD-M group compared to the TD-M group, (p ≤ 0.05, FDR ≤ 0.1). Also, levels of Glu-Cys, fCysteine, and fCystine are significantly higher in the ASD-M group compared to the TD-M group (p ≤ 0.05, FDR ≤ 0.1).
Global metabolic profile-Metabolon 622 metabolites were measured in whole blood. The univariate analysis for the 50 metabolites from broad metabolomics with the highest AUC values are shown in Table 5. They are ordered starting with those with the highest AUC. Note that these are semi-quantitative measurements (no absolute values), so only the ratio of ASD-M/TD-M is shown. In almost every case the ASD-M group had lower levels of metabolites than the TD-M group, with the levels of 4-vinylphenol sulfate, NAD+, and three glycine-containing metabolites (gamma-glutamylglycine, cinnamoylglycine, propionylglycine) being especially low (ASD-M/TD-M ratio < 0.5). Four metabolites were higher in the ASD-M group (histidylglutamate, asparaginylalanine, dimethyl sulfone, and mannose). Note that dimethyl sulfone was unusually high in the ASD-M group (ASD-M/TD-M ratio = 18.7, p = 0.01, but the FDR was not significant), 80% of the TD-M measurements of dimethyl sulfone and 47% of the ASD-M measurements of dimethyl sulfone were below the detection limit, and the distribution of the data for it is skewed.
Hypothesis testing was also done on the entire Metabolon dataset and revealed that 48 of these metabolites A measurement was considered significant if the p-value was less than or equal to 0.05 and the FDR is less than or equal to 0.  had significant differences between the two groups of mothers. Three of these metabolites were not included in the top 50 used for analysis because they had lower AUC values than the metabolites included (see Table S-3). The pathways and subpathways of these metabolites can be found in the supplemental section in Table S-4.

Carnitine
As shown in Table 5, several carnitine-conjugated metabolites are significantly different in the two groups of mothers. These metabolites have been indicated in Table  5 with #. The ratio of ASD/TD for carnitine-conjugated metabolites was consistently low, ranging from 0.63 to 0.87, with an average of 0.77. There were 33 additional carnitine metabolites in the 600 metabolites measured by untargeted metabolomics. Of these 33, eight metabolites had ratios indicating that the levels of carnitineconjugated molecules in the ASD-M group were significantly less than in the TD-M group, and none were significantly higher. Additional univariate hypothesis testing on carnitine and two of its precursors (lysine and trimethyllysine) from the Metabolon dataset not included in the top 50 found that the levels of these metabolites are very similar in the ASD-M and TD-M groups (within 1%; data not shown). This suggests that the low levels of carnitineconjugated metabolites is not due to a carnitine deficiency.
Since the levels of carnitine-conjugated molecules were lower in the ASD-M group (see Table 5), and since beef is the primary dietary source of carnitine (some can also be made by the body), hypothesis testing was performed on the beef quantity and beef frequency in the mother's diets to see if there was a difference between the two groups of mothers. These results are shown in Table 6 below.
There was no significant difference found in the mean/ median of the beef consumption frequency and quantity between the two groups. Also, the beef consumption frequency and quantity measurements did not significantly correlate with carnitine levels, except for a slight negative correlation of beef frequency and lignoceroylcarnitine (C24) (r = − 0.26, p = 0.05, unadjusted; data not shown).

Multivariate analysis
The multivariate analysis was performed using multiple subsets of data. The subsets included the twenty metabolites from the FOCM/TS pathways (i), the FOCM/TS metabolites plus some additional nutritional information (ii), the FOCM/TS metabolites plus the additional nutritional information and the MTHFR gene information (iii), and subset iii plus fifty metabolites from the broad metabolomics analysis (iv). The first two subsets were analyzed using FDA because all of the variables were continuous, and to allow comparison with a previous study [8], and the last two subsets were analyzed using logistic regression because the variables included both continuous and binary data. Each multivariate analysis was combined with leave-one-out cross-validation in order to ensure a statistically independent evaluation of Other symptoms not listed were also reported. For the ASD-M group, multiple sclerosis, very sensitive to alcohol, and frequent boils were reported. For the TD-M group, nausea/pain from ovarian cyst, chronic pain, sensitive to loud noises, and touch aversive were reported. The symptoms were considered significantly different between the two groups if the p-value was less than or equal to 0.05 and the FDR was less than or equal to 0.1. GI in this case stands for gastrointestinal. The ratio of ASD/TD values refers to the ratio of the means of the data for ASD cases and TD cases classifiers obtained. The best combinations of metabolites from each of the first three subsets had misclassification errors ranging from 20 to 27% which shows only a very modest ability to predict which group of mothers the sample came from. The highest accuracies were found when analyzing the fourth and final dataset with errors of 3%. The best combinations of fewer metabolites are included in the supplementary section (Table S- 5). Combinations that contained more than 5 variables resulted in a decrease in accuracy due to overfitting of the classification model. It is important to note that many other combinations of metabolites yielded similar results as the top combination of five metabolites, but the errors were slightly higher (Table S-6). Table 7 below details the type I/type II errors using these metabolites.
In order to visually demonstrate the separation between the two groups, a probability density function    In order to visually demonstrate the classification accuracy between the two groups when using the logistic regression classification model, a scatter plot was created showing the probabilities of each sample being classified as one group or another. The scatter plot representing the combination of metabolites from the FOCM/TS metabolites plus additional information and the MTHFR gene information (iii) is shown in the Fig. 3 below.
To further illustrate classification accuracy of the 5metabolite model from Table 7 the probabilities that the samples would be classified by the model in each of the two groups are shown in Fig. 4. The metabolites of this 5-metabolite model consisting of Glu-Cys, histidylglutamate, cinnamoylglycine, proline, adrenoylcarnitine (C22: 4)* are hereafter referred to as the "core metabolites" as these resulted in the lowest type I and type II errors.
The plots show that the ASD-M samples have a high probability of being classified as ASD-M and the TD-M samples have a high probability of being classified as TD-M. The results from this figure coupled with the low misclassification errors from Table 7 show that there are significant metabolic differences between the two groups of mothers and that these differences are sufficiently large to allow for accurate classification in the vast majority of cases.
In order to further investigate the differences between the two groups, the correlation coefficients between the 5 metabolites from the best classification model (Table  7) and the rest of the metabolites considered in the analysis for the combined set of ASD-M and TD-M samples The metabolites listed here are the 50 metabolites measured by Metabolon from broad metabolomics with the highest area under the receiver operating characteristic (ROC) curve (AUC). Metabolites with p-value ≤ 0.05 and FDR ≤ 0.1 marked with Δ. The possible hypothesis tests include Welch's test without the normality criteria being met (t ≠ †), Welch's test (t ≠), Mann-Whitney U (MW, and Student's t-test (t=). The a indicates metabolites that have not been officially confirmed based on a standard, but Metabolon is confident in the metabolite's identity. The b indicates carnitine-conjugated metabolites The beef frequency and quantity measurements were not recorded for every ASD-M participant which is why N = 28 in this case. The beef frequency is defined as the number of times beef was eaten per week and the beef quantity is defined as the serving size (compared to a standard serving) were calculated. The metabolites that had the highest correlation coefficients with these metabolites are listed in Table 8. Furthermore, the correlations of the top 5 metabolites with one another were calculated, and, as expected, very little correlation among these five were found (see Table 9); this is not unexpected as the classification algorithms tries to identify metabolites that provide new information that can be used for classification as redundant information will not increase classification accuracy. This suggests that there are five general areas of metabolic differences in mothers of children with/ without ASD involving 9 or more metabolites for each area.
Most of the metabolites listed in Tables 4 and 5 that were significantly different between the ASD and TD groups were found to be significantly correlated with the 5 core metabolites. However, there were 5 metabolites that were significantly different between the ASD-M and TD-M groups that did not significantly correlate with the 5 core metabolites. These five metabolites were B12, cis-4-decenoylcarnitine (C10:1), catechol sulfate, 7methylxanthine, and tiglylcarnitine (C5:1-DC). A correlation analysis was conducted to determine if any of the 5 metabolites were correlated with one another, possibly forming a 6th group of correlated metabolites. However, none of the 5 metabolites were significantly correlated with one another. So, it appears that there are 5 primary sets of metabolites, and 5 additional metabolites that are not part of those 5 groups, which are significantly different between the ASD-M and TD-M groups.
Overall, many of the metabolites measured in this study are significantly different between the two groups

3% 3%
The a indicates metabolites measured by Metabolon that were not officially confirmed based on a standard, but Metabolon is confident of the Metabolite's identity Fig. 1 PDFs of the combination of metabolites from the FOCM/TS metabolites (i) that resulted in the respective errors shown in Table 7 Fig . 2 PDFs of the combination of metabolites from the FOCM/TS metabolites and additional measurements (ii) that resulted in the respective errors shown in Table 7 of mothers, ASD-M and TD-M. The subset of metabolites that worked the best for classification was a subset of five metabolites which were each correlated with many others.

Univariate analysis
The hypothesis testing done on the medical histories, current medications, and symptoms indicated that there were no significant differences between the two groups of mothers other than the age of their children. Therefore, the medical histories, current medications, and symptoms were determined to not have a large effect on the differences in metabolite measurements between the two different groups in this particular study. The p-value was significant for the use of birth control, but the FDR value was not. The slightly higher use of birth control in the ASD group may be due to less desire to have additional children after one child is already diagnosed with ASD. This shows that the differences that were found in the metabolites were most likely not due to the age of the mothers, their medical histories, current medications, or current symptoms. Hypothesis testing on the metabolites from the FOCM/TS pathways, additional nutritional information, and MTHFR gene information revealed that only five of these measurements have a significant difference in the mean/median between the two groups (Table 4). A meta-analysis of 12 studies [33] found that supplementation with folic acid during pregnancy results in a significantly reduced risk of ASD in the children, with some studies suggesting that folic acid supplementation during the first 2 months of pregnancy is most important. Levels of folate were not significantly lower in the ASD-M group in this study (17% lower, p = 0.20, n.s.), but folate levels were significantly correlated with two of the five key metabolites (Glu-Cys and proline; data not shown). Similarly, vitamin B12 levels were significantly lower in the ASD-M group, and significantly correlated with six of the top 50 metabolites, and Fig. 3 Scatter plot of the probabilities of being classified into one group or the other using a combination of variables from the FOCM/TS pathways, the additional measurements, and the MTHFR gene information (iii) that resulted in the errors listed in Table 7 Fig . 4 Scatter plot of the probabilities of being classified into one group or the other using a combination of variables from the FOCM/TS pathways, the additional measurements, and the top 50 metabolites from the metabolon one study [11] found that abnormal maternal levels of vitamin B12 were associated with an increased risk of ASD, although one small study [34] found no association. Vitamin B12 and folate work together in recycling of homocysteine to methionine, a key step of the FOCM/TS pathway. Hypothesis testing was next performed on the 50 metabolites from Metabolon with the highest AUC. Fortyfive of these 50 metabolites were found to have significant differences (p ≤ 0.05, FDR ≤ 0.1) between the two groups of mothers. Additionally, three other metabolites, not found among the 50 with the highest AUC, also showed statistically significant differences between the two groups (see Table S -3). This reveals that, in addition to the known abnormalities in the FOCM/TS pathway [2][3][4][5][6][7][8] there are also many other metabolic pathway differences between mothers of children with/without ASD. The pathways of the significantly different metabolites are listed in the supplemental section in Table S-4. The a indicates metabolites measured by Metabolon that have not been officially confirmed by a standard, but Metabolon is confident in the metabolite's identity Table 9 Correlation coefficients between the five core metabolites model from The primary categories of these metabolites are amino acids, carnitines, and xenobiotics. In almost all cases these particular metabolites were significantly lower in the ASD-M group. This does not appear to be an artifact of the study, because all samples were collected identically and processed and analyzed together, and most metabolites were not significantly different between the ASD-M and TD-M groups. So, the large number of metabolites listed in Table 5 suggest that there are in fact many metabolic differences between the ASD-M and TD-M groups.

Multivariate analysis
Multivariate analysis was performed to investigate if the metabolites measured would be able to classify a mother as either having had a child with ASD (ASD-M) or a typically-developing child (TD-M). When using just the metabolites from the FOCM/TS metabolites, a combination of five metabolites (tCysteine, Glu-Cys, fCysteine, fCystine/fCysteine, and Nitrotyrosine) appeared to have the lowest misclassification errors calculated using leaveone-out cross-validation with errors of approximately 25% These errors show that the first subset of metabolites have only modest ability to classify the two groups of mothers. It is interesting to note that the present results for the FOCM/TS analysis revealed substantially less ability to distinguish the ASD mothers than a similar study. This previous study found that metabolites from the FOCM/TS pathways could classify between pregnant mothers who have had a child with ASD and pregnant mothers who have not with an accuracy of about 90% [8].
The key difference is that the present paper analyzed FOCM/TS metabolites 2-5 years after birth, whereas the other study evaluated mothers during pregnancy; in other words, measurements during pregnancy were better predictors of ASD risk. The addition of other biomarkers (B12, folate, Ferritin, MMA, vitamin E, and MTHFR) to the dataset did not significantly improve classification with either FDA or logistic regression. The fourth subset of metabolites included the FOCM/ TS metabolites, the nutritional biomarkers, the MTHFR gene information, and 50 metabolites from the 600 metabolites measured by Metabolon. Using this larger set of information, the classification errors decreased significantly. The best combination of five metabolites was found to have misclassification errors as low as 3%. This combination included one metabolite from the FOCM/ TS metabolites (Glu-Cys). At least for this study, the metabolites of the FOCM/TS pathway provide some information for a modest classification, but other metabolites play an even more important role. Correlation analysis (Table 8) revealed that there appear to be five primary categories of significantly different metabolites, with significant correlations within the group to the primary metabolites, but low correlations between the 5 primary metabolites. Almost all the metabolites which were significantly different between the ASD-M and TD-M groups (see Tables 5) fell into one of these five groups. However, there were five metabolites that did not significantly correlate with any of the primary metabolites and did not correlate with each other.

Carnitine-conjugated metabolites
The univariate analysis found that all but one carnitineconjugated metabolite (Adrenoylcarnitine (C22:4)*) were significantly lower in the ASD-M group, with the ratio of carnitine levels for ASD-M/TD-M ranging from 0.66 to 0.87, with an average of 0.78. Carnitine can be produced by the body, but there is some dietary intake also, with the only common dietary sources of carnitine being beef and (to a lesser extent) pork. There were no significant differences in the beef consumption quantity and frequency between the two groups of mothers. Similarly, the levels of carnitine and two of its precursors (lysine and trimethyllysine) were essentially identical for the ASD-M and TD-M groups (data not shown). This suggests a difference in the process of carnitine conjugation and may be due to a defect or impairment of the enzymes which control carnitine conjugation, such as carnitine palmitoyltransferase 1 (CPT1). Other studies have also shown that carnitine supplementation is beneficial for children with ASD [35][36][37].
Implication on possible role of nutritional/metabolic status of mothers with children with ASD Couples who have had a child with ASD have an 18.7% chance of future children being diagnosed with ASD [16], while the general risk for ASD is approximately 1.9% [15]. While it is not known if a mother's metabolic profile is linked to their child's ASD diagnosis, this is a hypothesis that requires further research. Our results indicate that measurements of Glu-Cys, histidylglutamate, cinnamoylglycine, proline, and adrenoylcarnitine (C22: 4)* may be able to predict with approximately 97% accuracy whether a woman, while she is not pregnant, had a child with ASD in the previous 2-5 years. It should be noted that it is not possible from this study to draw conclusions about nutritional status of the mothers during pregnancy.
This research found that several metabolites of the FOCM/TS pathway are different in the ASD-M group. One previous study found that several metabolites from the FOCM/TS pathway were significantly different in children with ASD as compared to controls [2]. A second study measured these metabolites in non-pregnant mothers, similar to what was done here, and found using univariate analysis that homocysteine, adenosine, and SAH were elevated in the mothers who have a child with ASD [7], but the work did not perform the classification analysis that this paper covers. These previous studies, in combination with the research done here, serve as indicators that the FOCM/TS pathways are of importance to ASD research. That being said, this work also tried to replicate findings from a previous study involving two groups of pregnant mothers (mothers who have had a child with ASD and mothers who have not) using metabolites from the FOCM/TS pathways [8]. However, it was found that the FOCM/TS metabolites alone provided only modest classification ability. It needs to be emphasized that this study focuses on metabolic measurements 2-5 years after birth while the significantly stronger results from [8] involved measurements during pregnancy. As such the two results are not directly comparable.
Several studies have suggested that other pathways may be indicative of ASD risk. Studies have found abnormalities in glutathione metabolism and redox metabolism in people with ASD. These processes are important for cell health and can affect other processes within the FOCM/TS pathways [38]. Another study found that when a mother had diabetes and was obese pre-pregnancy, her branched-chain amino acid level was significantly associated with the risk of her child having autism [39]. Similar to that study, further research has found that combining low maternal high-density lipoprotein cholesterol along with high maternal plasma branched-chain amino acid levels and the child being male led to an increased risk of the child being diagnosed with ASD [40]. Another study involving maternal metabolite levels pre-pregnancy from blood samples stored from that time, found that mothers who had children with ASD showed differences in several metabolic pathways including bile acid pathways, glycosphingolipid synthesis, N-glycan and pyrimidine metabolism, and C21-steroid hormone biosynthesis and metabolism [41].
It is important to note that the results for this pilot study are for maternal levels post-pregnancy, so they are only suggestive of possible nutritional and metabolic differences during pregnancy.

Limitations of this study
There were several limitations of the study performed. This is a pilot study with a relatively small sample size. Further studies with larger sample sizes are needed to validate these results and to confirm that the medical histories, symptoms, and current medications of the mothers are not cofounders for the metabolic analysis. These further studies would also benefit by including these potential cofounders in the multivariate analysis. Further studies should also consider measuring paternal factors such as age and diet as a potential confounder. Metabolite concentrations do not stay constant as stress and diet can have an effect. Some of these issues can be a factor of when they were measured. Also, the measurements were taken 2-5 years after giving birth and therefore, the measurements are only suggestive of differences that might have been present during pregnancy/lactation. There was also a discrepancy in the ages of the children in this study. The ages of the mothers were closely matched, but the time since giving birth is different as the children in the ASD group were slightly older (4.71 +/− 1.0 vs. 3.87 +/− 1.3, p = 0.01). Future studies should try to match ages of the mother and time since birth. This is especially important as the diet information was collected using a block food frequency questionnaire for when the mothers were pregnant which may result in a recall bias the more time had passed since birth. Another potential limitation of using this questionnaire is a response bias where some of the dietary intake levels may be underestimated due to social expectations.

Conclusions
In conclusion, this study found many statistically significant differences in metabolites of mothers of children with ASD compared to mothers of typically-developing children, at 2-5 years after birth. A subset of five metabolites was sufficient to differentiate the two groups with approximately 97% accuracy, after leave-one-out cross-validation. Almost all of the metabolites that were significantly different between the two groups were correlated with one of these five metabolites, suggesting that there are at least five areas of metabolic differences between the ASD-M group and the TD-M groups, represented by the core metabolites (Glu-Cys, histidylglutamate, cinnamoylglycine, proline, adrenoylcarnitine (C22:4)) which each correlated with many others. The results of this pilot study may be useful for guiding future studies of metabolic risk factors during conception/pregnancy/lactation. analysis because they had lower AUC values than those in the top 50. These metabolites and their hypothesis testing results are shown in Table S-3.