Relationship between our microbiome and personalized nutrition

Relationship between our microbiome and personalized nutrition

We are searching data for your request:

Forums and discussions:
Manuals and reference books:
Data from registers:
Wait the end of the search in all databases.
Upon completion, a link will appear to access the found materials.

Recently, it has been asked whether there are 'metabolic types' between humans that can benefit from a sort of personalized nutrition. One answer suggested that one discerning factor could be the human microbiome. It is known that host-microbial symbiotic states might respond differently to diet and drug intake, but whether this can be useful for personalized nutrition is less clear to me. I wonder:

does the microbiome affect food metabolism? (useful for personalized nutrition)


is the food that we eat affecting the microbiome? (less useful for personalized nutrition)

I know the answer is probably both, but there are examples causally demonstrating one branch and excluding the other?

does the microbiome affect food metabolism?

Most definitely (and not surprisingly). The Arumugam paper [1] notes that

The drivers of [enterotype 1] seem to derive energy primarily from carbohydrates and proteins through fermentation,… because genes encoding enzymes involved in the degradation of these substrates (galactosidases, hexosaminidases, proteases) along with glycolysis and pentose phosphate pathways are enriched in this enterotype [… ]

Enterotype 2… is enriched in Prevotella… and the co-occurring Desulfovibrio, which can act in synergy to degrade mucin glycoproteins present in the mucosal layer of the gut [… ]

Enterotype 3 is [… ] enriched in membrane transporters, mostly of sugars, indicating the efficient binding of mucin and its subsequent hydrolysis as well as uptake of the resulting simple sugars by these genera. [… ]

The enriched genera indicate that enterotypes use different routes to generate energy from fermentable substrates available in the colon, reminiscent of a potential specialization in ecological niches or guilds. In addition to the conversion of complex carbohydrates into absorbable substrates, the gut microbiota is also beneficial to the human host by producing vitamins. Although all the vitamin metabolism pathways are represented in all samples, enterotypes 1 and 2 were enriched in biosynthesis of different vitamins [… ]

[All emphasis mine.]

is the food that we eat affecting the microbiome?

Yes, just as certainly. I don't have a publication handy but it should be obvious that our food influences our gut microbiome - in the extreme case, it can kill it (consider antibiotics side effects).

[1] Manimozhiyan Arumuga, Jeroen Raes & al., Enterotypes of the human gut microbiome, Nature 473, 174-180, May 2011.

Yes, the microbiome affects food metabolism and the diet affects the composition of the microbiome. +1 to Konrad for his response. This is an area of research in which I and colleagues are engaged. Frankly, it is easier to assess the changes to the microbiome based on diet rather than looking at the fecal material to determine (unused) metabolic energy or potential given a certain input (i.e, from a controlled diet).

Recently, we've fed humans a high number of servings of whole grains. The response is highly individualistic with some declines and others increases in Firmicutes, for example. Typically, Bacteroidetes numbers move opposite to Firmicutes, but not in all individuals. These observations are also seen by others in this growing field of research.

The "enterotypes" paper Konrad cites is beginning to see some resistance in the field in that the three enterotypes defined by those authors are not really so cleanly defined when one takes a closer look at more individuals. Time will tell how well one view or the other holds.

Nutrition for Precision Health, powered by the All of Us Research Program

The goal of the NIH Common Fund’s Nutrition for Precision Health, powered by the All of Us Research Program, is to develop algorithms that predict individual responses to food and dietary patterns. Nutrition plays an integral role in human development and in the prevention and treatment of disease. However, there's no such thing as a perfect, one-size-fits-all diet. The NPH program will build on recent advances in biomedical science including artificial intelligence (AI), microbiome research, as well as the infrastructure and large, diverse participant group of the All of Us Research Program. These advances provide unprecedented opportunities to generate new data to provide insight into personalized nutrition also referred to as precision nutrition.
In addition, the first ever Strategic Plan for NIH Nutrition Research emphasized opportunities to improve our understanding of how individual human biology and molecular pathways influence relationships among diet and environmental, social, and behavioral factors to influence health. Designed to implement aspects of the Strategic Plan, the Nutrition for Precision Health program will conduct a study nested in the All of Us Research Program to explore how individuals respond to different diets. The NPH study is the first ancillary study to leverage the All of Us infrastructure to answer scientific questions important to participants like understanding more about the role of nutrition in health. High-quality nutrition studies such as the NPH study will help individuals and their health care providers create healthy, precise, and effective diet plans.

The objectives of the study are:
1. To examine individual differences observed in response to different diets by studying the interactions between diet, genes, proteins, microbiome, metabolism and other individual contextual factors
2. To use artificial intelligence (AI) to develop algorithms to predict individual responses to foods and dietary patterns
3. To validate algorithms for clinical application

The Nutrition for Precision Health program includes several integrated components:

1) Research Coordinating Center: Provide administrative management and coordination across all sites.
2) Clinical Centers: Recruit, consent, and enroll All of Us participants into nutrition
3) Data Generation Centers: a) Perform genetic analyses of microbiome from the human gut b) Perform metabolic analyses c) Advance dietary assessment methods.
4) Artificial Intelligence, Multimodal Data Modeling, and Bioinformatics Center: Establish mathematical and computational modeling, develop algorithms, and enhance data visualization.
5) All of Us Biobank: Receive, process, and store biosamples and metadata.

Thryve's Personalized Probiotics

Clinical and scientific research for our strain formulations

Our immune health formulation contains Lactobacillus paracasei Th1 and Th2. L. paracasei Th1 is also called LP33, Th2 is called AAP-2. Several clinical studies have shown that they can modulate the immune system by alleviating symptoms of allergic rhinitis and atopic dermatitis (Ref 1, Ref 2, Ref 3). In addition, our weight management formulation contains Lactobacillus reuteri Tr1, which is also called L. ruteri ADR1. This patented probiotic strain can modulate metabolic health, help weight management (ref 4, Ref 5), and is currently under a 2017-2018 clinical trial at (Ref 6).

Our digestive health formulation contains Bacillus, Lactobacillus and Bifidobacterium species, ingredients that are derived from plants, human and dairy products. They are gluten-free and encapsulated in vegetable cellulose, which helps bypass stomach acid.

Gut microbiome implicated in healthy aging and longevity

The gut microbiome is an integral component of the body, but its importance in the human aging process is unclear. ISB researchers and their collaborators have identified distinct signatures in the gut microbiome that are associated with either healthy or unhealthy aging trajectories, which in turn predict survival in a population of older individuals. The work is set to be published in the journal Nature Metabolism.

The research team analyzed gut microbiome, phenotypic and clinical data from over 9,000 people -- between the ages of 18 and 101 years old -- across three independent cohorts. The team focused, in particular, on longitudinal data from a cohort of over 900 community-dwelling older individuals (78-98 years old), allowing them to track health and survival outcomes.

The data showed that gut microbiomes became increasingly unique (i.e. increasingly divergent from others) as individuals aged, starting in mid-to-late adulthood, which corresponded with a steady decline in the abundance of core bacterial genera (e.g. Bacteroides) that tend to be shared across humans.

Strikingly, while microbiomes became increasingly unique to each individual in healthy aging, the metabolic functions the microbiomes were carrying out shared common traits. This gut uniqueness signature was highly correlated with several microbially-derived metabolites in blood plasma, including one -- tryptophan-derived indole -- that has previously been shown to extend lifespan in mice. Blood levels of another metabolite -- phenylacetylglutamine -- showed the strongest association with uniqueness, and prior work has shown that this metabolite is indeed highly elevated in the blood of centenarians.

"This uniqueness signature can predict patient survival in the latest decades of life," said ISB Research Scientist Dr. Tomasz Wilmanski, who led the study. Healthy individuals around 80 years of age showed continued microbial drift toward a unique compositional state, but this drift was absent in less healthy individuals.

"Interestingly, this uniqueness pattern appears to start in mid-life -- 40-50 years old -- and is associated with a clear blood metabolomic signature, suggesting that these microbiome changes may not simply be diagnostic of healthy aging, but that they may also contribute directly to health as we age," Wilmanski said. For example, indoles are known to reduce inflammation in the gut, and chronic inflammation is thought to be a major driver in the progression of aging-related morbidities.

"Prior results in microbiome-aging research appear inconsistent, with some reports showing a decline in core gut genera in centenarian populations, while others show relative stability of the microbiome up until the onset of aging-related declines in health," said microbiome specialist Dr. Sean Gibbons, co-corresponding author of the paper. "Our work, which is the first to incorporate a detailed analysis of health and survival, may resolve these inconsistencies. Specifically, we show two distinct aging trajectories: 1) a decline in core microbes and an accompanying rise in uniqueness in healthier individuals, consistent with prior results in community-dwelling centenarians, and 2) the maintenance of core microbes in less healthy individuals."

This analysis highlights the fact that the adult gut microbiome continues to develop with advanced age in healthy individuals, but not in unhealthy ones, and that microbiome compositions associated with health in early-to-mid adulthood may not be compatible with health in late adulthood.

"This is exciting work that we think will have major clinical implications for monitoring and modifying gut microbiome health throughout a person's life," said ISB Professor Dr. Nathan Price, co-corresponding author of the paper.

This research project was conducted by ISB and collaborators from Oregon Health and Science University, University of California San Diego, University of Pittsburgh, University of California Davis, Lifestyle Medicine Institute, and University of Washington. It was supported in part by a Catalyst Award in Healthy Longevity from the National Academy of Medicine, and the Longevity Consortium of the National Institute on Aging.

Impact of prematurity and nutrition on the developing gut microbiome and preterm infant growth

Background: Identification of factors that influence the neonatal gut microbiome is urgently needed to guide clinical practices that support growth of healthy preterm infants. Here, we examined the influence of nutrition and common practices on the gut microbiota and growth in a cohort of preterm infants.

Results: With weekly gut microbiota samples spanning postmenstrual age (PMA) 24 to 46 weeks, we developed two models to test associations between the microbiota, nutrition and growth: a categorical model with three successive microbiota phases (P1, P2, and P3) and a model with two periods (early and late PMA) defined by microbiota composition and PMA, respectively. The more significant associations with phase led us to use a phase-based framework for the majority of our analyses. Phase transitions were characterized by rapid shifts in the microbiota, with transition out of P1 occurring nearly simultaneously with the change from meconium to normal stool. The rate of phase progression was positively associated with gestational age at birth, and delayed transition to a P3 microbiota was associated with growth failure. We found distinct bacterial metabolic functions in P1-3 and significant associations between nutrition, microbiota phase, and infant growth.

Conclusion: The phase-dependent impact of nutrition on infant growth along with phase-specific metabolic functions suggests a pioneering potential for improving growth outcomes by tailoring nutrient intake to microbiota phase.

Keywords: Gut microbiota Infant growth Meconium Nutrition Phase transition Preterm infants.

Conflict of interest statement

Ethics approval and consent to participate

Written informed consent was obtained from a parent or guardian of all participating infants. The institutional review board at the University of Rochester School of Medicine and Strong Memorial Hospital approved the study.

Consent for publication
Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.


Overview of the preterm infant…

Overview of the preterm infant gut microbiota phases and properties. a The decision…

Temporal distribution of gut microbiota…

Temporal distribution of gut microbiota phases, change in infant weight and meconium clearance.…

Current Diet Assessment Practices and Their Limitations in Diet-Microbiome Studies

The advent and increasing availability and affordability of sequencing technology has resulted in an explosion of diet-microbiome literature. This is easily illustrated with a PubMed search of 𠇍iet” and “microbiome.” In 2009 there were 100 papers published and in 2019 there were 2,204 papers published that were identified using these search terms. As of January 2020, there are 9,544 papers that are returned on PubMed using these terms, and of those over half were published in the last 3 years. This increase in publications has been accompanied by a growing awareness of the limitations we face while attempting to measure and analyze the highly complex interactions between microbes, dietary exposures, and host phenotypes.

Measurement of dietary intake remains particularly challenging. While the methods for collection and analysis of microbiome data have improved over the past decade, there has been little change in the analysis and collection of dietary data with many human studies relying on food frequency questionnaires (FFQs) or self-administered single day food records or 24-h dietary recalls. Each of these methods is prone to reporting errors and is associated with advantages and disadvantages, as has been reviewed extensively elsewhere (35). Importantly, while certain dietary assessment approaches may be adequate for estimating total caloric intake, dietary diversity, or the intake of certain foods and food categories, they may not capture the level of detail needed to discover relationships between diet and the gut microbiome. Although improved methods for the collection and assessment of diet in microbiome studies are desperately needed to address these issues, the development and adoption of new dietary assessment techniques will take time. In the meantime, careful consideration of key variables when planning studies and integrating dietary data with microbiome outcomes is recommended.

Accurately measuring and assessing dietary intake using self-report and nutritional biomarkers is challenging (36, 37). When choosing a dietary assessment technique for a microbiome study the method selected will ultimately impact the research questions that can be answered using the data. Like most research decisions, to assess diet it is necessary to weigh competing options in terms of time, quality, and cost. In this case, time includes both investigator time for collection and review, and participant response time and burden due to effort expended to record diet. Participant time is influenced by the timeframe of dietary record keeping. Low-burden dietary assessments include the administration of a single 24-h recall or record or a single FFQ to capture dietary history over the preceding weeks, months, or years. More time-intensive longitudinal dietary assessments, using detailed daily records over an extended period of weeks, months, or even years have a high participant burden. Quality relates to how closely the data from the dietary assessment accurately captures and reflects actual dietary intake, and how well the data capture the level of detail necessary for microbiome-related outcomes such as the inclusion of microbially-important diet-derived chemicals in nutrient databases (38). Cost is also multifactorial, including the cost of any nutritional software or physical measurement collection devices like scales and measuring cups, to the cost of trained personnel to collect or enter daily 24-h recalls, and the cost needed to pay participants to encourage participation and complete record keeping. Typically, researchers can only pick two of these three competing interests—time, quality, or cost. Therefore, researchers usually choose to minimize participant time burden and overall cost at the expense of quality. What this means in practice is that research teams frequently use FFQs or dietary screeners instead of more time- and cost-intensive methods like multi-day diet records or recalls administered by trained personnel.

Dietary measurement by FFQ has both advantages and disadvantages (39, 40). The primary advantage is that FFQs are convenient. They are very easy to administer and take less participant time than other methods. A study participant simply indicates the frequency at which they consume specific foods and how much they consume, and these answers are used to estimate total caloric intake, as well as the intake of the major macronutrients and micronutrients. However, because FFQs were developed to quantify broad dietary patterns or indices of healthy eating, they are limited in a number of ways and cannot capture diet as accurately as other methods (41). While far from optimal, dietary patterns estimated by FFQ have provided some insight into the way habitual diet contributes to microbiome composition long-term FFQ-determined nutrient intake has been associated with microbial composition, and relationships found between FFQ-determined dietary patterns and the abundance of microbial genera (7, 42�). These findings have been supported by research showing that changing dietary patterns, either experimentally or through natural experiments that take advantage of seasonal eating habits or immigration, affect microbiome composition (45�). However, most FFQs are not designed to provide data that can link specific foods to specific changes in microbial species composition or functional pathways.

Despite the identification of some signals from microbiome studies that use FFQs for dietary analysis, the technique is simply not specific enough to untangle the complex relationships between foods and the microbiome. We have shown previously that microbiome composition more closely covaries with food intake, not nutrient intake (6), indicating that reliance on existing nutrient composition variables is insufficient and that foods themselves are important when exploring diet-microbiome covariation. However, even well-conducted 24-h recalls and food records fail to sufficiently capture the complexity of foods in ways that are meaningful to microbes. For example, one brand of bread may include 7 different whole grains as well as nuts and seeds, while another brand may be made with sprouted wheat. During diet entry both of these breads may be coded as whole-wheat bread and therefore downstream analysis will treat them as identical when they are different. This variation could contribute to some of the personalized diet-microbiome results that have been reported with respect to blood glucose response (16, 19).

In instances where researchers recognize the need to collect more finely resolved dietary information, it is common in the microbiome literature to read about techniques that use researcher-developed food questionnaires, independently created app-based collection methods (16, 48), and consumer-facing tools for nutrient analysis rather than using validated techniques for dietary collection or nutrient analysis tools developed with a research focus (23). When collecting 24-h recalls or asking participants for 24-h food records, inclusion of properly trained staff can improve data quality. At a minimum, participants recording their intake should be trained using detailed examples showing the level of detail necessary to complete an accurate record. This should include a discussion of serving sizes, ingredient specificity, preparation methods, and the inclusion of commonly forgotten foods and additives. All records should be reviewed with the participant with a focus on identifying misreported or commonly forgotten foods. Methods for dietary intake assessment are improving and technology using computerized recalls and records can greatly reduce researcher burden when collecting dietary data. Regardless of collection method, when records or recalls are coded for analysis using dietary research software, consideration for consistency with data entry, coding, and cleaning of dietary data is important to allow for robust analysis of nutrient composition and food intake downstream.

In the current dietary record collection environment, different dietary collection tools and FFQs rely on different underlying databases, which makes comparison across studies and cohorts difficult, if not almost impossible. Even within the English speaking regions of the Americas, United Kingdom, and Australasia, food records are ultimately mapped back to different databases depending on the software tools used for analysis (49). Each database provides nutrient composition data, but the source of that data varies. The analysis methods for specific nutrients can also vary by database which leads to different nutrient level outcomes in databases from different countries. Beyond nutrient composition, the naming conventions used to identify foods are not identical across databases leading to issues when comparing data collected using different tools. Additionally, food-grouping structures vary, with no universally accepted method adopted across studies, or databases. Efforts to establish a shared food ontology that can harmonize dietary data collected in different global regions and by different tools are in progress, incorporating other food features such as preparation method (50). Food preparation and cooking methods alter the chemical properties of foods, such as the changes that occur to sweet potatoes during cooking, that then impact the effects of that food on the microbiome (51), adding another layer of complexity to measurement of diet. The food matrix (52) is likely to play an important role in the relationship between diet and the microbiome and should also be considered. Specific information such as the ripeness of fruits is not currently captured in any food databases. However, this level of detail may be highly relevant for certain food-microbe interactions. For example, it is known that when bananas are unripe or green the starch contained within them is resistant starch (53), which is fermentable by microbes, whereas in ripe bananas that resistant starch has broken down into simpler starch and glucose molecules, which are absorbable by the host, and no longer provide any fermentable substrate for the gut microbes. Beyond food preparation and ripeness, recent research addressing eating behaviors around ultra-processed foods shows that controlling for energy and macronutrient content alone may be insufficient for dietary interventions (54). Food sourcing, processing or cooking methods (51), additives or emulsifiers (55, 56), artificial sweeteners (57), and conventional or organic farming methods (58) likely also need to be taken into consideration.

In addition to the microbiome changes induced by the biochemical components of foods, foods themselves contain bacteria that affect the gut microbiome. From a health perspective, fermented dairy such as yogurt and cheeses, are the most commonly recognized foods that contain �neficial” microbes (59). These foods are sources of microbes that can transiently populate the human gut (23). Fresh, non-fermented foods have long been recognized as a source of food-borne pathogens and are the target of public health interventions to prevent the spread of food-borne disease. Despite this recognition, we know surprisingly little about the microbial composition in other non-fermented foods. Recently, crops that are not usually considered to transfer bacteria, such as apples, have been shown to harbor a microbiome that depends on growth and farming practices (58). Indeed, dietary patterns that include more food types, particularly fermented foods like yogurt, contain a higher abundance of microbes relative to less diverse diets (1 × 10 9 colony forming units [CFU] vs. 1 × 10 6 CFU in a day's worth of meals) (60). Consideration for the microbial load of a dietary pattern is important because the engraftment of non-pathogenic food-borne bacteria depends, in part, on other dietary components. For example, higher abundances of parmesan-cheese-associated bacteria are present after consumption of milk products (61). Ideally, we need to consider these microbial features of diet when planning and analyzing microbiome-diet studies.

Advanced technologies for food image recognition in nutrient intake assessment

Dietary assessment is a crucial step in the real-world deployment of any personalised nutrition programme. The ability of an individual to track their food intake plays a role in self-monitoring as a critical aspect of behaviour change ( Reference Peterson, Middleton and Nackers 88) . It can also provide a professional dietitian with information on how a client is adhering to their individualised meal plan. However, assessing dietary intake with traditional methods carries considerable costs and burden to the individual. Such methods are also prone to errors as they often rely on self-reporting ( Reference Burrows, Ho and Rollo 89) .

Advanced solutions are needed to objectively quantify food and beverage intake ( Reference Mezgec, Eftimov and Bucher 90) . Food image recognition is a promising strategy because most individuals own a smartphone with a camera, so the barrier to entry is low, and it can reach a large population. However, automatically recognising food items from images is a challenging computer vision problem due to a variety of issues: (1) foods are typically deformable objects, (2) foods can lose their visual information during preparation, (3) different foods can appear visually similar, (4) the same food can appear differently depending on the lighting or angle and (5) limited amount of visual information for beverages ( Reference Mezgec and Koroušić Seljak 91) . Only after foods are visually recognised, can they be reliably linked to a food composition database ( Reference Eftimov, Korošec and Koroušić Seljak 92) .

The introduction of the Pittsburgh Fast-Food Image Dataset in 2009 facilitated early research in this area based on manual recognition methods, but these approaches mostly achieved only 10–40 % classification accuracy ( Reference Chen, Dhingra and Wu 93, Reference Yang, Chen and Pomerleau 94) . In 2014, deep learning was first used to recognise food images. Deep learning, or deep neural networks, allows computational models composed of multiple processing layers to learn relevant image features through training on a set of input images ( Reference LeCun, Bengio and Hinton 95, Reference Deng and Yu 96) . Deep convolutional neural networks are inspired by the visual system of animals, where individual neurons assess the visual input by reacting to overlapping regions in the visual field ( Reference Hubel and Wiesel 97) . Because they can classify each pixel of the image, they can recognise any number of items, along with their location and size, allowing for food volume and food weight estimation. This approach has achieved substantially better results than other methods, resulting in an increased focus on deep learning in recent research ( Reference Zhou, Zhang and Liu 98, Reference Knez and Šajn 99) .

A novel deep learning architecture for food image recognition, called NutriNet, has been developed by Mezgec and Koroušić Seljak ( Reference Mezgec and Koroušić Seljak 91) . It is a modification of the well-known AlexNet architecture ( Reference Krizhevsky, Sutskever, Hinton, Pereira, Burges and Bottou 100) , with increased image size and an additional convolutional layer at the beginning of the neural network ( Reference Mezgec and Koroušić Seljak 91) . NutriNet was first trained on 225 953 freely available images that were downloaded from the Internet and organised into appropriate food classes (520 unique food items). When tested against three popular deep learning architectures of the time (AlexNet ( Reference Krizhevsky, Sutskever, Hinton, Pereira, Burges and Bottou 100) , GoogLeNet ( Reference Szegedy, Wei and Yangqing 101) and ResNet ( Reference He, Zhang and Ren 102) ), NutriNet was found to be superior to AlexNet and GoogLeNet and faster to train than all three of the other architectures. The deep learning approach was then used to recognise any number of items in a single food image using a training set obtained from the ‘fake food buffet’ ( Reference Bucher, van der Horst and Siegrist 103) , which is visually similar to real food. Fully convolutional networks, introduced by Long et al. ( Reference Long, Shelhamer and Darrell 104) , were applied to perform semantic segmentation, partitioning the image into logical parts and classifying each part on a pixel level. Due to the complexity of food images, an fully convolutional network variant that can segment images at the finest grain (fully convolutional network-8s) ( Reference Long, Shelhamer and Darrell 104) was used to train a model on the fake food buffet image data set. Output predictions of the trained model were compared with the ground-truth labels using the pixel accuracy measure ( Reference Long, Shelhamer and Darrell 104) , and the final accuracy of the trained fully convolutional network-8s model was 92·18 %.

In recent years, deep learning has been validated numerous times as a suitable solution for recognising food images ( Reference Zhou, Zhang and Liu 98) . Availability of food image data sets has been improving ( Reference Ciocca, Napoletano and Schettini 105– Reference Cai, Li and Li 107) , although there is a need for validation against data sets from different regions across the world. Future work will focus on real-world food images, which exhibit more variance compared with the test images used in the research environment. In the future, such technology could be used to improve dietary assessment in clinical trials. For example, it can play a role in human studies in the areas mentioned above, where accurate quantification of folate, vitamin B12 and energetic intake is critical. Consumer wellness apps can also apply this solution in the future, improving the dietary assessment of individuals and facilitating self-monitoring towards positive behaviour change.

Genetics or lifestyle: What is it that shapes our microbiome?

The question of nature vs nurture extends to our microbiome -- the personal complement of mostly-friendly bacteria we carry around with us. Study after study has found that our microbiome affects nearly every aspect of our health and its microbial composition, which varies from individual to individual, may hold the key to everything from weight gain to moods. Some microbiome researchers had suggested that this variation begins with differences in our genes but a large-scale study conducted at the Weizmann Institute of Science challenges this idea and provides evidence that the connection between microbiome and health may be even more important than we thought.

Indeed, the working hypothesis has been that genetics plays a major role in determining microbiome variation among people. According to this view, our genes determine the environment our microbiome occupies, and each particular environment allows certain bacterial strains to thrive. However, the Weizmann researchers were surprised to discover that the host's genetics play a very minor role in determining microbiome composition -- only accounting for about 2% of the variation between populations.

The research was led by research students Daphna Rothschild, Dr. Omer Weissbrod and Dr. Elad Barkan from the lab of Prof. Eran Segal of the Computer Science and Applied Mathematics Department, together with members of Prof. Eran Elinav's group of the Immunology Department, all at the Weizmann Institute of Science. Their findings, which were recently published in Nature, were based on a unique database of around 1,000 Israelis who had participated in a longitudinal study of personalized nutrition. Israel has a highly diverse population, which presents an ideal experimental setting for investigating the effects of genetic differences. In addition to genetic data and microbiome composition, the information collected for each study participant included dietary habits, lifestyle, medications and additional measurements. The scientists analyzing this data concluded that diet and lifestyle are by far the most dominant factors shaping our microbiome composition.

If microbiome populations are not shaped by our genetics, how do they nonetheless interact with our genes to modify our health? The scientists investigated the connections between microbiome and the measurements in the database of cholesterol, weight, blood glucose levels, and other clinical parameters. The study results were very surprising: For most of these clinical measures, the association with bacterial genomes was at least as strong, and in some cases stronger, than the association with the host's human genome.

According to the scientists, these findings provide solid evidence that understanding the factors that shape our microbiome may be key to understanding and treating many common health problems.

Segal: "We cannot change our genes, but we now know that we can affect -- and even reshape -- the composition of the different kinds of bacteria we host in our bodies. So the findings of our research are quite hopeful they suggest that our microbiome could be a powerful means for improving our health."

The field of microbiome research is relatively young the database of 1,000 individuals collected at the Weizmann institute is one of the most extensive in the world. Segal and Elinav believe that over time, with the further addition of data to their study and those of others, these recent findings may be further validated, and the connection between our microbiome, our genetics and our health will become clearer.

This is an excerpt from Environmental Microbiology Reports, 2015, authored by Arjun Raman, a postdoc in the Baliga Lab here at Institute for Systems Biology. The beauty of a living thing is not the atoms that go into it, but the way those atoms are put together. Information distilled over four billion years of biological evolution. Incidentally, all the organisms on the Earth are made essentially of that stuff. An…


Clinical methods

All study procedures were approved by the University of Rochester School of Medicine Internal Review Board (IRB) (Protocol # 37933). Infants included in the study were from the multicenter Prematurity and Respiratory Outcomes Program (PROP) and the Respiratory Pathogens Research Center (RPRC) at the University of Rochester School of Medicine and were cared for in a single-center Newborn Intensive Care Unit (NICU). Clinical care in terms of type and duration of antibiotic treatment, corticosteroids, diuretics, motility agents, and H2 receptor agonists as well as the timing and volume of feeds was at the discretion of treating physicians. Rectal swabs were used to collect fecal material from consented infants from 24 PMA until discharge and again at 6 months and 1 year for preterms and birth and 1 month for full terms. Each sample was collected by inserting a sterile Copan flocked nylon swab (Copan Diagnostics, Murrieta, CA) moistened with normal saline beyond the sphincters into the rectum and then twirled. Each sample was immediately placed into sterile buffered saline and stored at 4 °C for no more than 4 h. Samples were processed daily, which involved extraction of the fecal material from the swab in a sterile environment and immediately frozen at − 80 °C until DNA extraction. All sampling swabs, plasticware, buffers, and reagents used for sample collection and extraction of nucleic acids were sterile and UV-irradiated to insure no contamination from sources outside of the infant and sample.

Derived medication and nutrition variables

For all medications considered, binary variables were derived for each sample that indicate whether or not a given medication was administered in the week (7 days) prior to sample collection. Weight Z-score was computed as a proxy for growth. First, weight percentile was computed as the percentage of weight measures of a population of the same sex and age that fall below the observed weight value. We applied Cole’s LMS method as used by CDC and WHO [54]. The standard growth chart is based on sex-matched premature infant population weight data collected by Fenton and Kim [55, 56]. Weight Z-scores were computed based on the corresponding weight percentiles. Four variables associated with each sample were derived for nutritional intake: total calories per kilogram in the week prior to sample collection, ratio of lipids or proteins in the week prior to sample collection, and the ratio of total calories in the week prior to sample collection that were consumed enterally (as opposed to parenterally). These values were computed based on detailed daily feeding records and the available nutrition facts for all formulas, supplements, and total parenteral nutrient preparations used in the NICU. Total calories per kilogram in the past week is the sum of total calories per kilogram per day for the 7 days prior to sampling. The proportion of enteral calories computed as the ratio of (grams of lipids/protein per kilogram) divided by (total calories per kilogram) for each day, summed over the 7 days prior to sampling. “Enteral calorie ratio past week” was computed as the total calories per kilogram consumed enterally in the week prior to sampling divided by the total calories per kilogram consumed (enterally and parenterally) in the same period.

Genomic DNA extraction

Total genomic DNA was extracted with a modified method using the QIAGEN Fecal DNA kit and FastPrep mechanical lysis (MPBio, Solon, OH). 16S ribosomal RNA (rRNA) was amplified with Phusion High-Fidelity polymerase (Thermo Scientific, Waltham, MA) and dual indexed primers specific to the V3-V4 hypervariable regions (319F: 5′ ACTCCTACGGGAGGCAGCAG 3′ 806R: 3′ ACTCCTACGGGAGGCAGCAG 5′) [57]. Amplicons were pooled and paired-end sequenced on an Illumina MiSeq (Illumina, San Diego, CA) in the University of Rochester Genomics Research Center. Each sequencing run included (1) positive controls consisting of a 1:5 mixture of Staphylococcus aureus, Lactococcus lactis, Porphyromonas gingivalis, Streptococcus mutans, and Escherichia coli and (2) negative controls consisting of sterile saline.

16S rRNA sequence processing

Raw data from the Illumina MiSeq was first converted into FASTQ format 2 × 300 paired-end sequence files using the bcl2fastq program, version 1.8.4, provided by Illumina. Format conversion was performed without de-multiplexing and the EAMMS algorithm was disabled. All other settings were default. Sequence processing and microbial composition analysis were performed with the Quantitative Insights into Microbial Ecology (QIIME) software package [58], version 1.9. Reads were multiplexed using a configuration described previously [57]. Briefly, for both reads in a pair, the first 12 bases were a barcode, which was followed by a primer, then a heterogeneity spacer, and then the target 16S rRNA sequence. Using a custom Python script, the barcodes from each read pair were removed, concatenated together, and stored in a separate file. Read pairs were assembled using fastq-join from the ea.-utils package, requiring at least 40 bases of overlap and allowing a maximum of 10% mismatched bases. Read pairs that could not be assembled were discarded. The concatenated barcode sequences were prepended to the corresponding assembled reads, and the resulting sequences were converted from FASTQ to FASTA and QUAL files for QIIME analysis. Barcodes, forward primer, spacer, and reverse primer sequences were removed during de-multiplexing. Reads containing more than four mismatches to the known primer sequences or more than three mismatches to all barcode sequences were excluded from subsequent processing and analysis. Assembled reads were truncated at the beginning of the first 30 base window with a mean Phred quality score of less than 20 or at the first ambiguous base, whichever came first. Resulting sequences shorter than 300 bases or containing a homopolymer longer than six bases were discarded. Operational taxonomic units (OTU) were picked using the reference-based USEARCH (version 5.2) [59] pipeline in QIIME, using the May 2013 release of the GreenGenes 99% OTU database as a closed reference [60, 61]. An indexed word length of 128 and otherwise default parameters were used with USEARCH. Chimera detection was performed de novo with UCHIME, using default parameters [59]. OTU clusters with less than four sequences were removed, and representative sequences used to make taxonomic assignments for each cluster were selected on the basis of abundance. The RDP Naïve Bayesian Classifier was used for taxonomic classification with the GreenGenes reference database, using a minimum confidence threshold of .85 and otherwise default parameters [62]. Phylogenetic investigation of communities by reconstruction of unobserved states (PICRUSt) [63] was used with the provided pre-processed KEGG Orthologs database to infer the putative functional capacities of these communities.

16S rRNA microbiota data pre-processing

To ensure the quality of statistical analysis, microbiome samples with < 12,000 total reads were excluded from the subsequent data analyses. Microbiota abundance data were summarized at six different levels (level 2: PHYLUM–level 7: SPECIES). For characterization of the microbiota phases and within phase abundance analyses, raw relative abundance values were used. For beta diversity calculations, normalization by rarefaction at a depth of 12,000 reads was performed. For longitudinal abundance analyses, at each taxonomic level we excluded OTU units (taxa) with equal or more than 98% of exactly zero reads among the 705 samples. In total, 140 genera and 198 species are used for these statistical analyses. The abundance data were log2 transformed (log2(x + 1)) following normalization by cumulative sum scaling [64].

Description of decision tree logic to define microbiota phases

Drawing on the microbial dysbiosis index described by Gevers et al. [65], the first step in the decision tree is to compute and evaluate the log of (total abundance of the classes increased in prematurity (Bacilli + Gammaproteobacteria)) over (total abundance of the class decreased in prematurity (Clostridia)). If this value is less than or equal to two, the gut microbiota is defined as being in phase 3. If the result of the first step in the tree is greater than two, a second step is taken where we compute and evaluate the log of (total abundance of the class increased in extreme prematurity (Bacilli)) over (total abundance of the class decreased in extreme prematurity (Gammaproteobacteria)). If the resulting value is less than or equal to two, the gut microbiota is defined as being in phase two otherwise, it is defined as being in phase one (P1). In the event that the ratio is non-computable because Clostridia is entirely absent and the P1|P2 branch is taken, or the P1|P2 branch is taken and Gammaproteobacteria is absent, the microbiota is defined as being in P1 or the P1|P2 branch is taken and Bacilli is absent, the microbiota is defined as being in P2. If two of the three classes are absent, the microbiota is defined as being in the phase characterized by the class that is present. No samples were entirely devoid of all three classes, but such a case could not be resolved within this framework. Dirichlet multinomial mixture (DMM) modeling for comparative purposes was performed using the Dirichlet multinomial R package, which is based on Holmes et al. [66]. Class-level composition was used, and per sample normalization was performed by converting relative abundances to counts summing to 12,000 (the minimum read threshold for inclusion in analysis). The dmn function was used with default parameters and an arbitrary seed value of 11 count data was fit to one through ten Dirichlet components, and model fit was estimated using the Laplace metric.

Functional capacity of microbiota phases

The functional capacity of the microbiota present in each sample was inferred using PICRUSt (Phylogenetic Investigation of Communities by Reconstruction of Unobserved States) [63], which reconstructs the functional composition of a microbial community sample using 16S rRNA phylogeny and a database of annotated reference genomes. For each functional pathway from the Kyoto Encyclopedia of Genes and Genomes (KEGG) that was putatively identified, comparisons were made between the phases using LEfSe, which identifies features that are statistically differentially abundant among biological classes (in this case phases) and then performs comparative tests between pairs of biological classes to identify where these features are significantly enriched or diminished.

Comparing taxonomic composition, functional capacity, and week-to-week dissimilarity between phases

Analysis of variance of taxa abundance at all taxonomic levels across the three phases of the microbiota was conducted using a Kruskal-Wallis test, and the results are summarized in Additional file 4: Table S3. Differential abundance of taxa between each pair of two phases was assessed at each taxonomic level using the metagenomicsSeq zero-inflated Gaussian test [64], and the results are summarized in Additional file 5: Tables S4A–C. Testing for differential functional capacity between the phases was performed using LEfSe [67] with per-sample normalization to 1 M total counts, minimum effect size of 2.0, alpha of 0.1, an all-against-all strategy, and otherwise default parameters. The results are summarized in Additional file 6: Table S5. An exploratory test of the equality of the median of the week-to-week differences of samples within individual subjects between the cases where the phase remains the same and the cases where the phase changes was performed using the Wilcoxon rank-sum test. The p value reported for this test is approximate due to the paired nature of beta-diversity and the presence of repeated measures from the same subjects.

Transition from meconium to solid stool

The point of stool transition from meconium to normal as described in the text was determined from nurses’ records subjectively characterizing diaper contents when they were changed. These records were available as free text and each entry was time stamped, with one entry for every time a diaper was changed. Stool transition was defined as the first such record without the word meconium that was followed by no more than two records containing the word meconium. To assess the associations between day of life (DOL) of stool transition and day of life of initial transition out of phase one, a simple linear regression model was used with DOL of transition out of phase 1, gestational age at birth as covariates, and DOL of stool transition as the outcome. A similar regression model was used to assess the association between growth and time to reach phase 3. The DOL of the first phase 3 sample observed for each subject and their gestational age at birth were used as covariates, and the total change in weight Z-score from birth to discharge was used as the outcome variable. This model included only the 81 subjects who reached phase three prior to discharge.

Determination of early and late time periods

We applied functional principal component analysis to the microbiota abundance data [21]. The estimated temporal abundance function of taxon v and subject I, ( >_(t) ) , was represented by a linear combination of eigen-functions as follows:

Here, ( >_v(t) ) is the estimated mean curve for the vth taxon, ξ k, v(t) is the kth eigen-function for this taxon, K v is the number of top eigen-functions needed to explain ≥ 99% of total functional variation, and c ik, v are the linear coefficients. On average, it takes 2.93 functional principal components to explain ≥ 99% of total variation at the species level. We calculated the total functional variance based on the fitted microbiota abundance at the species level. More specifically, we computed the pointwise variance function for each species from the smoothed temporal curves of abundance at the species level, then took the summation over all species used in this study

Here, ( >_(t) ) represents the sample mean abundance function calculated from all subjects. ( overline(t) ) represents the overall temporal variance at the species level. The maximum of ( overline(t) ) occurred at PMA = 34 weeks (rounded to integers), which is illustrated in Additional file 2: Figure S5. Based on this cutoff, we define the EARLY period of PMA to be (0,34) and the LATE period to be [34,∞). The EARLY interval has 362 data points the LATE interval has 343 data points.

Association between clinical variables and microbiota abundance in each phase

Within each phase independently, association testing between all taxa and clinical and nutritional factors of interest was performed by regressing the relative abundance of each taxon on these covariates: gestational age at birth, post menstrual age, total calories per kilogram in the past week, ratio of lipids in the past week, ratio of proteins in the past week, ratio of carbohydrates in the past week, proportion of total calories received enterally in the past week, whether antibiotics were received in the past week, whether diuretics were received in the past week, whether corticosteroids were received in the past week, whether motility agents were received in the past week, whether proton pump inhibitors were received in the past week, and whether H2 receptor antagonists were received in the past week. This was done using the MaAsLin algorithm [68] with subject as a random variable, without model selection, and with otherwise default parameters. The results are summarized in Additional file 7: Table S6.

Association between nutrition/medication and growth

We performed linear mixed-effect regression analysis similar to the above model on both early and late periods (Model A) and three phases (Model B) to test the association between the nutrition/medication factors (as covariates) and weight Z-score as a proxy for growth (as the response variables). We included gaBirth (gestational age at birth) and PMA in the model to control for their possible confounding effects. More specifically, the following two linear mixed-effects regressions were performed.

Here, NutriMed(i,k) (t j) is the kth clinical covariate for the ith subject measured at the jth time point. β k is the corresponding linear coefficient (fixed effect) α i is a random-effect term that quantifies the within-subject dependence and ϵ ij is the i.i.d. measurement error. In summary, model A associates weight Z-score to the time periods (EARLY versus LATE), nutrition and medication variables, and their interactions. Model B is much like model A except that it uses microbiota phases to quantify the developmental stages of microbial community instead. For model A, LATE is considered as the baseline phase (coded as 0) and EARLY is coded as 1. For model B, phase 3 is considered as the baseline phase (coded as 0) phases 1 and 2 are coded as 1 in two separate binary variables. The interactions included in both models are defined as the products of the nutrition/medication variables and period/phase-related covariates. The significance of associations is determined by regression t test with Satterthwaite’s approximation. Due to the use of large number of covariates in these models, stepwise model selection based on the Akaike information criterion (AIC) was used to reduce model complexity. The results of model B for weight Z-score are summarized in Table 2 of the main text. As an example, the linear associations of P2 and percent lipids * P2 with the weight z-score are both significant (beta = − 0.7766 for P2 and 5.658 for lipids * P2) meaning that while P2 is correlated with a smaller weight z-score as compared with the baseline (P3), a higher percent of lipid intake for P2 subjects increases the weight Z-scores for subjects in P2. Analyses were performed in R 3.2.0 (R Foundation for Statistical Computing, Vienna, Austria).

Predicting microbiome phases

We performed a mixed-effects logistic regression analyses to study the associations between a host of nutrition- and medication-related covariates and the three microbiota phases on the early and late intervals. We considered P3 as the baseline phase and represented P1 and P2 by two separate binary outcome variables. Gestational age at birth and PMA were included to control for their potential confounding effects. A likelihood ratio test was used to determine the statistical significance of associations. The results are summarized in Tables 3A and B.