Biofilm Bacterial Diversity on Drinking Water Plastic Bags

Through 16S metagenomics, we reveal a remarkably high bacterial diversity within the surface biofilms of a commercial drinking water bag after one-year use. Additionally, we introduced an innovative method for collecting biofilm bacterial biomass that adheres strongly to plastic bags.

Share This Post

Table of Contents

Key Takeaways
  • A surface polishing tool designed for hard surfaces has been effectively modified to collect biofilm biomass from plastic bags.

  • An average of >80 K raw tags of 16S V3-V4 regions were generated from each sample.

  • Inside each of these biofilms, there are over 300 bacterial species from ~30 phyla thriving, representing a vibrant microbial community with rich biodiversity.

Biofilms on Plastic Bag

Biofilms: Tiny Communities with a Big Impact

Imagine a bustling city of microscopic residents, all living together in a sticky, protective neighborhood. These are biofilms – clusters of tiny organisms like bacteria, forming communities on various surfaces. 

Why Do Biofilms Love Plastic Bags?

Plastic is a bit like a cozy hideout for these microorganisms. Its smooth surface doesn’t bother them, and places like hospitals or labs, where plastic bags are often used, provide the perfect moisture they need to thrive. On top of that, plastic can contain bits of stuff that these tiny residents find delicious – like a little snack to keep them going.

Fighting Back: Keeping Biofilms in Check

Storing bags in dry areas and giving them regular clean-ups can help in preventing biofilms. Some scientists are working on making special plastic that biofilms don’t like, and adding special germ-fighting ingredients. Even using sunlight in a special way, called ultraviolet (UV) light, might help keep these biofilms from causing trouble.

By understanding these tiny communities and finding clever ways to stop them from forming, we’re taking a step towards a cleaner, safer world. To achieve this, the very first step is to investigate who these microbes residing on the plastic bag surfaces are.

Study Design

A thin layer of biofilm was collected from the surface of the bag at different locations (front side surface, back side surface, and inside the dispenser). The sampled plastic bag has been used for a year to store drinking water treated from untreated water sources. DNA was extracted and then PCR amplified on 16S rRNA genes, followed by high throughput sequencing of PCR products.  Bioinformatic analysis reveals the composition of the microbial community of the biofilms.

Collection of Biofilms

The biofilms at the surface of the plastic bags were collected by removing the top layer (ca. 1 mm thick) of the plastic surfaces using a sanding method with a cordless hand-held multitool (Dremel 8220) mounted with metal spinners. The metal spinner tips were heat-sterilized for 3 hours at 180 °C before use. The body of the multitool was covered with a piece of sterile plastic bag to prevent contaminations from falling onto the sample bags. 800 uL of TE buffer was placed on the sanding area of the bag. After surface sanding for 15-30 seconds, the TE solution with fine plastic powders off the bag was collected for microbiome DNA extraction.

Selection of 16S rRNA Gene Region for PCR

Most commonly used primer set of V3-V4 was selected for 16S rRNA gene metagenomics. The length of PCR products was well covered by PE250 sequencing strategy. 


Pan et al., 2023. Microbial Diversity Biased Estimation Caused by Intragenomic Heterogeneity and Interspecific Conservation of 16S rRNA Genes. Applied and Environmental Microbiology. 89:5.  https://doi.org/10.1128/aem.02108-22.

DNA Extraction, PCR, Sequencing and Bioinformatics

DNA extraction

Microbiome DNA was extracted using a FastPure Microbiome DNA Isolation Kit (Vazyme, Nanjing, China) according to the manufacturer’s instructions. DNA quantity and quality were assessed with a nano-spectrophotometer at mBioWorks Copenhagen, Denmark.

High-throughput 16S ribosomal RNA gene sequencing

The hypervariable region V3-V4 of the bacterial 16S rRNA gene was amplified with the primer pair 5′- GYGCASCAGKCGMGAAW -3′ and 5′- GGACTACVSGGGTATCTAAT -3′. Both the forward and reverse 16S primers were tailed with sample-specific Illumina index sequences to allow for deep NGS sequencing. The PCR was performed in a reaction mixture of DNA template 5-50 ng, 0.3 μL forward primer (10μM), 0.3 μL reverse primer (10μM), 5 μL KOD FX Neo Buffer, 2 μL dNTP (2 mM each), 0.2 μL KOD FX Neo polymerase (Toyobo Life Sciences, Shanghai), and finally ddH2O added to a total volume of 20 μL. After initial denaturation at 95 °C for 5 min, followed by 20 cycles of denaturation at 95 °C for 30 s, annealing at 50 °C for 30 s, and extension at 72 °C for 40 s, and a final step at 72 °C for 7 min. The amplified products were purified with an Omega DNA purification kit (Omega Inc., Norcross, GA, USA) and were quantified using Qsep-400 (BiOptic, Inc., New Taipei City, Taiwan). The amplicon library was paired-end sequenced (2×250) on an Illumina Novaseq 6000.

Bioinformatic analysis

NGS data pre-treatment: Raw data were first filtered by Trimmomatic (version 0.33) with parameters: window size of 50 bp and average Q-score within the window as min. 20. Identification and

removal of primer sequences were processed by Cutadapt (version 1.9.1) with a maximum mismatch set as 20% and minimum coverage of 80%. PE reads were assembled by USEARCH (version 10) with a minimum length of overlap of 10 bp, minimum similarity within the overlapping region as 90%, and maximum mismatch accepted as 5 bp. The chimera removal was done using UCHIME (version 8.1). If a fragment with over 80% similarity to the query sequence is found on both parents, this query sequence was defined as a chimera sequence. The high-quality reads generated were used in the following analysis.

ASV analysis: DADA2 method in QIIME2 (version 2020.06) was used to de-noise sequences, generating ASVs. The conservative threshold for OTU filtration is 0.005%. Each ASV was annotated using a combination of BLAST-based method and a Naive Bayes classifier-based method. The BALST-based annotation method was done with the Classify-consensus-blast functionality in QIIME2, which identifies the annotation with the highest consensus in N best hits (default: 3). Parameter setting: minimum similarity in sequence: 90%; minimum coverage: 90%; minimum consensus: 51%. The Naive Bayes classifier-based method is processed with the classify-sklearn plugin in QIIME. The classifier needs to be trained before use in order to “learn” which features can be used for classification. Parameter setting: the confidence of classifier set as 0.7. In general, feature sequences are firstly BLASTed against the reference database (Silva) using classify-consensus-blast. The sequences that could not be matched in the reference database were further classified by classify-sklearn.

Alpha diversity: Alpha diversity was calculated using QIIME2 (https://qiime2.org/), which reflects the species richness of individual samples and the species diversity. Chao1 and Ace index measure species richness, i.e. the number of species. Shannon and Simpson’s indexes are used to measure species diversity, which are affected by species richness and community evenness in the sample community. In the case of the same species richness, the higher the evenness of each species in the community is, the higher the community diversity is. A larger Shannon index and a smaller Simpson index indicate that the species diversity of the sample is higher.

Beta diversity: Beta diversity analysis was processed using QIIME2 (https://qiime2.org/), with the aim to compare species diversity between different samples. There are four commonly used statistical algorithm to calculate the distance between samples in beta diversity analysis, that is binary Jaccard, Bray Curtis, weighted Unifrac, and unweighted Unifrac. These four algorithms can be classified into weighted (Bray-Curtis and Weighted Unifrac) and non-weighted (Jaccard and Unweighted Unifrac). The unweighted algorithm focuses on the existence of a species, while the weighted algorithm takes both existence and abundance into consideration.

The unweighted Pair-group Method with Arithmetic Mean (UPGMA) distance analysis method was used to construct a clustering tree based on the similarity between samples. Distance matrices were obtained by the four distance algorithms mentioned above. Hierarchical clustering of samples was performed by UPGMA in the R language tool to determine the similarity in species composition among samples.

Raw reads quality report

A typical read quality report generated with fastqc:

Summary of raw data processing
Length distribution of effective reads after data quality control
Number of reads confidently classified to the known taxonomic level

The original feature table may contain those with extremely low abundance (reads number less than 2). After removing low abundance features, a summary of annotated the number of tags that are accurately annotated at each taxonomic level, including Kingdom, Phylum, Class, Order, Family, Genus, and Species, were shown here. Detailed classification can usually be found in the file named OTU_table_detail.xls.

Number of taxa confidently assigned to each level
 Taxonomic distribution
Taxonomic abundance clustering 

The clustering heat map was generated at the level of phylum, class, order, family, genus, and species. The values shown on the heatmap are z-scores generated by the z-normalization of relative abundance. The color gradient from blue to red represents the relative abundance at each row and the color gradient (blue to red) represents the relative richness (low to high).

Alpha diversity indices
Rarefaction curve analysis

The rarefaction curve is a plot of species counts that are detected from randomly sampled data. This curve is designed to evaluate whether a set of sequencing data is adequate to detect all species from a sample, which also reflects species diversity from the side. If the curve keeps climbing towards the end, it indicates that with more data included, more species will be detected, i.e. current sequencing depth is not sufficient. If the curve gradually becomes flat, it indicates that with more data included, a limited number of new species will be found, i.e. sequencing data is adequate to present the majority of the species in a sample.

UPGMA clustering with taxonomic composition at the genus level

This diagram combines the UPGMA clustering tree and histogram of species composition. Sample clustering trees on the left were constructed based on a distance algorithm as detailed in the above method, which shows the similarity between samples. Taxonomic composition in the genus level was shown as a histogram placed on the right in order to visualize similarity in species abundance.

Sample hierarchical clustering tree: The samples closer (with shorter branch lengths) share a higher similarity in taxonomic compositions.

Taxonomic composition histogram: The area of each block indicates an abundance of genus in the corresponding sample.


We thank S. Bredgaard for preparing the plastic bags, and J. B. Jørgensen and C. F. Michelsen for sharing their insights and expertise.

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore


Holidays 2024

This is a summary of official holidays of 2024 in Denmark, the UK, and China. We use this calendar to count the number of workdays spent on a project.

Read More »
Technical documents

Microbial Genome Sequencing

Sequencing a microbial genome has never been as easy and affordable as now. We provide a complete sequencing package with the basic bioinformatics analysis included. Our professional services allow you to be hustle-free and focus on your critical tasks.

Read More »
Technical documents

Microbial Community Profiling

Microbial life dominates our planet Earth in terms of quantity and biodiversity. The very first step to understanding them is to know who they are. This can be addressed via high-throughput amplicon sequencing of a few conservative biomarker genes.

Read More »

live a life as light as microbes

Do You Want To share your project or recieve a quote?