Table Of ContentsNext Page

The use of functional genomics to advance allelopathic science - investigating sorgoleone biosynthesis as an example

Scott R. Baerson, Daniel Cook, Franck E. Dayan, Agnes M. Rimando, Zhiqiang Pan and Stephen O. Duke

USDA-ARS-NPURU, P.O. Box 8048, University, MS 38677.


Expressed sequence tags (ESTs), are single-pass cDNA sequences selected at random from complex phage or plasmid libraries. The large EST datasets which have been generated for many different organisms serve as indispensable tools for both structural and functional genomics research. Datasets derived from standard, non-amplified libraries also reflect the abundance of different mRNA species in the tissues or cell types from which the libraries were generated, thus creating an electronic gene expression profile of moderately to highly abundant mRNAs. Our research efforts directed toward the discovery of genes involved in the biosynthesis of the potent allelochemical sorgoleone exploits this latter aspect of EST datasets. Ultrastructural studies indicate that root hair cells in sorghum are the primary site of sorgoleone biosynthesis, and we have generated an EST database comprised of approximately 5,500 sequences from this cell type. Highly expressed candidate sequences representing all of the putative enzyme classes required for the sorgoleone biosynthetic pathway were identified within this database.

Media summary

An expressed sequence tag database from Sorghum bicolor root hair cells is being utilized in an ongoing effort to identify genes involved in Sorgoleone biosynthesis.

Key Words

Sorgoleone, genomics, sorghum, expressed sequence tag


Natural products, well known as a source of novel pharmaceuticals, are also a promising source of novel herbicides (Dayan et al. 2002; Duke et al. 2002). Many plant species have been shown to produce such phytotoxic secondary metabolites, some of which may play a direct role in allelopathic interactions. Natural products such as allelochemicals offer an attractive alternative to synthetic herbicides, as they are generally thought to be more environmentally and toxicologically friendly. In addition, several studies indicate that many allelochemicals and other natural phytotoxins may inhibit molecular target sites distinct from those targeted by commercially available herbicides (Duke et al. 2002).

Sorgoleone (Figure 1), is an allelochemical produced by several Sorghum species which offers promise as a natural pesticide. Sorgoleone refers to a group of structurally related benzoquinones having a hydroxy and a methoxy substitution at positions 2 and 5, respectively; and either a 15- or 17-carbon chain with 1, 2, or 3 double bonds (Netzly et al. 1988). These quinones were first isolated from hydrophobic root exudates of Sorghum bicolor and are distinguished from one another on the basis of their molecular weights. The 2-hydroxy-5-methoxy-3-[(Z,Z)-8’,11’,14’-pentadecatriene]-p-benzoquinone (Figure 1) accounts for more than 80% of the root exudates of Sorghum, and is released into the soil where it represses the growth of other plants present in their surroundings. The remaining exudate constituents consist of sorgoleone congeners differing in the length or degree of unsaturation of the 3-alkyl side chain, and in the substitution of the quinone head (Rimando et al. 2003; Kagan et al. 2003; Rimando et al. 1998). As the sorgoleone-containing droplets are found exclusively on the root hairs (Czarnota et al. 2003), it is likely that most, if not all, of the biosynthetic pathway of sorgoleone is compartmentalized in root hairs. Ultrastructural studies indicate that these cells are highly physiologically active, containing a great number of mitochondria and the presence of a dense endomembrane system (Czarnota et al. 2003).

Expressed sequence tag (EST) analysis, the generation of single-pass DNA sequence data sets from randomly selected cDNA library clones (Boguski et al. 1993; Note: cDNA libraries are generated from RNA samples converted into DNA molecules by the enzyme reverse transcriptase), has in recent years emerged as a highly effective approach for identifying genes involved in plant secondary metabolic pathways, particularly in cases where the pathway of interest is highly expressed and restricted to a specific cell type or developmental stage (e.g., Lange et al. 2000; Guterman et al. 2002). . We have also found EST analysis useful for identifying genes potentially involved in the biosynthesis of the allelochemical sorgoleone, which, due to its high levels of biosynthesis specifically in root hair cells of sorghum (Czarnota et al., 2003), is well-suited to this approach. An annotated sorghum EST data set containing approximately 5,500 sequences generated from a root hair-specific cDNA library was analyzed, and highly expressed candidate sequences were found representing all of the enzymes expected to be involved in the final steps of sorgoleone biosynthesis.

Figure 1. Chemical structure of the major sorgoleone congener found in S. bicolor (2-hydroxy-5-methoxy-3-[(Z,Z)-8’,11’,14’-pentadecatriene]-p-benzoquinone.


cDNA library construction and EST data analysis

Sorghum bicolor (cv. BTX623)was grown for 5-7 days on a capillary mat system as described previously (Czarnota et al. 2003). Root hairs were then isolated by the method of Bucher et al. (1997), and total RNAs were isolated using the Trizol reagent (Invitrogen Corporation, Carlsbad, CA) per manufacturer's instructions. Tissue disruption was performed using a hand-held homogenizer at 25,000 rpm. RNA purity was determined spectrophotometrically, and integrity was assessed by agarose gel electrophoresis. Poly-A+ mRNAs were prepared using an Oligotex mRNA Midi Kit (Qiagen, Valencia, CA), then used for the construction of the cDNA library with a Uni-Zap XR cDNA library construction kit (Stratagene, La Jolla, CA). Mass excision of the primary library was performed to generate phagemid clones, which were then randomly selected for sequencing via the University of Georgia, Laboratory for Genomics and Bioinformatics wet-lab pipeline. Raw sequence traces were then filtered for quality control and elimination of contaminating vector sequences via an automated bioinformatics pipeline developed at the University of Georgia ( Database mining, which involved gene annotation key word searches, retrieval and analysis of sequence entries, and analysis of clusters of related sequences, was performed using the Magic Gene Discovery software (L. Pratt, University of Georgia, unpublished results), and by BLASTN and TBLASTN analysis (Altschul et al. 1990).

Real-Time RT-PCR

Total RNAs were prepared using Trizol as described above, then re-purified using an RNeasy Midi Kit (Qiagen, Valencia, CA), including an “on-column” DNase I treatment to remove residual DNA contamination. Real-time PCR (the quantification of messenger RNA levels using the polymerase chain reaction) was performed in two biological replicates (i.e., two RNA samples from different plants, with three PCR reactions on each RNA sample) for each tissue using an ABI PRISM™ 5700 Sequence Detector (Applied Biosystems, Foster City, CA) with gene-specific primers and primers specific to 18S rRNA (Forward, 5’- GGCTCGAAGACGATCAGATACC-3’; reverse, 5’- TCGGCATCGTTTATGGTT- 3'). First-strand cDNAs were synthesized from 2 g of total RNA in a 100 l reaction volume using a TaqMan Reverse Transcription Reagents Kit (Applied Biosystems, Foster City, CA) and random hexamers as primer. For PCR reactions using gene-specific primers, the cDNA was diluted 50-fold and 2.5 l (~0.5 ng cDNA) was used for a 25 l PCR reaction. For PCR reactions using 18S rRNA-specific primers, the cDNA was diluted 50,000-fold and 2.5 l (~0.5 pg cDNA) was used for a 25 l PCR reaction. The real-time PCR reactions were performed using the SYBR Green PCR Master Mix Kit (Applied Biosystems, Foster City, CA) with denaturation at 95C for 10 min, followed by 40 cycles of denaturation at 95C for 15 sec, and annealing/extension at 60C for 1 min. For the 18S rRNA assays, the primers were at 50 nM each and for the gene-specific assays, the primers were at 450 nM each. The changes in fluorescence of SYBR Green I dye in every cycle were monitored by the ABI 5700 system software and the threshold cycle (CT) for each reaction was calculated. The relative amount of PCR product generated from each primer set was determined based on the CT value. 18S ribosomal RNA was used as the normalization control for all assays. The CT value of 18S rRNA was subtracted from that of the gene-specific value to obtain a ΔCT value, and then the ΔCT value for the lowest-expressing tissue from that of each tissue to obtain a ΔΔCT value. The gene expression level in a tissue relative to that in the lowest-expressing tissue was expressed as 2-ΔΔCT.

Heterologous expression of recombinant proteins

Full-length coding sequences were determined in some cases using full-length cDNA clones from the root hair cDNA library (described above), and by assembly with 5' clones identified in the public sorghum EST data. When necessary, 5' ends were also generated using the SMART RACE (Rapid Amplification of cDNA Ends) cDNA Amplification Kit (Clontech Laboratories Inc., Palo Alto, CA) per manufacturer's instructions. PCR products containing complete open reading frames flanked by appropriate restriction endonuclease cleavage sites, were then generated and directly cloned into pET15b (EMD Biosciences, La Jolla, CA), in-frame with a poly-histidine tract and thrombin cleavage site. The resulting E. coli expression vectors were transformed into strain BL21/DE3 (EMD Biosciences, La Jolla, CA) for recombinant enzyme studies.

For recombinant protein production, E. coli cultures were grown at 37C to an optical density of 0.6 at 600 nm, then induced with 0.5 mM IPTG and allowed to grow 5 additional hours at 25C. Cells were then pelleted and resuspended in cold lysis buffer (50 mM Tris-HCl, pH 7.5, 1 M NaCl, 5 mM imidazole, 10% glycerol, 1 g/ml leupeptin for O-methyltransferase preparations; 100 mM potassium phosphate, pH 7.0 was substituted for Tris-HCl for polyketide synthase preparations) and extracted using a French Press at a pressure of 1500 p.s.i. Benzonase (25 U/ml) and 1 mM PMSF were added immediately to the lysate. Polyhistidine-tagged recombinant proteins were purified from the lysate using an activated Ni-column, and desalted on a PD-10 column equilibrated with cold desalting buffer (20 mM Tris-HCl, pH 7.5, 10mM DTT, 10% glycerol for O-methyltransferase preparations; 100 mM Tris-HCl was used for polyketide synthase preparations). Protein concentrations were determined using a Bio-Rad protein assay kit (Bio-Rad Laboratories, Hercules, CA). Enzyme preparations were stored at –80C prior to use.

O-methyltransferase and polyketide synthase enzyme assays

O-methyltransferase assays were performed essentially as described by Wang and Pichersky (1999), except that a mixture of ethyl acetate:hexane (1:1) was substituted for 100% ethyl acetate for extraction of enzyme reactions prior to scintillation counts. Polyketide synthase enzyme assays contained 100 mM potassium phosphate buffer (pH 7.0), 40 μM malonyl-CoA, 25 μM starter molecule (i.e. palmitoyl-CoA), and 2 μg protein in a 200 μL volume at 30 C for 30 minutes. Reactions were quenched by addition of 10 μL of 20% HCl. Extractions were then performed by the addition of 800 μl ethyl acetate, centrifugation at ~14,000 x g for 1 minute, then transfer of the organic phase to a fresh tube. Organic phases were dried under vacuum, and subsequently analyzed by GC/mass spectrometry as trimethysilyl derivatives. Product formation was quantified using selective ion monitoring.


Current understanding of the sorgoleone biosynthetic pathway (Dayan et al. 2003) suggests the participation of three different enzyme classes for biosynthesis starting with an acyl-CoA starter molecule and malonyl-CoA (Figure 2). In addition, a novel fatty acid desaturasewould be required to generate the Δ-9,12,15 double bond configuration of the proposed C16:3-CoA precursor (Figure 2). We have therefore targeted fatty acid desaturases, polyketide synthases, O-methyltransferases, and cytochrome P450s from Sorghum bicolor for obtaining candidate sequences for subsequent biochemical studies.

The S. bicolor cultivar BTX623 is represented by the majority of the over 180,000 public S. bicolor ESTs (, and was therefore chosen as the experimental system for these studies. Large-scale preparations of root hair tissue were obtained using seedlings grown on a capillary mat system developed by Czarnota and co-workers (2003), and root hairs were isolated using the method of Bucher et al. (1997). RNA extracted from this material was used for the construction of a directional cDNA library, from which an EST database was generated. 6,624 library clones were sequenced at random, yielding 5,469 ESTs of sufficient quality. The average EST length is 451 bp, using a moving window with a Phred quality score of 16, (corresponding to approximately 97.5% accuracy).

The EST data for candidate fatty acid desaturase, polyketide synthase, O-methyltransferase, and cytochrome P450-like sequences was mined using both the Magic Gene Discovery software (, and BLAST searches using functionally-characterized protein sequences as queries against the EST dataset conceptually translated in all possible reading frames. From these analyses, 47 fatty acid desaturase, 9 polyketide synthase, 94 methyltransferase, and 21 P450-like ESTs were identified. Assembly of the EST data into contigs suggested the representation of up to 15 different fatty acid desaturase, 5 polyketide synthase, 35 O-methyltransferase, and 33 P450-like sequences within the dataset.

Figure 2. Proposed biosynthetic pathway for sorgoleone. The hydroquinone, produced in vivo, undergoes autooxidation once secreted into the rhizophere to yield the benzoquinone, sorgoleone.

Given that the sorgoleone biosynthetic pathway may be exclusively localized to root hair cells (Czarnota et al. 2003), it is reasonable to speculate that the genes encoding the biosynthetic enzymes for this pathway are specifically or preferentially expressed in this cell type. A secondary screen using real-time PCR was therefore employed to prioritize sequences for further biochemical characterization. Gene expression assays were developed for the fatty acid desaturase, polyketide synthase, O-methyltransferase, and P450-like sequences identified, and remarkably, we were able to identify 3-4 candidate gene sequences from each enzyme family which were preferentially expressed in root hairs. To date, full-length open reading frames for most of these sequences have been generated and subcloned into E. coli expression vectors, or for cytochrome P450 and fatty acid desaturase-like sequences, vectors engineered for heterologous protein expression in Saccharomyces cerevisiae.

For three O-methyltransferase-like candidate sequences exhibiting root hair-preferential expression patterns, recombinant enzymes were tested for activity with various benzene-derived substrates, including a series of 5-substituted alkyl-resorcinols with alkyl chain lengths ranging from 1-15 carbons. Significantly, one of the three O-methyltransferase clones (clone G10_84), preferentially utilized 5-substituted alkyl-resorcinols among all of the substrates analyzed. 5-pentadecatrienyl resorcinol (Figure 2), the proposed in vivo substrate for the O-methyltransferase involved in sorgoleone biosynthesis, is closely related to these compounds, thus G10_84 could represent an O-methyltransferase participating in this pathway. Of significance, among all previously characterized plant enzymes, G10_84 is most closely related to an orcinol-specific (5-methyl-resorcinol-specific) O-methyltransferase identified from Rosa hybrida (Lavid et al. 2002).

Three polyketide synthase-like sequences exhibiting root hair-preferential expression patterns were also overexpressed as recombinant enzymes in E. coli, and have recently been tested for activity with various fatty acyl-CoA starter molecules in the presence of malonyl-CoA. Two of these sequences, RHPKS1 and RHPKS2, preferentially utilize long-chain acyl-CoAs, and thus represent the first known examples for this novel class of plant-specific polyketide synthases. Interestingly, phylogenetic analyses indicate that RHPKS1 and RHPKS2 form a distinct clade of polyketide synthases with homology to several PKS-like open reading frames identified within the genome sequence of rice (Figure 3). This is not entirely unexpected, given that Oryza sativa, along with at least 38 additional plant species, produce lipid resorcinolic compounds structurally related to sorgoleone (Kozubek and Tyman 1999).

Figure 3. Phylogenetic relationships among representative plant polyketide synthase-like sequences.


We have pursued a strategy based on the analysis of expressed sequence tags, to identify genes involved in the biosynthesis of the allelochemical sorgoleone. This approach, coupled with high-throughput gene expression analysis using quantitative real-time RT-PCR, has provided a highly efficient means for identifying candidate fatty acid desaturase, polyketide synthase, O-methyltransferase, and P450-like sequences preferentially expressed in S. bicolor root hair cells. This has led to significant progress made thus far in the identification of O-methyltransferase and polyketide synthase genes potentially involved in sorgoleone biosynthesis. In addition, the annotated dataset comprised of 5,469 5' sorghum root hair EST sequences we have generated will directly complement the existing approximately 180,000 public sorghum EST sequences, and expand our understanding of the transcriptome of a highly specialized and unique cell type.


Altschul SF, Gish W, Miller W, Myers EW and Lipman DJ (1990) Basic local alignment search tool. Journal of Molecular Biology 215, 403-410.

Boguski MS, Lowe TMJ and Tolstoshev CM (1993). dbEST. Database for “expressed sequence tags”. Nature Genetics 4, 332-333.

Bucher M, Schroeer B, Willmitzer L and Riesmeier JW (1997). Two genes encoding extension-like proteins are predominantly expressed in tomato root hair cells. Plant Molecular Biology 35, 497-508.

Czarnota MA, Paul RN, Weston LA and Duke SO (2003). Anatomy of sorgoleone secreting root hairs of Sorghum species. International Journal of Plant Sciences 164, 861-866.

Dayan FE (2002). In ‘Encyclopedia of Pest Management’. (Ed. D. Pimentel), pp. 521-525. (Marcel Dekker, Inc., New York).

Dayan FE, Kagan IA and Rimando AM (2003). Elucidation of the biosynthetic pathway of the allelochemical sorgoleone using retrobiosynthetic NMR analysis. Journal of Biological Chemistry 278, 28607-28611.

Duke SO, Rimando AM, Baerson SR, Scheffler BE, Ota E and Belz RG (2002). Strategies for the use of natural products for weed management. Journal of Pesticide Science 27, 298-306.

Guterman I, Shalit M., Menda N, Piestun D, Dafny-Yelin M, Shalev G, Bar E, Davydov O, Ovadis M, Emanuel M, Wang J, Adam Z, Pichersky E, Lewinsohn E, Zamir D, Vainstein A and Weiss D (2002). Rose scent: genomics approach to discovering novel floral fragrance-related genes. The Plant Cell 14, 2325-2338.

Kagan IA, Rimando AM and Dayan FE (2003). Chromatographic separation and in vitro activity of sorgoleone congeners from the roots of Sorghum bicolor. Journal of Agricultural and Food Chemistry 51, 7589-7595.

Kozubek A and Tyman JHP (1999). Resorcinolic lipids, the natural non-isoprenoid phenolic amphiphiles and their biological activity. Chemical Reviews 99, 1-25.

Lange BM, Wildung MR, Stauber EJ, Sanchez C, Pouchnik D and Croteau R (2000). Probing essential oil biosynthesis and secretion by functional evaluation of expressed sequence tags from mint glandular trichomes. Proceedings of the National Academy of Sciences 97, 2934-2939.

Lavid N, Wang J, Shalit M, Guterman I, Bar E, Beuerle T, Menda N, Shafir S, Zamir D, Adam Z, Vainstein A, Weiss D, Pichersky E and Lewinsohn E (2002). O-methyltransferases involved in the biosynthesis of volatile phenolic derivatives in rose petals. Plant Physiology 129, 1899-1907.

Netzly D, Riopel JL, Ejeta G and Butler LG (1988). Germination stimulants of witchweed (Striga asiatica) from hydrophobic root exudate of sorghum (Sorghum bicolor). Weed Science 36, 441-446.

Rimando AM, Dayan FE, Czarnota MA, Weston LA and Duke SO (1998). A new photosystem II electron transfer inhibitor from Sorghum bicolor. Journal of Natural Products 61, 972-930.

Rimando AM, Dayan FE and Streibig JC (2003). PSII inhibitory activity of resorcinolic lipids from Sorghum bicolor. Journal of Natural Products 66, 42-45.

Wang J and Pichersky E (1999). Identification of specific residues involved in substrate discrimination in two plant O-methyltransferases. Archives of Biochemistry and Biophysics 368, 172-180.

Top Of PageNext Page