![]() | ![]() |
Formats:
|
||||||||||||||||||||||||||||||||||
Methods for Acquisition of Quantitative Data from Confocal Images of Gene Expression in situ a Department of Computational Biology, Center for Advanced Studies, St. Petersburg State Polytechnical University, St. Petersburg, Russia b Department of Genetics, Harvard Medical School, Boston, USA c Department of Applied Mathematics and Statistics and Center for Developmental Genetics, Stony Brook University, Stony Brook, NY *e-mail: surkova/at/spbcas.ru Abstract In this review, we summarize original methods for the extraction of quantitative information from confocal images of gene-expression patterns. These methods include image segmentation, the extraction of quantitative numerical data on gene expression, and the removal of background signal and spatial registration. Finally, it is possible to construct a spatiotemporal atlas of gene expression from individual images recorded at each developmental stage. Initially all methods were developed to extract quantitative numerical information from confocal images of segmentation gene expression in Drosophila melanogaster. The application of these methods to Drosophila images makes it possible to reveal new mechanisms in the formation of segmentation gene expression domains, as well as to construct a quantitative atlas of segmentation gene expression. Most image processing procedures can be easily adapted to process a wide range of biological images. Keywords: methods of image processing, confocal microscopy, quantitative gene expression, image segmentation, spatial registration, background removal Progress in the molecular genetics of eukaryotes has advanced the sequencing of gene coding regions and the recognition of their regulatory regions with promoters, enhancers, and sites for the binding of transcription factors, as well as the identification of the transcription factors that bind to these regions. These genetic elements, together with mRNA and protein products, produce the gene regulatory networks. The networks coordinate the gene expression that is responsible for cell specificity, tissue differentiation, and, finally, for ontogenesis. To gain greater insight into the principles of the organization, functioning, and evolution of regulatory networks, it is necessary to provide a detailed quantitative description for the dynamics of each component. Thus, the functional genomics technique is widely used for the quantitative evaluation of gene expression (Banerjee et al., 2002). The method of fluorescence restoration after photobleaching (FRAP) is applied for the estimation of the rate of molecule diffusion or transport inside the cells (Shav-Tal, 2006). It should be noted that DNA microarrays, as well as other methods for the quantitative evaluation of gene expression (quantitative PCR, CAT assays) with high resolution, are of limited applicability for the study of the development process. These techniques are performed on cell homogenates; therefore, the information on spatial gene expression is lost. Cell determination and pattern formation during early embryogenesis take place in relatively small morphogenetic fields in which slight differences in the spatial expression of a few genes precedes the cell’s diversity (Gilbert, 2003). In Drosophila, segment determination takes place in morphogenetic fields with lengths of less than 0.3 mm along the anteroposterior embryo axis; at the syncytial balstoderm stage, it corresponds to a region of the presumptive germ band. It follows from this that the study of the determination mechanisms requires detailed information on spatiotemporal gene expression in situ. Various methods are currently available for the identification of gene expression in both mRNA and protein levels in living and fixed biological objects. The general technique used to detect biological macromolecules is immunofluorescent labeling. This method is highly specific, as it is based on the interactions of antibodies labeled with fluorescent dyes with the appropriate antigens. It is widely used to detect and localize antigens in fixed cells and tissues. Cell and tissue signals are registered by fluorescent microscopy with laser and confocal scanning microscopes. The advantages of a confocal microscope lie in the detection of signals from the focal plane only; fluorescence from nearby layers of the object examined is not recorded. As a result, images are of a higher contrast and are less blurred. In addition, confocal microscopy provides high-quality digital images ready for further computer processing. Confocal microscopy has wide applications; profit and nonprofit approaches to processing of confocal images has come to be in great demand. Here, we present a survey of techniques we have developed to process confocal images of expression pattern of segmentation genes in the fruit fly Drosophila (Kozlov et al., 2000, 2002; Myasnikova et al., 2001a, 2001b, 2005; Janssens et al., 2005). These methods are aimed at acquiring quantitative data on the expression of these genes and constructing a general atlas of gene expression. Finally, it will be possible to reconstruct the total spatiotemporal gene expression of a particular gene network with an accuracy of a single cell. Most procedures are universal and can be applied with slight modifications for acquiring quantitative data from image patterns of gene expression in other organisms. These methods are described in sufficient detail for the reader to consider their applicability to the other objects. The reader may also become acquainted with detailed information on processing algorithms in the original publications. Methods for the Acquisition of Confocal Images Experimental data An indirect immunofluorescence technique was used to acquire information on gene expression (Dequin et al., 1984; Frasch et al., 1987). About 1600 embryos of Drosophila melanogaster Oregon-R were fixed and incubated with primary antibodies to proteins encoded by the following genes that control segmentation in Drosophila: bicoid (bcd) and caudal (cad) maternal genes; Kruppel (Kr), knirps (kni), giant (gt), hunchback (hb), and tailless (tll) gap genes; and even-skipped (eve), fushi tarazu (ftz), hairy (h), runt (run), odd-skipped (odd), paired (prd), and sloppy-paired (slp) pair-rule genes. All antibodies were obtained in our experiments (Kosman et al., 1998) except for antibodies to the Eve protein (Azpiazu and Frasch, 1993). Secondary antibodies were conjugated with FITC; Texas Red; Cy5 (Jackson Labs, USA); or Alexa Fluor 488, 555, 647, or 700 (Molecular Probes, USA) (Janssens et al., 2005). Each embryo was labeled for the expression of eve and two other segmentation genes. About half of the embryos were labeled with histone-specific antibodies (Janssens et al., 2005). The age of embryos ranged from cleavage stage 10 to the beginning of gastrulation. Only embryos with lateral orientations were used for scanning. Confocal microscopy was performed as previously described (Kosman et al., 1997; Janssens et al., 2005;Myasnikova et al., 2005). Embryos were scanned using a Leica TCS4D confocal microscope with an immersion objective of 16×/0.50 (Kosman et al., 1997) or 20×/0.70 Plan Apo Leica TCS SP2 system (Janssens et al., 2005). For every four microscope channels, two optical sections at distances of 1 μm were scanned for each embryo. To reduce the level of noise in the recorded signals, every two optical sections were scanned 16 times and the resulting images were averaged. Thus, each pixel of a particular image recorded in one channel is an averaged value from 32 measurements. In total, 3 resultant images (one per channel) were obtained for each embryo. The size of the images was 1024 × 1024 pixels; a signal was recorded in 8-bit format. Confocal microscope options “gain” and “offset” were adjusted to have pixel values of 0 outside of the embryos and 255 as the most bright. Pixels with maximal brightness were selected as specific patterns of gene expression related to the stage of development with the maximal intensity of that particular segmentation gene. Thus, an experienced researcher is able to find in a slide an embryo for each gene with the expression pattern of maximal intensity and to operate the brightest pixels of the pattern to adjust the photomultiplier settings. It is obvious that adjustment makes it possible to estimate quite precisely the quantitative (in relative units) expression of a specific gene, though not to compare the expression of different genes. Temporal classification of embryos To reconstruct the temporal dynamics of gene expression based on individual fixed embryos, it was necessary to rank the embryos by age. Cleavage cycles 10–13 were identified by counting nuclei in text files containing quantitative data on expression obtained by image segmentation (see next section). As the interphases in these cycles last only 6–14 min, it is sufficient to make a temporal scale for this period. Cycle 14A is much longer at 50 min; therefore, the embryos inside the cycle are divided in 8 temporal classes equivalent in age based on the dynamics of eve gene expression (Fig. 1
A practical criterion for the adequacy in age of the temporal class is the inability of an experienced observer to distinguish expression patterns of embryos that belong to a particular class. Because embryos were scanned without taking into consideration their age and all 8 temporal classes have approximately an equal number of embryos, it is reasonable to assume uniform age distribution in samples of the 14A cycle and to consider the duration of each temporal class to be 6.5 min of the development (Myasnikova et al., 2001, Surkova et al., 2008). Figure 1 Orientation of embryos, one-dimensional expression patterns Images of eve gene expression patterns (Fig. 1
Access to data All images and quantitative data are stored in a FlyEx database (http://urchin.spbcas.ru/flyex/; http://flyex.ams.sunysb.edu/FlyEx/) available on the Internet (Poustelnikova et al., 2004). Methods of Image Processing 1. Image segmentation To study the dynamics of pattern formation, detailed information is required on the temporal and spatial expression of all regulators of the morphogenetic field where the processes take place. Available commercial software (e.g. VisiQuest, Accusoft Corp., USA and MatLab, MathWorks, USA) and their free analogues (SCIRun, USA and TiViPe, Japan) are not always sufficient for the quantitative evaluation of spatial gene expression. The method of segmentation is based on confocal images of gene expression patterns in Drosophila embryos (Janssens et al., 2005). The method makes it possible not only to separate the analyzed objects (nuclei) simultaneously in several images recorded in distinct microscope channels, but also to acquire quantitative information on gene expression in the form of text tables. For the acquisition of quantitative data on gene expression in every nucleus of Drosophila embryo, it is necessary to do the following: (1) to put experimental images into the standard orientation (see previous section); (2) to mark the zone occupied by the object and to cut off blank regions along the edges; (3) to construct a “nuclear mask,” i.e., binary image, where only pixels related to the embryo nuclei are “switched on”; (4) to calculate nuclear centroid coordinates and the average fluorescence intensity in individual nuclei for each microscope channel, i.e., for every scanning protein (Janssens et al., 2005; Kosman et al., 1997). Most of the techniques described above are standard and can be applied to treat images from other objects. Image adjustment to the standard orientation First, one image for each of the three gene products registered in one embryo is constructed by averaging two optical sections (Fig. 3
To calculate the angle of embryo rotation into the standard orientation (see Section 1), as well as for the removal of nonzero pixels outside embryos, it is necessary to plot the primary (rough) mask of the total embryo. For this purpose, a pixel maximum from 4 images registered in distinct microscope channels is created. Then, the threshold and median filters, as well as several cycles of erosion and dilation follow (Gonzales and Woods, 2002 Gonzales and Woods, 2005) (Fig. 3 Smooth mask construction The final phase before image segmentation is the construction of a smooth mask that corresponds precisely to the embryo shape (Fig. 4
Segmentation of images and acquisition of quantitative data To acquire information on gene expression in every embryo nucleus, a so called “nuclear mask” is constructed based on histone protein-expression images (if available, see above) or the pixel maximum of the gene expression of the other three (Fig. 5 Binary image multiplied by the image of the pixel’s maximum or the histone protein image leads to each nucleus being separated from the neighboring nuclei by a border of zero pixels. The procedure allows us to separate the fused nuclei on images. Erosion with the following distance transformation (Vincent, 1993) and threshold filter remove unwanted non-nuclear stains from the mask. After edge detection and filling, the resulting mask is a binary image with the regions corresponding to nuclei given the value of 1. The quantity of the nuclear mask construction is usually controlled visually by its superimposing on the image from histone scanning or the pixel maximal image obtained in other microscope channels (Fig. 6a The binary mask is used for the acquisition of quantitative data on gene expression. Centroid coordinates of every nucleus are calculated with moment invariants (Hu, 1962). Superimposing of a mask on images registered in each microscope channel allows us to calculate the average concentration of products of all scanned genes in every nucleus (Janssens et al., 2005). The ultimate result is the table with x and y coordinates of every nucleus in percentages of the embryo’s length and width, correspondingly, as well as the average fluorescence intensity (relative level of expression) for each scanned gene of a particular embryo (Fig. 5 2. Removal of background signal It is well known that along with specific staining immunofluorescence technique produce nonspecific signals named background. Unspecific staining is caused by various factors but mostly by binding of primary and secondary antibodies (Fig. 7
We developed a method for unspecific signal removal (Myasnikova et al., 2005) that is applicable for images of various objects with fixed settings of the confocal microscope (see above). It is clear that even slight background distorts the numerical value of gene expression. In addition, the background level is varied between embryos and experiments and, therefore, impedes the comparison of data from various experiments. Unspecific staining is also varied in experiments on the expression of a particular gene using secondary antibodies labeled with distinct fluorochromes (Fig. 7 The method of background removal that we suggested is based on observations that the fluorescence level in null-mutants stained on the product of the mutant gene is well approximated with two paraboloids or, in the common case, with a convex surface of the second order (Fig. 8
Determination of nonexpressing regions Nonexpressing regions are defined as areas with a particular gene being unexpressed in most nuclei. For each gene, these areas are shaped by careful visual examinations of expression patterns in all embryos. Identified nonexpressing areas, then, are precisely defined on the two-dimensional pattern of gene expression in every embryo (Fig. 9
As segmentation gene expression is basically the function of the position relative to the A–P embryo axis (see above), it may be adequately represented as a one-dimensional signal. Therefore, in most cases, it is sufficient to set the borders of nonexpressing areas along the x axis. For example, the gene eve areas are located within 0–25 and 90–100% of the embryo’s length (Fig. 9 Approximation and background removal The background signal is approximated with a quadratic paraboloid fit to the points of support from nonexpressing areas of an embryo (Fig. 9
Evaluation of the accuracy of background removal The method of background removal was carefully verified on embryos homozygous for null-mutations of bcd, eve, gt, kni, and Kr genes and stained to visualize the product of a mutant gene. After background removal, the expression patterns in these mutants are transformed into the null level of expression in the entire embryo (Fig. 8 3. Registration of gene expression patterns A confocal microscope permitted us to simultaneously record expression patterns for only a limited number of genes. However, our aim was to reconstruct the spatiotemporal dynamics of all genes in the gene network examined. Because the sizes of individual embryos are variable, it is impossible to gain information on relative spatial expression of different genes by overlapping of expression patterns in individual embryos stained with antibodies to distinct proteins. To overcome the obstacles, the data on individual embryos should be adjusted to the common coordinate system with the registration procedure. In previous reports (Myasnikova et al., 2001a, 2001b; Kozlov et al., 2002), we used a registration technique based on the allocation of control points in the image (Brown, 1992) and the transformation of coordinates for the maximally complete matching of these points in different images. Typical image features are usually used as control points. In the examined dataset, all embryos were stained for the expression of the eve gene. The expression pattern of this gene was highly temporally dynamic; therefore, as control points (peculiar pattern features), we used the extremes of the one-dimensional expression pattern obtained by data extraction from the central 10% embryo strip (Fig. 2 Choice of Control Points for Registration 1. Spline approximation Stripes and interstripes in the one-dimensional expression diagram (Fig. 1
2. Fast Redundant Dyadic Wavelet Transform (FRDWT) The second method of setting control points is based on the iterative application of a wavelet transform (Kozlov et al., 2000, 2002). A wavelet transform (Unser, 1996; Malozemov et al., 1998) allows one to gain local high-frequency and large-scale information on an object. Its application makes it possible to simultaneously examine the data in physical (time, coordinate) and frequency spaces. It is necessary to choose the transformation type and basis in such a way as to separate the information from the signal on the first derivative of the original signal. FRDWT application makes this possible (Unser et al., 1994). The common properties of FRDWT are strong noise inhibition and precise determination of spatially localized features. The signal is factorized in two sequences, i.e., approximating (low pass) and detailing (high pass) (Fig. 12
3. Image registration Image registration is performed with the scaling of two-dimensional expression patterns along the x axis by means of an affinity transformation minimizing the summary distance between the x coordinates of the total control points in different patterns and the average position of corresponding control points in overall registered images (Kozlov et al., 2002) (Fig. 13
4. Construction of integrated data The main purpose of the spatial registration and background removal was to create a map of mutual positions of expressing domains for all genes of the examined gene network for every period of time. Thus, on the map, the expression regions of every gene are presented with “reference” and “integrated” patterns characterized by average values of fluorescence intensity for that particular period of time. For the gene network controlling the segmentation in Drosophila, our goal was to construct integrated patterns for all of the embryos related to each temporal class (Fig. 1 Two-dimensional integrated expression patters were constructed on the registered data for embryos of every temporal class. For this purpose, a nucleus of an average pattern of a nuclear structure with the most similar coordinates was found for each nucleus of an individual embryo. Then, the average fluorescence intensity of total individual nuclei referring to the average nucleus has been calculated. The example of the integrated two-dimensional pattern of gt, Kr, and eve gene expression in embryos of temporal class 8 is presented in Fig. 14a
To create one-dimensional integrated patterns of segmentation, the gene-expression nucleus coordinates of every one-dimensional registered pattern are grouped along the x axis by R intervals (Myasnikova et al., 2001b). Then, inside each interval, the average value of the fluorescence intensity is calculated for all embryos of a particular temporal class. R is found as follows. For example, in the 14A cycle, the diameter of a single nucleus in the central part of the embryo is 1% of its length; in this case, to correctly design the nuclear structure of the pattern, R should be 100. One-dimensional integrated patterns of the expression of maternal, gap, and pair-rule genes are shown in Fig. 15
CONCLUSIONS The paper surveys the methods we developed for the acquisition of quantitative data on gene expression based on confocal images. The processing methods described include image segmentation, background removal, spatial registration of expression patterns, and data integration (Kozlov et al., 2000, 2002; Myasnikova et al., 2001a, 2001b, 2005; Janssens et al., 2005). The procedures are basic for the acquisition and processing of quantitative data on gene expression and can be used both in order and in various combinations. The general advantages of techniques presented are the wide spectrum of treatment procedures and easiness of adaptation to other biological objects. Most of the methods for image processing are suggested mainly for limited purposes (e.g. image registration) and, therefore, consist of one or two procedures. The most widespread and relevant method is that of cell and tissue image segmentation (Ortiz de Solorzano et al., 1999, 2001;Umesh Adiga and Chaudhuri, 1999; Chawla et al., 2004). An essential drawback of most segmentation algorithms and basing software is the requirement for the user to take part in setting the processing parameters. The segmentation method we developed is completely automatic. The single step needed the visual control is the definition of embryo orientation (Fig. 3 A recent development is the method of three-dimensional stack segmentation of confocal images of Drosophila blastoderm (Luegno et al., 2006). The method is based on the estimation of local maximums of nuclear fluorescence the and extraction of so called “seeds” for every nucleus to identify positions of nuclei stained with Sytox Green Dye for DNA detection. Next, seeds are intensified in the image obtained from the initial one with the application of a threshold filter. This method, as well as those described above, is realized with Mat-Lab software. Its advantage is that it was initially developed for three-dimensional data processing, while its drawback is a difficulty in seed estimation, as most of the nuclei have several local maximums and a complicated correction procedure is required. In addition, seed intensification is performed with masks obtained using threshold filters, which does not always reproduce nuclear shapes and sizes. Currently, a few techniques are available to remove unspecific signals from confocal images. Many researchers remove unspecific backgrounds only by subtracting a particular intensity value from the total signal. It has been shown (Gregor et al., 2007) that images of Drosophila living embryos with endogenous bcd gene substituted with expressing eGFP registered by two-photon confocal microscope have approximately uniform unspecific fluorescence within an embryo. However, these observations are exceptional and, undoubtedly, there should be algorithms that allow differences in the background signal in various parts of confocal images to be taken into consideration. A procedure has recently been published for image registration that differs from what we developed (Sorzano et al., 2006). This method, called “elastic registration,” elastically deforms images to overlap expression patterns in different embryos. After transformation, all registered images are reduced to the same size in x and y axes. The method is highly effective; however, it is unsuitable for evaluating expression domain positions, as it transforms the coordinates of initial images. The general advantage of the method described in this paper for treating confocal images of Drosophila blastoderms is the wide spectrum of procedures and progressive data treatment, as well as the possibility of adaptation to other biological objects. We have applied the technique to the processing of about 5000 images of segmentation gene expression at the protein level in Drosophila. Quantitative data obtained have been successfully used for the elucidation of the regulatory mechanisms underlying the border shift in gap domain expression in early embryogenesis (Jaeger et al., 2004b), as well as for the identification of regulatory mechanisms controlling gap gene expression (Jaeger et al., 2004a). The technique has also been employed to characterize the processes of dynamic filtration of the temporal variability of the expression pattern of zygotic segmentation genes (Surkova et al., 2008). The general drawback of most task-oriented software for data and image processing is the difficulty of adapting them to resolve similar problems in other objects. The procedures of segmentation, background removal, and the creation of integrated patterns described in the paper presented have been successfully adapted for processing data on segmentation gene expression at the mRNA level. In Drosophila, the main difference in the expression of genes encoding transcription factors at the mRNA and protein level is that RNA is localized in both the nucleus and cytoplasm. For the segmentation of such images, a modified watershed procedure that generates a mask for nuclei and the surrounding cytoplasm has been applied (Fig. 5 Acknowledgments This work was supported by the NIH grant RR07801, by CRDF GAP Award RUB1-1578, by contract 02.467.11.1005 from the FASI of the RF, and by grant 047.011.2004.013 from the NWO-RFBR. The authors are grateful to Dr. D. Kosman and Dr. C. Alonso for help in acquisition of experimental data. References
|
PubMed related articles
Your browsing activity is empty. Activity recording is turned off. |
|||||||||||||||||||||||||||||||||
In Silico Biol. 2002; 2(2):125-41.
[In Silico Biol. 2002]Bioinformatics. 2001 Jan; 17(1):3-12.
[Bioinformatics. 2001]Dev Biol. 1984 Jul; 104(1):37-48.
[Dev Biol. 1984]EMBO J. 1987 Mar; 6(3):749-59.
[EMBO J. 1987]Genes Dev. 1993 Jul; 7(7B):1325-40.
[Genes Dev. 1993]Dev Biol. 2008 Jan 15; 313(2):844-62.
[Dev Biol. 2008]Bioinformatics. 2004 Sep 22; 20(14):2212-21.
[Bioinformatics. 2004]Appl Opt. 1985 May 15; 24(10):1438.
[Appl Opt. 1985]Bioinformatics. 2001 Jan; 17(1):3-12.
[Bioinformatics. 2001]In Silico Biol. 2002; 2(2):125-41.
[In Silico Biol. 2002]Bioinformatics. 2001 Jan; 17(1):3-12.
[Bioinformatics. 2001]In Silico Biol. 2002; 2(2):125-41.
[In Silico Biol. 2002]In Silico Biol. 2002; 2(2):125-41.
[In Silico Biol. 2002]In Silico Biol. 2002; 2(2):125-41.
[In Silico Biol. 2002]In Silico Biol. 2002; 2(2):125-41.
[In Silico Biol. 2002]Bioinformatics. 2001 Jan; 17(1):3-12.
[Bioinformatics. 2001]In Silico Biol. 2002; 2(2):125-41.
[In Silico Biol. 2002]Bioinformatics. 2001 Jan; 17(1):3-12.
[Bioinformatics. 2001]J Microsc. 1999 Mar; 193(Pt 3):212-26.
[J Microsc. 1999]Microsc Res Tech. 1999 Jan 1; 44(1):49-68.
[Microsc Res Tech. 1999]J Neurosci Methods. 2004 Oct 15; 139(1):13-24.
[J Neurosci Methods. 2004]Proc Natl Acad Sci U S A. 2005 Dec 20; 102(51):18403-7.
[Proc Natl Acad Sci U S A. 2005]Cell. 2007 Jul 13; 130(1):153-64.
[Cell. 2007]Nature. 2004 Jul 15; 430(6997):368-71.
[Nature. 2004]Dev Biol. 2008 Jan 15; 313(2):844-62.
[Dev Biol. 2008]Nat Genet. 2006 Oct; 38(10):1159-65.
[Nat Genet. 2006]