• We are sorry, but NCBI web applications do not support your browser and may not function properly. More information
Logo of nihpaAbout Author manuscriptsSubmit a manuscriptNIH Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Comput Med Imaging Graph. Author manuscript; available in PMC Oct 1, 2012.
Published in final edited form as:
PMCID: PMC3151327

Automatic detection of follicular regions in H&E images using iterative shape index


Follicular Lymphoma (FL) accounts for 20-25% of non-Hodgkin lymphomas in the United States. The first step in grading FL is identifying follicles. Our paper discusses a novel technique to segment follicular regions in H&E stained images. The method is based on three successive steps: 1) region-based segmentation, 2) iterative shape index (concavity index) calculation, 3) and recursive watershed. A novel aspect of this method is the use of iterative Concavity Index (CI) to control the follicular splitting process in recursive watershed. CI takes into consideration the convex hull of the object and the closest area surrounding it. The mean Zijbendos similarity index (ZSI) final segmentation score on fifteen cases was 78.33%, with a standard deviation of 2.83.

Keywords: Follicular lymphoma, active contour, morphological filtering, recursive concavity index, recursive watershed, Fourier descriptor, Zijdenbos similarity index


There has been a growing demand for Computer-Aided Diagnosis (CAD) systems in medical fields especially in radiology, where it is currently used for detection and diagnosis of certain diseases [1-2]. CAD research is also becoming increasingly popular in pathology after the introduction of whole-slide scanners. The foundation of any CAD system is based on the translation of knowledge base in medical diagnosis into computer algorithms using mathematical models derived from computer vision and pattern recognition theories.

Follicular Lymphoma, a lymph system cancer, accounts for 20-25% of non-Hodgkin lymphomas in the United States [8]. It predominately affects adults, particularly the middle-aged and elderly. This disease is characterized by a partial follicular or nodular pattern, which is composed of lymphoid cells of follicular center origin. These cells include small-cleaved cells and larger non-cleaved cells. Before grading lymphomas, pathologists must identify concentrated cellular follicles on an H&E slide. Similarly, follicle detection is the first step of a computer-assisted follicular lymphoma grading system [3-7]. Our paper explores the important initial first step of follicular region detection before FL grading assessment. This is a part of a large FL grading system that we have been developing. In our previous work related to this system, we developed computer tools for accurate grading of Follicular Lymphoma (FL) [3-7]. However, most of the proposed research has targeted the grading of cancerous cells within pre-selected areas with little work dedicated to follicle extraction from H&E images.

As the first step in the diagnosis of FL, a pathologist browses the slide at the lower magnification (2X) in order to locate follicular regions. Then, he/she zooms in into higher magnifications to identify and count the number of cancerous cells in respect to non-cancerous cells. To mimic the pathologist’s conventional clinical routine, we are focusing here on extracting follicular regions. Figure 1 shows a follicular region. Follicular areas are noticeable at 2X and 4X magnification; however they appear as noisy, diffused regions with relatively low contrast, and suffer from blurred edges and shadow effects. These limitations make follicle detection/segmentation very challenging. To address this challenging problem, we propose a method based on successive application of level set curve evolution, a novel iterative concavity index, and a recursive controlled watershed technique.

Figure 1
An example of a follicular region in part of a digital slide.

The index of concavity is applied in order to split follicles and is based on the ratio of the total convex area surrounding the object to the area of the object. This parameter controls follicle splitting considering a threshold calculated from the concavity indices of the whole image. Concavity index have been used to split touching cells/objects. [9-12]. These methods are based on ellipse fitting, considering some boundary landmark points located around the object as well as local curve shape minima as concavity points. Moreover, contour and the object area are considered to split cells by iterative voting using oriented kernels [13-15]. These cone-shaped kernels vote iteratively for the local center of the components of an aggregation. In our approach, the geometry of the object is taken into consideration to simplify the current definitions of concavity index.

The proposed follicle segmentation combines knowledge-based segmentation techniques (e.g. active contour), a novel concavity index-based shape analysis, recursive watershed, color identification and Fourier shape representation. Figure 2 gives an overview of this method. An RGB input image is pre-processed to improve overall quality of the image (Section 2.1) then an active contour inspired region-based segmentation is performed (Section 2.2). The resultant image contains some undesired objects, for example, artifacts and small residual areas, which are removed by morphological filtering (Section 2.2). The concavity index is used to condition the splitting/identification of follicles with respect to each other (Section 2.3). Finally, Fourier shape descriptors are used to smooth the borders of each follicle (Section 2.5). Results are presented in Section 3 and Section 4 is our concluding remarks.

Figure 2
Diagram showing the different steps of the method



Prior to segmentation, a pre-processing step is usually needed to improve overall image quality and enhance follicle visibility. We are proposing a multi-step pre-processing approach of following operations: multi-channel decorrelation stretching [16-18] to increase decorrelation between RGB channels, median filtering [19] and adaptive histogram equalization [20] to reduce noise and to correct the uneven illumination distribution across the image. The lower resolution (2X magnification) FL images resemble remote-sensing images quite significantly. They are both affected by high noise, shadows, and artifacts.

2.1.a- Color Enhancement

An image enhancement technique widely used in remote sensing is Decorrelation Stretch. This technique first reduces the correlation between the R, G, and B channels of an image then stretches contrast [16-18]. The decorrelation stretch algorithm consists of two successive steps: first, eigenvalues and eigenvectors of the original image (in RGB) are calculated and second, the stretching vector is computed from the previous step as:


Here, S is a diagonal matrix [16-18]; σ is the standard deviation of the R, G, and B channels color space image and ei is the eigen-value vector along the diagonal matrix.

Each pixel in the original image, x, is then transformed to y using the following transformation:


where E is the eigen-vectors matrix, ET is the transpose matrix of E, and S is the diagonal matrix as defined in Equation 1. This is in essence matching between the mean and variance of the original image to those in the eigenspace to the target image.

Figure 3 shows the results of the decorrelation stretch method on one sample image. As this image illustrates, this method enhances the color contrast, which improves the contrast between follicles and the background, but the inherent noise is also amplified. Another step of pre-processing, noise reduction, is needed to reduce noise and to preserve image details (Section 2.1.b).

Figure 3
shows in (a), original RGB image, in (b) the decorrelation stretch, in (c) the red channel of image in (b), in (d) the green channel of image in (b), and in (e) the blue channel of image in (b).

We next identify the color channel that has the most descriptive follicular details (see Figure 3). We compute two quality assessment metrics: signal to noise ratio (SNR) [19] and texture contrast [21] to assess the quality of red, green, and blue channel separately to identify the channel that presents the highest quality metric values for further processing. The first metric (SNR) is defined as follows:


where μ is the mean of each channel obtained after color enhancement, and σ is the standard deviation.

The second metric is the texture-contrast introduced by Haralick, et al [21]. This metric describes the contrast of an image based on gray-level spatial dependencies using gray level co-occurrence matrix, and is given by the following equation:


where p(i, j) is the co-occurrence matrix, K is the number of blocks, and N*N is the size of the same co-occurrence matrix

The average metric values for our images (before and after color enhancement) are represented in Tables Tables11 and and2,2, respectively.

Table 1
Values of the quality metrics for the red, green, and blue channel obtained before the color enhancement
Table 2
Values of the quality metrics for the red, green, and blue channel obtained after the color enhancement

Comparing Tables Tables11 and and2,2, one can notice that the blue channel out-performs the red and green channels. An extended comparison between the blue channel to other color components from other color spaces such as L channel from the Lab and V channel from the HSV is also proposed in Table 3. It has been accepted that the Lab and HSV spaces are perceptually uniform and approximate to some degree the human visual system. However, the quality assessment comparison presented in Table 3 is limited to the comparison between the Blue, L, and V channels since the blue channel is not a true chrominance channel like H, S, a, and b components. More importantly the follicles are best visible in the Blue, V, and L components (see Figure 4) rather than in a, b, H, and S chrominance components.

Figure 4
shows in (a), Blue channel, in (b) the L channel of the Lab color space and in (c) the V channel of the HSV color space.
Table 3
Quality metric values for the Blue, L, and V channels obtained after the color enhancement.

Table 3 demonstrates that the Blue channel out-performed the L component of the Lab color space and the V component of the HSV color space, respectively both in terms of the SNR and the contrast. Consequently, this channel is kept for further analysis and others are not considered.

2.1.b Noise Reduction and luminance correction

To filter out the noise after the decorrelation stretch, we applied a median filter [19] to the blue channel selected after the decorrelation stretch. The median filter is selected so that the edges, which mostly correspond to the boundaries of cells and follicles, would be retained while the noise is reduced. The size of the median filter is fixed and equal to 7×7. The result of this filter is displayed in Figure 5 (a). It is observed that the SNR and contrast values increase after the application of the median filter on the images obtained from the previous step and are equal to 3.30 and 1.98 compared to 2.56 and 1.87, respectively (see Table 3).

Figure 5
shows in (a) result of the median filter on the blue channel image, and in (b) adaptive histogram equalization on the image in (a). As this example shows the histogram equalization emphasizes the follicles while suppressing the background structures, ...

To correct the uneven-distribution of illumination and hence flatten the interior of the follicles, we applied adaptive histogram equalization to the image obtained after the noise elimination stage. This method differs from ordinary histogram equalization in the sense that the adaptive method computes several histograms, each corresponding to a distinct block of the image, and uses them to redistribute the lightness values of the image [20] and hence obtain a uniform histogram.

The outcome of the adaptive histogram equalization on the sample image was deduced after the median filter and is exhibited in Figure 5(b). One can observe that the contrast of the image has considerably increased, the different follicles in the image are more easily visible and the interior of the follicle is more homogeneous than that of the previous image (see Figure 5a). The homogenization metric, which quantifies the smoothness (flatness) of the image [21] is used here:


where p(i, j) is the co-occurrence matrix, N*N is the size of the same co-occurrence matrix, and i and j are the horizontal and vertical indexes.

The average value of this metric is equal to 0.88 and 0.94 before and after the adaptive histogram equalization for the set of images used in this study, respectively. The increase of this metric shows that the adaptive histogram equalization smoothed the image and is more homogeneous without compromising the borderlines of the follicles. This is important because the region-based segmentation, which is a version of curve evolution in our case, often converges into a local minimum when the interior and the exterior of the object is not sufficiently homogenous [22].

2.2-Region-Based Segmentation

This section describes the curve evolution oriented segmentation that is used to determine the location of the follicle regions.

The early step of identifying individual follicles from the background consists of converting the gray-level image obtained from pre-processing step into a binary one. This binary image is a rough representation of the follicle regions and is then used as a support-image in the identification process. Also, this mask can be seen as result of an initial segmentation (global segmentation) of the image including several overlapping regions. The extraction of each follicle (fine segmentation) is fulfilled using the concavity index and the controlled watershed iteratively.

To determine the initial segmentation, we employed region-based segmentation, which is based on curve evolution as proposed by Chan and Vese [23]. This method utilizes the level set technique of curve evolution, which starts with an initial curve or multiple curves [23] around the object/objects to be segmented. The curve moves toward of the interior of the object/objects and stops when the desired boundary of the object/objects is reached. This model can detect objects whose boundaries are not necessarily defined by gradient. This is an iterative process based on the minimization of energy between the objects and the fitted curve.

The minimization is summarized in Equation 6 considering two distinct forces F1(.) and F2(.) defining the inside and outside force constraints respectively [23]:


where C is the active contour curve, and the constants c1 and c2 depending on C, are the averages of the image u0 inside C and outside C, respectively. If C0 is the object boundary, it is the minimizer of the fitting term:


Three solutions of this equation are suggested and are described as:

  1. If the curve C is outside the object (follicle), F1(C)> 0 then and F2(C) ≈ 0.
  2. If the curve C is inside the object, then F1(C) ≈ 0 but F2(C)> 0.
  3. If the curve C is both inside and outside the object, then F1(C)> 0 and F2(C)> 0 Finally, the fitting energy is minimized if C=C0 [23].

Figure 6 shows an example of the curve evolution zoomed around one follicle, which the center is indicated by the blue asterisk, at three different iterations respectively: 50, 100, and 500 iterations. This operation is applied on the entire image considering multiple curves evolution instead of one curve as shown in Figure 6. Furthermore, multi-curves evolution approach [23], which is implicitly dependent of the distance transform of the image, [23] is applied. Considering the clinical appearance and size of the follicles, an initial radius of the curves is set to 30 at the resolution level that this system operates.

Figure 6
Example of curve evolution around one follicular area in (a) the center of follicle in blue asterisk, in (b) curve fitting at 50 iterations, in (c) curve fitting at 100 iterations, and in (d) final curve fitting at 500 iterations (Follicular area is outlined ...

Figure 7 shows the region-based segmentation of follicles one example image. A global description of the follicles is noticeable in Figure 7b but the objects are not well separated. This leads us to consider another step based on the concavity index to separate and split objects that will probably define the follicles. Morphological filtering is applied to the binary image obtained from the previous step (see Figure 7b) to eliminate undesired areas and fill holes (see Figure 7d). Morphological operations are based on a succession of closing and object removal. The structuring element is chosen to be a square of size 2×2. Object removal filtering consists of labeling connected regions [19, 24], calculation of the area of each object, and then eliminating all objects that have an area less than or equal to a pre-defined area-threshold. The area-threshold is chosen experimentally to be equal to 200 pixels. This value was optimized considering the three training cases used in this study.

Figure 7
exhibits in (a) sample image obtained after the pre-processing step, and after (b) the region-based segmentation, in (c) Otsu thresholding on the same image and in (d) morphological filtering on image displayed in (b).

Figure 7 also compares the region-based segmentation and morphology filtering on one sample image with the Otsu based segmentation. From analyzing Figures 7b and 7c, the proposed method is able to better underline the boundary of objects when compared to the Otsu method. The Otsu method produces a binary image, in which the boundary of the objects are significantly deteriorated and lost. This example indicates that traditional thresholding is not an optimal solution for these types of images. Therefore, a more sophisticated approach such the one we are using is needed to overcome limitations of the traditional methods.

2.3 Novel Concavity Index

Some of the follicles are very close to each other; therefore the region-based segmentation generates overlapping combined follicles as the output demonstrated in Figure 7b. To separate each follicle from its close surrounding ones, it is critical to control the splitting or merging of these follicles. The key idea to overcome these limitations is to adaptively control the splitting or merging of these follicles. This is fulfilled at the object level and not globally. We are introducing a new control factor, which is based on the concavity index of the individual object. The geometrical convex hull [25] of the object is used to define the concavity index. A novel recursive marked-watershed operation is then applied after the concavity index step (as detailed in the next section). This will consequently produce a better segmentation map and hence prevent limitations that usually are introduced during the traditional morphological watershed.

To demonstrate the key idea, let us consider an object Oi(x, y) representing the ith-follicle. The union of all disjoints (not touching) objects constitutes the segmented image; Ω.


where, Oi is the ith object and Nis the total number of disjoints objects.

Let Ci, and Ai be the areas of the convex hull Ci around the object Oi and the area of the object Oi, respectively. The concavity index CIi of the ith object is defined as follows:


This index is computed in each individual object. A vector of indexes V is computed from all the objects extracted from the previous step and using equation (9) as:


where CI1, CI2, and,. ….CIn are the concavity indices of the objects Oi, respectively.

Finally, the index of each object is compared to a global index whose value is empirically set to:


where μ and σ are the mean and standard deviation of the vector V respectively.

The factor CT introduced in Equation 11 is applied in the selection of candidate objects (follicles) for splitting. Hence, an object with an index CIi greater or equal than CT is selected for splitting. Moreover, an object containing several concave points (complex curvature) (see Figure 8a) implies a CIi greater than CT compared to an object with a regular shape (see Figure 8b) including less inflection points (catchment basins). The example in Figure 8 shows two objects with distinct shapes and presenting different inflection points. The idea behind the concavity index proposed in this paper is that the index of object in Figure 8a will show an index greater than the index of the object in Figure 8b. Table 4 presents values of the concavity index for four different follicles:

Figure 8
A generic example of object area and convex hull for two different shapes. In (a) an object with several inflection points is exhibited and in (b) an object with regular shape with a one point inflection point
Table 4
displays the binary mask of the follicle and associated concavity index (CI) values.

The values of CI demonstrated in Table 4 imply that the concavity index (CI), as defined in Equation (9), increases with the complexity of the topology of the curvature of the follicle. One obvious conclusion is that the degree of depth of the catchment area around the inflection point impacts directly the value of the concavity index. The deeper the catchment area the higher the value of the CI (see Table 4) is. Moreover, the number of deep catchments plays an important role also in increasing the value of CI. An example of this observation is the comparison between the follicle with a CI equal to 0.18 and the follicle with a CI value equal to 0.25. The second follicle has a higher number (three) of catchments. However, the first follicle has one only catchment, which leads to a lower index.

This new definition of the concavity index has two advantages; first it simplifies the computation of the concavity index and second it eliminates local minima introduced during the computation of the second derivative by considering the curvature of the shape of an object as suggested in the existing concavity index measurements [9-15]. This iterative operation converges when all potential objects are separated, when the concavity indexes of all objects are smaller than CT (see Equation 11), and the area (surface) of each follicle is smaller than the maximum tolerated area ‘max_area,’ which is determined experimentally based on three training images.

2.4 Splitting of combined objects

The splitting separates touching follicles merged during region-based segmentation. To overcome these limitations, our approach considers the binary image (image obtained after the region–based segmentation) like a flow of data. This analogy is inspired from the optical flow estimation [26-27], in which the estimation of the motion is computed on the frame difference (image) obtained between two successive frames. Here, the image is transformed into a flow of data (objects) based on the concavity index of each object. This operation converts a static image into a dynamic one, in which the iteration number is similar to the time factor in the optical flow. This dynamic approach can also be seen as an adaptive segmentation because few objects are involved at each iteration. This procedure is repeated over iteratively until convergence, meaning that all individual objects have been extracted to some extent. The stopping/convergence criterion is reached when the size of the largest object obtained at ith iteration is smaller or equal to the tolerated threshold (max_area) and the index of each follicle is greater than the threshold CT defined in Equation (11).

Let us consider f(.) as the splitting function. The recursive splitting equation is defined as follows:


where A0 is the binary image after the region-based segmentation, and B0 is the binary image composed of the objects that the area satisfies the following condition:


where L is the number of residual objects, Ol is the lth object in the residual set, and p and q are indexes.

Then the final segmented image is defined as:


Where H is the binary image obtained at the splitting at the initial iteration, Bt is the splitted binary image at the tth iteration, and T is the total number of iterations.

2.4-1: Recursive Watershed

The recursive watershed transform is applied on each image flow obtained at each kth iteration (see Equation 14). A global watershed transform can produce an over-segmentation of the image and increases the miss-classification of the follicles.

Several approaches have been proposed to reduce the over-segmentation produced by the watershed transform [28-29]. One method to reduce over-segmentation is to apply the distance transform of the binary image as control (marker) in watershed transform based segmentation. The distance transform [24, 28] of a binary image is used to calculate the distance of each pixel from its nearest zero valued pixel. In conjunction with the distance transform, one of the techniques proposed is the use of the H-minima transform to impose minima in the distance transform of the image [24, 28]. The H-minima transform reduces over-segmentation by suppressing minima that are smaller than a specified depth. Typically, a constant depth is used with the H-minima transform. However, this approach is limited by the fact that it is not possible to set a single value that can work well for all situations. For example, using a large value for the H-minima transform may cause under-segmentation, while too small a value will lead to some over-segmentation in the image. Thus, an adaptive way of selecting the H-minima transform parameter is needed to find the optimal segmentation. Our algorithm uses an approach similar to the one proposed in [29], in which the H-minima transform is used in an adaptive manner to avoid under and over-segmentation.

Adaptive H-minima and Recursive Watershed

In our approach, we use the complement of the distance transform as described in [29] in conjunction with the H-minima transform for splitting the combined objects. Our algorithm applies the watershed transform recursively until one of two conditions is achieved:

  • Successive applications of the watershed transform does not increase the number of objects identified
  • Application of watershed finds only a single object

At each recursion, the H-minima transform is applied by increasing the depth used for the transform before applying the watershed transform. Thus, the H-minima transform and consequently the watershed transform are adaptive to each object in the image while at the same time avoiding over-segmentation.

Following the notation in [29], let H(g,h) be the H-minima transform for threshold h on the inner distance map g of the image I and N be the number of minima within the image. The algorithm proceeds as follows:

  1. Set h=1, hadp = 0
  2. Calculate H(gI,h)
  3. Calculate Nh, Nadp
  4. While Nh > Nadp do :
    • hadp = h
    • h = h + 1
    • Calculate H(gI , h) and Nh

Thus, our algorithm adapts the H-minima transform to each object under consideration and avoids over-segmentation as well as under-segmentation.

An example of recursive watershed operation around four combined-follicles is displayed in Figure 9. It should be noted that the segmentation improves with more iterations as the borders of follicles get better delineated.

Figure 9
shows the recursive watershed at combined-follicles, at the first iteration, in (b) at the second iteration, and in (c) in the third iteration.

2.4-2: Color Identification

To eliminate false follicles obtained from the previous stages, an analysis of the color saturation of each object is conducted. It appears from our discussion with expert pathologists that brighter structures situated mostly in the peripheral borders of the image are not considered as follicles. Therefore, it is critical to eliminate those false follicles from further analysis and grading of cancerous cells in follicular lymphoma. For each object, the average saturation level is calculated from the saturation channel of the HSV [19] color space. A follicle is eliminated if its average saturation level is smaller than the average saturation level of all objects in the image obtained after the previous step. An object is selected if its average saturation satisfies the following equation:


where So and Sm are the average saturation values of the object (follicle) and all objects (all follicles) of the image obtained after splitting step, respectively and F1 is the final segmented image.

The pseudo code summarizes the proposed technique to find objects to be segmented and splitting them as described in Sections 2.3 and 2.4:

For Iter = 1:nn (nn: total number of iterations)

(1). Seg = Mask (mask is the image obtained after region-based segmentation and verifying equation 13)

(2). For ii=1:num (number of objects at iteration k)

Obj= (Label (Seg) == ii);

If CI (object) >= CT

Split (object);





(3). Save objects that areas <= max_area obtained after (2)

(4). Find all objects those areas >= max_area obtained from the difference of image at iteration k and image obtained at iteration k-1

(5). Split objects that verifies equations (12) and (13)

Repeat conditions 2, 3, 4, and 5 until convergence


2.5. Follicle Shape Description

Although the follicle borders have smoother characteristics, the images obtained from the previous segmentation steps are often characterized by irregular and undulated contours as shown in Figure 10 (a). To reduce those irregularities, one can use Fourier shape descriptors to smooth out the borders of the segmented regions [30-34]. Fourier descriptors may be filtered in a manner similar to the filtering of signals and images in the frequency domain. The full set or subset of the coefficients can be also used to represent the contour of an object. In summary, given a discrete-space sequence z(n), we can derive its Fourier series as one period of its discrete Fourier Transform (DFT) Z (k), defined as:


where M is the total number of Fourier coefficients, and {x(n), y(n)} are the coordinates of the point n. Limiting the number of coefficients often leads to a smoother representation of the curvature of the shape of an object by eliminating high-order coefficients (high order frequencies). Figure 10 (b) shows an example of Fourier descriptors based smoothing of one follicle considering only fifteen coefficients.

Figure 10
exhibits in (a) irregularities of the shape of one follicle obtained in section 2, and in (b) the smoothing of the same shape with a cutting-series number of Fourier descriptors equal to 15.


To demonstrate the performance of our algorithm, we have selected fifteen malignant cases presenting very diffuse and spatially dense follicular regions. These H&E stained slides were digitized using an Aperio (Vista, CA) digital scanner. A trained grader manually outlined follicular regions in these fifteen cases containing each an average of 130 follicles presenting different sizes and shapes. These marked follicles are used as the ground truth for the validation of the proposed method.

Figures 11 display results of the proposed segmentation on a sample case. The iterative concavity index steps are demonstrated in Figures 11b, 11c, and 11d, respectively. One can observe that the splitting of the follicles is gradually increasing with each iteration. For the purpose of visualization, the newly split (appearing) follicles are highlighted in different color at each iteration. Three iterations are sufficient for the example in Figure 11. The connectivity and the complexity of the concave regions will determine the number of iterations as explained in Section 2.3.

Figure 11
shows flow appearance of the follicles; in (a) image obtained after region-based segmentation, in (b) follicles identification at k= 0 (initial iteration), in (c) follicles identification at k=1 (follicles in red color), in (d) follicles identification ...

In order to demonstrate the feasibility of our method, the performance of the final segmentation against the manual one was quantified using the Zijdenbos similarity index [35]. This index measures the percentage of the overlapping ratio between the two shapes A and M (see Figure 12) and is well known to be a good metric to assess the performance of any region-based segmentation. It is defined as:


where A and M in our case are the binary image generated from the proposed method and manual segmented image (i.e. ground truth), respectively.

Figure 12
A generic example to illustrate the Zijdenbos similarity index. In this example, A represents the automated computer segmentation and M represents the manual segmentation.

An extensive analysis of ZSI as developed in [35] deducted that this index weights the common area between overlapping regions more heavily than for instance, an index known as the Tanimoto coefficient [36], also proposed as a measure of agreement in MRI segmentation [37].

Figures 13a and 13b, exhibit the outcome the proposed segmentation after the Fourier descriptors shape smoothing, the manual marking. Figure 13c shows the automated segmentation outlined in yellow and in Figure 13d the overlay of the manual (outlined in green) and automated segmentation on the original image respectively. Fourier shape analysis has been applied to the image obtained from the previous step with the number of Fourier harmonics set to 15.

Figure 13
shows in (a) the binary image obtained after the proposed method after the Fourier descriptors, in (b) the ground-truth for the same image, and in (c) the results of the method superimposed on the original image (in yellow), and in (d) the overlapping ...

One can observe that the objects underlined in yellow color (proposed segmentation) match quite well those underlined in green color (manual segmentation). However, some follicles are merged in the case of the automatic segmentation compared to the manual segmentation in which those follicles are separated.

For the segmentation of structures from medical images, watershed transform is frequently used [19, 24, 28]. We compared the performance of the initial segmentation performance of the proposed algorithm with those of the watershed transform based segmentation as well as its variation, H-minima watershed. The value of height (depth) for the H-minima is chosen to be equal to the maximum distance obtained from the complement of distance transform of the image obtained after the region-based segmentation. We have used the same active contour segmentation (see Figure 7d) for the three methods suggested for comparison. Figure 14 presents the ZSI scores obtained from the comparison of the ground-truth and the proposed method and as well as the ZSI obtained from the traditional watershed and the controlled watershed for comparison purposes. The proposed segmentation results are highlighted in red, the traditional watershed results are presented in green, and the controlled watershed are colored in blue. The proposed technique matched the ground-truth with an average ZSI score of 78.33% and standard deviation of 2.83. It is generally accepted that a ZSI > 0.7 represents “excellent” agreement [35, 38]. The merit of ZSI lies in fact that it provides a value that can be used to compare the similarities measurements pair [35]. Hence, the average agreement of the proposed technique is excellent given the nature of FL images used in this study. Also, this current technique out-performed the Traditional Watershed and Controlled Watershed, which average ZSI scores and standard deviation are equal to 24.35%± 3.99% and to 56.46%± 7.33%, respectively.

Figure 14
shows the results of the Proposed Method in red, the Traditional Watershed in green, and the Controlled Watershed in blue.

An example of this validation on a sample image is showed in Figure 15, in which the automated and manual segmentations are superimposed on the same follicle, the ZSI score is 96.20%. The main advantage of our proposed method resides in its performance in capturing most of the follicles contained in H&E images in respect to the ground-truth generated by an expert pathologist. As long as this majority of the area is captured, then the next step of the grading process, centrablast cell count, can be implemented. Moreover, the performance of the current method is better than traditional and H-minima controlled watershed.

Figure 15
shows the manual segmentation (green) and the automated segmentation (yellow) superimposed on single randomly selected follicle.

The average computational time of this method on a single image used in this study is on average 40 seconds using a non-optimized Matlab (version 9 under windows Vista Enterprise) code. The hardware of the stand-alone computer is single core (TM) Intel 920 with 2,67 Ghz and 12 GB of RAM. While the simplicity of the computation of the concavity index decreases complexity and increases the efficiency of the proposed method, the recursive watershed, which is adaptive to the topology of the object, makes this method adaptable to variations in the FL images. However, due mainly to the biological variation of FL cases as well as wide variety of slide preparation and staining methods continue to make this problem a challenge. For example, the method may fail in the case of unusually thick or poorly stained images. These usually produce very poor quality images that would make it difficult even for an expert pathologist to distinguish the different follicles. A sample case where the method failed due to poor preparation and staining is shown in Figure 16. It is noticeable from Figure 16 that our method (yellow) significantly missed the follicles that are manually outlined (green).

Figure 16
shows the manual segmentation (green) and the automated segmentation (yellow) in a poorly stained case.


In this paper, we have developed a fully automated method to segment follicular regions in H&E-stained follicular lymphoma cases using a novel definition of the concavity index and a recursive watershed operation. A novel concavity index is used to control the diffusion and splitting of candidate follicles iteratively. The use of a recursive watershed operation reduces over-segmentation often resulting from traditional non-controlled watershed. In our case, working at individual object-level instead of the whole image-level simplifies the recursive watershed segmentation and decreases the over segmented follicles considerably. The appearance of the flow of follicles can be compared to optical flow. This phenomena spread gradually over time until touching all regions of the image. In our case, the time factor is assumed to be the iteration parameter, which conditions the appearance of the follicle. Since one image may contain more follicles than another, the key idea is to consider objects (i.e. follicle candidates) individually and also select candidates for splitting and discard the ones with low concavity index. As a final step to reduce the number of false-positive follicles, a color identification stage is applied in order to eliminate some follicles that deviate from the average color saturation of the HSV color space in the image obtained after the shape concavity analysis and splitting.

The accuracy of the method is evaluated on fifteen whole slide images, with a mean ZSI of 78.33%. The results are encouraging given the highly noisy, textured and low-contrast nature of the images used in this study. However, some limitations of the algorithm still exist due to the random spatial distributions of the follicles and their low local contrast and spatial connectivity, which introduce in some cases false detection, or merging of more than one follicle as a single follicle. Also, the preparation of tissue such as staining and thickness of the tissue impacts the accuracy of the segmentation. The performance of the algorithm can be improved in the future by testing the algorithm using a larger database of images and also taking into consideration the quality of the tissue as a parameter. Another study is underway to make use of supplemental information coming from adjacent slides stained with immunohistochemical stains.


The project described was supported in part by Award Number R01CA134451 from the National Cancer Institute. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Cancer Institute, or the National Institutes of Health.


Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.


1. Fenton JJ, Taplin SH, Carney PA, Abraham L, Sickles EA, D’ Orsi C. Influence of computer-aided detection on performance of screening mammography. N Engl J Med. 2007 April 5;356(14):1399–409. [PMC free article] [PubMed]
2. Wu N, Gamsu G, Czum J, Held B, Thakur R, Nicola G. Detection of small pulmonary nodules using direct digital radiography and picture archiving and communication systems. J. Thorac Imaging. 2006 Mar;21(1):27–31. [PubMed]
3. Sertel O, Kong J, Lozanski G, Catalyurek U, Saltz J, Gurcan MN. SPIE Medical Imaging’08. San Diego, California: Feb, 2008. Computerized microscopic image analysis of follicular lymphoma.
4. Gurcan MN, Boucheron L, Can A, Madabhushi A, Rajpoot N, Yener B. Histopathological Image Analysis: A review. IEEE Reviews in Biomedical Engineering. 2009;2:147–171. [PMC free article] [PubMed]
5. Belkacem-Boussaid K, Prescott J, Lozanski G, Gurcan MN. SPIE Medical Imaging 2010. San Diego, California: Feb 13-18, 2010. Segmentation of follicular regions on H&E slides using matching filter and active contour models.
6. Belkacem-Boussaid K, Sertel O, Lozanski G, Shana’aah A, Gurcan M. Extraction of color features in the spectral domain to recognize centroblasts in histopathology; 31st Annual International Conference of the IEEE EMBS; 2009.pp. 3685–3688. [PMC free article] [PubMed]
7. Kong J, Sertel O, Gewirtz A, Shana’ah A, Racke F, Zhao J, Boyer K, Catalyurek U, Gurcan MN, Lozanski G. Development of computer based system to aid pathologists in histological grading of follicular lymphomas. ASH; Atlanta: Dec, 2007. GA. 2007.
8. Griffin NR, Howard MR, Quirke P, O’Brian CJ, Child JA, Bird CC. Prognostic indicators in centroblastic-centrocytic lymphoma. J Clin Pathol. 1988;41:866–870. [PMC free article] [PubMed]
9. Kothari S, Chaudry, Wang MD. Automated cell counting and cluster segmentation using concavity detection and ellipse fitting techniques; Proceedings of the Sixth IEEE international conference on Symposium on Biomedical Imaging: From Nano to Macro; 2009.pp. 795–798.
10. Bai X, Sun C, Zhou F. Splitting touching cells based on concave points and ellipse fitting. Pattern Recognition. 2009:2434–2446.
11. Wang W, Song H. Cell Cluster Image Segmentation on Form Analysis; Third International Conference on Natural Computation, IEEE; 2007.pp. 833–836.
12. Wang W. Binary image segmentation of aggregates based on polygonal approximation and classification of concavities. Pattern Recognition. 1998;31(10)
13. Schmitt Oliver, Reetz Stephan. On the decomposition of cell clusters. Journal of Math. Imag. Vis. 2009
14. Schmitta O. Radial symmetries based decomposition based decomposition level images. Pattern Recognition. 2008;41:1905–1923. et all.
15. Kong Hui, Gurcan Metin, Belkacem-Boussaid Kamel. Splitting touching-cell clumps on histopathological images; International Symposium on Biomedical Imaging; 2011.
16. Mather PM. Computer Processing Of Remotely-Sensed Images. Wiley; 2004.
17. Gillespie AR, Kahle AB, Walker RE. Color enhancement of highly correlated images. I. Decorrelation and HSI contrast stretches. Remote Sens. Environ. 1986 Dec.20(3):209–235.
18. Karvelis P, Fotiadis D. A region based decorrelation stretching method: Application to multispectral chromosome image classification; Image Processing, 2008. ICIP 2008. 15th IEEE International Conference on; 2008.pp. 1456–1459.
19. Gonzalez RC, Woods RE. Digital Image Processing. 2nd. ed. Prentice Hall; 2002.
20. Paranjape RB, Morrow WM, Rangayyan RM. Adaptive-neighborhood histogram equalization for image enhancement. CVGIP: Graphical Models and Image processing. 1992;54(3):259–267.
21. Haralick RM, Shanmugam K, Dinstein I. Textural features for image classification:, IEEE Transactions on Systems. Man, Cybernetics. 1973;SMC-3(6):610–622.
22. Xu N, Bansal R, Ahuja N. Object Segmentation Using Graph Based Active Contours; Computer Vision and Pattern recognition Conference; 2003.pp. 46–53.
23. Chan TF, Vese LA. Active contours without edges. IEEE Trans. On Image Processing. 2001 February;10(2) [PubMed]
24. Serra J. Image Analysis and Mathematical Morphology. Academic Press; 1982.
25. Barber CB, Dobkin DP, Huhdanpaa HT. The Quickhull Algorithm for Convex Hulls. ACM Trans. Mathematical Software. 1996;22:469–483.
26. Torr PHS, Zisserman A. Feature-Based Method for Structure and Motion Estimation. Vision Algorithms: Theory and Practice. 2000:278–294.
27. Horn B, Schunck B. Determining optical flow. Artificial Intelligence. 1981;17:185–204.
28. Soille P. Morphological Image Analysis: Principles and Applications. Springer-Verlag New York, Inc.; Secaucus, NJ, USA: 2003.
29. Cheng J, Rajapakse J. Segmentation of clustered nuclei with shape markers and marking function. Biomedical Engineering, IEEE Transactions on. 2009 March;56(3):741–748. [PubMed]
30. Van Otterlo PJ. Contour –oriented Approach to shape Analysis. Prentice Hall; New York, NY: 1991.
31. Bookstein FL. The Measurement of Biological Shape and Shape Change. Springer-Verlag; New York, NY: 1978.
32. Loncaric S. A survey of shape analysis techniques. Pattern Recognition. 1998;31(8):983–1001.
33. Peterson E, Fu KS. Shape discrimination using Fourier descriptors. IEEE Transactions on Systems, Man, and Cybernetics. 1977;SMC-7(3):170–179.
34. Zahn CT, Roskies RZ. Fourier descriptors for plane closed curves. IEEE Transactions on Computers. 1972 March;C-21:269–281.
35. Zijdenbos AP, Dawant BM, Margolin RA, Palmer AC. Morphometric analysis of white matter lesions in MR images: Method and validation. IEEE Trans. Med. Imag. 1994 Dec;13:716–24. [PubMed]
36. Duda RO, Hart PE. Pattern Classification and Scene Analysis. Wiley; New York: 1973.
37. Brummer ME, Mersereau RM, Eisner RL, Lewine RRJ. Automatic detection of brain contours in MRI data sets. IEEE Trans. Med. Img. 1993 June;12:153–166. [PubMed]
38. Bartko JJ. Measurement and reliability: Statistical thinking considerations. Schizophrenia Bullet. 1991;17(3):483–489. [PubMed]
PubReader format: click here to try


Related citations in PubMed

See reviews...See all...

Cited by other articles in PMC

See all...


  • PubMed
    PubMed citations for these articles

Recent Activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...