Use of tumor diameter to estimate the growth kinetics of cancer and sensitivity of screening tests.

A statistical method has been developed that is useful for studying the relationship between the growth kinetics of malignant tumors and the detection probability either through symptoms or by screening. Mathematical models that describe the distribution of pathological variables in malignant tumors, detected after various histories of screening, are derived and parameters for detection probabilities and the growth kinetics are then estimated by the maximum likelihood procedure. By this method the probabilities of detection through symptoms as well as by screening can be estimated as functions of pathological variable(s) such as tumor size. The growth rate of tumor can also be estimated from the distribution of pathological variables. The present method was applied to gastric cancer in Japan, where an annual screening program for the disease exists. The detection probability for the indirect X-ray used as the screening test was estimated to be 0.323 x (diameter)2/[1 + 0.323 x (diameter)2]. The doubling time of gastric cancer was estimated to be 2.90 months.


Introduction
The natural history of cancer consists of two different stages: the stage of transformation and the stage of growth. In the former stage, a normal cell is transformed into a cancer cell by a series of inheritable genomic alterations; in the stage of growth, the cancer cell multiplies and grows to a mass until it becomes large enough to be clinically diagnosed due to symptoms. In this paper we focus on this latter stage and discuss a new method that is useful in describing the nature of growth in this stage.
A malignant tumor in the stage of growth is primarily characterized by the state of its pathological or morphological variables such as size of tumor, invasion of submucosa, and metastasis to lymph nodes or distant organs. These pathological variables determine whether the tumor manifests any clinical symptoms and, therefore, whether it can be detected through symptoms. These variables also determine the probability of tumor detection by a screening test and the prognosis of the individual after the tumor is detected.
The growth kinetics of a malignant tumor, which can be expressed as change in pathological variables with time, is one of the major questions to be answered to better understand the total picture of the process of carcinogenesis. It also provides important information useful for evaluating the potential impact of cancer screening programs in terms of the degree of improvement in prognosis. In this paper a mathematical method is presented by which the functional relationships between the pathological variables of a tumor and the probability of detection of the tumor, either through symptoms or by a screening test, can be estimated from data of the distribution of pathological variables among cancers with various histories of screening. The growth rate, expressed as the change in the pathological variables per unit of time, can also be estimated by this method. Though the pathological variables are numerous, we choose here the maximum diameter (diameter, hereafter) of tumor as the quantitative measure of the pathological feature. The relationship between the distri-bution of diameter and the mode ofdetection (symptoms versus screening) as well as the history of screening before tumor detection is analyzed based on mathematical models of growth kinetics.

Detection of a Tumor through Symptoms
We assume that the only reason a patient with a malignant tumor seeks medical aid is because of the symptom(s) manifested due to the tumor. We also assume that the magnitude of the symptoms is related to the pathological state of the tumor.
We use the following conditional probability: ,r(x) lim Pr{x -LD < X+ dxILD ¢X} (1) dlx--+O0 dx where x is the diameter of the tumor at a certain point in time and LD is the diameter at detection through symptoms.

Growth of Tumor
We assume that a tumor grows at a rate proportional to the number of cancer cells in the tumor. This implies that the growth of tumor is exponential.
where t is the time measured from the start of growth, a is the diameter of a single cancer cell, and , is a parameter related to the growth rate. We assume that all tumors have the same value for ,.

Incidence
We assume that the incidence rate of the cancer in the population involved is constant over time. Since the incidence rate may differ with age and other variables, this assumption implies that the distribution of these variables in the population does not change with time.

Detection by Screening Test
We assume that a person's decision to participate in the annual screening program is not affected by the existence of a tumor. We also assume that the ability of the screening test concerned in detecting a tumor is related to the pathological state of the tumor. We use a function v(x) for the probability ofdetection of a tumor with diameter x by the screening test.

Mathematical Formulation
In this section, the probability density functions of diameter of tumors in various situations are derived based on the assumptions previously mentioned. Situations discussed here are: when tumors are detected through symptoms in a population with no history of screening and when tumors are detected by (k + 1)'th screen in persons who have k negative results in consecutive screens in the past (k = 0,1,2.... ).
Let f(xlk) denote the probability density function of diameter x in tumors detected through symptoms in persons who have been exposed to k consecutive screens and f8(xlk) the probability density function of diameter x in tumors detected by (k + 1)'th screen in persons who have been exposed to k consecutive screens.

Detection through Symptoms in a Population without Screening
Suppose that we collect cancer cases of people who were diagnosed in a population during a given period [a, b]. If this penod is long enough to ensure that the probability of a cancer's being included in the study is independent of the age of tumor measured from the start of growth, Thus, r(x) can be estimated by analyzing the distribution of diameter in tumors detected through symptoms in persons who have never been exposed to a screen.

Detection by Screening
Let us consider the distribution of diameter in tumors detected by (k + 1)'th screen in a periodically screened population. The probability that a tumor is detected when the diameter falls into a short range (x, x + dx) by a screen after k negative results in consecutive screens with interval d in the past can be expressed as the product of a) the probability that the tumor is not detected through symptoms until the diameter reaches x, b) the probability that the tumor is not detected by the k screens in the past, and c) the probability that the tumor of diameter x is detected by the last screening test. Let P(x,k) denote this probability, where xi denotes the diameter of the tumor at (k-i+ 1)'th screen. xi = x * exp(-* d * i).
Based on the Bayes Theorem, f8(xlk) = Pr{x -L < x + dxlLD > X, not detected by k screens, detected by (k + 1)'th screen} P(x,k) * h(x) where L is the diameter of tumor, L1 and L4 are the minimum and maximum values of L and h(x) = Pr(x - 2P(x,k) (5) P(x,k)dx If k = 0, i.e., in the patients with no history of screening, Application In this section, the above method is applied to gastric cancer in Japan.
Materials and Methods Table 1 shows the data ofthe distributions ofdiameter in gastric cancers of various histories of screening. The cases of gastric cancers detected through symptoms are from the Center for Adult Disease of Fukuoka City Medical Association (CAD-FCMA). They were diagnosed as having gastric cancer during the period 1978 to 1981. The cases of gastric cancers detected by screening are from CAD-FCMA and the Health Center of Karatsu City. They were diagnosed as having gastric  cancer in these institutes after indirect X-ray examinations as the screening test during the period 1969 to 1981.
Since f,(xIO) is related only to r(x) as shown in Eq.
(3), r(x) is directly estimable from the observed distribution (column 1 in Table 1). For r(x), we chose polynomials of varying degrees as follows: I r(x) = + x 'ir, (I = 1,2, .. .. (7) Since f,(xlO) is related only to r(x) as shown in Eq. (6), v(x) is estimable from the observed distribution (column 2 in Table 1) by substituting i(x) in Eq. (6). For v(x), we arbitrarily chose the following function: The growth parameter i is estimable from the observed distribution of diameter of tumors detected by screening among people who have k negative results in consecutive screenings in the past [Eq. (5)] by substituting ix() and vb(x) for r(x) and v(x). We chose gastric cancers detected by screening after one negative result in a screen 1 year before (column 3 in Table 1) and those detected by screening after two negative results in screens 1 and 2 years before (column 4 in Table 1). A log likelihood function is formulated by combining these two types of data.
The maximum likelihood estimates (MLEs) of parameters are calculated by the Newton-Raphson method, and the best fit models are chosen from among various models based on the Akaike Information Criterion (AIC) (1).
Centimeters and months are used as the units for diameter and time, respectively.

Results
Estimation of r(x). The AICs for polynomials of degree 1 and 2 were 599.56 and 602.32, respectively. Therefore, we chose the polynomial of degree one as the best fit model for r(x). The MLEs and estimated covariance matrix of parameters in the polynomial are shown in Table 2.
The 95% confidence intervals of -9 and Pi are (0.0173, 0.1113) and (0.0230, 0.0546), respectively. Figure 1 illustrates the observed distribution function and the distribution function calculated by A(x).
Estimation of v(x). To calculate the MLEs of the parameters in v(x), we first estimated the true distri-   bution of diameter of preclinical tumors that are not yet detected at a certain point in time by using Eq. (6) and then estimating v(x) based on the fact that the difference between the true and the observed relative frequencies can be explained by the function v(x). The AICs for various models of v(x) were as shown in Table 3. So we chose the function of power 2 (model 2) as the best fit function. The MLE and SE of 62 were 0.323 and 0.110, respectively. Figure 2 illustrates the observed distribution function of diameter in gastric cancers detected by first screenings and the distribution function calculated based on the estimated r(x) and v(x). Figure  3 illustrates the estimated v(x). Estimation of Growth Parameter. Finally, the MLE for a was obtained by substituting i(x) and v(x) for r(x) and v(x) in Eq. [5] and by maximizing the combined log likelihood of the data of diameter of gastric cancers detected by screening with one and two negative results in the past. The MLE and SE are 0.239 and 0.158, respectively. Based on the MLE, the doubling time of diameter in gastric cancer was estimated to be 2.90 months. By assuming that the diameter of a single cancer cell is 0.01 mm, the length of time from the start of growth to the point at which the diameter reaches 5 cm was estimated to be 3.0 years.

Discussion
Mathematical theories concerning the natural history of cancer and early detection by screening have been proposed by many researchers; one of the earliest studies was published by Zelen and Feinleib (2) as early as 1969. One of the finest mathematical theories is given in a series of papers by Albert et al. (3), Albert et al. (4), and Louis et al. (5). They formulated the natural history of chronic disease state in terms of the distribution of the age of a person at the time of entering the disease state, the sojourn time in that disease state, and the person's present age. The formulation was used by many authors for various studies. However, in the case of cancer, it is difficult to observe the point in time at which a cancer enters the preclinical state and even more difficult to define the state of preclinical state itself in a fashion that is useful for biostatistic research.
In the present study, we took into account the fact that the primary determinant of the distribution of sojourn time was the pathological variables of the growing tumor, and based on this fact, we developed a mathematical model by which the change in the pathological variables over time can be estimated. As noted above, the pathological feature of a tumor determines not only whether it manifests any clinical symptoms but also whether it can be detected by a screening test and even determines the prognosis or curability of the patient with the tumor. Thus, pathological variables seem more useful in describing the natural history of cancer than the time measure itself.
Also, the present method enables us to estimate the probability of detection by screening test as a function of the pathological variables, in contrast to the models introduced so far by other researchers (6,7), in which they assume that the probability has the same value throughout the entire preclinical period. The ability of a screening test to detect a cancer of a certain size cannot be measured directly from screening data, since false negative cases are not accessible. The method used to calculate the false negative rate is usually to review the past X-ray films ofindividuals in whom cancers were later found. However, this method may give biased results, since the probability of detection by screening is related not only to the ability of the test used but also to the pathological features of the tumor, and this method cannot take into account the latter factor.
In the present study, we assumed that all cancers of a certain site have the same growth rate. This assumption may be replaced by a weaker assumption that the growth rate differs in individual cases following a certain distribution family such as log normal distribution. Although the results are not presented in this paper, according to the simulation we conducted to compare the fixed growth rate model with the random growth rate model with P, log normally distributed showed good agreement for various values of P and its standard deviation.
The data to which we applied our method were not obtained from a population to which a screening program was randomly assigned. If the method is applied to data obtained through a randomized controlled trial, the validity of results will be increased. However, it should be noted that an analysis of existing data based on the method presented in this study can provide important information on the natural history of cancer fairly efficiently if the assumptions made on the data and model are plausible from a biological viewpoint.