Robust Characterization of Multidimensional Scaling Relations between Size Measures for Business Firms

Although the sizes of business firms have been a subject of intensive research, the definition of a “size” of a firm remains unclear. In this study, we empirically characterize in detail the scaling relations between size measures of business firms, analyzing them based on allometric scaling. Using a large dataset of Japanese firms that tracked approximately one million firms annually for two decades (1994–2015), we examined up to the trivariate relations between corporate size measures: annual sales, capital stock, total assets, and numbers of employees and trading partners. The data were examined using a multivariate generalization of a previously proposed method for analyzing bivariate scalings. We found that relations between measures other than the capital stock are marked by allometric scaling relations. Power–law exponents for scalings and distributions of multiple firm size measures were mostly robust throughout the years but had fluctuations that appeared to correlate with national economic conditions. We established theoretical relations between the exponents. We expect these results to allow direct estimation of the effects of using alternative size measures of business firms in regression analyses, to facilitate the modeling of firms, and to enhance the current theoretical understanding of complex systems.


Introduction
An index of the size of a business firm has numerous implications, ranging from its method of corporate finance [1,2] and the quality of its CEO [3] to employee job satisfaction [4] and gender gaps in wages among employees [5]. The exact determinants of the size of a specific firm or a group of firms in an industry have been sought both empirically [6,7] and theoretically [8,9]. Researchers have also studied regularities in the distribution of firm size empirically [10][11][12][13][14][15], mainly motivated by the theory that the evolution of firm size can be modelled as a stochastic process [10,[16][17][18].
However, the question of what constitutes the "size" of a firm has rarely been addressed, and it is possible that statistical results regarding firm size depend on the measures used. Indeed, many of the effects previously found in empirical studies of corporate finance were not robust when different size measures were used as a control variable [19]. Researchers, who questioned the often implicit assumption that size measurements of firms are adequate indicators of their actual size [19][20][21], evidently agree that the relations of various size measures commonly used have not been well-explored in the literature. Moreover, some [20,21] propose the notion of multidimensional size for business firms, using various size measures to indicate manifold aspects. Although it might clarify the conceptual status of firm size, differences and relations between firm size measures should be explored empirically for the notion of multidimensional size to be fruitful.
Regarding the relations between corporate size measures, power-law scaling relations have been reported empirically [12,19,[22][23][24]. More interestingly, some size measures have been reported to demonstrate allometric scaling [23], which means that the distribution of relative deviations from the average relation between two size measures was invariant regardless of firm size. In other words, when the logarithms of a size measure were regressed against the logarithms of another measure, the residuals were found to be surprisingly homoscedastic and independent of the "explanatory" variable; note that the use of formulae for regression analysis is solely intended for the characterization of multivariate joint distributions and does not imply causal relations between the size measures. This suggests that we may be able to employ the residual as a variable indicating the adjusted "proportion" between size measures of a firm, allowing us to isolate the effects of using alternative size measures. Nevertheless, the size measures considered in previous studies are not intended to represent all possibilities. It is unknown as to what extent allometric scaling accurately describes empirical relations regarding other size measures. Theories regarding relations between multiple allometric scalings have been lacking as well. Therefore, empirical characterization and theoretical understanding of various size measures are still far from complete.
Analysis of the empirical relations between size measures of firms also has merit for advancing complex systems science, and a few studies [23,24] have indeed been pursued from such a perspective. Power laws of the form x ∝ y γ are a typical functional form of scaling found in general complex systems, such as animal bodies [25][26][27][28], ecological communities [29,30], and cities [31,32]. For instance, the value of exponent γ in the scaling of the metabolic rate (x) on the body mass (y) is close to 3/4 in mammals [25,27,28] and is theoretically related to the minimization of the energy consumption in blood pumping [33]. Theoretical considerations have successfully been used to predict numerous other scaling exponents in the natural systems of animal bodies. Similarly, the unique value of γ = 2 for humans (compared with γ = 3 for other animals) in the scaling of the body mass (x) against the body length (y) was theoretically accounted for by human bipedalism [34]. In this manner, studies on scaling relations provide a basis for a deeper understanding of a system's governing principles.
The present study is focused on multivariate relations among five quantities of firms that have often been used as size measures in research and practice to characterize scaling relations exhibited by firms across scales. We based our study on exhaustive annual data for more than 2 million unique Japanese firms over a 22-year period. After examining the properties of individual size measures, we empirically verified statistical models of up to three-dimensional allometric scaling for four of the five size measures, while establishing theoretical relations between multiple scalings. Building upon these results, we further developed a method for estimating scaling exponent γ as well as the range where the data fit well to the scaling relations, thereby obtaining insights into the temporal robustness of our results.
The remainder of this paper is organized as follows. First, we describe our selection of firm size measures and the employed data in Section 2. After the dataset and statistical properties of each size measure are characterized in Section 3.1, we inspect the bivariate scalings between the size measures and specify the pairs of variables that are in an allometric scaling in Section 3.2. We then address the problems of seeming inconsistency between the power-law exponents for the tails of marginal and conditional distributions of size measures with regard to bivariate scalings (Section 3.3). Next, we examine a trivariate scaling relation in Section 3.4. Trivariate allometric scalings were found for some combinations of size measures and shown to be related to the transitivity of allometric scalings both theoretically and empirically. This theoretical development allows robust estimation of bivariate and trivariate scaling exponents (Section 3.5). We found that the number of employees has a higher relevance to sales than does the number of trading partners, and there was a mild variation in the trivariate scaling exponents in the period 1994-2015, which appeared to be correlated to the gross domestic product (GDP) variation (Section 3.6).
Finally, we discuss our results and conclude the paper in Section 4. Our rationales for using the standard regression analysis are given in Appendix A, and problems related to the theoretical relations between several power-law exponents that appear in scalings, conditional distributions, and marginal distributions are discussed in detail in Appendix B.

Selection of Firm Size Measures
Despite the recurrent claim that the concept of firm size has not been sufficiently clarified [19][20][21], the variety of size measures employed in studies has remained mostly unchanged during the past half century. For example, in an economic study in 1975, the authors compared different firm size measures using the data of sales, total or net assets, employment, invested capital, and market value [22]. Almost contemporarily, a review of organizational studies in 1976 listed the number of employees as well as the number of clients, sales, and net assets as commonly used size measures and proposed physical capacities, such as the square footage (area) available for an organization's activities, as a possible measure [20]. Recent research on corporate finance mentions total assets, sales, market value, and the number of employees [19]. Meanwhile, the econophysics school of research has used sales, number of employees, total assets, net sales, net income, and number of trading partners (i.e., the reported number of firms that a firm directly trades with) [12,23,35]. Finally, as small and medium enterprises are legally defined according to their capital stock in Japan [36], capital stock data are widely available for Japanese firms, which is the subject of the database used in our study.
We considered only the firm size measures that did not take negative values and were readily measured for almost all firms. The former criterion concerns the very notion of size in broader contexts. For instance, when the size of an animal body is considered, it is non-negative whether it is measured by weight, length, or how many calories per day it takes for the organism to live. We excluded net assets, net sales, and net income based on this criterion, as these measures can take negative values. Other implications of the concept of a size measure include positive correlations to other size measures-e.g., the weight of a human body is usually positively correlated to the height-but this was not incorporated into our criteria due to the empirical nature of the question. We set the latter criterion pragmatically, allowing us to preserve the comprehensiveness of our original data and to minimize the sampling bias. Market value was rejected with this criterion because data were unavailable for firms that were not publicly traded, which comprised the vast majority by number. Thus, the measures of corporate size that were referred to in the literature and that met our criteria were sales, total assets, invested capital, capital stock, and the numbers of employees and trading partners. We omitted invested capital from our analysis because of its partial conceptual overlap with capital stock, relative scarcity of the data, and its rare appearance in the literature as a size measure.

Data and Preprocessing
The source of the data used in this study was a collection of brief descriptions of Japanese business firms, including the financial status, type of business, and physical location, and this was provided and maintained by a major Japanese private credit reporter, Teikoku Databank, Ltd., Japan (hereinafter referred to as TDB). The data were mostly based on annual questionnaires presented to Japanese business firms.
The data employed in this study were primarily from the electronic version of the COSMOS 2 database, which is updated every January by TDB. This database describes over two million business firms in Japan and includes two sub-databases: (1) "summaries" of firm profiles, including the sales, capital stock, number of employees, and business type category (approximately one million per year, from 1980) and (2) lists of several million trading relationships between firms for each year since 1993. The profile of the trading relationships included the direction of money flows, i.e., which firm (buyer) paid which (supplier). Our study was focused on the data after 1993, in which the COSMOS 2 database started to capture the trading relations between firms. We excluded the 1993 data to avoid boundary effects that may have been present at the beginning of the data collection. The data for sales, number of employees, and trading relations that we employed were last updated in January 2017, whereas the data for the capital stock were compiled in January 2020. This confined our analysis to the period ending in 2015 because the 2016 sales were not available for a substantial portion of the firms. Data for the total assets were from the COSMOS 1 database by TDB, in which financial statements of firms were included. This database was last updated in May 2020, and data since 2000 were available to us. Each firm has a unique anonymous ID that allows the firm to be identified in different databases.
To ensure time coherence between different types of data and to make the included firms sufficiently homogeneous for the purpose of our analyses, we performed several data-compilation steps, as follows. First, we excluded the firms that were categorized as governmental (e.g., local governments) or financial (e.g., banks and insurers). This is because the definition of sales for these "firms" differed significantly from that for construction, manufacturing, and wholesale firms, which constituted the majority of the firms in the data. In this step, we filtered out possible outliers in the database. This procedure was applied to both databases. Second, we did not use sales and capital stock data from financial statements published more than 8 years before data entry or without an adequate timestamp. Third, we set the sales to be unknown when the end of the fiscal year had changed because some such data were apparently not the annual sales (the value was sometimes considerably lower than those in previous or subsequent periods), whereas we accepted the data of capital stock and total assets in such cases if more than one instance of inconsistent data did not appear for the same year. Fourth, we determined the year to which a datum of sales, capital stock, or total assets was assigned according to the year in which the fiscal term ended. For example, when a firm had a fiscal year that started in April 2000 and ended in March 2001 (as for the majority of Japanese firms), the sales value of the fiscal term was considered to belong to the year 2001, regardless of whether the data appeared in the database in 2002 or later. In contrast, this principle was not applied to the employee or trading data; we always assigned them to the year prior to the data entry, because the COSMOS 2 database was updated in January of each year.
We directly used the raw data without normalization or adjustment for inflation. The number of trading partners of a firm was determined by counting the trading relations in which the firm participated throughout the year, regardless of whether the firm was a supplier or a buyer. This amounted to computing the sum of the in-degree and out-degree for each node in the directed network of trading.
The data for firm exits were additionally compiled to estimate the number of firm entries and exits. First, if any data of a firm existed in the COSMOS 2 database for a year, the firm was considered to exist in that year. Additionally, when the data of a firm were unavailable for up to two consecutive years, the firm was considered to exist in these years, assuming that its absence from the data was accidental. Before the compilation of the exit data, the first and fourth steps of data exclusion described earlier for the compilation of quantitative data were applied to ensure consistency between the datasets.
When multiple variables were used in the analyses, we excluded the data of firms that lacked any one of the variables. Results for the data amount are presented in Section 3.1 and tabulated in Table S1.

Characterization of Databases and Size Measures
The amounts of data before and after the compilation are plotted in Figure 1 with respect to the years in the 1994-2015 period. The amount of data generally increased. The number of firms with complete data for multiple size measures in the COSMOS 2 database (i.e., number of trading partners k, number of employees , and annual sales s) was more than 0.8 million per year in 1999 and later, and the data for total assets a were more than 0.2 million per year in 2004 and later, as shown in Figure 1a, after the filtering procedure described in Section 2.2. The number of firms considered to be existing based on the COSMOS 2 database was usually larger than the number of firms in the sub-database of sales and employees, as existence was assumed for some firms that only appeared in the trading data or those with temporarily missing data. Two sudden increases in the number of trading data (2007-2008 and 2010-2011) are observed in Figure 1b. The second jump was due to trading data involving financial or governmental organizations, as we observed no jump in the same period after the filtering. In contrast, the first jump was the consequence of revised data-collection methods that were specific to trading-relations data; trading relationships that had been mentioned only in on-demand reports started to be included in the database in 2008. Indeed, there was no comparable jump in the number of "summary" profile data. Additionally, the number of exiting or disappearing firms was stable compared with the number of entering firms, as shown in Figure 1c, implying that the former was less affected by the fluctuation of the data-collection efforts by TDB. All the size variables were distributed in fat-tailed manners on the right side, as shown in Figure 2. Their distribution functions for every year are plotted in the figure, with the color gradient from blue through black to red, indicating the direction from older data to newer ones. The distributions were fairly stable despite the increase in data and, in particular, the exponents of the power-law tails were evidently invariant when the tails could be approximated by a power-law distribution. They were -2.2 for k (the number of trading partners), -2.2 for (employee number), -2.0 for s (annual sales in million yen), and -1.85 for a (total assets in thousand yen). The exponent of the sales distribution was consistent with that in previous studies [13][14][15]. The distribution of capital stock (hereafter denoted by p) was also fairly fat-tailed, although it did not seem to be characterized by a power law because the curve of the estimated density function was convex, as seen in Figure 2e. Consequently, the arithmetic mean of s and a did not indicate the representative value, and the standard deviation of all the variables was not a robust index of the width of the distribution (Table S1). We alternatively indicated the representative value and width of distribution by the median and the interquartile range, respectively, as shown in the table. However, these values were only representative of firms of middle-range sizes, not the ones at the tails, which were characterized by power-law distributions and scaling trelations.
The capital stock was unique because the right tail did not exhibit a power law and there were multiple modes near the median value. A manual inspection of the data revealed that these multiple modes were the result of preference for rounded values such as 10,000. This phenomenon was also notable in other published data [36].

Bivariate Scalings between Size Measures
The dependencies between the size measures of business firms were characterized in terms of bivariate "allometric" scaling relations in a previous study [23], and we visualized them using the method described in the study. The data in the previous study were gathered by a credit reporting company independent of TDB. However, we expected the results to be similar between the two datasets, as the data were collected for the same system (Japanese firms).
Bivariate allometric scaling is denoted by x ∝ y γ , where γ represents the exponent, and x and y are two different size measures. We first consider the three size measures studied previously: the number of trading partners (k), number of employees ( ), and annual sales in millions of yen (s). As described in the previous study [23], the concept of allometric scaling relations incorporates scale-invariant relative fluctuations and is defined in terms of conditional distributions: where P(x|y) represents the probability density function of variable x conditional on a specific y-value; γ 1 , γ 2 , and γ 3 represent the scaling exponents; P x|y represents a scaling function for variable x conditional on the y-value, which indicates the distribution of the fluctuation ratio around the scaling line, x ∝ y γ .  Table S1). To avoid extreme values, which are usually observed in variables distributed in such a manner, we log-transformed the raw size data so that the variables were exponentially distributed. After the logtransformation, Equations (1)-(3) were more intuitive linear regression-type formulae, as discussed in detail in Appendix A: Here, the error terms, ε l|k , ε s|k , and ε s|l , can be regarded as stochastic variables that correspond to P l|k , P s|k , and P s|l , respectively. The mean of these "error terms" is generally not equal to zero, and the intercept term (as it is called in regression analysis) is included among them. These bivariate scaling laws, presented in a previous work [23], can be naturally interpreted as projections of a single three-dimensional scaling line onto two-dimensional planes, as illustrated in Figure 3a, around which the firms are densely distributed [37]. A similar concept of multivariate scaling was used in a study on the morphology of organisms [38]. To test the validity of these relations (Equations (1)-(6)) in our data, we examined both the conditional quantiles and the distributions of the error terms, where the conditional quantile was defined as follows: for any given number, q, in the range of (0, 1), the q-quantile of x conditional on y, which is denoted as x|y q , satisfies the following equation: If Equations (1-6) hold, all the quantiles of the same q-value conform to the power-law relation, x|y q ∝ y γ , and the curves of x|y q plotted against y for different q-values should collapse into a single curve when vertically shifted. Furthermore, the distributions of x/ x|y 0.5 conditional on y should be identical for all y-values and amount to the distribution specified by P x|y with normalization by its median. As shown in Figure 4a-c, the plots of log x|y q vs log y indicated linear relations for a wide range of y, indicating that the power-law relation, x|y q ∝ y γ (abbreviated as x ∝ y γ ), holds for all combinations of k, , and s. In Figure 4d-f, the distribution of relative fluctuation from the conditional median (x/ x|y 0.5 ) is plotted for eight intervals of equal length on the logarithmic scale of y. All the curves collapse into a single "scaling function" for all three cases, validating the scaling relations of Equations (1)-(3).
We next examined the total assets, a, with regard to k, , and s. The relation between k and the quantiles of a ( a|k q ) was not clearly linear, being slightly curved in a logarithmic plot, as shown in Figure 5a; however, the distributions of relative fluctuations from the conditional median (a/ a|k 0.5 ) were almost independent of k ( Figure 5d). More obvious allometric scaling can be seen for -a (Figure 5b,e) and s-a (Figure 5c,f) relations. The total assets seemed to scale superlinearly with (γ > 1) and linearly with s (γ = 1). The power-law scaling exponents of a with regard to and s are hereafter denoted by γ 4 and γ 5 , respectively. Note that the distributions of relative fluctuations of a (i.e., a/ a|s 0.5 ) were wider for small values of s (blue curves) than for large values (red curves), as seen in Figure 5f. We also noted an increase in a slightly larger than a linear function of s for sales values more than 10 5 . Therefore, allometric scaling most precisely approximated the s-a relation for the range of s between 10 2 and 10 5 . We conclude that total assets scaled allometrically with and s, and the relation between a and k did not conform to the allometric scaling.
The capital stock, p, was not found to be in an allometric scaling with any of k, , s, or a, as shown in Figure 6. The vertically shifted curves of conditional quantiles did not agree for k-p, -p, and s-p relations (see Figure 6a-c), indicating that the forms of conditional distributions of p differed substantially as the values of k, , or s varied. Whereas the a-p relation appeared to be close to allometric in the plot of conditional quantiles, as shown in Figure 6d, the conditional relative fluctuations (p/ p|a 0.5 ) were not constant over the range of a (Figure 6e). Quantiles other than those of q = 0.5 (i.e., medians) are plotted with horizontal shift, so that all the curves pass through a point whose x-axis is slightly above 100. (d-f) Probability distributions (PDFs) of or s conditional on k or (i.e., P(l|k), P(s|k), and P(s|l) ) for 8 different "bins" (i.e., mutually exclusive intervals) of the "explanatory" variable (k or ), normalized by their conditional medians, plotted on a log-log scale. The bins are obtained by evenly dividing the entire range (from the minimal value of unity to the maximum value of the variable shown in Table S1) into eight segments on the logarithmic scale. The color gradient of blue to red indicates low to high values of the explanatory variable. (d-f) Probability distributions (PDFs) of a conditional on k, , or s (i.e., P(a|k), P(a| ), and P(a|s), respectively) for 8 different "bins" of the "explanatory" variable (k, , or s), normalized by their conditional medians, plotted on a log-log scale. See the caption of Figure 4 for details on plotting.

Explaining Puzzling Scaling Exponents
Next, we studied the open question of the different power-law exponents in the conditional and marginal sales distributions, which has not been addressed previously. This is related to the asymmetric levels of relevance of the numbers of employees ( ) and trading partners (k) to the firm sales (s), as described in the following section.
It is well established that the distribution of annual sales s of firms roughly follows Zipf's law [13][14][15]; i.e., the probability density tail follows a power law, P(s) ∝ s -2 . This was also observed in our data (see Figure 2c). However, for the conditional sales distributions, the power-law exponents increased to approximately 2.4 (for the number of trading partners (k); see Figure 4e) and approximately 2.7 (for the number of employees ( ); see Figure 4f), substantially varying from 2.0 (see Figure 7).  (12)), respectively, plotted on log-log scales. The entire range of k and is divided into eight levels corresponding to eight curves, so that each interval has an identical range on the logarithmic scale. The weight of an interval is defined as the average probability density in the interval. All the data used in the figures are for 2014.
These seemingly contradictory results are explained by Bayes' theorem. Considering number of employees and annual sales s, we can approximate the integral of P(s| )P( ) with respect to using the maximum values of P(s| )P( ): where lead = argmax P(s| )P( ), such that P(s| lead )P( lead ) is the "leading order" contribution, and ∆ lead is the width of at lead , which is assumed to be a constant. In Figure 5a, the functional form of P(s| )P( ) is plotted for several typical values of based on real data. As shown, the envelope function of P(s| )P( ) followed a power law with an exponent close to -2.0 at its tail. Similar results were obtained for the number of trading partners (k), as shown in Figure 5b. A more rigorous derivation is presented in Appendix B.
Because the scaling relations between the size variables, i.e., ∝k 1.0 , s∝k 1.2 , and s∝ 1.2 (see the following sections for the determination of these exponents), suggest symmetric relevance of k and to annual sales s, it is surprising that different exponents of the powerlaw tails for different conditional distributions (shown in Figure 5a,b) suggest strong asymmetry between the levels of relevance of k and to s. We discuss this novel feature in more detail in the next section in connection with a trivariate scaling relation.

Multivariate Scaling with Scale-Invariant Fluctuation
So far, we have studied bivariate relations between size measures of business firms. In spite of the intuitive picture of triple bivariate scalings as the two-dimensional projections of a single three-dimensional scaling line (Figure 3b), the transitivity of allometric scalings is not theoretically guaranteed; for example, ∝ k γ 1 and s ∝ γ 3 does not necessarily mean k ∝ s γ 1 γ 3 . Although the transitivity might be justified if the distribution of s is constant for an -value regardless of the k-value, such an assumption directly contradicts the previous results [23]. Therefore, trivariate relations should be considered to understand the theoretical basis of the transitivity of allometric scaling relations in the empirical datasets.
We first inspected the scaling of annual sales s with regard to the numbers of trading partners and employees, k and . Our analysis assesses the relative relevance of the number of trading partners (k) and the number of employees ( ) to the prediction of annual sales s. We generalize the allometric scaling relations to a multivariate relation, as follows: or P(s|k, ) = P s|k, s/k α β /k α β , where α and β are the scaling exponents indicating the relative relevance of k and , respectively; ε s|k, is a stochastic fluctuation term of log s conditional on both k and ; P(s|k, ) represents the conditional probability density, which is dependent on both k and ; P s|k, represents the scaling function. This multivariate scaling relation was proposed in [23] but not confirmed directly by real data. Assuming that Equation (7) is satisfied, the median value can be straightforwardly derived: where s|k, 0.5 represents the median value of s conditional on a specific set of k and values. If Equation (9) holds, the contour plots of conditional median sales s|k, 0.5 on the klogarithmic coordinate plane should exhibit nearly regular and parallel contours. Because ε s|k, 0.5 in Equation (9) is a constant, the multivariate scaling relation can be represented by a plane, as shown in Figure 3a. Importantly, this differs from the "scaling line" [37], which is implied by the three scaling relations for the pairs of variables (Figure 3b). Although the relation is not a perfect plane but a surface because it is curved in the high-k and low-regions, as shown in Figure 8a, the data support the scaling of conditional median sales against the k and values (Equation (9)), particularly for medium or large values. Moreover, the statistical fluctuations around the median value are clearly independent of k and (Figure 8b). These facts indicate that the assumption of scaling represented by Equation (8) is valid for most k and values. When the distribution of scaled sales s/k α β conditional on k and was plotted using the values of α and β estimated according to the data (Figure 8c), a remarkable fraction of the curves scaled with each other. Additionally, function P s|k, was surprisingly stable across the years (Figure 8d). Thus, the scaling assumptions of Equations (7) and (8) are well supported by a large amount of available data. The scaling exponent for k (α) was smaller than the scaling exponent for (β), as indicated by the moderate slopes of the contours compared with -1.0 in Figure 8a. We quantitatively verify this statement in the following section. This finding can be linked to the asymmetricity of k and , as discussed in the previous section. In view of the magnitude of errors or fluctuations around the scaling relations, the distribution of residuals (ε s|k and ε s| , as defined in Equations (5) and (6)) was more fat-tailed when s was regressed against k ( P s|k ( s) ∝ s −2.4 ; see Figure 4e) than when s was regressed against ( P s| ( s) ∝ s −2.7 ; see Figure 4f). This difference in the tails of the error distributions implies that the number of trading partners (k) was less useful in predicting the sales value than the number of employees ( ), which was indeed the case.
The trivariate allometric scaling, s∝k α β , along with ∝ k γ 1 and a power-law tail of the distribution of k, is sufficient to derive s ∝ γ 3 and k ∝ s γ 1 γ 3 asymptotically under fairly mild assumptions (see Appendix B). In other words, the transitivity of bivariate scaling relations depends on whether the trivariate allometric scaling holds among the variables. We compared the triplets of (k, , a) and ( , s, a) to exemplify this point. Note that the transitivity held among the latter but not among the former, because allometric scaling holds between k and and between and a, but not between k and a. We expected that the tri-variate allometric scaling would hold only for the latter. The results for these combinations of variables that are analogous to Figure 8b are plotted in Figure 9a-d. Although the deviations of a from the conditional median were independent of k and (panel (b)), the contour curves representing the conditional median of a were notably curved against k and (panel (a)). In contrast, contours for a conditional on and s were straight, parallel to each other, and evenly spaced (panel (c)), and most of the deviations might be approximated by a single function (panel (d)). Therefore, the trivariate scaling of a with regard to and s can be approximated by the following equation: log a = α log + β log s + ε a| ,s .
However, the trivariate scaling among k, , and a cannot be described in an analogous way as expected. Finally, we inspected the trivariate scaling of the capital stock, p, against other size measures. The results for the three variables ( , s, p) are shown in Figure 9e,f. Both figures indicate clearly that these three variables were not in an allometric scaling, as we can expect from the non-allometric bivariate scaling property of p discussed in Section 3.2.

Robust Estimation of Scaling Exponents
Using the trivariate scaling with a scale-invariant relative fluctuation, we can derive several mathematical relations between the scaling exponents (see Appendix B). Here, we show that these relations allow the asymptotic scaling exponents to be estimated robustly against scaling for small firms that does not conform to the power law.
By assuming the trivariate scaling formulated by Equation (7), bivariate scaling of a similar form between k and , and power-law tails of several conditional and marginal distributions, we can determine the joint probability distribution of k, , and s. Thus, one can obtain a set of relations between numerous scaling exponents. Central to these are and where γ 1 , γ 2 , and γ 3 represent the k-, k-s, and -s scaling exponents, respectively (see Appendix B for the derivations). These theoretical equations are useful for estimating the exponents, as determining the threshold size over which the behaviors of firms can be approximated as asymptotic is critical to such an estimation. The aforementioned relations are satisfied only when the data with k < 100 or < 100 are excluded (see below), indicating that firms do not follow exactly the same power-law scaling below and above the threshold. We observed deviations from the scaling relations, particularly when the number of trading partners (k) and the number of employees ( ) were less than 10 or the annual sales (s) were less than 100 (Figure 4a-c, Figures 5a-c and 8a; also see [37]). Because smaller firms dominated the data (Figure 2), their deviation from the scaling relations substantially affected the estimation of the scaling exponents. When all the available data were included in the regression analyses, there was a clear difference between the expected value of γ 3 (dashed purple line) based on Equation (11) and the value obtained via direct estimation (solid purple line), as shown in Figure 10a. Estimated exponents γ 1 in ∝ k γ 1 (green), γ 2 in s ∝ k γ 2 (blue), γ 3 in s ∝ γ 3 (purple), and α (black) and β (red) in s ∝ k α l β are plotted. Additionally, the expected values of γ 2 and γ 3 derived mathematically from α, β, and γ 1 in the case of perfect scaling (Equations (10) and (11)) are juxtaposed (blue and purple dashed lines, respectively). (a) All the available data are used to estimate the exponents. (b) Only the data for which the "explanatory" variable is >10 are used. (c) Only the data for which the "explanatory" variable is >100 are used. (d) Plot of the size of samples from which the exponents are estimated against the year. The black, brown, and red points indicate the thresholds of 0 (no exclusion), 10, and 100, corresponding to panels (a-c), respectively. The hollow ( ) and filled (•) circles and upward ( ) and downward ( ) triangles represent the scaling of vs. k, s vs. k, s vs. , and s vs. k and , respectively.
We performed a standard regression analysis using R (ver. 3.1.2) to estimate the scaling exponents in the bivariate and multivariate scaling relations of the firm data. See Appendix A for the rationales for applying the regression analysis to the problem of estimating scaling exponents of power-law relations with scale-invariant relative fluctuations. To reject the data of small firms that did not fit the "linear" assumption of the model, we excluded the data with low values of "explanatory" variables in the regression formulae by applying a threshold. We used two values for possible thresholds: 10 and 100. The threshold was determined according to the consistency of the resulting multivariate scaling exponents with the bivariate ones. If the "explanatory" variables on the right-hand sides of Equations (4)-(7) were below the threshold, the datum was neglected. Although a considerable proportion of the data might be excluded from the analysis, this process ensured that the final set of exponents conformed to the model assumptions.
The results of regression of s against k and using thresholds of 10 and 100 are presented in Figure 10b,c, respectively. As shown, γ 1 nearly reached 1.0, and the direct and indirect estimates of γ 3 agreed only when the threshold was set as 100. This suggests that the threshold should be ≥100 for k and . We preferred to employ a smaller threshold value owing to the large sample size; thus, we adopted the value of 100. The sample sizes, before and after the threshold was applied, are shown in Figure 10d. After the threshold of 100 was applied, as shown in Figure 10c, the estimated γ 1 value was between 0.94 and 1.06, and the estimated γ 2 and γ 3 values were both between 1.14 and 1.25. We estimated the values of these exponents (averaged for the entire period) as γ 1 ≈ 1.0, γ 2 ≈ 1.2, and γ 3 ≈ 1.2. However, the slow and systematic fluctuations of the estimated α (black line) and β (red line) values, i.e., the trivariate scaling exponents defined in Equation (7), were noticeable.
Despite our use of linear regression, estimation of the uncertainty requires a nonparametric method, because the "error terms" are not normally distributed even with their homoscedasticity. Thus, we used the bootstrap method [39] to determine the confidence intervals (CIs). Resampling was performed 10,000 times, and the resampling size was identical to the sample size. The 95% CIs were determined as the 2.5-and 97.5-percentiles of the bootstrap distribution.
Considering the uncertainty of the estimations, our finding of β > α was true for all the years, as shown in Figure 12a, where the estimated scaling exponents for different years are plotted with the uncertainties based on the bootstrap method. For example, the results for 2014 were α = 0.49 (95% CI (0.455, 0.531)) and β = 0.72 (95% CI (0.697, 0.744)). Thus, we conclude that the number of employees ( ) has a higher relevance to sales s than the number of trading partners (k).
Although the changes were not large and inequality α < β was invariably satisfied, exponents α and β became smaller and larger, respectively, in the 2000-2005 period compared with the 2013-2015 period. The difference was "significant" in the sense that the 95% CIs did not overlap. Additionally, α and β were negatively correlated, which we expected from Equation (10) and the relatively robust γ 1 and γ 2 (Figure 10c).
We then applied the aforementioned procedure to the trivariate scaling of total assets a with and s. The results are shown in Figure 11. In contrast to the scaling of s with k and , there was no obvious improvement in the estimation of α and β by the use of thresholding. This should be related to better fitting of the trivariate data of ( , s, a) to the scaling surface hypothesis (see Figure 3a) compared to the (k, , s) data, as seen in Figures 8a and 9c. The degree of the time variation of α and β was noticeably smaller in the ( , s, a) data than in the (k, , s) data. The estimated α (exponent for ) and β (exponent for s) values were between 0.17 and 0.24 and between 0.82 and 0.92, respectively, in all years and for all threshold values that we applied. In contrast, the bivariate scaling exponents were estimated without data with the sor a-value less than 100 to exclude the data that did not conform to the power law, as seen in Figure 5b,c. Therefore, the final values of estimated γ 4 (for -a scaling) and γ 5 (for s-a scaling) were between 1.16 and 1.19 and between 1.01 and 1.04, respectively, as shown in Figure 11c.  (10) and (11)) are juxtaposed (blue and purple dashed lines, respectively). (a) All the available data are used to estimate the exponents. (b) Only the data for which the "explanatory" variable is >10 are used. (c) Only the data for which the "explanatory" variable is >100 are used. (d) Plot of the size of samples from which the exponents are estimated against the year. The black, brown, and red points indicate the thresholds of 0 (no exclusion), 10, and 100, corresponding to panels (a-c), respectively. The hollow ( ) and filled (•) circles and upward ( ) and downward ( ) triangles represent the scaling of s vs. , a vs. , a vs. s, and a vs. and s, respectively.

Scaling Exponents Related to National Economic Conditions
As indicated in the previous section, there were significant changes (i.e., changes beyond the CI) in α and β during the period 1994-2015, although the source of this fluctuation remains to be explored. Here, we only test the hypothesis that national macroeconomic conditions are related to the changes in these scaling exponents.
Interestingly, α and β appeared counter-and pro-cyclical, respectively; i.e., they were apparently positively and negatively correlated to the nominal GDP of the country [40], respectively, as shown in Figure 12a. The GDP was selected here because its fluctuation cycle is longer than those of other economic indices, such as the Indexes of Business Conditions reported by the Cabinet Office of the Government of Japan [41]. A closer inspection revealed that α was enlarged when the nominal GDP decreased (Figure 12b) and that β decreased almost simultaneously with the GDP, whereas its increase was delayed with respect to the GDP expansion (Figure 12c). We calculated the cross-correlation between the national GDP and the scaling exponents to evaluate the delay of the effect of the former on the latter. The normalized cross-correlation, CC τ [x, y], was defined by the Pearson's correlation coefficient applied to lagged time-series data x(t) and y(t + τ) defined for discrete time t 0 ≤ t(∈ Z) ≤ t end : where T τ ≡ {t ∈ Z|t 0 + τ ≤ t ≤ t end + τ}, N represents the length of the time series (the number of elements in truncated set T τ ), and As shown in Figure 12d, we plotted the cross correlations, CC τ [GDP,α] and CC τ GDP,β , with respect to τ (the time lag in years) to quantify the delay of the changes in the estimated α and β values with respect to the GDP. Both the cross correlations peaked at τ = 1 (negatively and positively, respectively), with Pearson correlation coefficients as large as -0.59 and 0.58. The second-highest peak was present in both cross-correlation series, indicating that the extent of the delay in the exponent changes differed between the increasing GDP and the decreasing GDP. The reversal of the sign at a τ value far from 0 was an artefact of the time window, which covered only slightly more than one economic cycle. The foregoing results suggest that the exponents can be predicted using the GDP of the preceding year. However, we could not eliminate the possibility that the correlations were mere coincidences, as the dataset covered only slightly more than one business cycle. Considering that the cycle was approximately 20 years long, an additional decade of data or a dataset from another country may be needed to verify this trend.
We suspect that the relative importance of inter-firm trades in predicting sales is mechanistically affected by the GDP. It is reasonable that in a recovering economy, selling goods produced with labor is relatively easy, and increasing the number of employees is advantageous for expanding production, which leads to an increase in β. In contrast, in an economic depression, the trading partners to which the firm can sell products may become more crucial for maintaining its sales; in such a situation, weight may be added to α. However, this statement is not supported by the present study and is yet to be verified.

Discussion
We analyzed the multivariate scaling relations between firm size measures. First, the marginal distributions of k, , s, a, and p were examined, and the former four were found to have a power-law tail while the right tail of the latter one did not exhibit a clear power law. Second, we examined the bivariate relations between the size measures to uncover that relations for pairs of k, , s, and a could all be approximated by an allometric scaling except the k-a relation, while the capital stock, p, was never in an allometric scaling with other size measures. Based on the empirically verified model of allometric scaling, seemingly inconsistent power-law exponents for the right tails of conditional and marginal distributions were theoretically explained. Furthermore, theoretical considerations clarified that the trivariate allometric scaling is related to the transitivity of bivariate allometric scalings. This was exemplified by the comparison of trivariate data of (k, , s) and ( , s, a), which conformed to trivariate allometric scaling, with the data of (k, , a) which did not fit the allometric scaling model. Building on the theoretical development, a method was proposed and applied to the data to estimate the scaling exponents for each year. We observed only small variations when data from different years were compared. The fluctuations of the scaling exponents α and β for the data of (k, , s) during a 22-year period apparently followed the country's GDP.
We argue that our results might be highly valuable in making the existing research results more robust and clearer. Some reported results regarding corporate finance were not robust when different size measures were used as a control variable indicating corporate size [19], suggesting that a single variable indicating the firm size is not sufficient to control the "size effects." An allometric scaling between size measures, e.g., x∝y γ , means that the fluctuation or deviation from the average relation (x/y γ ) is independent of y. Therefore, such an adjusted ratio between size measures in allometric scalings could be added to analyses as a possible factor improving the robustness of the existing methods against the use of alternative corporate size measures, bridging the studies using different size measures of firms as a control variable.
Scaling relations for data of more than three measures of size were not examined in our study. Although it is possible to conceive of such a theoretical development, empirical verification of the statistical model for a high-dimensional space is expected to face a difficulty known as the curse of dimensionality [42]. Furthermore, we do not expect theories regarding four-dimensional allometric scaling to be useful in understanding empirical data, since the allometric scaling was found to be broken in the three-dimensional data of (k, , a).
The current analyses are based on aggregated data from all industries. It is highly possible that firms in different industries or markets have different scaling relations. Nevertheless, our results would still serve as a basis for investigating the characteristics of different industries and markets, since the results can be regarded as representing average behaviors of firms across different sectors, calculated from nearly the most comprehensive dataset. Therefore, it would be a promising research direction to study the uniqueness of different groups of firms relative to the average behaviors.
It is our current expectation that similar results could be obtained if it were possible to conduct the same analysis on different datasets of business firms operating in areas other than Japan. Tests of the reproducibility of our results would be highly valuable for understanding the diversity and universality of business firms. While analysis regarding the sales, total assets, and employees of firms can be performed with minimal effort owing to the abundance of information, it would be more difficult to include the number of trading partners because data related to trading between firms are often missing in existing datasets. Text mining of published corporate reports may be a promising approach for compiling datasets of inter-firm trading [43]. We also encourage the application of our method to other systems that exhibit scale-invariant relative fluctuations, such as metropolises and cities [44,45].
We confirmed that the fluctuation of s-conditional to k and l and relative to the median-was invariant, directly verifying the hypothesis proposed in the previous work [23]. This scale-invariance of the relative fluctuation is highly suggestive of the similarity between firms of different sizes and the renormalizability of internal organizations or interfirm relationships with regard to complex networks [46,47].
Verifying previously hypothesized stylized facts and checking the mathematical consistency of independently known power-law exponents help to validate or refute possible theories regarding business firms. Numerous models [35,[48][49][50][51][52][53][54][55][56][57][58][59][60][61] have been suggested to explain only a few stylized facts relating to firms, and it appears that new criteria are needed to select the models empirically. Thus, our results for multivariate scaling should be incorporated into subsequent theoretical considerations. Theories for the scaling relations in other complex systems, such as animal bodies and cities, may benefit from this study because fractal-like hierarchical organization is a pervasive design in such systems (including business firms) [15,32,33,62]. However, the data are far more abundant for firms than for other systems. The multidimensional notion of size may introduce prospects for devising general theories and modeling principles for such complex systems. where P(x|y) represents the probability density of x conditional on y, P x|y represents a probability density function, and γ is a positive constant. When variable x is defined as x ≡ x/y γ , we have P( x|y) = P x|y ( x), given that the probability density should satisfy normalization P( x|y)d x = 1. Variable x does not depend on the y-value; thus, x is independent of y. This indicates that exponent γ can be estimated by finding the value that maximizes the degree of independence of x and y. One of the simplest indices for measuring the dependence between stochastic variables is the Pearson's product-moment correlation coefficient. We can obtainγ, i.e., the estimated value of γ, as follows: where Cor[·, ·] is the correlation coefficient of the two terms. We apply the log-transformation to the raw data because the distribution of x conditional on y is heavy-tailed, as shown in Figure 4d-f. In such heavy-tailed distributions, a few extreme values can substantially affect the correlation coefficient compared with the case of the normal or exponential distribution. The minimization of the absolute value of correlation coefficients amounts to the linear least-squares regression of log x against log y. Indeed, when the correlation coefficient is zero, the covariance is also zero, and if we apply transformations x ← x/x and y ← y/y , where x and y represent the geometric means of x and y, respectively, Here, Cov[·, ·] represents the covariance, N represents the number of samples, and x i and y i represent the i th samples of x and y, respectively. When the residual sum of squares is minimized, ∂ should hold. This is indeed satisfied by Equation (A2). Similar considerations are valid for the estimation of exponents in multivariate scaling. Assuming Equation (8) and defining s ≡ s/k α l β , the probability function of s is constant regardless of k and : P( s|k, l) = P s|k,l ( s). Therefore, α and β can be estimated as follows: Cor log s k α l β , log k 2 + Cor log s k α l β , log l 2 . (A3) When the correlation coefficients are zero so that the right-hand side of Equation (A3) is minimal and when we apply transformations s ← s/s , k ← k/k , and l ← l/l , where x represents the geometric mean of the variable x, where k i , i , and s i represent the ith samples of k, , and s, respectively. Under this condition, Therefore, the estimation in Equation (A3) can be equated to the linear least-squares regression of log s against log k and log l without the interaction term.
Although it is more typical to remove the correlations between "explanatory" variables in the regression analysis, the foregoing method provides a result equivalent to such an orthogonalized version of regression. Let us consider the regression of log s against log k and log[l/k γ 1 ], where γ 1 is a constant determined empirically (γ 1 ≈ 1.0, as shown in Figure 10c). The variables are again normalized with transformation x ← x/x , where x represents the geometric mean of the variable x. Here, exponents α and β are intended to fulfill the multivariate scaling, s ∝ k α l k γ 1 β , and are thus estimated as follows: When the correlations are zero, By adding Equation (A6) to Equation (A5) multiplied by γ 1 , we obtain Equations (A5) and (A7) are satisfied with α = α + βγ 1 and β = β when Equation (A4) holds. Therefore, the estimated values of the exponents in a formal regression analysis can be derived from the regression with the "explanatory" variables (not orthogonalized). Note that α is the mathematically expected value of γ 2 , as indicated by Equation (A16) in Appendix B.

Appendix B. Derivation of Asymptotic Power-Law or Scaling Exponents
Here, we derive the relations between asymptotic exponents for power-law distributions and scalings assuming the bi-and trivariate scaling relations. This allows us to check the consistency of results of regression analyses. First, given Equations (1)-(3), we derive the asymptotic power-law exponent of the marginal distribution, P(x), of the "explained" variable, x. Second, given Equations (1) and (8), we calculate the asymptotic power-law exponent of the marginal distribution of s. Third, we show that the bivariate scaling exponents, γ 2 and γ 3 (i.e., those of sales s against number of trading partners k and against number of employees , respectively), can be determined from α, β, and γ 1 .
Because Equations (1)-(3) have the general form: we can derive the power-law exponent at the tail of the density function of the marginal distribution, P x (x) ≡ P(x), for variable x given the following: positive scaling exponent γ, P x|y (the universal distribution function of x conditional on y), and marginal distribution P y (y) ≡ P(y). Assuming that P x|y and P y have power-law right tails of exponents δ x|y and δ y , respectively, we approximate the functions as and where c x|y and c y are positive constants, δ x|y and δ y represent the power-law exponents larger than 1, and d x|y and d y represent the distribution functions for the smaller side. This approximation is based on the actual distributions of the size variables (Figures 2a-d, 4d-f and 5d-f). We set the threshold beyond which the distribution follows the power law to unity because the same results are easily derived for the asymptotic value of exponents with linear transformations as long as the threshold is positive. By applying Bayes' theorem and assuming that x > 1 (because we are interested in the "tails" or asymptotic behaviors as x → ∞ ), we derive P x (x), i.e., the marginal distribution of x, from P(x, y), i.e., the joint distribution of x and y: P x (x) = ∞ 0 P(x, y)dy = ∞ 0 P(x|y)P(y)dy = (I 1 − C)x −δ x|y + (I 2 + C)x −1−(δ y −1)/γ , where I 1 , I 2 , and C are constants respectively defined as I 1 = c x|y 1 0 y γ(δ x|y −1) d y (y)dy, and substitution y = y/x 1/γ is applied in the last integral term. Therefore, the asymptotic behavior of P x (x) at positive infinity is determined by whether δ x|y or 1 + δ y − 1 /γ is smaller: as x → ∞ , the behavior of P x is approximated as where The change in δ x|y does not affect the power-law exponent as long as δ y is sufficiently large; this is consistent with the intuitive explanation ( Figure 7) presented in the main text.
Although rather complex, a similar strategy works for the case of multivariate scaling. Let us continue with the foregoing notations, except that x, y, and γ are replaced with , k, and γ 1 , respectively, and Equation (8) is assumed as well as the following: where c s|l,k is a positive constant, δ s|k,l represents a power-law exponent larger than 1, and d s|k,l represents the distribution function for the smaller side. Again, the threshold of the functional change of P s|k,l is set to unity without loss of generality. The marginal distribution of s, i.e., P s (s), for s > 1 is where substitution l = l/k γ 1 is applied and E 2 = (α + βγ 1 ) δ s|k,l − 1 − δ k + 1; where E 3 = (α/β + γ 1 ) δ l|k − 1 − δ k + 1; and substitutions k = l −1/γ 1 × k and l = s −γ 1 /(α+βγ 1 ) × l are applied. The sum of these 10 terms has the form P s (s) = A 1 × s −δ s|k,l + A 2 × s where A 1 , A 2 , and A 3 are constants independent of s. Therefore, the asymptotic behavior of P s as s → ∞ is expressed as follows: where δ s is the smallest of the three exponents: Finally, we evaluate the values of exponents γ 2 and γ 3 using α, β, and γ 1 , given Equations (1) and (7). Then, the joint distribution of k, , and s is given as follows: P(k, l, s) = P(s|k, l)P(l|k)P(k) = 1 k α l β P s|k,l s k α l β 1 k γ 1 P l|k l k γ 1 P k (k).
We can obtain the joint distribution of k and s by integrating this probability with respect to . P(k, s) = ∞ −∞ P(k, l, s)dl = ∞ 0 1 k α l β P s|k,l s k α l β 1 k γ 1 P l|k l k γ 1 P k (k)dl.
Comparing this result with Equation (3), we have for the right tail of s in a limited condition.