Sources of Artifacts in SLODR Detection

Background Spearman’s law of diminishing returns (SLODR) states that intercorrelations between scores on tests of intellectual abilities were higher when the data set was comprised of subjects with lower intellectual abilities and vice versa. After almost a hundred years of research, this trend has only been detected on average. Objective To determine whether the very different results were obtained due to variations in scaling and the selection of subjects. Design We used three methods for SLODR detection based on moderated factor analysis (MFCA) to test real data and three sets of simulated data. Of the latter group, the first one simulated a real SLODR effect. The second one simulated the case of a different density of tasks of varying difficulty; it did not have a real SLODR effect. The third one simulated a skewed selection of respondents with different abilities and also did not have a real SLODR effect. We selected the simulation parameters so that the correlation matrix of the simulated data was similar to the matrix created from the real data, and all distributions had similar skewness parameters (about –0.3). Results The results of MFCA are contradictory and we cannot clearly distinguish by this method the dataset with real SLODR from datasets with similar correlation structure and skewness, but without a real SLODR effect. The results allow us to conclude that when effects like SLODR are very subtle and can be identified only with a large sample, then features of the psychometric scale become very important, because small variations of scale metrics may lead either to masking of real SLODR or to false identification of SLODR.


Introduction
I n 1927, British psychologist Charles Spearman (Spearman, 1927;Detterman & Daniel, 1989) formulated the hypothesis that, when measuring intellectual ability, one nds higher subtest correlations in the lower region of general factor (g) distribution and, vice versa, lower subtest correlations in the higher region of the g distribution (the so-called Spearman's Law of Diminishing Returns, SLODR). Testing and discussion of this hypothesis continued, and has even increased over the past three decades. Many studies involving widely varying types of respondents, tests, and data-processing methods have been published over that period. e results were discordant.
Although the tendency for a decrease of intercorrelations between the subtests when the g factor is growing has been veri ed in a meta-analysis, only a little more than half of the studies reviewed directly con rmed Spearman's hypothesis (Blum & Holling, 2017). ere is some criticism of this meta-analysis in a study by Hartung and colleagues (Hartung, Doebler, Schroeders, & Wilhelm, 2018). e rst point of the criticism is that the meta-analysis did not include recent studies with new methods for examining the SLODR hypothesis. Indeed, the question of which statistical methods are appropriate to investigate the structure of intelligence is very important. It seems that the SLODR e ect in general is not strong, and the ways to detect it statistically may not be so trivial as was originally supposed. To examine the di erentiation and dedi erentiation of intelligence along the age dimension and/or growth of ability, current studies use the following methods: con rmatory factor analysis, moderated factor analysis (Molenaar, Dolan, Wicherts, & van der Maas, 2010), multi-group con rmatory factor analysis (Reynolds & Keith, 2007), factor mixture modeling (Reynolds Keith, & Beretvas, 2010), and local structural equation models (Hildebrandt, Lüdtke, Robitzsch, Sommer, & Wilhelm, 2016) (for a brief review, see the introduction to Hartung et al., 2018). We believe that each of these procedures is worth a separate discussion in the context of SLODR detection, but in our study we will focus on moderated con rmatory factor analysis (MCFA). e goal of our study is to investigate the capabilities of MCFA to detect SLODR under di erent conditions.
As the SLODR e ect is not strong, it can be statistically con rmed only when large data samples are used. We have seen such very large samples in SLODR studies (for instance, Arden & Plomin, 2007;Breit, Brunner, & Preckel, 2020;Dombrowski, Canivez, & Watkins, 2018;Hartmann & Reuter, 2006;Hartung et al., 2018;McGill, 2015). Using large samples not only allows the researcher to obtain more statistically reliable results, but it also enhances the probability of the appearance of artifacts ( Korneev, Krichevets, & Ushakov, 2019). In particular, artifacts may be generated by the scale characteristics and skewed distributions of scores.
In a eld close to SLODR study -the study of gene-environment interactions -artifacts have become the object of deep re ection and special investigations. Analysis so far indicates that some of the results may be explained equally by either real interactions or subtle features of the distributions. For instance, A. Murray and colleagues conclude that "Estimates of gene-environment interactions (G×E) in behavioral genetic models depend on how a phenotype is scaled. Inappropriately scaled phenotypes result in biased estimates of G×E and can even suggest G×E sometimes in the direction opposite to its true direction" (Murray, Molenaar, Johnson, & Krueger, 2016, p. 552). e authors also point out two reasons for the violation of normal distribution of phenotypic characteristics, which then o en lead to ambiguity in the measuring scales. ese are the irregularity of the distribution of tasks according to their di culty, and selection of respondents according to their abilities. " e problem of dependency of G×E on phenotype scaling has been known since the time of R.A. Fisher who noted that G×E interaction could be manipulated by re-scaling the variable involved" (Murray et al., 2016, p.553). Re-scaling here refers to a non-linear but monotonic scale transformation, which is permissible for ordinal scales of measurement as de ned by S. Stevens (Stevens, 2017), but not for interval scales. However, only the latter can be used in the great majority of complex mathematical measurements (for instance, in modeling by linear structural equations and in linear regression). Nevertheless, a great number of cases where those methods are employed do not contain any serious arguments in favor of interval scales.
Some researchers who understand the importance of this problem opt for using measurement methods based on Item Response eory (IRT) (Breit, Brunner, & Preckel, 2020;Embretson & McCollam, 2000, Tucker-Drob, 2009). For instance, I. Schwabe argues that the IRT approach generally provides greater reliability than the summation of scores (Sсhwabe, 2016). Another approach is using item scores in factor analysis directly (Molenaar, Kő, Rózsa, & Mészáros., 2017). In our work, we do not discuss in detail the applications of IRT in the context of the SLODR e ect, but we think it is an important direction of investigation.
In SLODR studies we see, rst of all, interest in the e ect of distribution skewness on SLODR detection. Murray, Dixon, and Johnson (2013) point out that subtest distribution skewness may result in SLODR detection when the e ect is actually absent. e sources of the skewness are the same as in Murray et al. (2016): the selection of respondents in the data set, which is de ned by external factors, and the di erence in the number of easy and di cult tasks presented in a subtest (with oor and ceiling e ects as special cases).
As we show in this paper, these are the very di erent sources of the very di erent aspects of SLODR detection, and the problem is not in the skewness itself, but in the sources of deformation of the distribution. Now, let us consider the SLODR detection methods in more detail. e most interesting case for testing the SLODR hypothesis is when there is one data set with a continuous spectrum of respondents' test results. In this case the hypothesis is that there will be a weakening of the interdependence of the subtests that may be expressed in intercorrelations that weaken along with the growth of the respondent's intellectual ability. But if there is just one data set, it is di cult to explain what could be considered subtest correlation in di erent regions of the single set.
An earlier method of studying SLODR (which is now called "traditional") employs principal component analysis of the subtests. e rst component (obtained with no rotation) is interpreted as the general intellect factor g. e SLODR may be expressed in several modes: 1. Denser distribution of respondents' g-factor scores at higher levels of g. If the subtest results have the standard normal distribution (or at least equal vari-ances), then the g-factor scores have negative skewed distribution (Molenaar et al., 2010, Murray et al., 2013. 2. After dividing samples into two halves according to the median of g factor scores, factor analysis is carried out separately for each group (see, for instance, Reynolds & Keith, 2007). Finding less eigenvalue of the first factor (less variance) and less average value of factor loadings by subtests on this factor for a high g-value group is an SLODR marker in this case. 3. The average subtest intercorrelation (as obtained in item 2) may be compared for the two groups. The high intercorrelation average for the low g-value group also may be considered as evidence in favor of Spearman's law (Hartmann & Reuter, 2006;Sugonyaev & Radchenko, 2018). The variants of the method use some other variable to divide the whole sample, and then exclude it from the analyzed set of data.
In all three cases, the supposition that the values of all the mentioned indicators are di erent on subsamples due to the presence of the SLODR e ect, is based upon the presupposition that all subtest distributions are symmetrical; otherwise, di erences in the SLODR indicators in the high and low groups might be generated by distribution skewness of any origin (Murray et al., 2013). As was shown in the 2019 study by Korneev, Krichevets, and Ushakov of "traditional" SLODR detection, skewness from di erent sources will tend to produce di erent results, expressed in di erent combinations of the properties outlined in the three modes. e so-called modern methods employ structural equation modelling. ere are two types of models in use: (a) second-order or higher-order models (in which g is loaded only by the factors of special abilities, which in turn are loaded by the subtests); and (b) bi-fact or or nested models, in which g is directly loaded by the subtests, and then each of them loads its factor of special abilities. e advantages and disadvantages of the two types of models are discussed by Gignac (2016) and Molenaar (2016), but this question is beyond the scope of this article. We consider here only the second type ( Figure 1). Note. V11-V22 are results of (sub)tests (real and simulated); f1 and f2 are factors of special abilities; g is the general factor.
In such models, Spearman's law may be expressed, rst of all, in the factor loadings being lower for an increasing level of the general factor (g), i.e., in a decrease of factor loadings of the subtests on the factors of special abilities, and/or on the g factor along the growth of the g-factor score. As g increases, residuals (on the level of subtests) may be expected to increase. e negative skewness of g-factor score distribution may be connected with all those phenomena (Molenaar et al., 2010;Murray, Dixon, & Johnson, 2013). e modern method of SLODR detection, moderated factor analysis, allows us to uncover such phenomena (Bauer, 2017;Bauer & Hussong, 2009). It uses the moderation of structural model parameters, i.e., the linear dependence of model parameters (factor loadings or/and residuals and so on), on factor g or other moderators introduced. e task is to estimate coe cients of a parameter's linear dependence on the moderator.
In many current studies, researchers are using a hybrid methodology: e analysis of real data is compared with the results achieved by the same method applied to simulated data, with similar parameters and clear structure. When structural modeling for simulated sampling is performed to assess the same parameters that were used in sample generation, this operation does not always lead to values of the parameters close to those used in the simulations (Molenaar, Dolan, Wicherts, & van der Maas, 2010). at result shows the limits to the sensitivity of the method, so this methodology is a good supplement to the real data analysis. Moving in parallel with Molenaar et al. (2017), we constructed simulations of data with di erent sources of skewness and then compared the results of the application of the above-mentioned methods to the simulations and natural data.
Our research question can be formulated as follows: Can the moderation by the g factor of factor loadings or residuals in moderated con rmatory factor analysis help to distinguish di erent sources of skewness of data? And, as a result, is it possible to distinguish a real SLODR e ect from similar data patterns with other sources? Beside simulation of "true" SLODR, we simulate the situation of skewed distribution of respondents' ability and the situation of skewed distribution of item di culty. We analyze the moderation coe cients in MCFA of raw simulated and real data sets and these sets a er normalization and try to check: 1. Whether MCFA can differentiate the sources of skewness in the simulated data sets and real data; 2. If we have skewed the real data of intelligence testing, can we differentiate the contribution of different sources of the skewness to the SLODR effect using MCFA?

Data
Real Data e test data of 11,335 military school recruits were used in this study. e test battery of intellectual abilities, which was specially designed for the candidate selection, consisted of 10 subtests, each of which included 30 tasks. A more detailed descrip-tion appeared in studies by Korneev, Krichevets, & Ushakov (2019) and Sugonyaev & Radchenko (2018). We chose four subtests with a simple correlation structure for our comparative investigations, in order to make it possible to obtain similar correlational structures in the simulations. ese subtests were: (a) analogies (An); (b) syllogisms (Syl); (c) memorization of shapes (SM); and (d) verbal memory (VM). Every task required the choice of one answer out of ve suggested ones. e respondents' scores were evaluated in two ways: by the classical procedure (number of correct answers), and by the two-parameter IRT method. e process of IRT analysis was as follows: First, we estimated the di culty and the discriminative ability of every task, and measured the ability score for each respondent on every scale. We used the two-parameter logistical model from the mirt R package (Chalmers, 2012;Lord & Novick, 1968). e next step was to exclude guessers from the data. We assessed the probability of a right answer for each respondent on each item, and if the probability was lower than 0.2 (that is, the probability that the person will guess the right answer among ve proposed ones) and the respondent gave the correct answer, that respondent was marked as a potential guesser on this item. ose respondents who were marked as guessers more than 10 times within one scale, were excluded from the analysis. In fact, the probability of guessing is o en greater than 0.2, because some of the proposed answers could be easily rejected, so it is not surprising that our algorithm found as many as 164 guessers (1.44% of the sample), so the nal sample size was 11,171. Next, we repeated the estimation of task parameters and the ability of each respondent with the updated model on the clear sample.

Simulated Data
In addition to the sample of real data, three sets of simulated data were produced. e random sample generation and data processing were based on SPSS version 22. e scripts for generating the data are available in Appendix S1 at http://mathpsy.com/ slodr. We tried to select the parameters of simulation so that the correlation matrices of simulated data were similar to those of the natural data, and so that the variables had similar coe cients of skewness simultaneously.
Simulation of the selection of respondents according to the external criterion (case selection). During the rst step, a sample of 17,000 cases from a four-dimensional normal distribution with de nite correlation structure was selected. Intragroup correlations were R 12 = .71 and R 34 = .66; four intergroup correlations (R 13, R 14, R 23, R 24 ) were approximately equal to .57. e rst factor score (using principal component analysis) for each "respondent" was calculated. en, the respondents with factor scores of under 0.3 (12,218 cases) were selected. e distributions of variables we obtained had negative skewness.
Simulation of the skewed distribution of tasks according to their di culty (di erent task density). Four standard normal variables (the correlations between them were similar to the correlations of our real data) were transformed by the following formulas: for OLD i < 0, NEW i = -(-OLD i ) 1.08 ; for OLD i > 0, NEW i = (OLD i ) .93 e transformation caused the negative skewed distribution (the le semi-axis was stretched, and the right one was compressed). at distribution corresponded to a higher density of easy tasks than of di cult ones. (In reality, the modelling of different task density requires an additional supposition about the interdependency of respondents' answers to tasks with equal di culty. Our stretching/compression corresponds to a strong correlation.) e size of the obtained sample was 11,338. Simulation of the "true SLODR." An additional sample was obtained using a model with decreasing factor loadings and growing residuals along increasing g-factor scores (the script is available in the supplementary materials, Appendix S2). e size of the simulated sample was 10,000.
We also normalized the measured variables of real data and simulated variables for the "case selection" set. e value of normalized variable V for the given case X, with a range R among all values of V from our set containing N cases, was calculated as Ф -1 ((R-0.5)/N), where Ф(x) is the distribution function for standard normal distribution.

Construction and Assessment of the Model
We constructed a simple bi-factor model (see Fig. 1) with four indicators (real or simulated results of four subtests), two factors of special abilities (f1 and f2), and one general factor. e variance of latent factors was xed to 1 in the models, to allow scaling and identi cation of latent variables. en we estimated the same model using seven data sets: (a) raw real data; (b) normalized real data; (c) IRT real data; (d-f)) three simulations; and (g) one normalization of the simulated sample. en for every data set we performed principal component analysis and used the rst factor scores as moderators. e following types of moderation were used: (a) the moderation of factor loadings from indicators on special abilities; (b) the moderation of residuals from the indicators; and (c) the moderation of both factor loadings and residuals from the same indicators.
In order to compare the baseline model without moderation with the moderated models (they can be considered as nested models), we used the Bayesian Information Criterion (BIC) (Bollen, Harden, Ray, & Zavisca, 2014). is criterion is relative and does not have a standard scale, but a lower BIC is a sign of a better t. e di erence in t between two nested models can be considered signi cant if it is greater than 10 (Ra ery, 1995).
We assessed our models in Mplus 8.3, using maximum likelihood estimation with robust standard errors (MLR), and used R version 3.6.0 (R Core Team, 2016) with the MplusAutomation package (Hallquist & Wiley, 2018) for automated processing of the model and summarizing of the results. Table 1. e bi-factor model, with factor g loaded by all variables and two factors of special abilities loaded by pairs of variables, corresponds to such a structure. e coe cient of skewness is -.309 for the variable of our real An, and -.022 a er normalization. It is -.232 and -.002 for Syl; -.302 and -.032 for VS; and -.361 and -.066 for VM, respectively.

Simulation of Case Selection
Within this sample, the relatively high values of the test indicators were represented by a greater number of "respondents. " e coe cients of skewness of the four "truncated" variables uctuated around the mean value of -.305 (SD = .023). Correlations inside the subgroups were: R 12 = .55 and R 34 = .48; the mean intergroup correlation (R 13 , R 23 , R 24 , R 34 ) was .33 (SD = .064). For the normalized data, R 12 = .53 and R 34 = .46; the mean intergroup correlation (R 13 , R 23 , R 24 , R 34 ) was equal to .39 (SD = .054).

Simulation of Di erent Task Density
Within this set of the data, the coe cients of skewness uctuated around -.298 (SD = .015). e intragroup correlations were: R 12 = .53 and R 34 = .48; the intergroup ones were equal on average to .32 (SD = .048).
Simulation of the "True SLODR" As a result, the intragroup correlations were .54 and .54, and the intergroup ones were .33 (SD = .014). e variables were standardized (they did not need any normalization since the sampling distribution di ered little from the normal one; the absolute value of skewness did not exceed .03).

Testing of the Model on Real and Simulated Data
e results are presented in Table 2. e results of MCFA showed that the coe cients of moderation of factor loadings or residuals or both are o en signi cant, but the patterns of moderation vary in di erent simulations. Before we discuss the speci cs of these patterns, let us recall that the simulations of di erent task density and case selection as distinct from "true" SLODR have been constructed without the SLODR e ect. Starting from a bivariate normal distribution, the rst of these is obtained by simple scale deformation; the second is obtained by case selection with skew in favor of more productive "persons". Data set 1. "True" SLODR (non-skewed distribution) (Chi-Sq (1)  Note that the normalization of the rst one returns the distribution to the original symmetrical state, so we do not include it in our comparison. e second distribution normalization leads to a more interesting distribution symmetrization, which is created by deformation of the original scale. e result of the deformation is a product of the neutralization of two opposite asymmetrizations, neither of which can actually produce the SLODR e ect. Table 2 contains the coe cients of moderation of di erent latent variables, with the rst factor scores as moderators. ese scores were obtained by principal component analysis of the four measured variables and have a strong correlation with latent g.
Data set 1 ("true" SLODR) demonstrates the expected decreasing of factor loadings and increasing of residuals, all in accordance with Spearman's hypothesis. In this case, the moderation e ect increases when the parameters are moderated together. ese data give an example of what may be seen as a "good SLODR e ect. " In data set 2 ("di erent task density"), we see stronger moderation coe cients in factor loadings moderated alone than in the "good SLODR case. " As expected, residuals decrease (in a direction opposite to that of the SLODR), because the le side of the scale was stretched and the right one was compressed, which implies corresponding changes in residuals. When these parameters are moderated together, the factor loadings lose the decreasing tendency and even change it to increasing, and the residuals decrease more strongly than when being moderated alone. A similar e ect may cause the inconsistent heteroscedasticity of residual variances (unexplained in Molenaar, 2011), although all other parameters show an e ect consistent with Spearman's hypothesis.
In data set 3 ("case selection"), the picture is contradictory. It may be proposed a priori that neither factor loadings nor residuals have to show any signi cant mod- Note. * signi cant coe cients (p-value < 0.05). In column ΔBIC the di erences between correspondent model BIС and 'baseline' model BIC (a negative number corresponds to a better t with the moderated model) are presented for three variants of moderation. A er the regression coe cient, the estimated standard error appears in brackets. e indicators of model t are placed a er the data set number and characterize the baseline models. e next four columns contain regression coe cients (RC) for four moderations of (a) factor loadings being moderated alone; (b) RC for residual being moderated alone; (c) and (d) RCs for factor loading and residual respectively, moderated simultaneously. e four strings contain information for the four variables.
eration coe cient, but the model shows the strongest decreasing of factor loadings among all our sets of data, and not very strong decreasing of residuals in the case of separate analysis of these parameters. If moderation is estimated for both parameters together, then the decrease of the factor loadings becomes weaker, and residuals become about constant (which corresponds to the real residuals of our simulation).
In data set 4, which was derived from data set 3 by the normalization procedure described above, we have a result very similar to the "true SLODR" set in all moderated parameters. is most interesting case shows the importance of the skewness source. e negative skewness of data set 2 is accompanied by residual decreasing in the positive part of the g scale, and also the negative skewness of data set 3 is accompanied by equal residuals for the entire g scale. When the second e ect (data set 3) and the e ect opposite to the rst e ect (data set 2) (the normalization produced just this e ect) are neutralized, we get a normal distribution, with increasing residuals along the g score (cf. Murray et al., 2013), and hence the false SLODR e ect.
In data set 5 (real data with the sum score), which contains variables with negative skewness from -.22 to -.36, the results are di erent for di erent variables, re ecting their natural di erences. Nevertheless, considering them as a whole, one can see residual negative moderation coe cients similar to those of simulation data sets 2 and 3.
Comparing the results obtained in the original real data (sum score) normalization (data set 6), we see that the positive residual moderation coe cients for them are less strong than those of data set 4. is shows real data similar to the residual moderation coe cients of data set 2, which are theoretically equal to zero, because normalization restores these data to the simple form of correlated variables without any SLODR e ect.
us the analysis of data sets 1-6 shows that the real data SLODR-like e ect may have been achieved due to similar properties in both cases: the predominance of simpler tasks in the subtests, and the predominance of high-ability persons within the respondents' distribution (this is only a hypothesis). Actually, only 15% of the tasks were solved by fewer than 40% of respondents, but this fact might be explained by either reason. Such a result coheres with the analysis of the same data that was produced by "traditional methods" (Korneev, Krichevets, & Ushakov, 2019).
Data set 7 (with the IRT scores) shows the SLODR e ect in factor loadings and the opposite tendency in residuals. Such a situation is theoretically possible -the distribution of the results is similar to that of data set 2 -but what it means in terms of intellectual abilities is not a simple question.

Discussion and Conclusion
We analyzed real data of intellectual tests, choosing four subtests from the total. In all cases, the distribution of results is skewed with a score of about 0.3. e MCFA gave a contradictory result, with negative moderation coe cients for both loadings and residuals. Analysis of the item di culties shows that the subtests contain more easy items than di cult ones. So the real data result can be explained, at least partly, by irregularity of task di culty. But the comparison of the real data and data set 2 simu-lation shows that the loading moderation coe cients of real data are greater modulo than those of data set 2, and, vice versa, the residual moderation coe cients for data set 2 are greater than those of the real data. So there may be some additional e ect that can be explained either by SLODR or by case selection. e similarity of the "true" SLODR model and the normalized case selection model raises the question whether it is possible to di erentiate such situations in principle in the classical psychometric framework. In that framework, the normalization is considered as a possible instrument to get an interval scale (Furr & Bachrach, 2008), and just such rescaling converts data set 3 to data set 4, which is very similar to the "true" SLODR, data set 1. We expect that the answer is "no", but the subject needs detailed exploration.
Not only normalization may lead to this e ect. If there is a negatively skewed distribution of participant abilities, we may pick up a set of tasks with a positively skewed distribution of di culties to make the distribution of sum scores normal, as in the case of data set 4, with the spurious SLODR e ect. Note that this e ect can be detected by both "traditional" methods (Korneev, Krichevets, & Ushakov, 2019) and the MCFA methods of detection used here, so the skewness itself cannot be completely responsible for the spurious SLODR detection.
At the same time, the IRT approach may reveal both the di erent densities of easy and di cult tasks and skewed distribution of respondents' ability, and so can help solve the problem mentioned above. e two-parameter IRT model method of test scoring could not be considered as fully adequate for our test tasks, due to the presence of a guessing strategy among some participants and its absence among others; but to the extent that it is appropriate, it shows a more complicated situation than the SLODR e ect with normal subtest distributions. It shows a decreasing of both intercorrelations and residuals, while ability level is increasing.
An interesting question is to what extent the di erent easy and di cult task density in the IRT model may lead to spurious detection of a dedi erentiation e ect or masking the "true" SLODR e ect within the IRT framework (such a hypothesis was formulated by Breit, Brunner, & Preckel, 2020, Study 2), due to di erent variance of ability estimated at di erent loci of the ability scale. is may be a subject of future simulation studies.

Supplementary Materials
e following are available online at http://mathpsy.com/slodr: Table S1: Goodne ss of t of the model obtained to di erent datasets. Appendix 1: Running variation estimate. Appendix 2: SPSS syntax for generation of dataset with SLODR simulation.

Author Contributions
Anatoly Krichevets and Dmitry Ushakov developed the concept of the study and performed the theoretical analysis. Konstatin Sugonyaev collected the data. Aleksei Korneev, Alexander Vinogradov and Aram Fomichev developed and performed simulations and computations. Anatoly Krichevets, Aleksei Korneev, and Dmitry Ushakov prepared the original dra and performed review and editing of the manuscript. All authors discussed the results of the study and contributed to the nal manuscript.

Con icts of Interest
e authors declare no con icts of interest.