THE USA MEDICAL INSURANCE AS A STIMULATING FACTOR TO INCREASE LABOUR EFFICIENCY

: Medical insurance is critical for state labour efficiency. In many countries (including in the United States of America), it is tightly connected to labour, which makes workers have valid insurance policies for free and constant access to medical aid. That strongly secures workers’ health and their high performance. In state-supporting insurance cases, citizens have a common access to medical services (regardless of their employment type). Here, people can be provided with medical aid without worrying about any prices

Within the Scopus base, we found 1,525 sources  via the queries «health insurance» and «productivity». The Bibliometrix and RStudio software (Aria and Cuccurullo 2017) defined top 20 keywords, countries and marking words in headlines of labour efficiency papers (Figure 1).

Figure 1. Interrelation within «keyword (DE) -country (AU_CO) -headline (TI_TM)»
Sources: developed by the authors based on the RStudio and Bibliometrix software. The most used keywords within all 1,525 Scopus articles were: human, medical insurance, article, people, labour efficiency, medical service costs, female, adult, male, the USA, economics. detailed clinical research, middle age, disease costs, priority register, elderly, control study, absenteeism, healthcare costs.
Methodology and research methods. The informational research base is the 1987-2021 USA statistics for state and private insurance (USA Facts, 2023), labour productivity (OECD, 2023), national employment (USA Bureau of Labour Statistics, 2023), life expectancy (Macrotrends, 2023), healthcare costs within gross domestic product (USA Facts, 2023), healthcare costs by volume (USA Facts, 2023).
To produce all calculations, we should define normalized values of input data. It is necessary for adequate information processing in analyses and forecasts. Consequently, you can easily understand and properly index the arranged data. The obtained results may be used for further analysis.
The data normalisation is conducted via Formula (1): where Knormalised value of input variables; xiinput index value (i= ¯(1,…,35)); mdinput index median; mxmaximal input index value. To confirm the model statistical quality, we realised the Statgraphics descriptive analysis. This application defines main numeric characteristics, regularities and conclusions.
To develop the regression model, we involve the Multiple Regression Backward Selection (MRBS) as an iteration algorithm to select variables from forecast sets within the regression model. It defines how labour efficiency is affected by state and private insurances, national employment, life expectancy, healthcare costs within gross domestic product and medical expenditures.
You start from all model predictors. Gradually, the highest p-variable is removed till you reach the predefined or minimal values for other predictors. This model is often used to assess relative significance of predictors by analysis of their influence on the common model. Finally, you get a simplified and accurate model with less variables.
In statistics, the MRBS is applied to produce a regression model via gradually added and subtracted predictors till you find a statistical reason for the model use or refusal. The process seeks for the model with all predictor variables that have a statistically significant relation to the solution variable itself.
The MRBS defines the most significant variables within the regression model. It is engaged to increase the regression model efficiency, decrease sample size and simplify its analysis. The Backward Stepwise Selection (BSS) is realised via separate steps. Firstly, you define an assessing criterion for model coefficients, select predictors and target variables. Secondly, you examine correlations to establish how dense ties are between predictors. Thirdly, you select those predictors whose correlation with target variables is the densest. Fourthly, you produce a regression model through the selected predictors, check statistical quality and significance (Effroymson, 1960).
The model statistical quality is checked via the Fisher and Student criteria, p-value, R2 and MAE (mean absolute error).
The next research stage is creation of multivariate adaptive regression splines by the most significant indexes. They affect the resulting variables (labour efficiency) after selection procedures.
The Multivariate Adaptive Regression Splines (MARS) is a statistical technique to analyse relations between dependent and independent variables that can be linear and non-linear.
The MARS is based on regression models where spline functions describe non-linear relations between variables. These are piecewise linear functions: there are junctions in break points with various slopes and shifts. The MARS technique uses splines to produce basis functions that can describe non-linear data.
The MARS algorithm looks for optimal break points for each variable and their optimal combinations as well. They may also reflect non-linear relations between variables. Via basis functions and data adaptation, the MARS generates very flexible models where intervariable relations are explained accurately.
An important MARS advantage is its applicability for absent data values and intervariable relations, which is difficult to trace through conventional linear models. Moreover, the MARS provides an automatic variable selection and reduces model sizes. Therefore, you do not need for retraining.
The MARS is a powerful machine method to generate non-linear regression models. It is reasonable for many predicted variables and their interrelations in data sets. The MARS piecewise linear functions reflect common data sections. They combine to produce a model with fixed data variations. In contrast to the simple linear regression, this technique gives more accurate results (Friedman, 1991).
The MARS model is defined as a weighted sum of basis functions ( ): (2) wherestable coefficient; kamount of basis functions.
The basis function (hinge function) is applied for machine learning, support vector machines, classifiers. Hinge losses measure the stock between classifier predictions and actual feature values. If the stock is low, penalties are imposed. The hinge function domain is max(0, − ) or max(0, − ). Thus, the MARS model automatically selects the hinge function shape, variables and their values. Also, it may define interaction between two or more variables as a product of hinge functions.
Therefore, the MRBS presupposes generalised check of basis function overload via the Generalised Cross-Validation (GCV) criterion. To select the best model subset, the following rule is observed: lower GCV values mean better results. In other words, GCV is a regulating method that includes contrast between model simplicity and efficiency (Craven and Wahba, 1978): where RSSresidual sum of squares (sum of difference squares between actual and planned model variables); Nobservation amount.
The GCV criterion (3) adjusts the RSS considering the model flexibility. Therefore, the imposed flexibility penalty is necessary. Too flexible models produce data noise rather than their systematic structure (Bottegal and Pillonetto, 2018).
Results. Statistical input data are indicated in Table 1. Since each variable is measured differently, we should convert input indexes into the comparable format to keep modelling. Here, the normalisation procedure is necessary according to Formula (1). The data normalisation results are given in Table 2.    Sources: developed by the authors based on the Statgraphics software.
R-square as 98,4915% shows that the variables are highly interrelated. According to this value, practically all dependent variables between independent ones can be covered via the linear regression. Such a model is very statistically significant.
R-square (adjusted for d.f.) as 98.3455% reflects dispersion via dependent variables, which is explained by the model. This assessment is extremely high.
Standard error as 0.0402679 demonstrates low deviation of our data. Thus, they are reliable. Within the model, the mean absolute error is usually 0.0300303. Lag 1 residual autocorrelation as 0.68334 reflects interrelations between previous and current values. It means that the model can evolve further.
The next stage is the MARS model generation via the Salford Predictive Modeler 8 software. Here, we engage K1, K3, K4, K6. To accomplish the task, we chose such settings: target and predicted variables, regression, the MARS analysing engine, basis function search with their limit of 40 units. Predicted variables are related in pairs.
The MARS model generation results in 11 basis functions. Among them, the 5th basis function is optimal (according to the minimal GCV - Figure 3). Detailed statistical information about all produced basis functions is given in Table 6. Below we are going to explain the optimal basis function №5. The MARS model consists of five basis functions. The optimal MARS model using four basis functions is shown in Formula (6) and  Statistical characteristics of the optimal MARS model (6) are indicated in Table 8. The research results in comparing Table 9. Here, such data are included as input coverage of state and private insurance (USA Facts, 2023) and calculated regression model values (5). The latter describes how state and private insurances (K1) depend on national employment (K3), life expectancy (K4), public costs for medical goods and services (K6). Besides, Table 9 represents the defined MARS model values (6) within six basis functions. Due to Table 9, the MARS model is more accurate for 2000-2021 rather than for 1987-1999. Obtained via the regression analysis, the values produce more accurate predictions for 1987-1999 rather than 2000-2021.
Conclusions. To understand how labour efficiency is affected by national employment, life expectancy, healthcare costs as a GDP percentage and public costs for medical goods and services within the USA medical insurance, we conducted a three-stage study.
Firstly, we normalised and described the research data. Secondly, the Statgraphics software was applied to process values of national employment, life expectancy, healthcare costs as a GDP percentage and public costs for medical goods and services.
Thirdly, we generated the MARS model via the software Salford Predictive Modeler 8. Within the obtained basis model, hinge points were detected with assessing relations between independent values (K3, K4, K6) and dependent ones (K1).
Finally, we compared initial values and those calculated through the regression and MARS models. The acquired MARS model values are more similar to initial ones for 2000-2021. Also, calculated values and actual normalised indexes differ in 1987-1999. Simultaneously, the regression model has more accurate indexes in 1987-1999 rather than in 2000-2021. Use of these two methods opens search for other significant variables.