ZEMCH 2019 International Conference Proceedings April.2020 | Page 21

Multiple linear regression is a linear function of several explanatory variables that predicts the response variable . When there are p input variables xx � , xx � , xx � , ⋯ , xx � for the target variable y , the multiple linear regression model is as follows :
y ��ββ � �ββ � xx � �ββ � xx � �⋯�ββ � xx � �εε ( 1 )
where ββ � , ββ � , ββ � , ⋯ , ββ � are the regression coefficients to be estimated and εε is the random error term .
Regression coefficients are estimated by least‐squares method , which minimizes sum of squared errors . Statistical significance of the model is verified by F‐statistics . If p‐value of F‐statistics is less than 0.05 , the model is considered statistically significant . In this study , stepwise regression model was used , with both entry and stay points for the model set to 0.05 , which means that it ends with showing only the significant predictors .
2.2.2 . Decision tree
Decision tree is a popular tool for classification and prediction . It splits the data multiple times according to certain cutoff values in explanatory variables . Through splitting , different subsets of the dataset are created , with each instance belonging to one subset . The final subsets are called terminal nodes and the average of response variable of each terminal node represents the prediction . Algorithms for growing the tree choose a variable at each step that best splits the set of items . Different algorithms use different metrics for measuring ʺbestʺ : CHAID , CART , C4.5 , C5.0 . In this study , the tree was formed by CHAID algorithm . CHAID has the advantage that both categorical and continuous variables can be used as response variable . For the continuous response variables ( EUIs for heating and cooling ) used in this study , nodes were separated based on the mean and standard deviation of the response variable , resulting in the best separation by least p‐value of F‐statistics . In this study , separation point was set to 0.05 .
3 . Results
3.1 . MLR analysis results
3.1.1 . Determinants of EUI for heating
The stepwise regression result for the EUI for heating is given in Table 1 . Among the building characteristics , surface area is the only significant factor . As the surface area increases by 1 m2 , the EUI for heating increases by 0.280 kWh /( m2 ∙y ). This is due to the proportional increase in heat loss as the surface area facing cold outdoor air increases . Use of auxiliary heating devices , such as electric heater or electric heating blanket , has a significant impact on the EUI for heating ( In this study , heating energy use is defined as the energy use of main heating system ( boiler ) not including the energy use of auxiliary heating devices ). Households using such devices shows a 32.315 kWh /( m2 ∙y ) reduction in the EUI for heating compared to those not in use . Heating set temperature is a major factor among the occupant characteristics . As the temperature increases by 1 ℃, the EUI for heating increases by 8.164 kWh /( m2 ∙y ). Standardized beta shown in Table 1 allows to rank the variables by their influence level . As a result , heating set temperature ( 0.517 ), surface area ( 0.341 ), and use of auxiliary heating devices ( ‐0.265 ) affect the EUI for heating in the sequence .
Table 1 . Regression model for EUIs for heating and cooling ( stepwise )
Explanatory variable
Beta
EUI for heating
Std . beta p‐value
Beta
EUI for cooling
Std . beta p‐value
Building Characteristics
Year of Building Permit
NS
NS
NS
NS
NS
NS
Analyzing Determinants of Energy Consumption for Heating and Cooling in Apartment Units – Comparison of Linear and Nonlinear Statistical Models 10