Comparative Study of Machine Learning Models for U.S. Housing Price Prediction

Authors

  • Wenguang Zhou Szkoła Główna Handlowa, Poland Author
  • Wenjiao Zhou Akademia Leona Koźmińskiego, Poland Author

DOI:

https://doi.org/10.71222/1s8mmk10

Keywords:

housing price prediction, machine learning, regression models, feature engineering

Abstract

This study compares machine learning models used to predict US house prices using a large-scale real estate dataset covering all US states, cities, and zip code regions. The goal is to evaluate prediction accuracy and identify effective preprocessing strategies for high-cardinality location features. After cleaning the data through missing value handling, deduplication, and outlier removal based on interquartile range (IQR), leakage-aware feature engineering was applied, including aggregating zip codes into ZIP3, grouping by the top K cities, and date decomposition, while excluding target derived variables. This study trains and evaluates three models-linear regression, random forest, and XGBoost-on a reserved test set using mean absolute error (MAE), mean squared error (MSE), and coefficient of determination (R²). The results show that XGBoost achieves the best performance, outperforming linear regression and random forest, and feature importance indicates that location indicators play a dominant role in prediction gain. The findings demonstrate that the XGBoost model outperforms linear regression and random forest models in predicting US house prices.

References

1. D. J. C. Sihombing, "Application of Feature Engineering Techniques and Machine Learning Algorithms for Property Price Prediction," JITSI: Jurnal Ilmiah Teknologi Sistem Informasi, vol. 5, no. 2, pp. 72-76, 2024. doi: 10.62527/jitsi.5.2.241

2. O. E. Ogunbiyi, "House Sale Price Prediction using Feature Engineering Techniques and Ensemble Learning Algorithms (Doctoral dissertation, Dublin, National College of Ireland)," 2020.

3. Q. Truong, M. Nguyen, H. Dang, and B. Mei, "Housing price prediction via improved machine learning techniques," Procedia Computer Science, vol. 174, pp. 433-442, 2020. doi: 10.1016/j.procs.2020.06.111

4. M. S. V. Tyagadurgam, V. N. Gangineni, S. Pabbineedi, A. B. Kakani, S. K. K. Nandiraju, and S. K. Chundru, "Using Artificial Intelligence-Based Machine Learning Regression Models for Predictions of Home Prices," European Journal of Applied Science, Engineering and Technology, vol. 3, no. 3, pp. 404-416, 2025. doi: 10.59324/ejaset.2025.3(3).29

5. J. Manasa, R. Gupta, and N. S. Narahari, "Machine learning based predicting house prices using regression techniques," In 2020 2nd International conference on innovative mechanisms for industry applications (ICIMIA), March, 2020, pp. 624-630. doi: 10.1109/icimia48430.2020.9074952

6. M. R. Saefudin, M. R. Putri, A. Hadi, H. Wijayanto, and B. Irmawati, "Significant Features for House Price Prediction Using Machine Learning," In 2024 IEEE International Conference on Communication, Networks and Satellite (COMNETSAT), November, 2024, pp. 659-664. doi: 10.1109/comnetsat63286.2024.10862860

7. J. Y. Wu, "Housing price prediction using support vector regression," 2017. doi: 10.31979/etd.vpub-6bgs

8. R. Monika, J. Nithyasree, V. Valarmathi, G. R. Hemalakshmi, and N. B. Prakash, "House price forecasting using machine learning methods," Turkish Journal of Computer and Mathematics Education, vol. 12, no. 11, pp. 3624-3632, 2021.

9. I. C. Obagbuwa, and S. Danster, "Housing Price Prediction Using Machine Learning Techniques," In 2024 International Conference on Science, Engineering and Business for Driving Sustainable Development Goals (SEB4SDG), April, 2024, pp. 1-12. doi: 10.1109/seb4sdg60871.2024.10629723

10. M. Thamarai, and S. P. Malarvizhi, "House Price Prediction Modeling Using Machine Learning," International Journal of Information Engineering & Electronic Business, vol. 12, no. 2, 2020. doi: 10.5815/ijieeb.2020.02.03

11. G. Cabrera, J. D. Díaz, and E. Hansen, "Real Estate Returns and the Macroeconomy: Insights from Big Data in the US, Canada, and the UK," The Journal of Real Estate Finance and Economics, pp. 1-89, 2026.

Downloads

Published

14 March 2026

Issue

Section

Article

How to Cite

Zhou, W., & Zhou, W. (2026). Comparative Study of Machine Learning Models for U.S. Housing Price Prediction. Journal of Computer, Signal, and System Research, 3(2), 66-73. https://doi.org/10.71222/1s8mmk10