Nowcasting Hotel Room Occupancy Rate using Google Trends Index and Online Traveler Reviews Given Lag Effect with Machine Learning (Case Research: East Kalimantan Province)

Authors

  • Adelina Rahmawati Politeknik Statistika STIS Author
  • Erna Nurmawati Politeknik Statistika STIS Author
  • Teguh Sugiyarto Badan Pusat Statistik Author

DOI:

https://doi.org/10.15294/sji.v11i2.5553

Keywords:

Hotel room occupancy rate, Nowcasting, Traveler reviews, Lag effect, Machine learning

Abstract

Purpose: The presence of a two-month lag in Hotel Room Occupancy Rate (TPK) data necessitates an alternative method to accommodate adjustments in the economic circumstances of the tourism industry. In this context, TPK is connected to the influx of tourists, making the data a valuable resource for assessing the tourism potential of a particular area. The information can be used to make informed decisions when considering investments in the local tourism industry. Therefore, this research aimed to formulate predictions for future trends using now-forecasting. The variables of Google Trends Index (IGT) and online traveler reviews considered were obtained from big data.

Methods: This research used machine learning methods with Random Forest, LSTM, and CNN-BiLSTM-Attention models in determining the best model. Meanwhile, the datasets were acquired from diverse secondary data sources. Hotel Occupancy Rooms Rate was derived from BPS-Statistics Indonesia, while additional data were collected through web scraping from online travel agency websites such as Tripadvisor.com, IGT with keywords “IKN”, “hotel”, and “banjir”. For the sentiment variable from online reviews, lag effects of one, two, and three months were analyzed to determine the correlation with TPK. The highest correlation was selected for inclusion in the prediction model across all machine learning methods.

Result: The results showed that the use of IGT and online traveler reviews increased the precision of forecasting models. The best model of hotel TPK nowcasting was Random Forest Regression with the lowest MAPE value and accuracy of 5.37% and 94.63%, respectively.

Novelty: The proposed method showed great potential in improving the prediction of hotel TPK by leveraging new technology and extensive data sources. The correlation with TPK decreases with an increasing time lag of sentiment. Therefore, the sentiment of reviews in the current month has the highest correlation with TPK, compared to the previous one, two, or three months.

Downloads

Article ID

5553

Published

12-06-2024

Issue

Section

Articles

How to Cite

Nowcasting Hotel Room Occupancy Rate using Google Trends Index and Online Traveler Reviews Given Lag Effect with Machine Learning (Case Research: East Kalimantan Province). (2024). Scientific Journal of Informatics, 11(2), 507-522. https://doi.org/10.15294/sji.v11i2.5553