Which Features Matter Most? Evaluating Numerical and Textual Features for Helpfulness Classification in Imbalance Dataset using XGBoost

Anindita Putri Kirani; Ristu Saptono; Rini Anggrainingsih

doi:10.15294/sji.v12i4.33443

Authors

Anindita Putri Kirani Department of Informatics, Universitas Sebelas Maret, Indonesia Author
Ristu Saptono Department of Informatics, Universitas Sebelas Maret, Indonesia Author
Rini Anggrainingsih Department of Informatics, Universitas Sebelas Maret, Indonesia Author

DOI:

https://doi.org/10.15294/sji.v12i4.33443

Keywords:

Review helpfulness, Helpful vote, Time-based evaluation, Imbalanced data handling

Abstract

Purpose: This study aims to develop and realistically evaluate a reliable model for identifying helpful online reviews, particularly in the context of Indonesian-language texts, which are often informal and challenging.

Methods: This study addresses several key challenges in predicting review helpfulness: the relative effectiveness of numerical features from metadata compared with traditional text representations (TF-IDF, FastText) on noisy data; the impact of severe class imbalance; and the limitations of standard validation compared with time-based validation. To address these challenges, we built an XGBoost model and evaluated various feature combinations. A hybrid approach combining SMOTE and scale_pos_weight was applied to handle class imbalance, and the best configuration was further assessed using time-based validation to better simulate real-world conditions.

Result: The results show that the model based on numerical features consistently outperformed the text-based model, achieving a peak macro F1-score of 0.7214. Compared to the IndoBERT baseline (F1-score = 0.6400) and the RCNN FastText baseline (F1-score = 0.5317), this indicates that simpler feature-driven models can provide more reliable predictions under noisy review data. Time-based validation further revealed a performance decline of up to 8.06%, confirming the presence of concept drift and highlighting that standard validation tends to yield overly optimistic estimates.

Novelty: The main contribution of this research lies in offering a robust methodology while demonstrating the superiority of metadata-based approaches in this context. By quantifying performance degradation through temporal validation, this study provides a more realistic benchmark for real-world applications and highlights the critical importance of regular model retraining.

Which Features Matter Most? Evaluating Numerical and Textual Features for Helpfulness Classification in Imbalance Dataset using XGBoost

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Article ID

Issue

Section

How to Cite

Main-Sidebar

Stat Counter