P. Susongko


The purpose of this study was to validate the test items of science achievement, which were used as a test at Pancasakti Science Competition, in order to obtain enough valid test items with the Rasch model application. Validation model used in this research is Messick validity covering aspects such as (1) content, (2) substantive, (3) structural, (4) external and (5) consequential. To achieve these objectives, this study investigates the quality of the items that include matching items, Person-Items Folder, Person/Item Folder, Information Function Tests, Person Fit Statistic, Collapsed Deviance, Casewise Deviance, Hosmer-Lemeshow, accuracy, sensitivity, specificity, unidimensional, invariance, separation and DIF. The test was given in the form of multiple choice as many as 40 items consisting of 15 items of physics, 10 items of chemistry, and 15 items of chemistry. The participants were 85 biology students with the test time allocation of 60 minutes. Item analysis was conducted by using R 3:12 Program, eRm package version 0.15-6. The study results showed that the test items of science achievement were proven valid by the application of Rasch Model. The test items have met construct validity according to Messick (1996) which includes such aspects: (1) content, (2) substantive, (3) Structural, (4) External and (5) consequential.


validation; Messick; Rasch model; science achievement test

Full Text:



APA. 1999. Standards for educational and psychological test and manuals .Washington DC

Ayele, Dawit G, Zewotir, Temesgen, Mwamb., & Henry. 2014. Using Rasch modeling to re-evaluate rapid malaria diagnosis test analyses. International Journal of Environmental Research and Public Health, 11(7): 23-32.

Bansilal, Sarah. 2015. A Rasch analysis of a grade 12 test written by mathematics teachers. South African Journal of Science, 111.5(6): 1-9. Retrieved from:

Baghaei,P. 2008. The Rasch model as a construck validation tool. Rasch Measurement Transaction , 22(1): 1145-1146. Retrieved from:

Baghaei,P & Amrahi, N. 2011. Validation of a Multiple Choice English Vocabulary Test with the Rasch Model. Journal of Languange Teaching and Research , 2(5) : 1052-1060

Chyi Lo,Wen-Miin Liang, Liang-Wen Hang , Tai-Chin Wu , Yu-Jun Chang & Chih-Hung Chang. 2015. A psychometric assessment of the St. George’s respiratory questionnaire in patients with COPD using rasch model analysis, Health and Quality of Life Outcomes, 13 (1): 45-60 , August 2015. Retrieved from:

Glynn,S.M. 2012. International assessment: A Rasch model and teachers’ evaluation of TIMSS science achievement items.Journal of Research in science teaching (JRST) , 49 (10): 1321–1344

Grimbeek, P., & Nisbet, S. 2006. Surveying primary teachers about compulsory numeracy testing: Combining factor analysis with Rasch analysis. Mathematics Education Research Journal, 18(2): 27-39.

Hambleton,R.K, Swaminathan,H & Rogers,H.J. 1991. Fundamentals of item response theory . Newbury Park London New Delhi: Sage Publication

Long, Caroline., Bansilal, Sarah., Debba., & Rajan. 2014. An investigation of Mathematical Literacy assessment supported by an application of Rasch measurement , Pythagoras35 (1): 1-17. Retrieved from:

Lou, Yiping, Blanchard, Pamela, Kennedy., & Eugene. 2015. Development and validation of a science Inquiry skills assessment. Journal of Geoscience Education, 63 (1): 73-85. Retrieved from::

Lu, Yen-Mou, Wu, Yuh-Yih, Hsieh, Ching-Lin, Lin, Chih-Lung. 2013. Measurement precision of the disability for back pain scale-by applying Rasch analysis. Journal of Health and Quality of Life Outcomes ,11, Page 119 Retrieved from :

Mair, P & Hatzinger , R . 2007. CML based estimation of extended Rasch models with the eRm package for the application of IRT models in R. Journal of Statistical Software, 20(9): 1-20.

Mair, P, Reise, S.P & Bentler, P,M. 2008. IRT Goodness og fit Using Approaches from Logistic Regression. UCLA.California, US. Retrieved from:

Mardapi,D. 2012. Pengukuran Penilaian Dan Evaluasi Pendidikan .Yogyakarta: Nuha Medika

Messick. 1996.Validaty and washback in language testing, Languange Testing, 13(3):241-256.

Nettekoven, Michaela& Ledermüller, Karl. 2012. Assess the Assessment: An Automated Analysis of Multiple Choice Exams and Test Items.European Conference on e-Learning: 397-XV. Kidmore End: Academic Conferences International Limited. AustriaRetrieved from:.(

Neumann, I., Neumann, K., & Nehm, R. 2010. Evaluating instrument quality in science education: Rasch-based analyses of a Nature of Science test. International Journal of Science Education, 33(10): 1373-1405.

Oon, P.T., & Subramaniam, R. 2012. Factors influencing Singapore students’ choice of physics as a tertiary field of study: A Rasch analysis. International Journal of Science Education, 35(1), 86- 118. doi: 10.1080/09500693.2012.718098.

Reise, SP, Waller, NG & Comrey, AL. 2000. Factor analysis and scale revision. Psychological Assesment, 12 (3): 287-297 Retrieved from :

Sabah, Saed., Hammouri, Hind., & Akour, Mutasem. 2013. Validation of a scale of attitudes toward science across countries using rasch model :findings from TIMSS. Journal of Baltic Science Education ,12 (5): 692-702. 11p.Retrieved from : ticles/2015/987-1425810820.pdf

Schulz, W., & Fraillon, J. 2011. The analysis of measurement equivalence in international studies using the Rasch model. Educational Research and Evaluation, 17(6): 447-464.

Sjaastad, J. 2012. Measuring the ways significant persons influence attitudes towards science and mathematics. International Journal of Science Education, 35(2): 192-212.

Smith, A.B, Fallowfield, L. J, Stark, D. P, & Velikova, G. 2010. A Rasch and confirmatory factor analysis of the General Health Questionnaire (GHQ) – 12. Journal Health and Quality of Life Outcomes, 8(1): 45-56. Retrieved from :

Smith , E.V.Jr. 2001. Evidence for the reliability of measures and validity of measure interpretation: A Rasch measurement perspective . Journal of Applied Measurement ,2(3): 281-311

Stubbe, T. C. 2011. How do different versions of a test instrument function in a single language? A DIF analysis of the PIRLS 2006 German assessments. Educational Research and Evaluation, 17(6): 465-481.

Wendt, H., Bos, W., & Goy, M. 2011. On applications of Rasch models in international comparative large-scale assessments: A historical review. Educational Research and Evaluation, 17(6): 419- 446.

Wolfe,E.W & Smith, E.V.Jr. 2007. Instrument development tools and activities for measure validation using Rasch models; Part II-validation activities. Journal of Applied Measurement, 8(2): 204-234.

Wright, B.D & Stone, M.H. 1999. Measurement essentials. Wide Range Inc, Wilmington.Retrieved from:( 28Mb)

Zain, A. N. M., Samsudin, M. A., Rohandi, & Jusoh, A. 2010. Using the Rasch model to measure students’ attitudes toward science in ‘low performing’ secondary schools in Malaysia. International Education Studies, 3(2): 56-63.


  • There are currently no refbacks.