Assessment of The Difficulty of Mathematics High-Level Reasoning Using Focus Group Discussion and Program Item Approaches

Rukli Rukli(1), Ma'rup Ma'rup(2*),

(1) Universitas Muhammadiyah Makassar
(2) Universitas Muhammadiyah Makassar
(*) Corresponding Author



This study aims to estimate the difficulty level of high-level reasoning math problems by comparing the estimated difficulty level of the Focus Group Discussion (FGD) approach and the Iteman program. The estimation of the FGD approach uses a semantic differential scale on a scale of 1-7, while the Iteman program uses version 4.0. The study used a quantitative-comparative approach involving 79 FGDs of students and teachers of SMP/MTs in Soppeng Regency. Each FGD consisted of four people, one teacher and three students of class VIII. The comparative of difficulty level using Scheffe test with a significance level of 0.05. The results showed that the average level of difficulty in the FGD approach had significant similarity with the average level of difficulty in the output of the Iteman program approach. This means that the estimated level of difficulty of the FGD examinees has a similarity with the output of the Iteman program from the response data of the FGD examinees. This shows that examinees  from FGD in assessing the level of difficulty of high-level reasoning questions can be the teacher’s choice in schools other than the Iteman program.


Difficulty Level; High-Level Reasoning Questions, Focus Group Discussion, Iteman Program

Full Text:



Applegate, G. M., Sutherland, K. A., Becker, K. A., & Luo, X. (2019a). The Effect of Option Homogeneity in Multiple-Choice Items. Applied Psychological Measurement, 43(2), 113–124.

Guest, G., Namey, E., & McKenna, K. (2017). How Many Focus Groups Are Enough? Building an Evidence Base for Nonprobability Sample Sizes. Field Methods, 29(1), 3–22.

Guyer, R., & Thompson, N. A. (2006). User’s Manual for the ITEMAN TM Conventional Item Analysis Program. Retrieved from ITEMAN Manual.pdf

Jabrayilov, R., Emons, W. H. M., & Sijtsma, K. (2016). Comparison of Classical Test Theory and Item Response Theory in Individual Change Assessment. Applied Psychological Measurement, 40(8), 559–572.

Juškaite, L. (2018). PAPER-BASED AND ONLINE TESTS IN NATURAL. 1, 293–303. Retrieved from

Kozierkiewicz-Hetmańska, A., & Poniatowski, R. (2014). An Item Bank Calibration Method for a Computer Adaptive Test. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics): Vol. 8397 LNAI (pp. 375–383).

Lafferty, M. I. (2004). Focus group interviews as a data collecting strategy. Journal of Advanced Nursing, 48(2), 187–194.

Martin, M., Mullis, I. V. S., Foy, P., & Stanco, G. (2012). TIMMS 2011 International Result in Science. Retrieved from Science_FullBook.pdf

Martinková, P., Štěpánek, L., Drabinová, A., Houdek, J., Vejražka, M., & Štuka, Č. (2017). Semi-real-time analyses of item characteristics for medical school admission tests. Proceedings of the 2017 Federated Conference on Computer Science and Information Systems, FedCSIS 2017, 11, 189–194.

Monique M. Hennink. (2012). Focus Group Discussions (Understanding Qualitative Research). In Oxford University Press.

Mullis, I. V. S., Martin, M. O., Foy, P., & Hooper, M. (2016). Timss 2015 International Results in Science Saved. Retrieved from

Raykov, T., Dimitrov, D. M., Marcoulides, G. A., & Harrison, M. (2019). On the Connections Between Item Response Theory and Classical Test Theory: A Note on True Score Evaluation for Polytomous Items via Item Response Modeling. Educational and Psychological Measurement, 79(6), 1198–1209. 0013164417745949

Raykov, T., & Marcoulides, G. A. (2016). On the Relationship Between Classical Test Theory and Item Response Theory. Educational and Psychological Measurement, 76(2), 325–338.

Richard A. Krueger; Mary Anne Casey. (2014). Casey - Focus Groups_ A Practical Guide for Applied Research (Edition 5, ed.). SAGE Publications Ltd.

Rukli, R. (2012). INTERNATIONAL CONFERENCE ADRI -5 " Scientific Publications toward Global Competitive Higher Education " 384 CHARACTERISTICS ANALYSIS OF MATH ITEMS TYPE ON DATA TREND IN INTERNATIONAL MATHEMATICS AND SCIENCE STUDY Rukli. 384–389. Retrieved from


Rukli, R., Ma’rup, M., Bahar, E. E., & Ramdani, R. (2021). The estimation of test item difficulty using focus group discussion approach on the semantic differential scale. Kasetsart Journal of Social Sciences.

Singh, R., Timbadia, D., Kapoor, V., Reddy, R., Churi, P., & Pimple, O. (2021). Question paper generation through progressive model and difficulty calculation on the Promexa Mobile Application. Education and Information Technologies.

Stone, J. C., Glass, K., Munn, Z., Tugwell, P., & Doi, S. A. R. (2020). Comparison of bias adjustment methods in meta-analysis suggests that quality effects modeling may have less limitations than other approaches. Journal of Clinical Epidemiology, 117, 36–45.

Turkmen, G., & Caner, S. (2020). The investigation of novice programmers’ debugging behaviors to inform intelligent e-learning environments: A case study. Turkish Online Journal of Distance Education.

Zubairi, A. M. (2006). Classical and Rasch analyses of dichotomously scored reading comprehension test items. Malaysian Journal of ELT Research, 2(March), 1–20. Retrieved from

Article Metrics

Abstract view : 292 times | PDF view : 76 times


  • There are currently no refbacks.

Copyright (c) 2021 Rukli

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.



Magister Program of Mathematics Education

prostgraduate Universitas Negeri Makassar


Daya Matematis: Jurnal Inovasi Pendidikan Matematika Indexed by





 Creative Commons License
daya matematis: jurnal inovasi pendidikan matematika is licensed under a

View My Stats