Quality profile of Arabic final semester assessment items: A psychometric analysis
DOI:
https://doi.org/10.30603/al.v11i1.7322Keywords:
Arabic language assessment;, content and construct validity;, psychometric analysis;, final semester testAbstract
Background: The quality of assessment instruments is essential to ensure that students’ learning outcomes are measured accurately. In Arabic language learning, Final Semester Assessments (PAS) must be supported by sound psychometric qualities to function as valid and reliable evaluation tools.
Aims: This study aims to examine the quality profile of Arabic PAS items at MAN 1 Gresik by analysing their psychometric characteristics and identifying items that are feasible, need revision, or are not feasible for use.
Methods: This research employed a quantitative descriptive design using psychometric item analysis. The data consisted of 40 multiple-choice PAS items and students’ response sheets. The analysis integrated content and construct validity with empirical indicators, including point-biserial validity, KR-20 reliability, item difficulty, and item discrimination, using Microsoft Excel and ANATES V4.
Results: The results show that content validity reached 92.5%, construct validity reached 82.85%, and empirical validity was moderate (r = 0.60). The overall test reliability was high (r₁₁ = 0.75). Item difficulty is dominated by medium-level items, while item discrimination is the weakest aspect. Based on integrated psychometric criteria, 40% of items are feasible, 57,5% require revision, and 2,5% are non feasible. The causes of the failure of the test items, content validity (7.5%), construct validity (42.5%), empirical validity (2.5%), level of difficulty (12.5%), and discrimination index (22.5%).
Implications: These findings highlight the importance of systematic psychometric evaluation in Arabic language assessment. Improvements are needed in construct validity, especially Arabic language accuracy, distractor effectiveness, and item discrimination. Such an approach supports the improvement of school-based Arabic assessments to ensure more valid and reliable measurement of students’ learning outcomes.
Downloads
References
Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. Cole Publishing. https://books.google.co.id/books?id=cgElAQAAIAAJ
Aprilia, P. (2024). Cara penanganan siswa berkemampuan di atas rata-rata sedang dan rendah. Journal of Knowledge and Collaboration, 1(7), 311–323. https://doi.org/10.59613/6q3akf79
Arbeni, W., Windiani, A., Sihotang, D. S. B., Anggraini, N., Wulandari, S., & Nugroho, A. (2025). Test reliability analysis in educational evaluation: a quantitative approach to consistency and validity. Holistic Science, 5(1), 59–64. https://doi.org/10.56495/hs.v5i1.838
Choirudin, Sugianto, R., Darmayanti, R., & Muhammad, I. (2023). Teacher competence in the preparation of test and non-test instruments. Journal of Teaching and Learning Mathematics, 1(1), 25–32. https://doi.org/10.22219/jtlm.v1i1.27695
Damayanti, A. M., Daryono, & Rayanto, Y. H. (2022). Evaluasi pembelajaran. CV Basya Media Utama.
https://books.google.co.id/books?id=cM7cEAAAQBAJ&dq
Downing, S. M., & Haladyna, T. M. (2006). Handbook of test development. Lawrence Erlbaum Associates Publishers. https://psycnet.apa.org/record/2006-01815-000
Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement. 5th Edition, Prentice-Hall, Englewood Cliffs. https://psycnet.apa.org/record/1973-22100-000
Fikriyah, N. (2021). Analisis butir soal ulangan tengah semester mata pelajaran Bahasa Arab kelas VII semester genap SMP Muhammadiyah 1 Yogyakarta tahun ajaran 2019/2020* [Undergraduate thesis, Universitas Muhammadiyah Yogyakarta]. UMY ETD. https://etd.umy.ac.id/id/eprint/3069/
Hamid, M. A., Sutaman, S., Natsir, M., & Salih, I. O. M. (2022). The development of an evaluation instrument for the implementation of the Arabic language curriculum in Islamic high school. Jurnal Al Bayan: Jurnal Jurusan Pendidikan Bahasa Arab, 14(1), 242–257. https://doi.org/10.24042/albayan.v14i1.10303
Harfiani, M. (2022). Analisis butir soal bahasa Arab kelas XII pada Penilaian Akhir Semester (PAS) semester ganjil tahun ajaran 2021/2022 di MAN 2 Kota Bandung berdasarkan Taksonomi Bloom revisi* [Undergraduate thesis, UIN Sunan Gunung Djati Bandung]. https://digilib.uinsgd.ac.id/55693/
Hayati, R., Wijayati, I. W., Nugroho, F. A., Fazriansyah, M. F., Nurdini, Wardoyo, T. H., Evenddy, S. S., Fratiwi, N. J., Edi, S., Hadikusumo, R. A., Nurlely, L., Mahardiyanti, T., Ariantara, R. G., Tandirerung, V. A., Darmo, S. Y., Suminar, I., Pitrianti, S., Lisnasari, S. F., & Talindong, A. (2023). Asesmen pembelajaran: teori dan praktik. PT. Sada Kurnia Pustaka. https://books.google.co.id/books?id=XABbEQAAQBAJ
Hidayah, A. (2022). Internal quality assurance system of education in financing standards and assessment standards. Indonesian Journal of Education (INJOE), 1(3), 291–300. https://felifa.net/index.php/INJOE/article/view/129
İlhan, M., Güler, N., Teker, G. T., & Ergenekon, Ö. (2024). The effects of reverse items on psychometric properties and respondents’ scale scores according to different item reversal strategies. International Journal of Assessment Tools in Education, 11(1), 20–38. https://doi.org/10.21449/ijate.1345549
Imran, I., Bismark, B., Adiansyah, A., Munir, A., & Luthfiyah, L. (2025). Tindak lanjut asesmen pada PAI menjadi program remedial dan pengayaan (teknik memberikan umpan balik dan tindak lanjut hasil asesmen). Pedagogos: Jurnal Pendidikan, 7(1), 49–62. https://doi.org/10.33627/https://doi.org/10.33627/gg.v6i2
Lam, T. N. (2024). Enhancing the quality of competency assessment for elementary school students in modern education. International Research Journal of Management, IT and Social Sciences, 11(3), 93–101. https://doi.org/10.21744/irjmis.v10n3.2429
Liani, A. M., Asmaun, & Nasrullah, A. H. (2025). Peran penilaian yang efektif dalam pengambilan keputusan guru di kelas. Pedagogy: Jurnal Pendidikan Matematika, 10(2), 393–409. https://doi.org/10.30605/pedagogy.v10i2.5904
Meyliasari, A. R., Al-Ibrahimy, A. M., Rohmawati, B., Ariyana, D., Erlindasari, D. P., Nurzaliha, D. P., & Malikah, N. (2024). Penyusunan instrumen penilaian afektif di sekolah. Muaddib: Jurnal Pendidikan Agama Islam, 2(2), 430–441.
Millman, J., & Greene, J. (1989). The specification and development of tests of achievement and ability. In Educational measurement. American Council on Education. https://psycnet.apa.org/record/1989-97348-008
Nizary, M. A., & Kholik, A. N. (2021). Validitas instrumen assesmen (Analisis validitas isi dan konstruk instrumen asesmen buku pelajaran Al Quran Hadis kelas 6 Madrasah Ibtidaiyah materi Surat Ad Dhuha bab VI). CONTEMPLATE: Jurnal Pendidikan Bahasa Arab, 2(01), 20-35. https://ejournal.iaiqi.ac.id/index.php/contemplate/article/view/49
Nurhasanah, Hidayatullah, Z., & Arif, M. B. S. (2024). Karakteristik instrumen tes literasi digital ditinjau dari validitas isi dan validitas empiris (kecocokan butir dengan model, reliabilitas, serta tingkat kesukaran butir). Journal of Classroom Action Research, 6(4), 916–923. https://doi.org/10.29303/jcar.v6i4.9650
Nurzahira, F., Jayadi, M. I., & Ridlo, U. (2025). Konsep evaluasi pembelajaran bahasa Arab. Ihya Al-Arabiyah: Jurnal Pendidikan Bahasa Dan Sastra Arab, 11(3), 467–484. http://dx.doi.org/10.30821/ihya.v11i3.26379
Permendikbud. (2013). Peraturan pemerintah republik Indonesia no. 32 tahun 2013 tentang perubahan atas peraturan pemerintah no. 19 tahun 2005 tentang standar nasional pendidikan. Menteri Pendidikan dan Kebudayaan Republik Indonesia. https://peraturan.bpk.go.id/Home/Details/5364/pp-no-32-tahun-2013
Qorib, M. (2024). Analysis of differentiated instruction as a learning solution in student diversity in inclusive and moderate education. International Journal Reglement & Society (IJRS), 5(1), 43–55. https://doi.org/10.55357/ijrs.v5i1.452
Saputra, H. D., Purwanto, W., Setiawan, D., Fernandez, D., & Putra, R. (2022). Hasil belajar mahasiswa: analisis butir soal tes. Edukasi: Jurnal Pendidikan, 20(1), 15–27. https://doi.org/10.31571/edukasi.v20i1.3432
Saputri, H. A. S., Zulhijrah, Larasati, N. J., & Shaleh. (2023). Analisis instrumen asesmen: validitas, reliabilitas, tingkat kesukaran dan daya beda butir soal. Didaktik: Jurnal Ilmiah PGSD STKIP Subang, 9(5), 2986–2995. https://doi.org/10.36989/didaktik.v9i5.2268
Sari, N., Ahmad, Manggaberani, A. A., Jusmiana, A., Metianing, D., Solikhin, F., Negara, H. R. P., Silubun, H. C. A., Disnawati, H., Afri, L. E., Santos, M. Dos, Bahriani, M., & Ningsih, T. Z. (2025). Konstruksi instrumen pendidikan. CV Ruang Tentor. https://books.google.co.id/books?id=Neg9EQAAQBAJ&redir
Savika, H. I., & Zuhriyah, I. A. (2024). Peran analisis butir soal terhadap kualitas soal, kompetensi guru, dan prestasi belajar peserta didik di sekolah dasar. Pandu: Jurnal Pendidikan Anak Dan Pendidikan Umum, 2(2), 43–51. https://doi.org/10.59966/pandu.v2i2.856
Sekaran, U., & Bougie, R. (2016). Research methods for business: a skill building approach. 7th Edition. John Wiley & Sons, Haddington. https://books.google.co.id/books?id=Ko6bCgAAQBAJ
Sibarani, C. G. G. T., Ahsan, J., & Umar, A. T. (2025). Buku monograf: evaluasi teori dan model. CV. Merdeka Kreasi Group. https://books.google.co.id/books?id=NGtxEQAAQBAJ
Sutomo, F. G., & Aini, M. R. Q. (2024). Pemahaman karakteristik peserta didik dalam mengoptimalkan pembelajaran. Jurnal Kajian Penelitian Pendidikan Dan Kebudayaan, 2(4), 60–72. https://doi.org/10.59031/jkppk.v2i4.499
Syafi’i, M., Samsudin, M., Abidin, Z., & Basarrudin, M. (2025). Evaluasi pendidikan sebagai dasar pengembangan instrumen penilaian berbasis kompetensi. Jurnal Akuntansi, Manajemen Dan Ilmu Pendidikan, 1(4), 1–12. https://journal.yapakama.com/index.php/JAMED/article/view/299
Tanjung, M. A. H. R., Fahmi, A. A., Rahmanita, F., Habibah, I. F., & Qomari, N. (2024). Analisis butir soal penilaian akhir tahun pelajaran Bahasa Arab kelas VII MTs Al-Ma'arif Rakit Banjarnegara Jawa Tengah. Mantiqu Tayr: Journal of Arabic Language, 4(1), 347–367. https://doi.org/10.25217/mantiqutayr.v4i1.4038
Thahir, M. (2023). Manajemen mutu sekolah. Indonesia Emas Group. https://books.google.co.id/books?id=wzraEAAAQBAJ
Umareni, Soehardin, U., & Shodikin, E. N. (2024). Evaluasi pembelajaran bahasa Arab kelas 7 di marhalah salafiyah wustho pondok pesantren Islamic centre bin baz putri Yogyakarta. Ascent: Al-Bahjah Journal of Islamic Education Management, 2(1), 27–35. https://doi.org/10.61553/ascent.v2i1.157
Yu, J., Kreijkes, P., & Salmela-Aro, K. (2022). Students’ growth mindset: relation to teacher beliefs, teaching practices, and school climate. Learning and Instruction, 80, 101616. https://doi.org/10.1016/j.learninstruc.2022.101616
Zahroh, F. L., & Hilmiyati, F. (2024). Indikator keberhasilan dalam evaluasi program pendidikan. Edu Cendikia: Jurnal Ilmiah Kependidikan, 4(3), 1052–1062. https://doi.org/10.47709/educendikia.v4i03.5049
Zayrin, A. A., Nupus, H., Maizia, K. K., Marsela, S., Hidayatullah, R., & Harmonedi, H. (2025). Analisis instrumen penelitian pendidikan (uji validitas dan reliabilitas instrumen penelitian). Qosim: Jurnal Pendidikan Sosial & Humaniora 3.2, 3(2), 780–789. https://doi.org/10.61104/jq.v3i2.1070
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Zuliyah Safitri, M. Baihaqi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Copyright Notice
Authors who publish in Al-Lisan: Jurnal Bahasa (e-Journal) agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.






