The Utilization of Sketch Engine in Concordance Studies: A Corpus Linguistics Perspective
DOI:
https://doi.org/10.15575/ta.v5i1.52011Keywords:
Arabic Educational Terminology, Corpus Linguistics, Sketch EngineAbstract
Corpus linguistics has developed rapidly in recent years; however, its application in Arabic language studies in Indonesia remains limited, particularly in examining the semantic distinctions of key educational terms. This study aims to investigate the distribution, collocational patterns, and semantic functions of three central terms تعليم (taʿlīm), تدريس (tadrīs), and تربية (tarbiyah) in a modern Arabic corpus, as well as their implications for conceptual meaning and language pedagogy. The novelty of this study lies in integrating corpus-based analysis of authentic linguistic data with semantic interpretation within a pedagogical framework. This research employs a qualitative descriptive approach with a corpus-based discourse analysis design, using data derived from the arTenTen24 corpus through Sketch Engine. The analysis applies frequency, concordance, and collocation techniques to identify lexical distributions and relationships. The findings reveal that تعليم is the most dominant term, primarily associated with educational systems and policy contexts; تدريس relates to instructional practices in classroom settings; while تربية encompasses broader dimensions, including values, character formation, and institutional contexts. These distinctions indicate systematic differences in semantic functions that reflect the conceptual structure of modern Arabic educational discourse. Theoretically, this study contributes to corpus-based lexical-semantic research in Arabic, while pedagogically, it supports data-driven learning to improve learners’ accuracy in using educational terminology.
References
Al Fraidan, A., & Alkuwaity, R. (2025). Corpus-Driven Innovation in Saudi Arabian EFL Teaching Practices. Educational Process International Journal, 19(1), e2025553. https://doi.org/10.22521/edupij.2025.19.553
Al-Sulaiti, L., & Atwell, E. (2006). The design of a corpus of contemporary Arabic. International Journal of Corpus Linguistics, 11(2), 135–171. https://doi.org/10.1075/ijcl.11.2.02als
Boleda, G. (2020). Distributional semantics and linguistic theory. Annual Review of Linguistics, 6, 213–234. https://doi.org/10.1146/annurev-linguistics-011718-011756
Boulton, A., & Forti, L. (2025). Corpus Linguistics and Data-Driven Learning. In Reference Module in Social Sciences (p. B9780323955041004828). Elsevier. https://doi.org/10.1016/B978-0-323-95504-1.00482-8
Brezina, V. (2023). Corpus linguistics and collocation analysis: New directions in meaning research. International Journal of Corpus Linguistics, 28(3), 345–367. https://doi.org/10.1075/ijcl.21045.bre
Crosthwaite, P., & Baisa, V. (2023). Generative AI and the End of Corpus-Assisted Data-Driven Learning? Not So Fast! Applied Corpus Linguistics, 3(3), 100066. https://doi.org/10.1016/j.acorp.2023.100066
Curry, N., & McEnery, T. (2025). Corpus Linguistics for Language Teaching and Learning: A Research Agenda. Language Teaching, 58, 1–20. https://doi.org/10.1017/S0261444824000430
Elewa, A. (2025). A Corpus-Based Analysis of Gendered Language in Spoken Religious Discourse. Applied Corpus Linguistics, 5(3), 100137. https://doi.org/10.1016/j.acorp.2025.100137
Fitrianto, I. (2024). Innovation and Technology in Arabic Language Learning in Indonesia: Trends and Implications. International Journal of Post Axial: Futuristic Teaching and Learning, 2(3), 134–150. https://doi.org/10.59944/postaxial.v2i3.375
Guerza, R. (2023). Transportable Identities in Conversational Interaction among Batna 2 University Students of English. Arab World English Journal, 14(2), 65–76. https://doi.org/10.24093/awej/vol14no2.5
Harahap, A., et al. (2025). Corpus-based learning and lexical development in language education. English Education Journal, 16(1), 1–15. https://doi.org/10.24815/eej.v16i1.41762
Hizbullah, N., Arifa, Z., Suryadarma, Y., Hidayat, F., Muhyiddin, L., & Firmansyah, E. K. (2020). Source-Based Arabic Language Learning: A Corpus Linguistic Approach. Humanities & Social Sciences Reviews, 8(3), 940–954. https://doi.org/10.18510/hssr.2020.8398
Jurko, P. (2022). Semantic Prosody of Slovene Adverb–Verb Collocations: Introducing the Top-Down Approach. Corpora, 17(1), 39–67. https://doi.org/10.3366/cor.2022.0234
Kovačević, D. (2026). Corpus Stylistic Analysis with Sketch Engine. 2026 25th International Symposium INFOTEH-JAHORINA (INFOTEH), 1–6. https://doi.org/10.1109/INFOTEH68759.2026.11477721
Li, D., Noordin, N., Ismail, L., & Cao, D. (2025). A systematic review of corpus-based instruction in EFL classroom. Heliyon, 11(2), e42016. https://doi.org/10.1016/j.heliyon.2025.e42016
Ma, Q., Chiu, M. M., Lin, S., & Mendoza, N. B. (2023). Teachers’ Perceived Corpus Literacy and Their Intention to Integrate Corpora into Classroom Teaching: A Survey Study. ReCALL, 35(1), 19–39. https://doi.org/10.1017/S0958344022000180
Ma, Q., & Mei, F. (2021). Review of Corpus Tools for Vocabulary Teaching and Learning. Journal of China Computer-Assisted Language Learning, 1(1), 177–190. https://doi.org/10.1515/jccall-2021-2008
Mawaddah, Z., Ok, A. H., & Arsyad, J. (2025). Challenges In Teaching Arabic Language At Mas Ulumul Quran Langsa: A Case Study. Jurnal At-Tarbiyat: Jurnal Pendidikan Islam, 8(2), 478–484. https://doi.org/10.37758/jat.88i2.74
McEnery, T., & Hardie, A. (2011). Corpus Linguistics: Method, Theory and Practice. Cambridge University Press.
Musthafa, I., & Hermawan, A. (2018). Metodologi Penelitian Bahasa Arab: Konsep Dasar Strategi Metode Teknik. Remaja Rosdakarya.
Pandža, N. B., Phillips, I., Karuzis, V. P., O’Rourke, P., & Kuchinsky, S. E. (2020). Neurostimulation and Pupillometry: New Directions for Learning and Research in Applied Linguistics. Annual Review of Applied Linguistics, 40, 56–77. https://doi.org/10.1017/S0267190520000069
Sardinha, T. B. (2020). A Historical Characterisation of American and Brazilian Cultures Based on Lexical Representations. Corpora, 15(2), 183–212. https://doi.org/10.3366/cor.2020.0194
Sun, Y., & Park, J. (2023). Corpus-informed vocabulary learning and collocation awareness in language education. Sustainability, 15(17), 13242. https://doi.org/10.3390/su151713242
Tsai, C.-H. (2021). Corpus linguistics in language teaching: Applications and implications. Language Resources and Evaluation, 55, 1001–1020. https://doi.org/10.1007/s12528-021-09272-4
Valdivieso, T., & González, O. (2025). Generative AI Tools in Salvadoran Higher Education: Balancing Equity, Ethics, and Knowledge Management in the Global South. Education Sciences, 15(2), 214. https://doi.org/10.3390/educsci15020214
Yin, Y., & Li, X. (2021). Collocation patterns and lexical variation in corpus analysis. Applied Corpus Linguistics, 1(1), 100006. https://doi.org/10.1016/j.acorp.2021.100006
Yuliawati, S., Ekawati, D., & Mawarrani, R. E. (2021). Investigating Lexical Bundles in the Corpora of English and Indonesian Research Articles with the Sketch Engine. Jurnal Sosioteknologi, 20(2), 188–200. https://doi.org/10.5614/sostek.itbj.2021.20.2.5
Zaki, M. (2021). Corpus‐Based Language Teaching and Learning: Applications and Implications. International Journal of Applied Linguistics, 31(2), 169–172. https://doi.org/10.1111/ijal.12316
Downloads
Published
How to Cite
Issue
Section
Citation Check
License
Copyright (c) 2026 Faiza Nur Khalida, Mochamad Mui'zzudin, Ahmad Suhaili

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.






