Adapting Falcon3-7B Language Model for Arabic: Methods, Challenges, and Outcomes
Basma El Amel Boussaha; Mohammed Alyafeai; Ahmed Alzubaidi; Leen Al Qadi; Shaikha Alsuwaidi; Hakim Hacid
ArabJobs: A Multinational Corpus of Arabic Job Ads
Mo El-Haj
Semitic Root Encoding: Tokenization Based on the Templatic Morphology of Semitic Languages in NMT
Brendan T. Hatch; Stephen D. Richardson
3LM: Bridging Arabic, STEM, and Code through Benchmarking
Basma El Amel Boussaha; Leen Al Qadi; Mugariya Farooq; Shaikha Alsuwaidi; Giulia Campesan; Ahmed Alzubaidi; Mohammed Alyafeai; Hakim Hacid
TuniFra: A Tunisian Arabic Speech Corpus with Orthographic Transcriptions and French Translations
Alex Choux; Marko Avila; Josep Crego; Fethi Bougares; Antoine Laurent
The Cross-Lingual Cost: Retrieval Biases in RAG over Arabic-English Corpora
Chen Amiraz; Yaroslav Fyodorov; Elad Haramaty; Zohar Karnin; Liane Lewin-Eytan
Open-domain Arabic Conversational Question Answering with Question Rewriting
Mariam E. Hassib; Nagwa El-Makky; Marwan Torki
ATHAR: A High-Quality and Diverse Dataset for Classical Arabic to English Translation
Mohammed Sabry Mohammed; Mohammed Khalil
A-SEA-L-DU: An Fully Automated Self-Evolving, Adversarial Agentic Framework for Arabic Long-Context Document Understanding
Kesen Wang; Daulet Toibazar; Pedro J Moreno Mengibar
Lemmatizing Dialectal Arabic with Sequence-to-Sequence Models
Mostafa Saeed; Nizar Habash
Saudi-Alignment Benchmark: Assessing LLMs Alignment with Cultural Norms and Domain Knowledge in the Saudi Context
Manal Alhassoun; Imaan Mohammed Alkhanen; Nouf Alshalawi; Ibtehal Baazeem; Waleed Alsanie
AraHalluEval: A Fine-grained Hallucination Evaluation Framework for Arabic LLMs
Aisha Alansari; Hamzah Luqman
Evaluating Prompt Relevance in Arabic Automatic Essay Scoring: Insights from Synthetic and Real-World Data
Chatrine Qwaider; Kirill Chirkunov; Bashar Alhafni; Nizar Habash; Ted Briscoe
InExOntology - Ontology-Driven LLM Prompting for Unified Information Extraction Tasks
Alaa Aljabari; Nagham Hamad; Mohammed Khalilia; Mustafa Jarrar
Tahḏīb: A Rhythm-Aware Phrase Insertion for Classical Arabic Poetry Composition
Mohamad Elzohbi; Richard Zhao
Can LLMs Directly Retrieve Passages for Answering Questions from Qur'an?
Sohaila Eltanbouly; Salam Albatarni; Shaimaa Hassanein; Tamer Elsayed
ArabEmoNet: A Lightweight Hybrid 2D CNN-BiLSTM Model with Attention for Robust Arabic Speech Emotion Recognition
Ali Abouzeid; Bilal Elbouardi; Mohamed Maged; Shady Shehata
Capturing Intra-Dialectal Variation in Qatari Arabic: A Corpus of Cultural and Gender Dimensions
Houda Bouamor; Sara Al-Emadi; Zeinab Ibrahim; Hany Fazzaa; Aisha Sultan
Feature Engineering is not Dead: A Step Towards State of the Art for Arabic Automated Essay Scoring
Marwan Sayed; Sohaila Eltanbouly; May Bashendy; Tamer Elsayed
Assessing Large Language Models on Islamic Legal Reasoning: Evidence from Inheritance Law Evaluation
Abdessalam Bouchekif; Samer Rashwani; Heba Sbahi; Shahd Gaben; Muetaz Al-Khatib; Mohammed Ghaly
BALSAM: A Platform for Benchmarking Arabic Large Language Models
Rawan Nasser Almatham; Kareem Mohamed Darwish; Raghad Al-Rasheed; Waad Thuwaini Alshammari; Muneera Alhoshan; amal almazrua; Asma Al Wazrah; Mais Alheraki; Firoj Alam; Preslav Nakov; Norah A. Alzahrani; Eman Albilali; Nizar Habash; Abdelrahman Mustafa El-Sheikh; Muhammad Elmallah; Hamdy Mubarak; Zaid Alyafeai; Mohamed Anwar; Haonan Li; Ahmed Abdelali; Nora Altwairesh; Maram Hasanain; Abdulmohsen Al-Thubaity; Shady Shehata; Bashar Alhafni; Injy Hamed; Go Inoue; Khalid Elmadani; Ossama Obeid; Fatima Haouari; Tamer Elsayed; Emad A. Alghamdi; Khalid Almubarak; Saied Alshahrani; Ola Aljarrah; Safa Alajlan; Areej Alshaqarawi; Maryam Alshihri; Sultana Alghurabi; Atikah Alzeghayer; Afrah Altamimi; Abdullah Alfaifi; Abdulrahman M Alosaimy
TEDxTN: TEDx Speech Translation Corpus for Code-Switched Tunisian Arabic - English
Fethi Bougares; Salima Mdhaffar; Haroun Elleuch; Yannick Estève
AutoArabic: A Three-Stage Framework for Localizing Video-Text Retrieval Benchmarks
Mohamed Eltahir; Osamah Sarraj; Abdulrahman Alfrihidi; Taha Alshatiri; Mohammed Khurd; Mohammed Bremoo; Tanveer Hussain
Shawarma Chats: A Benchmark Exact Dialogue & Evaluation Platter in Egyptian, Maghrebi & Modern Standard Arabic—A Triple-Dialect Feast for Hungry Language Models
Kamyar Zeinalipour; Mohamed Zaky Saad; Oumaima Attafi; Marco Maggini; Marco Gori
Zero-Shot and Fine-Tuned Evaluation of Generative LLMs for Arabic Word Sense Disambiguation
Yossra Noureldien; Abdelrazig Mohamed; Farah Attallah
Nile-Chat: Egyptian Language Models for Arabic and Latin Scripts
Guokan Shang; Hadi Abdine; Ahmad Chamma; Amr Mohamed; Mohamed Anwar; Abdelaziz BOUNHAR; Omar El Herraoui; Preslav Nakov; Michalis Vazirgiannis; Eric P. Xing
Mind the Gap: A Review of Arabic Post-Training Datasets and Their Limitations
Mohammed Alkhowaiter; Saied Alshahrani; Norah F Alshahrani; Reem I. Masoud; Alaa Alzahrani; Deema Alnuhait; Emad A. Alghamdi; Khalid Almubarak
Bridging Dialectal Gaps in Arabic Medical LLMs through Model Merging
Ahmed Ibrahim; Abdullah Hosseini; Hoda Helmy; Wafa la; Ahmed Serag
Tool Calling for Arabic LLMs: Data Strategies and Instruction Tuning
Asım Ersoy; Enes Altinisik; Kareem Mohamed Darwish; Husrev Taha Sencar
Toward Culturally-Aware Arabic Debate Platforms with NLP Support
Khalid Al Khatib; Mohammad Khader
Modeling North African Dialects from Standard Languages
Yassine Toughrai; Kamel Smaïli; David Langlois
Learning Word Embeddings from Glosses: A Multi-Loss Framework for Arabic Reverse Dictionary Tasks
Engy Ibrahim; Farhah Adel; Marwan Torki; Nagwa El-Makky
ALARB: An Arabic Legal Argument Reasoning Benchmark
Harethah Abu Shairah; Somayah S. Alharbi; Abdulaziz A. AlHussein; Sameer Alsabea; Omar Shaqaqi; Hebah A. Alshamlan; Omar Knio; George Turkiyyah
Transfer or Translate? Argument Mining in Arabic with No Native Annotations
Sara Nabhani; Khalid Al Khatib
An Exploration of Knowledge Editing for Arabic
Basel Mousi; Nadir Durrani; Fahim Dalvi
Octopus: Towards Building the Arabic Speech LLM Suite
Sara Althubaiti; Vasista Sai Lodagala; Tjad Ivan Clark; Yousseif Alshahawy; Daniel Izham; Abdullah S Alrajeh; Aljawahrah; Ahmed Ali
ArabicWeb-Edu: Educational Quality Data for Arabic LLM Training
Majd Hawasly; Tasnim Mohiuddin; Hamdy Mubarak; Sabri Boughorbel
AMCrawl: An Arabic Web-Scale Dataset of Interleaved Image-Text Documents and Image-Text Pairs
Mustafa Alturki; Shahad Mohammed Aboukozzana; Daulet Toibazar; Muhammad Kamran J Khan; Ahmed Ali
DialG2P: Dialectal Grapheme-to-Phoneme. Arabic as a Case Study
Majd Hawasly; Hamdy Mubarak; Ahmed Abdelali; Ahmed Ali