Wajdi Zaghouani | Hamad Bin Khalifa University
Hamad Bin Khalifa University

FACULTY BIOGRAPHIES

Wajdi Zaghouani

Dr. Wajdi Zaghouani


Assistant Professor in Master of Arts in Digital Humanities and Societies
College of Humanities and Social Sciences
Middle Eastern Studies Department (MESD)
Master of Arts in Digital Humanities and Societies

  • Phone+974 4454 5601
  • Office locationA141, LAS

Biography

Dr. Zaghouani is an Assistant Professor in Digital Humanities within the Middle Eastern Studies department of the College of Humanities and Social Sciences (CHSS) at Hamad Bin Khalifa University (HBKU).

He received a Ph.D. in Natural Language Processing from the University of Paris Nanterre, and an M.A in Linguistics from the University of Montreal. His research interests span several areas of computational linguistics: Arabic Data Analytics, Linguistic Annotation, Language Resources and Evaluation, Fake News Detection, Sentiment Analysis, Lexical-Semantics and Computational Morphology. Over the years, he participated in multiple large scale human language technology projects such as the Penn Arabic TreeBank and PropBank, the Multi-Arabic Dialect Applications and Resources project (MADAR), The Qatar Arabic Language Bank in several universities and research institutions such as the University of Colorado Boulder, the Joint Research Center of the European Commission, the University of Montreal, Carnegie Mellon University and the University of Pennsylvania. Dr. Zaghouani worked as a consultant in various international companies specialized in Big Data and Information Management such as Nuance, OpenText, Nstein Technologies, Temis France and Lionridge. He co-organized several international conferences and workshops such as Arabic Natural Language Processing workshops, the Social Media Analysis in the Arab World SocInfo 2019 workshop, The QICC Fake News Detection Contest, the CheckThat! Fact Checking CLEF Lab, The Arabic Author Profiling and Deception Detection FIRE 2019 Task, and the Association of Computational Linguistics (ACL) Conference. He published over 50+ peer-reviewed journal and conference publications cited 912 times with an h-index of 17.

 


Research Interests

  • Digital Humanities
  • Social Media Data Processing
  • Big Data Analytics
  • Corpus Linguistics
  • Language Resources Annotation and Evaluation
  • Computational Social Science
  • Arabic Natural Language Processing
  • Computational Linguistics
  • Rhetorical Analysis
  • Author Profiling
  • Open Sources Tools and Resources
  • Educational Technologies

Experience

Postdoctoral Research Associate

Information Systems Program, Carnegie Mellon University (Qatar)

2015-2018
  • Senior Research Associate

    Computer Science Program, Carnegie Mellon University (Qatar)

    2012-2015
  • Visiting Scholar at the Linguistic Data Consortium (LDC)

    University of Pennsylvania, Philadelphia, USA

    2006-2010
  • Consultant (Computational Linguist)

    University of Colorado Boulder, Colorado, USA

    2009-2010
  • Researcher in Arabic Natural Language Processing

    The Language technology group of the Joint Research Center (JRC), Ispra, Italy

    2005-2006
  • Computational Linguist

    OpenText Inc., Montreal, Canada

    2002- 2004

Education

PhD in Computational Linguistics

University of Paris Nanterre La Defense, (Paris, France)

2015
  • MA in Computational Linguistics

    University of Montreal, (Montreal, Canada)

    2009
  • BA in Computational Linguistics

    (Minor in Computer Science, Major in Linguistics), University of Quebec in Montreal, (Montreal, Canada)

    2002
  • BA in French Literature

    Language and Civilization, Université de Kairouan, (Kairouan, Tunisia)

    1999

Selected Publications

  • Wajdi Zaghouani and Anis Charfi

    ArapTweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification (LREC 2018, Miyazaki, Japan (7-12 May 2018)

    2018
  • P Rosso, F Rangel, IH Farías, L Cagnina, W Zaghouani, A Charfi

    A survey on author profiling, deception, and irony detection for the Arabic language Language and Linguistics Compass Vol 12 (Issue 4); First published: 11 April 2018 https://doi.org/1

    2018
  • Wajdi Zaghouani and Anis Charfi

    Guidelines and Annotation Framework for Arabic Author Profiling (OSACT3 Workshop, LREC 2018, Miyazaki, Japan (7-12 May 2018)

    2018
  • Houda Bouamor, Nizar Habash, Mohammad Salameh, Wajdi Zaghouani, Owen Rambow, Dana Abdulrahim, Ossama Obeid, Salam Khalifa, Fadhl Eryani, Alexander Erdmann and Kemal Oflazer

    The MADAR Arabic Dialect Corpus and Lexicon (LREC 2018, Miyazaki, Japan ((7-12 May 2018)

    2018
  • Nizar Habash, Salam Khalifa, Fadhl Eryani, Owen Rambow, Dana Abdulrahim, Alexander Erdmann, Reem Faraj, Wajdi Zaghouani, Houda Bouamor, Nasser Zalmout, Sara Hassan, Faisal Al shargi, Sakhar Alkhereyf, Basma Abdulkareem, Ramy Eskander, Mohammad Salameh and

    Unified Guidelines and Resources for Arabic Dialect Orthography (LREC 2018, Miyazaki, Japan (7-12 May 2018)

    2018
  • Ossama Obeid, Salam Khalifa, Nizar Habash, Houda Bouamor, Wajdi Zaghouani and Kemal Oflazer

    MADARi: A Web Interface for Joint Arabic Morphological Annotation and Spelling Correction. (LREC 2018, Miyazaki, Japan (7-12 May 2018)

    2018
  • Sawsan AlQahtani, Mona Diab and Wajdi Zaghouani

    ARLEX: A Large Scale Comprehensive Lexical Inventory for Modern Standard Arabic  (OSACT3 Workshop, LREC 2018, Miyazaki, Japan (7-12 May 2018)

    2018
  • Wajdi Zaghouani

    Language Technologies for Social Media. INFuture 2017, Zagreb, Croatia

    2017
  • Wajdi Zaghouani, Abdelati Hawwari, Sawsan Alqahtani, Houda Bouamor, Mahmoud Ghoneim, Mona Diab and Kemal Oflazer

    Using Ambiguity Detection to Streamline Linguistic Annotation, In Proceedings of Coling Workshop "Computational Linguistics for Linguistic Complexity" (CL4LC), Osaka Japan, December 2016

    2016
  • Wajdi Zaghouani, Nizar Habash, Houda Bouamor, Ossama Obeid, Sawsan Alqahtani, Mona Diab and Kemal Oflazer

    Filtering Dialectal Arabic Text in Two Large Scale Annotation Projects. The 2nd Workshop on Noisy User-generated Text (W-NUT), December 11 2016, Osaka, Japan

    2016
  • Wajdi Zaghouani, Abdelaati Hawaari and Mona Diab

    AMPN: A Lexical Semantic Resource for Arabic Morphological Patterns. International Journal of Speech Technology (IJST).

    2016
  • Wajdi Zaghouani, Ahmed Abdelali, Francisco Guzman and Hassan Sajjad.

    Normalizing Mathematical Expressions to Improve the Translation of Educational Content. In Proceedings of the AMTA 2016 Workshop Semitic Machine Translation (SeMaT) Collocated with EMNLP 2016 Workshops on November 1st, 2016 Austin, Texas, USA

    2016
  • Wajdi Zaghouani and Dana Awad

    Toward an Arabic Punctuated Corpus: Annotation Guidelines and Evaluation.  In Proceedings of The 2nd workshop on Arabic Corpora and Processing Tools 2016 Theme: Social Media.

    2016
  • Zaghouani, Wajdi, Houda Bouamor, Abdelati Hawwari, Mona Diab, Ossama Obeid, Mahmoud Ghoneim, Sawsan Alqahtani, Kemal Oflazer

    Guidelines and Framework for a Large Scale Arabic Diacritized Corpus. In Proceedings of the International Conference on Language Resources and Evaluation (LREC'2016).

    2016
  • Zaghouani Wajdi, Nizar Habash, Ossama Obeid, Behrang Mohit, Houda Bouamor, Kemal Oflazer

    Building an arabic machine translation post-edited corpus: Guidelines and annotation. In Proceedings of the International Conference on Language Resources and Evaluation (LREC'2016).

    2016
  • Ossama Obeid, Houda Bouamor, Zaghouani, Wajdi, Mahmoud Ghoneim, Abdelati Haww ari, Sawsan Alqahtani, Mona Diab, Kemal Oflazer

    MANDIAC: A Web-based Annotation System For Manual Arabic Diacritization. In Proceedings of The 2nd Workshop on Arabic Corpora and Processing Tools 2016 Theme: Social Media

    2016
  • E.A. Draffan, Mike Wald, Nawar Halabi, Ouadie Sabia, Wajdi Zaghouani, Amatullah Kadous, Amal Idris, Nadine Zeinoun, David Banes, Dana Lawand

    Generating acceptable Arabic Core Vocabularies and Symbols for AAC users. In Proceedings of the 6th Workshop on Speech and Language Processing for Assistive Technologies, Interspeech 2015, Dreseden, Germany, 6-10 September 2015.

    2015
  • Zaghouani, Wajdi, Nizar Habash, Houda Bouamor, Alla Rozovskaya, Behrang Mohit, Abeer Heider and Kemal Oflazer

    Correction Annotation for Non-Native Arabic Texts: Guidelines and Corpus. In Proceedings of the 9th Linguistic Annotation Workshop, co- located with NAACL in Denver, Colorado, USA, 2015

    2015
  • Houda Bouamor, Wajdi Zaghouani, Mona Diab, Ossama Obeid, Kemal Oflazer, Mahmoud Ghoneim and Abdelati Hawwari

    A Pilot Study on Arabic Multi-Genre Corpus Diacritization. 2015. In Proceedings of the ACL 2015 Workshop on Arabic Natural Language Processing (ANLP), Beijin, China, July 2015.

    2015
  • Wajdi Zaghouani, Taha Zerrouki and Amar Balla

    SAHSOH@QALB-2015 Shared Task: A Rule-Based Correction Method of Common Arabic Native and Non-Native Speakers’ Errors. The Second QALB Shared Task on Automatic Text Correction for Arabic. In Proceedings of the ACL 2015 Workshop on Arabic Natural Language Processing.

    2015
  • Alla Rozovskaya; Houda Bouamor; Nizar Habash; Wajdi Zaghouani; Ossama Obeid; Behrang Mohit

    The Second QALB Shared Task on Automatic Text Correction for Arabic. In Proceedings of the ACL 2015 Workshop on Arabic Natural Language Processing (ANLP), Beijing, China, July 2015

    2015
  • Behrang Mohit; Alla Rozovskaya; Nizar Habash; Wajdi Zaghouani; Ossama Obeid

    The First QALB Shared Task on Automatic Text Correction for Arabic. In Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), Doha, Qatar, October 2014.

    2014
  • Serena Jeblee; Houda Bouamor; Wajdi Zaghouani; Kemal Oflazer CMUQ@QALB-2014: An SMT-based System for Automatic Arabic Error Correction.

    In Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP), Doha, Qatar, October 2014.

    2014
  • Wajdi Zaghouani, Behrang Mohit, Nizar Habash, Ossama Obeid, Nadi Tomeh and Kemal Oflazer

    Large-scale Arabic Error Annotation: Guidelines and Framework. in the Proceedings of the International Conference on Language Resources and Evaluation (LREC'2014), Rejkavik, Iceland, 26-31 May 2014

    2014
  • Wajdi Zaghouani and Kais Dukes.

    Can Crowdsourcing be used for Effective Annotation of Arabic? In Proceedings of the International Conference on Language Resources and Evaluation (LREC'2014), Rejkavik, Iceland, 26-31 May 2014

    2014
  • Wajdi Zaghouani

    Critical Survey of the Freely Available Arabic Corpora. In Proceedings of the International Conference on Language Resources and Evaluation (LREC'2014), OSACT Workshop. Rejkavik, Iceland, 26-31 May 2014

    2014
  • Ossama Obeid, Wajdi Zaghouani, Behrang Mohit, Nizar Habash, Kemal Oflazer and Nadi Tomeh

    A Web-based Annotation Framework For Large- Scale Text Correction. In Proceedings of IJCNLP’2013, Nagoya, Japan.

    2013
  • Abdelaati Hawwari, Mohsen Rashwan and Wajdi Zaghouani

    A Lexical Semantic Resource for Quranic Morphological Patterns. The International conference for the development of Quranic studies. http://www.quranicconferences.com/ . Riyadh, KSA. 16-20 February 2013

    2013
  • Wajdi Zaghouani. Arabic Natural Language Processing and the Future

    In proceedings of the CECTAL’13, Montreal, Canada. Sept 26th 2013

    2013
  • Hawwari, A.; Zaghouani, W.; O'Gorman, T.; Badran, A.; Diab, M.

    "Building a Lexical Semantic Resource for Arabic Morphological Patterns," Communications, Signal Processing, and their Applications (ICCSPA), 2013, vol., no., pp.1,6, 12-14 Feb. 2013.

    2013
  • Wajdi Zaghouani

    RENAR: A Rule-Based Arabic Named Entity Recognition System. ACM Trans. Asian Lang. Inf. Process. 11(1): 2 (2012).

    2012
  • Wajdi Zaghouani, Abdelati Hawwari and Mona Diab.

    A Pilot PropBank Annotation for Quranic Arabic. In Proceedings of the first workshop on Computational Linguistics for Literature, NAACL-HLT 2012, Montreal, Canada

    2012
  • Mohammed Maamouri, Wajdi Zaghouani, Violetta Cavalli-Sforza, Dave Graff and Mike Ciul

    Developing ARET: An NLP-based Educational Tool Set for Arabic Reading Enhancement. In Proceedings of The 7th Workshop on Innovative Use of NLP for Building Educational Applications, NAACL-HLT 2012, Montreal, Canada

    2012
  • Wajdi Zaghouani

    Étude sur la composition des noms de personnes dans la langue arabe. In proceedings of the 25th Journées de linguistique de Laval. 9-11 March 2011, Laval University , Québec, Canada.

    2011
  • Wajdi Zaghouani , Mona Diab , Aous Mansouri, Sameer Pradhan and Martha Palmer

    The Revised Arabic PropBank. In proceedings of the 4th Linguistic Annotation workshop ACL held in Uppsala. July 15-16 2010

    2010
  • Eric Atwell, Kais Dukes, Abdul-Baquee Sharaf, Nizar Habash, Bill Louw,Bayan Abu Shawar,Tony McEnery,Wajdi Zaghouani, Mahmoud El- Haj

    Understanding the Quran: a new Grand Challenge for Computer Science and Artificial Intelligence. In Grand Challenges in Computing Research for 2010 and beyond. part of  ACM-BCS Visions of Computer Science conference. 13-16 April 2010, Edinburgh University

    2010
  • Mohamed Maamouri, Ann Bies, Seth Kulick, Wajdi Zaghouani, Dave Graff and Mike Ciul

    From  Speech  to  Trees: Applying  Treebank  Annotation  to  Arabic  Broadcast News.  In Proceedings of LREC 2010, Valetta, Malta, May 17-23, 2010.

    2010
  • Wajdi Zaghouani, Bruno Pouliquen, Mohamed Ebrahim and Ralf Steinberger

    Adapting a resource-light highly multilingual Named Entity Recognition system to Arabic. In Proceedings of LREC 2010, Valetta, Malta, May 17-23, 2010.

    2010
  • Wajdi Zaghouani

    L'intégration d'un outil de repérage d'entités nommées pour la langue arabe dans un système de veille. Session Démo, TALN 2010, Montréal, 19-23 juillet 2010

    2010
  • Mona Diab, Aous Mansouri, Martha Palmer, Olga Babko-Malaya,Wajdi Zaghouani, Ann Bies, Mohammed Maamouri

    A  Pilot  Arabic  Propbank;  LREC  2008,  Marrakech, Morocco, May 28-30, 2008

    2008
  • J. VÉRONIS, O. HAMON, C. AYACHE, R. BELMOUHOUB, O. KRAIF, D. LAURENT, T.M.H. NGUYEN, N. SEMMAR, F. STUCK, W. ZAGHOUANI

    La campagne d'évaluation ARCADE II. In Chaudiron, S. & Choukri, K. (Eds.) L'évaluation des technologies de traitement de la langue (pp 47-69). Paris:  Hermes Science Publications, IC2 Cognition Collection.  ISBN 978-2-7462-1992-2.

    2008
  • Bruno Pouliquen, Marco Kimler, Ralf Steinberger, Camelia Ignat, Tamara Oellinger, Ken Blackler, FlavioFuart, Wajdi Zaghouani, Anna Widiger, Ann-Charlotte Forslund, Clive Best

    Geocoding multilingual texts: Recognition, Disambiguation and Visualisation. Proceedings of the (LREC'2006), pp. 53-58. Genoa, Italy, 24-26 May 2006

    2006
  • Yun-Chuang Chiao, Olivier Kraif, Dominique Laurent, Thi Minh Huyen Nguyen, Nasredine Semmar, François Stuck, Jean Véronis, Wajdi Zaghouani

    Evaluation of multilingual text alignment systems: the ARCADE II project. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006). Genoa, Italy, 24-26 May 2006

    2006
  • Yun-Chuang Chiao, Olivier Kraif, Dominique Laurent, Thi Minh Huyen Nguyen, Nasredine Semmar, François Stuck, Jean Véronis, Wajdi Zaghouani

    Evaluation of multilingual text alignment systems: the ARCADE II project. Proceedings of the 5th International Conference on Language Resources and Evaluation (LREC'2006). Genoa, Italy, 24-26 May 2006

    2006
  • Pouliquen Bruno, Ralf Steinberger, Camelia Ignat, Irina Temnikova, Anna Widiger, Wajdi Zaghouani & Jan Žižka

    Multilingual person name recognition and transliteration. Journal CORELA - Cognition, Représentation, Langage. Numéros spéciaux, Le traitement lexicographique des noms propres. Available online at: http://edel.univ- poitiers.fr/corela/document.php?id=490.

    2005