Machine translation system broadens access to information using the power of deep learning

Entity: Qatar Computing Research Institute
HBKU QCRI’s Shaheen Achieves Milestone With Over 1 Billion Words Translated

Qatar Computing Research Institute (QCRI), part of Hamad Bin Khalifa University (HBKU), celebrated a milestone recently when its machine translation system, Shaheen, marked over 1 billion words translated. 

Using statistical and deep learning methods for language processing, Shaheen yields accurate Arabic versions of English content and vice versa. As the official language of 25 countries, Arabic is spoken by over 400 million people and widely used globally in scientific and artistic literature. By making Arabic content accessible to the outside world, the sophisticated system facilitates knowledge sharing and learning by providing wider access to information. 

Dr. Hassan Sajjad, Scientist, QCRI, said: “One of the priority areas of QCRI is Arabic language technologies with the intention of promoting the language in the information age. Shaheen, which has been widely used worldwide in different fields and applications, is one of QCRI's successes in line with this objective. Shaheen uses a state-of-the-art technology that preserves context in translating between languages, providing users with high quality content.”

Shaheen’s success was made possible by the QCRI Arabic Language Technologies (ALT) team at HBKU including Dr. Hassan Sajjad, Dr. Nadir Durrani, Dr. Ahmed Abdelali and Fahim Dalvi. The team also had the support of Dr. Stephan Vogel and Dr. Francisco Gúzman during the initial stages of the system’s development.

The QCRI development team used a comprehensive collection of Arabic and English documents of various types, styles and topics, such as United Nations proceedings, news, TED talks, movie subtitles and educational lectures, and performed billions of computations to train and hone the system. They developed artificial intelligence-based domain adaptation and generalization techniques that allow the model to learn translating between two languages from heterogeneous data while maintaining high quality translations. 

Since its launch in 2018, Shaheen has been used in 46 countries across five continents by different organizations such as Al Jazeera Media Network, the BBC and Deutsche Welle. To date, Shaheen has rendered over 171,363 computing hours or approximately 7,000 days of producing translations that enhance the user’s understanding of the original document. 

To access the platform, please go to https://mt.qcri.org/api. For more information on QCRI, please visit qcri.hbku.edu.qa.

 


Related News

QCRI, MoEHE Co-host Python Competition

06 Jun 2023
Read more

UN and QCRI Alliance to Drive Peace with Technology

20 Feb 2023
Read more

FIFA World Cup Fans Learn More about Future Impact of AI on Jobs and Employment

29 Dec 2022
Read more

Newly Built QCRI Platform to Defend Against Security Threats

11 May 2022
Read more

QCRI’s Arabic Text-to-Speech System Advances Regional Solutions

11 Jan 2022
Read more

Middle School Seniors Become Technology Creators at Qatar Computing Research Institute’s Creative Space Summer Camp

17 Aug 2021
Read more

New Study by HBKU’s Qatar Computing Research Institute and Sidra Medicine Reveals Genetic Influence on Cancer Immune Responsiveness

18 Feb 2021
Read more