Visit the Data Analytics site for more information about the team's work. Some of our more recent projects and demos include:
NADEEF
NADEEF(which means ''clean'' in Arabic) is an extensible and generalized data cleaning system. Released as open source, NADEEF allows users to implement their own data repairing algorithms to replace default NADEEF data repair implementation.
Rayyan
The Systematic Reviews Web App. Rayyan aims to build tools to support the process of creating, analyzing, and maintaining systematic reviews, in terms of data extraction, cleaning, integration, and mining of published clinical trials and journal articles. A production system is available here.
KATARA
KATARA aims to perform trusted data cleaning by using reliable knowledge bases augmented with crowd sourcing for validation.
Analytics on Data Anomalies
Oftentimes users face errors in the results of a query. We introduce DBRx, a system for discovering concise explanations of data anomalies.
Web Data Integration
Web data is a great opportunity, but using it in analytics requires new solution to overcome the varierty and volatily. In this project we exploit web data for data integration tasks.
World Bank's Auto Geotagger
The World Bank's Auto Geotagger is a prototype which automatically identifies locations in documents from the World Bank Projects Data API using the Stanford Name Entity Recognizer (NER) and Alchemy, geocodes them with the Google Geocoder, Yahoo! Placefinder , and Geonames and visualizes them on a map.
Tamr
Tamr , a start-up founded in 2013, is based on technology developed at QCRI that allows for scalable data curation and integration by deduplicating the resulting dataset composite.
Join the Innovation!
QCRI’s vision is to be a global leader of computing research in identified areas that will bring positive impact to the lives of citizens and society.