-
CoPhIR
The CoPhIR (Content-based Photo Image Retrieval) Test-Collection has been developed to make significant tests on the scalability of the SAPIR project infrastructure (SAPIR:... -
The Italian Music Dataset
The dataset is built by exploiting the Spotify and SoundCloud APIs. It is composed of over 14,500 different songs of both famous and less famous Italian musicians. Each song...-
JSON
The resource: 'Dataset' is not accessible as guest user. You must login to access it!
-
JSON
-
GERDAQ Dataset
This is a benchmark dataset of annotated search-engine queries. Mentions of entities in search-engine queries are tagged with the entity they refer to. Wikipedia is used as...-
XML
The resource: 'GERDAQ dataset' is not accessible as guest user. You must login to access it!
-
XML
-
ArchiveSpark
ArchiveSpark is an Apache Spark framework for easy data access, processing, extraction as well as derivation for Web archives and archival collections. It has a simple and... -
German Academic Web
The dataset contains regular crawls of the websites for German academic institutions. -
MSN Search query log
The data consists of an MSN Search query log excerpt with 15 million queries, from US users, sampled over one month of activity. Data attributes made available per query: 1)... -
Product Reviews for Ordinal Quantification
This data set comprises a labeled training set, validation samples, and testing samples for ordinal quantification. It appears in our research paper "Ordinal Quantification... -
Wikipedia Word Embeddings
Embeddings were created through applying word2vec skipgram to a corpus of wikipedia non-stub articles from a December 2015 English dump with the following parameters: -cbow 0... -
The Propagation of Misinformation in Social Media
There is growing awareness about how social media circulate extreme viewpoints and turn up the temperature of public debate. Posts that exhibit agitation garner... -
Explaining Explanation Methods
The most effective Artificial Intelligence (AI) systems exploit complex machine learning models to fulfill their tasks due to their high performance. Unfortunately, the most...-
HTML
The resource: 'Explaining Explanation Methods' is not accessible as guest user. You must login to access it!
-
HTML
-
Word Sense Evolution Testset
This testset consists of 23 terms which have experienced word sense change during the past centuries. The main changes for each term were found using Wikipedia, dictionary.com...-
ZIP
The resource: 'WSE-testset.zip' is not accessible as guest user. You must login to access it!
-
ZIP
