-
Private Illegal drug trafficking text classification dataset for PRESERVE (AI generated)
This dataset contains labelled conversations that correspond to conversation in forums, social media or instant messaging applications. It's a dataset for binary... -
Private Terrorists recruitment text classification dataset for PRESERVE (AI generated)
This dataset contains labelled conversations that correspond to conversation in forums, social media or instant messaging applications. It's a dataset for binary... -
Private Hate Speech text classification dataset for PRESERVE (AI generated)
This dataset contains labelled conversations that correspond to conversation in forums, social media or instant messagging applications. It's a dataset for hate speech... -
Private SubCat: A Dataset of Subordinate Categories in Human Mind and LLMs for the It...
People can categorize the same entity at multiple taxonomic levels, such as basic (bear), superordinate (animal), and subordinate (grizzly bear). While prior research has... -
Code and data accompanying the paper: Quantifying Privacy Risks in Synthetic ...
This repository contains the code and data for the paper “Quantifying Privacy Risks in Synthetic Data: A Study on Black-Box Membership Inference”. It enables full... -
Private AE-SAD
Tensorflow implementation of AE-SAD This repository provides a Tensorflow implementation of the AE-SAD method for (semi-)supervised anomaly detection. Citation and Contact... -
Private Masking Models for Outlier Explanation (M2OE)
$\text{M}^2 \text{OE}$ - Masking Models for Outlier Explanation This repository provides a Python implementation of the Masking Models for Outlier Explanation ($\text{M}^2... -
Synthetic data for recruitment
The datasets consist of a pair of tabular 2000 curricula and 2000 job offers generated by a trained generative causal model. The generation process followed a causal graph...-
ZIP
The resource: 'Synthetic%20data%20for%20re ...' is not accessible as guest user. You must login to access it!
-
ZIP
-
CoSRec
CoSRec is the first dataset explicitly designed for joint Conversational Search and Recommendation (CSR) tasks. CoSRec comprises approximately 9,000 user-system conversations... -
CATALINA Model - Cognitive AgenT prActicaL reasonINg Architecture
This is the prototype of a novel agent architecture, called CATALINA (Cognitive AgenT prActicaL reasonINg Architecture), that expands Bratman's Belief-Desire-Intention (BDI)... -
Private Shifting LLMs style to fool Machine Generated Text detectors
Datasets of synthetic news article generated by aligning LLMs using Direct Preference Optimization to shift the machine-generated texts' (MGT) style toward human-written text... -
LLM-Driven Explanations for Quantum Algorithms
This item contains the replication package of the paper Exploring LLM-Driven Explanations for Quantum Algorithms. In particular, it contains the explanations generated by a...-
ZIP
The resource: 'Replication Package' is not accessible as guest user. You must login to access it!
-
ZIP
-
Gender Equality Plans in Italian Universities
This dataset contains the documents describing the gender equality plans extracted for each public Italian university. Documents are divided by Italian regions and have been...-
ZIP
The resource: 'final_gep_dataset' is not accessible as guest user. You must login to access it!
-
ZIP
-
Private Twitter users retweet
The dataset was collected using the tweepy API (http://docs.tweepy.org), a Python library for accessing the Twitter API. We selected 14 Twitter accounts, and we obtained all... -
EVALITA 2020 HT
This dataset is obtained by transforming the training and test data of the two EVALITA tasks into an LLM prompt following a template. The tasks involved are AMI2020 (misogyny...-
ZIP
The resource: 'EVALITA_2020_bloom_it' is not accessible as guest user. You must login to access it!
-
ZIP
-
Private Battery State of Health in smart grids Dataset
Smart Grids are the evolution of traditional electric grids and allow two-way flows of electricity and information between different actors. At the edge of this network,... -
Experimental results from the Empirical Investigation of the Completeness of ...
This is the raw data from the empirical investigation of the paper “Completeness of Datasets Documentation on ML/AI repositories: an Empirical Investigation”. This work aim of... -
Private Optimizing Empty Container Repositioning and Fleet Deployment via Configurabl...
We introduce a novel framework, Configurable SemiPOMDPs, to model this type of problems. Furthermore, we provide a two-stage learning algorithm, “Configure & Conquer”... -
Bark Beetle Outbreak Czech Republic
Repository containing satellite dataset created for bark beetle outbreak detection in satellite (Sentinel-1 and Sentinel-2) images. The dataset refer to scenes observed in... -
Private Vegetation of a basin of the Po river Dataset
We provide two climatological dataset composed by D = 136 (with 1038 samples) and D = 1991 (with 981 samples) continuous climatological features and a scalar target, which...
