-
German Credit
In the german credit dataset each one of the 1,000 persons is classified as a good or bad creditor according to attributes like age, sex, checking_account, credit_amount,...-
CSV
The resource: 'German Credit' is not accessible as guest user. You must login to access it!
-
CSV
-
Twitter Dumps
The dataset consists of the 10% of the daily stream of tweets produced on Twitter filtered into 3 subsets: English, Italian, geo-referenced. The tweets are a random sample of... -
Open data from NervousNet
This dataset contains anonymized proximity information sent by 154 mobile phones (both Android and iPhone) via phone apps. These information are sent by bluetooth beacons every...-
ZIP
The resource: 'open data from NervousNet' is not accessible as guest user. You must login to access it!
-
ZIP
-
Car sharing dataset
The dataset comprises pickup and drop-off times and locations of vehicles in 10 European cities for one of the major free-floating car sharing operator. For nine of these... -
Twitter social bots
Spambots are automated accounts (i.e., accounts driven by a bot) that repeatedly advertise unsolicited and often harmful content (e.g., malware, URLs to phishing Web sites,... -
Broad Twitter Corpus
The Broad Twitter Corpus is a named entity-annotated dataset of tweets, collected in order to capture temporal, spatial and social diversity. The goal of the corpus is to...-
JSON
The resource: 'Broad Twitter Corpus' is not accessible as guest user. You must login to access it!
-
JSON
-
Estonian public sector electronic services and service providers and consumers
The dataset contains records of electronic services (aka X-Road services), service providers and consumers harvested in April 2014 from RIHA (https://riha.eesti.ee). The data... -
Twitter fake followers
Fake followers are fake accounts massively created to follow a target account and that can be bought from online markets. In other words, their goal is that of increasing the... -
Disease Twitter Dataset
This Twitter dataset covers two recent outbreaks: Ebola and Zika. About 60 million tweets were collected through a query-based access to the Twitter Streaming API, covering... -
e-MID interbank transactions
This dataset is an edgelist containing daily interbank transactions as registered in the electronic Market for Interbank Deposits (e-MID), in the period 2010--2014. e-MID is... -
GeoLife - GPS trajectories dataset
This (link to a) GPS trajectory dataset was collected in (Microsoft Research Asia) Geolife project by 182 users in a period of over three years (from April 2007 to August 2012)....-
ZIP
The resource: 'GeoLife Download page' is not accessible as guest user. You must login to access it!
-
ZIP
-
Russell 3000 stock prices
This dataset contains the price and volume of the 3000 stocks belonging to the Russell 3000 Index, roughly corresponding to the 3000 more capitalized stocks. Traded volume and... -
Mobility index for local quarantines in Chile
Fighting the COVID-19 pandemic, most countries have implemented non-pharmaceutical interventions like wearing masks, physical distancing, lockdown, and travel restrictions....-
CSV
The resource: 'Mobility Index for Local ...' is not accessible as guest user. You must login to access it!
-
CSV
-
GPS Tracks - Tuscany 2011
This dataset contains GPS trajectories of private vehicles crossing the region of Tuscany in Italy. It is composed of about 11 mln of trips of 150.000 users collected in May... -
Twitter dataset about two premier UK music festivals
The dataset contains twitter posts about two premier UK music festivals: Creamfields 2016 (on August 25th-28th) and VFestival 2016 (on August 20th-21st).-
Github
The resource: 'Twitter dataset about two ...' is not accessible as guest user. You must login to access it!
-
Github
-
Food consumption data at the canteens of University of Pisa
A dataset storing all the meals consumed by students at the canteen of University of Pisa during a six years-long period. -
Retail Market Data
This dataset contains Retail Market Data about food products, from 2007, for about 130 shops of an Italian Distribution chain. Data are of about 1 M of Active Clients, and... -
Compas
The compas dataset contains the features used by the COMPAS algorithm for scoring defendants and their risk (Low, Medium and High), for over $4,000$ individuals. We considered...-
CSV
The resource: 'https://www' is not accessible as guest user. You must login to access it!
-
CSV
-
-
xslx
The resource: 'Wyroles' is not accessible as guest user. You must login to access it!
-
xslx
