-
Private Boilernet
Deploys an artificial neural network to remove the boilerplate from HTML files. Annotates the text content in the file or extracts the text from the HTML file. -
Private Distributed W2V
Accelerated training of Word Embeddings for large text corpora. Creates a word2vec-model from an input corpus of tokenized texts through the use of parallel distributed... -
Conversational search dataset with labels
CAsT 2019 data is split into two files one for training and the other one for testing. - Training set: CAsT 2019 conversations from training set and from test set without... -
Ego Networks of Words in Twitter
This set of dataframes were used in our last paper : Ollivier K, Boldrini C, Passarella A, Conti M (2022) Structural invariants and semantic fingerprints in the “ego network”... -
Dataset for Evaluating Abstractive Summaries of Crisis-Related Social Media
The dataset created for evaluation of summaries generated from social media posted during five natural disasters. The dataset contains: ground truth reports created by human... -
CoPhIR
The CoPhIR (Content-based Photo Image Retrieval) Test-Collection has been developed to make significant tests on the scalability of the SAPIR project infrastructure (SAPIR:... -
Social Network Analysis @MasterBigData2022
This course introduces students to the theories, concepts, and measures of Social Network Analysis (SNA), which is aimed at characterizing the structure of large-scale Online...-
PDF
The resource: 'Slides for the course' is not accessible as guest user. You must login to access it!
-
PDF
-
Facebook EuroSys 2009
This dataset contains Social and interaction graphs representing two large-scale Facebook regional networks. Social graphs describe Facebook friendships between users... -
Fast and scalable likelihood maximization for Exponential Random Graph Models
Exponential Random Graph Models (ERGMs) have gained increasing popularity over the years. Rooted into statistical physics, the ERGMs framework has been successfully employed...-
PDF
The resource: 'Fast and scalable ...' is not accessible as guest user. You must login to access it!
-
The resource: 'Github of the NEMtropy module' is not accessible as guest user. You must login to access it!
-
The resource: 'Github of the BiCM module' is not accessible as guest user. You must login to access it!
-
PDF
-
The Italian Music Dataset
The dataset is built by exploiting the Spotify and SoundCloud APIs. It is composed of over 14,500 different songs of both famous and less famous Italian musicians. Each song...-
JSON
The resource: 'Dataset' is not accessible as guest user. You must login to access it!
-
JSON
-
WBiCM
The method implements a maximum entropy model tailored for weighted bipartite graphs. This model incorporates constraints on degree sequences/topology and strength sequences. -
Egonetworks
This package contains classes and functions for the structural analysis of ego networks. An ego network is a simple model that represents a social network from the point of... -
Facebook - New Orleans regional network
This dataset contains information about 90,269 users and 3,646,662 friendship links between those users. These users belong to the New Orleans Facebook regional network. The...-
HTML
The resource: 'New Orleans Facebook dataset' is not accessible as guest user. You must login to access it!
-
HTML
-
MSN Search query log
The data consists of an MSN Search query log excerpt with 15 million queries, from US users, sampled over one month of activity. Data attributes made available per query: 1)... -
A dataset of gamers on Twitter
This gaming-related dataset consists of 8932 users (labeled as gamers) engaging in game-related conversations. We have collected (June 2018) their timeline (the most recent 3200... -
-
CSV
The resource: 'Ego Network of Words in ...' is not accessible as guest user. You must login to access it!
-
CSV
-
Tail granger causality network construction
This method constructs a causality network by implementing Granger-causality tests for extreme events in multivariate time series. -
Detecting Content That Triggers Polarization in Social Networks
We provide a method that finds echo chambers in online social networks. The method considers controversial contents and finds users of the network who discuss these contents...
