User Guide ▸
Community Group ▸
Join our Groups.io Group to connect with other Media Cloud researchers.
Use our API ▸
We have a public API that allows access to data from our archive, including searching through our collection of over 5 billion sentences.
The API allows access to a broad array of our data, including:
- Over 25 thousand media sources
- Millions of stories collected from those media sources
- Over 5 billion sentences parsed from those stories
As described below, an authentication key is required to access the API. Register for an account here. After you have the account, get your API key on your profile page. You can use our Python API client library to easily call our API. The full API spec is available with the rest of the code on Github.
What is MediaCloud?
Media Cloud is an open source and open data platform for storing, retrieving, visualizing, and analyzing online news.
What type of data does Media Cloud collect?
The bulk of our data is news stories from media sites around the web. In order to allow for insightful analyses of media ecosystems we also optionally collect data such as hyperlinks, Bitly clicks, Facebook shares, and Twitter shares.
How does Media Cloud get the data?
Media Cloud collects most of its content through the RSS feeds of the media sources we follow. We only have data for a source from the time we started scraping its RSS feeds.
What tools exist so that I can explore this data?
At this moment, we support three main tools. Dashboard is the tool that allows you to search our database, visualize the results of your search, and download a CSV file with the urls of the stories in our database that match your query. Topic Mapper is a tool that, taking the results of a Dashboard query, crawls the Web in search of new relevant stories by following hyperlinks, and allows for different types of influence analysis and visualization. Source Manager is the tool with which to explore the different sources and media collections from which we collect data, and add new ones.
How do I get data?
Our tools are designed to visualize in different ways all the data we have, but also to allow you to download and transfer it to other tools. On the top right corner of most of our tools you will find a menu with the download options. Due to copyright restrictions we cannot release the actual text of a story.
What data can I have access to?
We are committed to share as much data as we possibly can, so you can access all the data that we have and download it to your own computer. Due to copyright restrictions we cannot release the actual text of a story.
Can I download the content of the stories?
Due to copyright restrictions we cannot provide the actual news content, but we can give you a complete list of urls so you can check the content yourself.
What can I do with the Dashboard tool?
You can find out how much the media have been talking about your subject of interest over time, which were the key events that drove coverage about it, which are the words most frequently used around the keywords you searched for, and which media sources have covered the issue—if you want to get into details, you can explore the list of stories. You can also draw comparisons among queries, since the tool is designed to make these easy.
What can I do with the Topic Mapper tool?
Topic Mapper allows you to answer deeper questions than Dashboard, such as: Which are the most influential sources when covering a particular topic? Which were the most relevant stories about a specific issue? Which media form different linking communities? Are there groups of sources that use similar language when talking about an issue? Which stories have more social media traction? How does the structure of online news coverage about an issue evolve over time?
Can I run my own topic?
You will be able to create new topics very soon. For now you can explore a series of topics we’ve created for different research projects.
Can I add sources to the database?
If a source or a set of sources is not already part of our database, you can suggest its addition through the Sources tool, and we will carefully consider your suggestion. Our first inclination is to say yes to suggestions.