About

Tidbits from behind the scenes on Al Jazeera America's Interactive Multimedia team.

Team

Rhyne Piggott (senior exec. producer), Lam Thuy Vo (editor), Joanna Kao, Michael Keller, & Alex Newman.

Code

github.com/ajam

Projects
  • AJAM on Github
  • rss
  • archive
  • This article was cross-posted with the Tow Center for Digital Journalism at the Columbia Journalism School.
On the Interactive Multimedia team at Al Jazeera America (AJAM), we’re experimenting with reframing stories: how can we change the way people think about or understand a story?  Our most recent story is a map, of sorts, that aims to put the scope of the Syrian humanitarian crisis in a new context for Americans. 
The interactive allows the user top select a city, enter an address, or click on the map. Then it calculates where seven million people — the equivalent of all displaced Syrians — currently live and shows this population on the map.
The shaded areas represent three groups: one million child refugees, one million adult refugees and five million internally displaced Syrians. 
Simply hearing that seven million Syrians have been affected by the war is a fact that not many people can really grasp. But showing a New Yorker how Manhattan, Brooklyn Queens and parts of the Bronx would be completely filled with displaced Syrians is a visual that hits home, so to speak. 

Similarly, because the interactive takes into account population density, showing someone how those displaced people would be spread across a number of states if dropped in South Dakota makes the point another way.
Hearing readers
One of our main goals at AJAM is to produce stories that give a voice to the voiceless. On our team, we’re trying to tell similar stories in new formats, and, importantly, trying to extend our story into a conversation with readers.
One of the great things about working in interactive news is building projects that let readers see the story in their area. In that spirit, we built a screenshot server called Banquo that lets readers easily send us a picture of what they found and tell us why they found it interesting. We present these pictures and comments as a gallery below the interactive so readers can help us surface the interesting views of the interactive and more importantly, communicate their impressions to us and other readers.

Reading these comments actually made us aware of another angle to the story the interactive told. We always saw this project as making an equivalency: “Seven million Syrians is equal to X.” But a couple of readers saw it in a way I think tells a much more compelling story. As Douglas from Florida put it: 
“Looks like an asteroid hit the Heartland. Imagine if the entire population of Tennessee & Kentucky were displaced or had to seek refuge out of the country…” An anonymous reader similarly wrote “the number of refugees is more than the population of the state of tn, imagine if all of them were displaced?”
So, instead of just seeing numbers in context, these readers went a step further and imagined the effect of that many people losing their homes if it were to happen in the U.S.
Kelsey from DC noted that one of the only point of comparison we have is Hurricane Katrina: 
“The last major event I can recall in the US that displaced thousands was Katrina. Unsurprisingly, this is orders of magnitude larger.”
We’ve been lucky to get such thoughtful comments and ones that showed us things about the project that we hadn’t as clearly articulated as we were building the project.
How it works - The map
One of the things we like about this project is it can process any data point in the country and output something simple and visual. To do that we used CartoDB to hold a table of Census tracts in a PostGIS database that we could run queries against on the fly. 
The PostGIS query it runs does a few things. It first finds the nearest tracts to that point using an indexed nearest neighbor search (which the helpful folks at CartoDB showed us and which makes the query much faster than what I had before) and then starts adding up the 2010 population values in those tracts. It then runs three separate queries to grab just the group ranges we’re interested in, does an ST_Union() to merge all the tracts in a group together and does UNION ALL, which takes those three query results and makes each one a row in a single larger table. 
The original version of the query took around 3 to 5 seconds but with their help, it now runs between .5 and 2 seconds, the larger queries happening in rural areas where it needs to union more tracts.
The query returns three shapes in GeoJSON format, which Leaflet, the element that powers the map, can easily plot as a layer.
The screenshot service
The screenshot uses PhantomJS to visit the interactive from a server, take a screenshot of the map container and store the image. To do this, we made a Node.js library called Banquo which gives some special options to a Node.js wrapper called node-phantom. It uses some tricks to target and hide specific divs developed by Kevin Schaul in his screenshot module Depict. Essentially, you can give Banquo a URL, a CSS selector to target and CSS selectors to hide (such as the map zoom controls) and it will give you back the base64 encoded version of the image.
Wait a minute, why the base64 version of the image and not just write the image to a file?
We built Banquo to be really flexible and to be agnostic about what we do with this file. Actually doing stuff with this image data is where Banquo Server comes in.
Banquo Server is a simple Express.js server that is actually the thing that our client calls with a bunch of options. It gives these options over to Banquo, which returns the base64 image and a timestamp to the client as JSONP and upload the image data to S3 as a PNG.
Originally, we wanted to eliminate storing the image on S3 completely. The plan was to return the base64 data to the client and store that along with the reader’s comments. We didn’t end up going with that (to be discussed next) but we do render the image for the reader when the screenshot is processed. Showing the image about to be submitted is good feedback, clues the reader into what’s going on and lets them confirm if the screenshot was accurate. 
Storing comments and photos
Photos are stored on S3 using the timestamp as a file name and the rest of the reader comment data we submit as a Google Form. We’re big fans of using Google Forms for reader submissions because 1) they’re really easy to customize and 2) We’re not a database security expert so we sleep better knowing that they’re already sanitizing the input against any malicious code. Google Forms puts the results into a Google Spreadsheet, which everyone already knows how to use, so moderating the submissions is really easy. We have a column called “approved” and if it’s marked with a “y” it gets loaded into the client. 
The image data was too large to store in the Google Form unfortunately, otherwise we would have stored everything in the spreadsheet. The ease of use and peace of mind, however, outweighed the extra code in storing the images on S3.
Loading the comments
You can load data from Google Spreadsheets directly into your client using The Miso Project’s Dataset.js or Tabletop.js but here be dragons. It’s best to download your spreadsheet into a flat JSON or CSV file and load your data from there using a service like Flatware or Table Service.
For this particular task we used a library we made called Turntable, which copies the spreadsheet to S3 every five minutes. Turntable is a little different from those other libraries in that can has the option to only copy over moderator-approved rows and lets you specify a subset of columns to copy. The latter is nice because stripping out the submission time column that Google adds can reduce your file size. 
It was a lot of moving parts to put together, but in the end it all came together. You might say, that’s way too much for me to figure out but listed below is a cheatsheet on the libraries we built, which are 100% open source. Hopefully it will eliminate some of the startup cost in running your own project.
Banquo - Node.js module to take a screenshot of a given webpage
Banquo Server - An Express.js server, deployable to EC2 that will return your screenshot image data as JSONP.
Turntable - A Node.js script that can copy a Google Doc onto S3 as csv or json, allows for moderation and pruning to remove non-public reader information such as contact information.
— Michael Keller
Extras
Here’s a sketch of how the reader comment machines all work together:

And here some of our mock-ups:
Desktop:

Mobile:

    This article was cross-posted with the Tow Center for Digital Journalism at the Columbia Journalism School.

    On the Interactive Multimedia team at Al Jazeera America (AJAM), we’re experimenting with reframing stories: how can we change the way people think about or understand a story?  Our most recent story is a map, of sorts, that aims to put the scope of the Syrian humanitarian crisis in a new context for Americans. 

    The interactive allows the user top select a city, enter an address, or click on the map. Then it calculates where seven million people — the equivalent of all displaced Syrians — currently live and shows this population on the map.

    The shaded areas represent three groups: one million child refugees, one million adult refugees and five million internally displaced Syrians. 

    Simply hearing that seven million Syrians have been affected by the war is a fact that not many people can really grasp. But showing a New Yorker how Manhattan, Brooklyn Queens and parts of the Bronx would be completely filled with displaced Syrians is a visual that hits home, so to speak. 

    New York

    Similarly, because the interactive takes into account population density, showing someone how those displaced people would be spread across a number of states if dropped in South Dakota makes the point another way.

    Hearing readers

    One of our main goals at AJAM is to produce stories that give a voice to the voiceless. On our team, we’re trying to tell similar stories in new formats, and, importantly, trying to extend our story into a conversation with readers.

    One of the great things about working in interactive news is building projects that let readers see the story in their area. In that spirit, we built a screenshot server called Banquo that lets readers easily send us a picture of what they found and tell us why they found it interesting. We present these pictures and comments as a gallery below the interactive so readers can help us surface the interesting views of the interactive and more importantly, communicate their impressions to us and other readers.

    gallery

    Reading these comments actually made us aware of another angle to the story the interactive told. We always saw this project as making an equivalency: “Seven million Syrians is equal to X.” But a couple of readers saw it in a way I think tells a much more compelling story. As Douglas from Florida put it: 

    “Looks like an asteroid hit the Heartland. Imagine if the entire population of Tennessee & Kentucky were displaced or had to seek refuge out of the country…” An anonymous reader similarly wrote “the number of refugees is more than the population of the state of tn, imagine if all of them were displaced?”

    So, instead of just seeing numbers in context, these readers went a step further and imagined the effect of that many people losing their homes if it were to happen in the U.S.

    Kelsey from DC noted that one of the only point of comparison we have is Hurricane Katrina: 

    “The last major event I can recall in the US that displaced thousands was Katrina. Unsurprisingly, this is orders of magnitude larger.”

    We’ve been lucky to get such thoughtful comments and ones that showed us things about the project that we hadn’t as clearly articulated as we were building the project.

    How it works - The map

    One of the things we like about this project is it can process any data point in the country and output something simple and visual. To do that we used CartoDB to hold a table of Census tracts in a PostGIS database that we could run queries against on the fly. 

    The PostGIS query it runs does a few things. It first finds the nearest tracts to that point using an indexed nearest neighbor search (which the helpful folks at CartoDB showed us and which makes the query much faster than what I had before) and then starts adding up the 2010 population values in those tracts. It then runs three separate queries to grab just the group ranges we’re interested in, does an ST_Union() to merge all the tracts in a group together and does UNION ALL, which takes those three query results and makes each one a row in a single larger table. 

    The original version of the query took around 3 to 5 seconds but with their help, it now runs between .5 and 2 seconds, the larger queries happening in rural areas where it needs to union more tracts.

    The query returns three shapes in GeoJSON format, which Leaflet, the element that powers the map, can easily plot as a layer.

    The screenshot service

    The screenshot uses PhantomJS to visit the interactive from a server, take a screenshot of the map container and store the image. To do this, we made a Node.js library called Banquo which gives some special options to a Node.js wrapper called node-phantom. It uses some tricks to target and hide specific divs developed by Kevin Schaul in his screenshot module Depict. Essentially, you can give Banquo a URL, a CSS selector to target and CSS selectors to hide (such as the map zoom controls) and it will give you back the base64 encoded version of the image.

    Wait a minute, why the base64 version of the image and not just write the image to a file?

    We built Banquo to be really flexible and to be agnostic about what we do with this file. Actually doing stuff with this image data is where Banquo Server comes in.

    Banquo Server is a simple Express.js server that is actually the thing that our client calls with a bunch of options. It gives these options over to Banquo, which returns the base64 image and a timestamp to the client as JSONP and upload the image data to S3 as a PNG.

    Originally, we wanted to eliminate storing the image on S3 completely. The plan was to return the base64 data to the client and store that along with the reader’s comments. We didn’t end up going with that (to be discussed next) but we do render the image for the reader when the screenshot is processed. Showing the image about to be submitted is good feedback, clues the reader into what’s going on and lets them confirm if the screenshot was accurate. 

    Storing comments and photos

    Photos are stored on S3 using the timestamp as a file name and the rest of the reader comment data we submit as a Google Form. We’re big fans of using Google Forms for reader submissions because 1) they’re really easy to customize and 2) We’re not a database security expert so we sleep better knowing that they’re already sanitizing the input against any malicious code. Google Forms puts the results into a Google Spreadsheet, which everyone already knows how to use, so moderating the submissions is really easy. We have a column called “approved” and if it’s marked with a “y” it gets loaded into the client. 

    The image data was too large to store in the Google Form unfortunately, otherwise we would have stored everything in the spreadsheet. The ease of use and peace of mind, however, outweighed the extra code in storing the images on S3.

    Loading the comments

    You can load data from Google Spreadsheets directly into your client using The Miso Project’s Dataset.js or Tabletop.js but here be dragons. It’s best to download your spreadsheet into a flat JSON or CSV file and load your data from there using a service like Flatware or Table Service.

    For this particular task we used a library we made called Turntable, which copies the spreadsheet to S3 every five minutes. Turntable is a little different from those other libraries in that can has the option to only copy over moderator-approved rows and lets you specify a subset of columns to copy. The latter is nice because stripping out the submission time column that Google adds can reduce your file size. 

    It was a lot of moving parts to put together, but in the end it all came together. You might say, that’s way too much for me to figure out but listed below is a cheatsheet on the libraries we built, which are 100% open source. Hopefully it will eliminate some of the startup cost in running your own project.

    Banquo - Node.js module to take a screenshot of a given webpage

    Banquo Server - An Express.js server, deployable to EC2 that will return your screenshot image data as JSONP.

    Turntable - A Node.js script that can copy a Google Doc onto S3 as csv or json, allows for moderation and pruning to remove non-public reader information such as contact information.

    — Michael Keller

    Extras

    Here’s a sketch of how the reader comment machines all work together:

    Servers

    And here some of our mock-ups:

    Desktop:

    Desktop

    Mobile:

    Mobile

    • November 5, 2013 (3:00 pm)
    1. mhkeller reblogged this from ajamsessions
    2. tophtucker likes this
    3. toffeemilkshake reblogged this from ajamsessions
    4. bluechoochoo likes this
    5. ajamsessions posted this