The Meedan Blog Archive

Advancing New Media Research: Review of USIP Report

Very interesting to read the conclusions of a special report by a panel of new media experts and researchers for the US Institute of Peace.

The report advocates for new research tools and methods to understand the impact of new media - particularly tools that parse languages other than English, new 'open' approaches to collecting data, and new approaches to categorization:

Better research tools are urgently needed. Although some highly promising tools exist, they need to be developed so that they can parse languages other than English. New tools that can identify the tone of communication would help greatly but would also require major technological advances.

The disparity between publicly available data on new media and those held by private companies (or, in some cases, publicly owned companies in other countries) is considerable. Public-private partnerships, or initiatives sponsored by well-respected nongovernmental bodies, are needed to create frameworks that would allow research on the consequences of new media.

Being sensitive to the differences between, and relationships among, the various kinds of new media is also important. Blogs are different from text messages, and both are different from social networking sites. Categorizing these media in terms of their form and likely consequences would help advance research and policy.

The report goes on to cite a set of tools that provide useful templates for future research:

the Berkman Center's Media Cloud  - - which uses language processing to help conduct content analysis on an archive of news stories and blog posts.

Stanford and Cornell's Meme Tracker - - which shows the spread and change in 'memes' (short term behaviour patterns - such phrases that spread through a community over the course of a day or week)

Morningside Analytics linkage maps of the Persian and Arabic blogospheres -

The authors are impressed by these tools, but acknowledge that there are limitations with them.  There are three concerns that crop up. First, the authors want to see tools that can be deployed for multilingual contexts. As a Meedani, this is music to my ears. But I also know how hard it is to do content analysis in fluid linguistic contexts.  Arabic web use, for example, shows lots of code switching between different styles of the language, different alphabets, and even different languages (see Dina Basnaly's Twitter Stream as an example).

Second, they want to ensure that tools can respond to changing circumstances.  This is all the more important given that the media environment can now change within a day (I am thinking of the example of Twitter during the 2009 Iran protests). Meme tracker does well on this, but the other tools tend to take snapshots.

Third, the authors are concerned about the sources that these tools are able to work on. There is always going to be some labour involved in building a database of sources. Can any set of sources truly be 'comprehensive'?

I am surprised though that the report did not look at two really interesting projects: Swift - and Journalisted which have very powerful solutions to some of these problems.

At Meedan, we spent some time trying to tackle these issues back in 2009. We developed a structured data set, all open source, comprising hundreds of Middle East media on Freebase . The idea was that we could crowdsource data - such as publisher, location, language - around all kinds of web-based media, from blogs to news sites to twitter streams. You should be able to see for each source some nutritional information that could help you distinguish between, for sake of example, Al Manar and Annahar.  To date though the database has not developed the community we need to maintain it.

To close my thoughts on this, a question - what could Meedan be doing today to help researchers and web users better understand the changing new media environment?  What simple tools would help improve awareness of social and political trends in diverse communities around the world?