COSMOS One

COSMOS One was my first iteration and a test bed to work through ideas about user interface and visualization design, uses for Twitter data and how that data could be mashed with other sources.

The visualizations in COSMOS One are based on four analytics:

For more information on identifying demographics in Twitter data, please read:

The COSMOS One user interface is composed of four areas:

  • The data source control tabs
  • The filters panel
  • The stats panel
  • The visualization tabs

The COSMOS One user interface

The Data Source Control Tabs

The data source control tabs enable users to select either real time or archive search mode.

Real Time Mode

In real time mode, COSMOS One presents Twitter data from the one percent streaming API. The stats panels and many of the visualization tabs update to reflect the changes in the Twitter stream.

The real time tab provides two continually updated statistics:

  • The number of incoming tweets per second (typically ranging from 30 to 100)
  • The number of incoming tweets per second that match the filter settings

The COSMOS One Real Time tab

Archive Search Mode

In archive search mode, COSMOS One enables users to search several collections of Twitter data.

Archive searches are expressed in a natural language query input that structures the query clauses into an English sentence.

The COSMOS One Archive Search tab

The Filters Panel

The filters panel contains a set of individually controllable filters; checked filters are active, unchecked filters are inactive. Each active filter adds a clause to the query that retrieves the results. As filters are checked or unchecked and filter parameters are updated, a natural language explanation is built dynamically that clarifies the query in English.

The COSMOS One Filters panel

The Stats Panel

The stats panel contains three tables that provide users with an overview of the data in three categories:

  • The gender breakdown
  • The language breakdown
  • The occurrence of hashtags

The COSMOS One Stats panel

The Visualization Tabs

COSMOS One provides seven tabs that visualize Twitter data in a variety of styles:

  • Gender Profanity
  • Sentiment and Tension
  • Gender Geotagging
  • Frequency Analysis
  • Network Analysis
  • Census
  • Clustering

The Gender Profanity Tab

The Gender Profanity tab presents a tabular view of the Twitter data with each row providing the following information for each tweet:

  • The text
  • The gender of the author
  • The sentiment of the text

Gender is presented with a color-coded background:

  • Blue - male
  • Pink - female
  • Green - unisex
  • White - unknown

Sentiment is presented as a double-ended bar chart. With zero at the centre, the red negative sentiment bar increases in magnitude to the left, and the blue positive sentiment increases in magnitude to the right. The double-ended sentiment bar chart is a custom-built Java component that provides an implementation of a custom Swing JTable table cell renderer.

Rows with white text on a red background highlight profanity within the tweet text. Profanity is identified by matching the words in the tweet text with words in a profanity dictionary.

The COSMOS One Gender Profanity tab

The Sentiment and Tension Tab

The Sentiment and Tension tab presents the sentiment and tension of the tweet text as line charts over time.

The first line chart presents the sentiment and tension over time aggregated over all genders. The second and third line charts present the sentiment and tension broken down by male and female, respectively.

The data used to build the three sentiment and tension line charts is exportable in CSV format. Exporting the data enables further manipulation and visualization of the data in other applications such as spreadsheets.

The COSMOS One Sentiment and Tension tab

The Gender Geotagging Tab

The Gender Geotagging tab presents geolocated tweets on a map as color-coded circular markers:

  • Blue - male
  • Pink - female
  • Green - unisex
  • White - unknown

The map is provided by OpenStreetMap and is implemented using the JMapViewer Swing component. Both map and terrain views are user selectable.

The COSMOS One Gender Geotagging tab

The Frequency Analysis Tab

The Frequency Analysis tab provides frequency views at three levels of temporal granularity:

  • The day frequency chart shows the number of tweets authored per day
  • The hour frequency chart shows the number of tweets authored per hour
  • The minute frequency chart shows the number of tweets authored per minute

Moving the mouse over the bars enables users to scrub details on the exact day, hour or minute represented by the current bar. As users scrub over the bars, the tweets represented by each bar are displayed in the table to the right. This table is similar to the tabular presentation used by the Gender Profanity tab. The difference is there is no gender column. Gender is instead represented by color-coding the background of each row.

The day, hour and minute frequency charts are connected by the double-ended sliders below each chart. The green slider below the day frequency chart selects a day range that controls the data displayed in the hour frequency chart. The hour frequency chart displays the number of tweets authored per hour within the day range selected by the day frequency slider. Similarly, the yellow slider below the hour frequency chart selects an hour range that controls the data displayed in the minute frequency chart. The minute frequency chart displays the number of tweets authored per minute within the hour range selected by the hour frequency chart.

The data used to build the three bar charts is exportable in CSV format. Exporting the data enables further manipulation and visualization of the data in other applications such as spreadsheets.

The COSMOS One Frequency Analysis tab

The Network Analysis Tab

The Network Analysis tab provides a social network graph based on retweets or mentions.

The network graph is built automatically using a force-directed layout algorithm without requiring users to input configuration parameters or other settings. This approach to user interface design enables users to work with the network immediately, which is contrast to tools such as Gephi that require considerable knowledge and user input to produce a network graph.

The size of the nodes represents one of three user-selectable network metrics:

Nodes are filtered with a slider that removes nodes with a metric value greater than the slider value. Nodes that have been filtered out and the edges that connect them are displayed in two user-selectable styles:

Dimmed
Filtered out nodes and edges are drawn semi-transparently to maintain their context in the network
Hidden
Filtered out nodes and edges are not drawn to provide a clean view of the remaining nodes and edges

Node labels are filtered with a slider that hides the labels of nodes with a metric value greater than the slider value.

The data used to build the social network graph is exportable in GEXF, GraphML and JSON format. Exporting the network graph data enables further manipulation and visualization in applications such as Gephi or libraries such as D3.

The COSMOS One Network Analysis tab

The Census Tab

The Census tab presents ideas for mashing together naturally-occurring social media data with the curated data of the 2001 UK census (the results of the 2011 census were not available at the time).

A choropleth map of the London boroughs visualizes the level of unemployment in each borough with a shade of green; the lighter the shade, the higher the level of unemployment. Mousing over each borough highlights the border of the borough in red and updates the ethnicity bar chart to the right of the choropleth.

The ethnicity bar chart shows the number of inhabitants of the borough broken down by the ethnicity categories used by the UK census; the longer the bar, the greater the number of inhabitants with that ethnicity.

Unemployment was a useful statistic because it provides a numeric range of data suitable for calculating the color range of the choropleth. Similarly, ethnicity was a useful statistic because it provides a numeric range for calculating the lengths of the bars over a fixed number of categories.

Clicking a point on the boroughs choropleth displays the crime for district containing the point. The crime statistics are retrieved from the UK Police API and broken down by the eight major crime types used by the Metropolitan police force.

The COMSOS One Census tab

The Clustering Tab

The clustering tab enables COSMOS One users to log in to a remote clustering service running on a Jenkins server. The Jenkins web application is rendered in a JavaFX WebPane component.

The COSMOS One Clustering tab