Assignment 5

Goals

Interaction and Linked Views

Instructions

In this assignment, we will be working with interactions and linked views. Visualizations may be created using Observable Plot or D3. You may use other libraries (e.g. lodash.js or jQuery) for non-visualization tasks, but you must credit them in the HTML file you turn in. Observable Plot has documentation and examples For D3, there is extensive documentation available as well as examples, and Vadim Ogievetsky’s example-based introduction that we went through in class is also a useful reference. Our in-class example showing linked highlighting will also be useful.

Due Date

The assignment is due at 11:59pm on Friday, December 6.

Submission

You should submit any files required for this assignment on Blackboard. If you use Observable, submit the .tar.gz or .tgz file that is generated from the export menu and rename it to a5.tar.gz or a5.tgz. If you create your own files, please make sure the filename of the main HTML document is a5.html. Any other files should be linked to the main HTML document accordingly relatively. Blackboard may complain about individual files; if so, please zip the files and submit the zip file instead.

Details

In this assignment, we will examine articles about musical artists on Wikipedia, specifically those classified as best-selling. Because Wikipedia encourages links between entities mentioned in an article, there are links from one artist’s page to others so we can build a network between artists if one is mentioned on another’s page. In addition, we can examine page views of each artist over time. We will use adjacency matrix visualization coupled with a multiple line graph to examine relationships between artists and Wikipedia readers’ interest of over.

Data

The data is a subset of the best-selling artists list for those artists whose first charted record was between 1980 and 1985 or after 2004.

The artist links is a two-attribute CSV file where an entry indicates a that the second artist is mentioned on the first artist’s page; each attribute is an artist’s name. The monthly page views is a dataset with three attributes:

  • artist: the artist’s name
  • views: the number of page views
  • timestamp: the month the page views were recorded, formatted as “YYYYMMDD” string

0. Info

Make sure to include the following information in your notebook or main html file:

  • Your name
  • Your student id
  • The course title (“Data Visualization (CSCI 627/490)”), and
  • The assignment title (“Assignment 5”)

If you used any additional JavaScript libraries or code references, please append a note to this section indicating their usage to the text above (e.g. “I used the Lodash library to partition an array.”) Include links to the projects used. You do not need to adhere to any particular style for this text, but I would suggest using headings to separate the sections of the assignment.

1. Line Plot (10 pts)

Create a line plot for each artist showing their page views over the past twelve months. All of the plots should be superimposed on a single set of axes. Observable Plot’s Line Mark should make this straightforward. Use a logarithmic scale for the number of views. Do not vary the stroke colors. If you wish, you may also use D3 for this, but it will take more work.

Hints:
  • You will want to convert types for views and timestamps, but note that the timestamp is a YYYYMMDD string. Use substring to convert this to either a “YYYY-MM-DD” string or the individual date components.
  • Use the z attribute.

2. Adjacency Matrix (20 pts)

Create an adjacency matrix for the artists where each row and column is an artist and the matrix cell is filled when the artist represented by the column is mentioned on the Wikipedia page of the artist represented by the row. You can use Observable Plot’s Cell Mark for this. Order the rows and columns by the number of other artists mentioned on an artist’s page. If you wish, you may also use D3 for this, but it will take more work.

Hints
  • Rotate the column labels (x-axis) using tickRotate
  • Make sure the labels are visible using the margin[Left|Right|Top|Bottom] attributes
  • You can set the domain for each axis to the sorted order from the artists
  • Consider using d3.rollup to count the number of links an artist has.
  • Remember that an array sort requires a comparison function

3. Linked Highlighting (30 pts)

Combine the two visualizations from Parts 1 and 2 in a single view. Assuming you have assigned the plots to variables adjMatrix and linePlot, you can do this with a flex layout on an outer div (remember to use a style rule here) and code that looks something like:

combined = html`<div id="all">
  ${adjMatrix}
  ${linePlot}
</div>`

In order to investigate if any links between artists correlate with their page views, we wish to make it possible to highlight the page views for the two artists that are linked in the adjacency matrix. Thus, if the pointer is over a particular cell, the lines for the page views for artists that intersect at that cell should be highlighted. Unfortunately, Observable Plot, unlike D3, does not set the data or attributes on cells so we need to find a way to embed the information needed to link these views. The title property offers a possibility. We can create a title for each cell that indicates the link and can also be used to locate the corresponding lines in the line plot (e.g. “Celine Dion – Taylor Swift”). Of course, we must also create a title for each line (path) in the line plot with the artist’s name.

Then, we can use a similar approach to the one from class to highlight the current cell and the corresponding lines. D3’s .on(<event>, <callback>) functions will help handle pointerover and pointerout events to instrument this functionality. In the handlers, we can obtain the title by selecting that element from the currentTarget and accessing its text(), but then we need to split it based on the delimiter we used in the cell title (e.g. " -- "). Then, we need to filter the lines to only highlight the two from the cell’s title. Here, we need to compare each of the artists to the title of the path elements in the line plot. D3 supports filtering elements using <selection>.filter, but note that to access this, you cannot use arrow functions–you must define functions using the function keyword. Make sure to highlight both the currently selected cell and the two corresponding lines.

Finally, enable linking highlighting in the other direction. When the pointer is over a line, highlighting the entire row and column of the adjacency matrix for that artist.

Hints
  • When you get one direction working, the other direction will be similar.
  • In a pointerover or pointerout event handler, you can get the selected node as event.currentTarget and can treat it as a D3 selection using d3.select(event.currentTarget).
  • In a filter function defined in a function() { ... } style, we can do d3.select(this) to access the element that is the subject of filtering. This allows d3.select(this).select("title").text(), for example.
  • Be careful, however, as some elements may not have a title, and will cause an error with this access. You can test if a selection is empty using d3’s empty check.
  • You can do selections inside the event handlers to select elements from the other views.
  • Remember to remove highlights from lines and cells other than the selected one.
  • If you use a CSS rule for highlighting, look at D3’s classed function for adding and removing a class.

4. [627 Only] Filtering (25 pts)

In order to better investigate trends between more than two artists, we can implement filtering by allowing users to select/deselect artists by clicking the adjacency matrix row labels. When a user selects a particular artist, the label’s text color (fill) should change to a previously unused color (use a meaningful color scheme), and the corresponding line should be highlighted with the same color. When a user deselects that artist, the label’s color and corresponding line should revert to the default (empty or "currentColor"). Note that you should not allow the user to select too many (more than 10) artists as it will be difficult to differentiate the colors.

Hints
  • Locating the tick labels in Plot is a bit tricky and requires an attribute selector. Specifically, Plot uses the aria-label attribute with value "y-axis tick label" for the y-axis labels.
  • Instead of pointer events, use the click event
  • It is probably easier to use the fill attribute directly instead of CSS
  • Remember to check if the clicked text has a fill set or not in order to determine whether to unset the fill and stroke or to assign and set a color
  • You can filter all text labels that do not have a fill color set
  • You can iterate through a selection, but remember that you will need to use d3.select(elt) to wrap the node for D3 calls like .attr('fill', ...)
  • A d3 color scheme can be indexed as an array so d3.<scheme>[i] evaluates to a color string that can be used for a fill.
  • Use .raise() to make sure the most recently selected artist’s line is visible (and not lost behind others)

Extra Credit

  • CSCI 490 students may do Part 4 for extra credit
  • When selecting lines, create an option to hide unselected lines (Parts 3 and 4)
  • Color the column text corresponding to the selected row label in Part 4.
  • Color the cells of the selected artists from Part 4. Where a row and column meet, design a way to show both colors.
  • Create methods to allow linked highlighting and filtering to work better together. In other words, don’t change the stroke color with linked highlighting but find another way to show the selection