Interaction and Linked Views
In this assignment, we will be working with interactions and linked views. Visualizations may be created using Observable Plot or D3. You may use other libraries (e.g. lodash.js or jQuery) for non-visualization tasks, but you must credit them in the HTML file you turn in. Observable Plot has documentation and examples For D3, there is extensive documentation available as well as examples, and Vadim Ogievetsky’s example-based introduction that we went through in class is also a useful reference. Our in-class example showing linked highlighting will also be useful.
The assignment is due at 11:59pm on Wednesday, April 23.
You should submit any files required for this assignment on Blackboard. If you use
Observable, submit the .tar.gz
or .tgz
file
that is generated from the export menu and rename it to
a5.tar.gz
or a5.tgz
. If you create your own
files, please make sure the filename of the main HTML document is
a5.html
. Any other files should be linked to the main HTML
document accordingly relatively. Blackboard may
complain about individual files; if so, please zip the files and submit
the zip file instead.
In this assignment, we will examine data from the Citi Bike System in New York City. Each bike rental is logged with its start station, end station, and trip duration. We will examine a subset of these trips to examine patterns between stations across the city, aggregating the individual trips into routes. We will use both an directed node-link diagram and adjacency matrix visualizations. We will use filtering and linked highlighting to help deal with the large amount of data.
The trip data contains information about an individual bike rental, including:
start_station_id
: the start station’s identifierend_station_id
: the end station’s identifierduration
: the amount of time the trip tookstarted_at
: the start timestampended_at
: the end timestampThe stations data contains information about each location, including:
station_id
: station identifiername
: station namelat
: station latitudelon
: station longitudedistrict
the identifier of the NYC community district
where the station is locatedThe community district boundaries is a GeoJSON file where each feature is a district and has the following properties:
district
: the district identifiercentroid_lat
: the latitude of the centroid of the
districtcentroid_lon
: the longitude of the centroid of the
districtMake sure to include the following information in your notebook or main html file:
If you used any additional JavaScript libraries or code references, please append a note to this section indicating their usage to the text above (e.g. “I used the Lodash library to partition an array.”) Include links to the projects used. You do not need to adhere to any particular style for this text, but I would suggest using headings to separate the sections of the assignment.
I have created an Observable
notebook (need to be in NIU Team) that you can fork to begin. (If
you wish to use raw HTML/JS instead of a notebook, you can copy the Plot
code from the notebook.) This notebook contains some data processing as
well as two visualizations. The data processing loads the data and
creates a lookup for stations and aggregates the trips into objects that
correspond to the arrows, having start
and end
station ids as well as the count
of the number of trips.
The first visualization is a map showing the community districts with a
directed node-link diagram that encodes the number of trips between any
pair of stations using stroke width. The second visualization is an
adjacency matrix that encodes if there are any trips between any pair of
stations.
Combine these two visualizations into a single visualization with the two views juxtaposed horizontally. Consult the in-class notebook from lecture if you are not sure how to do this.
Update the adjacency matrix to encode the count of trips between the start and end station using an appropriate color scale. Add a legend.
You may have noticed that it can be rather difficult to see the individual trips in the node-link diagram due to the number of arrows. Add a slider to set a threshold for the number of trips an arrow must have to be shown. Filter out any arrows that have fewer than that number of trips. You can use Observable’s Inputs to create a slider. If you are using Observable, you can rely on reactive execution to update the visualization (referencing the threshold value in visualization). Otherwise, you will need to update the display of the arrows for those marks with counts below the threshold. For Part 3, it will work better to find a way to hide the arrow rather than remove it from the visualization!
It can be difficult to match the station names in the adjacency matrix with their locations on the map. We will use linked highlighting from the matrix to the node-link diagram. If the pointer is over a particular cell, both that cell and the corresponding stations and arrow in the node-link diagram should be highlighted.
Add event handlers to the cells to highlight the currently selected cell. Think about a good way to do this that doesn’t interfere with the existing encoding (remember we added a fill color in Part 1b).
pointerover
or pointerout
event
handler, you can get the selected node as
event.currentTarget
and can treat it as a D3 selection
using d3.select(event.currentTarget)
.Now, use the information from the currently selected cell to
highlight the start and end station for this edge in the node-link
diagram. Unfortunately, Observable Plot, unlike D3, does not set the
data or attributes on cells or dots. However, we have used the
z
attribute to order the elements according to the order in
tripCounts
. Thus, we can bind the data to the marks
ourselves using D3. Plot has a className
property that adds
the specified class to the div
element containing the plot
marks. We have specified the class for station dots as
stations
, the class for arrows as arrows
, and
the class for cells as edges
. Then, we just need to know
the types of elements we are going to bind for our
selectAll
calls. The dot mark creates circle
elements, the arrow mark creates path
elements, and the
cell mark creates rect
elements. For example, to select the
cells, we have
d3.select(adjMatrix).selectAll(".edges rect")
. Now, we need
to bind the data. Note that the stations
are created from
the stations
data while the other two marks are created
from the tripCounts
. Once the data is bound to the
elements, we can use D3’s methods to extract information or update
attributes.
filter
a selection (the points), but we need the correct boolean
expression.start_station_id
and
end_station_id
while the dot data stores just a
station_id
.classed
function for adding and removing a class.Finally, we want to highlight the arrow corresponding to the selected
cell. However, we also want to show an arrow if it is currently filtered
out (Part 2). One option is to add a class and use CSS to change the
style. However, this will still leave the arrow below others and
changing the drawing order could potentially cause issues. Another
option is to add a new arrow that basically copies the (potentially
hidden) arrow and restyles it. Remember that we can grab any attribute
from an existing graphical element using attr
so for the
arrow, we can grab its definition d
and append a new path
to the map that highlights it. We can also remove that path when the
highlighted arrow is no longer needed.
There is still a bit too much data for the visualization. Let’s create a new visualization that aggregates the trips by district. You can start from the original visualizations, but remember to give them new identifiers. We wish to change our adjacency matrix to aggregate by district, and our linked highlighting should highlight all arrows that correspond to the specified district connections.
We want a new matrix view that shows the districts and sums all trips
between those districts. Note that while we excluded trips that started
and ended at the same station, we can now have trips that start and end
in the same district (the diagonal will not be empty). Create a new
array (tripDistrictCounts
) that specifies the counts
between start and end districts and use that to draw a new matrix. Add a
color mapping as before.
stationLookup
to map from station id to
district.Finally, we want to create a second multiple view visualization using the same map as before (with a new identifier) and the district matrix we just created. Then, add linked highlighting so that selections in the district matrix highlight all arrows in the map that correspond to paths between stations in those districts.