Assignment 3

Goals

The goal of this assignment is to work with lists and dictionaries in Python.

Instructions

You will be doing your work in a Jupyter notebook for this assignment. You may choose to work on this assignment on a hosted environment (e.g. tiger) or on your own local installation of Jupyter and Python. You should use Python 3.14 for your work. To use tiger, use the credentials you received. If you work remotely, make sure to download the .ipynb file to turn in. IIf you choose to work locally, miniforge, Anaconda, pixi or uv are probably the easiest ways to install and manage Python. If you work locally, you may launch Jupyter Lab either from the Navigator application (anaconda) or via the command-line as jupyter-lab or jupyter lab.

In this assignment, we will be working with data about US Senators’ stock trading practices. In this case, we will be using publicly available data collated by InsiderFinance. I have created a version of this data, which can be read as a list of dictionaries. That data is located here, but it is compressed so you may use the following code to download and read this data into a python list of dictionaries in your notebook (copy and paste into a cell):

from pathlib import Path
import json
from urllib.request import urlretrieve
import gzip

# download the data if we don't have it locally
url = "http://faculty.cs.niu.edu/~dakoop/cs503-2026sp/a3/senate-stock-trades.json.gz"
local_fname = "senate-stock-trades.json.gz"
if not Path(local_fname).exists():
    urlretrieve(url, local_fname)

data = json.load(gzip.open(local_fname));

Once loaded, the data is a list of dictionaries where each dictionary has ten key-value pairs. Those keys and a brief description are:

  • office: the name of the senator reporting the transaction
  • owner: the owner of the asset (the senator or a family member)
  • transaction_date: the date of the transaction as a string in mm/dd/yyyy format
  • type: the type of transaction (purchase or sale)
  • asset_type: whether the asset is a stock, bond, cryptocurrency, etc.
  • symbol: the stock ticker symbol if it exists (e.g. AAPL)
  • amount_range: the amount of the transaction (a range specified by a list (min_amount, max_amount)).
  • party: the political party of the senator reporting the transaction
  • link: a link to the report

You will be answering queries and writing functions to help analyze this data. You may not use external libraries including statistics, collections, datetime, polars, or pandas for this assignment (the gzip, pathlib, json, and urlretrieve modules as used in the snippet above are ok for that purpose).

Due Date

The assignment is due at 11:59pm on Monday, February 16.

Submission

You should submit the completed notebook file required for this assignment on Blackboard. The filename of the notebook should be a3.ipynb.

Details

Please make sure to follow instructions to receive full credit. Use a markdown cell to Label each part of the assignment with the number of the section you are completing. You may put the code for each part into one or more cells.

0. Name & Z-ID (5 pts)

The first cell of your notebook should be a markdown cell with a line for your name and a line for your Z-ID. If you wish to add other information (the assignment name, a description of the assignment), you may do so after these two lines.

1. REIT Transactions (10 pts)

List the names of all senators who have been involved in transactions involving real estate investment trusts (REITs). These will have an asset_type of “REIT”. List each senator only once!

Hints
  • Iterate through all of the list elements, and examine the asset type from each element
  • Use a set

2. Largest Transaction (10 pts)

Write code to find the trades in the dataset that involved the most money. Note that the dataset only specifies a range for each transaction so you will need to find the maximum range. Output the names of the senators who were involved in those trades. You will need to determine what the maximum range is; ranges do not overlap. Only extract the trades that were in that range, and report the offices that were involved in those trades.

Hints
  • There are only two senators with transactions in the maximum range in this dataset
  • You can do this with only one iteration through the data…

3. Transaction Counts (10 pts)

Write code to create a dictionary that keeps track of how many sales (“Sale”, “Sale (full)”, “Sale (partial)”) transactions each senator has been involved in. You should find that Sen. Tuberville has 418 sales while Sen. Coons has 1 sale in the dataset.

Hints
  • Make sure to filter to only capture sales.
  • Can you check with only one boolean condition (i.e. no ors)?

4. Sales Sums (15 pts)

Write code to create a dictionary that keeps track of the sum of the sales that each senator has made. Since we only have ranges, your output should also be a range. For example, if a senator has two sales of [1001,15000] and [100001, 250000], the result will be [101002, 265000]. Your result should be a dictionary whose keys are the senators’ names and whose values are their sales sums. Sen. Whitehouse’s sum should be [882104, 3455000].

5. Median Transaction by Ticker Symbol (15 pts)

Write a function get_symbol_median that, given a ticker symbol, returns the median transaction range for that ticker symbol. Recall the median is the middle value. For a sorted list of values [1, 3, 4, 7, 21], the median is 4; for [1, 3, 4, 7, 13, 21], it is the average of the two middle values 4 and 7 = 5.5. The median range, unlike the sum, will be the middle range (after sorting) if we have an odd number of ranges for a particular symbol, and the union of two middle ranges (the lower bound from lower middle range and upper bound from higher middle range) if we have an even number of ranges. For example, the median of [[0,1], [1,3], [4,7]] is [1,3] while the median of [[0,1], [1,3], [4,7], [8,15]] is [1,7].

For example,

get_symbol_median('NVDA') # returns [1001, 15000]
get_symbol_median('MPWR') # returns [1001, 50000]

6. [CSCI 503 Only] Filter by Date (20 pts)

Only CSCI 503 students need to complete this part. CSCI 490 students may complete it for extra credit.

Write a function transactions_in_range that will filter the sales by date (inclusive). Specifically, given a start date and an end date, return the transactions that fall in that range. Note that you will need to parse the date strings, and then compare the dates in the correct order. Do not use a python library for this, but rather create a tuple that encodes the date and makes the comparison operators work as desired. For example, transactions_in_range("10/27/2025", "10/28/2025") returns

[{'office': 'Linda Sanchez',
  'owner': '',
  'transaction_date': '10/28/2025',
  'type': 'Sale',
  'asset_type': 'Stock',
  'symbol': 'CSCO',
  'amount_range': [1001, 15000],
  'party': 'Democrat',
  'link': 'https://disclosures-clerk.house.gov/public_disc/ptr-pdfs/2026/20033755.pdf'},
 {'office': 'Sheldon Whitehouse',
  'owner': 'Spouse',
  'transaction_date': '10/27/2025',
  'type': 'Sale',
  'asset_type': 'Stock',
  'symbol': 'PGR',
  'amount_range': [15001, 50000],
  'party': 'Democrat',
  'link': 'https://efdsearch.senate.gov/search/view/ptr/51a44263-fbff-415d-81fb-abf32d197db9/'},
 {'office': 'Sheldon Whitehouse',
  'owner': 'Spouse',
  'transaction_date': '10/27/2025',
  'type': 'Sale (Full)',
  'asset_type': 'Stock',
  'symbol': 'PGR',
  'amount_range': [15001, 50000],
  'party': 'Democrat',
  'link': 'https://efdsearch.senate.gov/search/view/ptr/51a44263-fbff-415d-81fb-abf32d197db9/'}]
Hints
  • str.split will help
  • To compare dates, compare the year, then month, then day