Assignment 3

Goals

The goal of this assignment is to work with lists and dictionaries in Python.

Instructions

You will be doing your work in a Jupyter notebook for this assignment. You may choose to work on this assignment on a hosted environment (e.g. tiger) or on your own local installation of Jupyter and Python. You should use Python 3.12 for your work. (Older versions may work, but your code will be checked with Python 3.12.) To use tiger, use the credentials you received. If you work remotely, make sure to download the .ipynb file to turn in. If you choose to work locally, Anaconda or miniforge are probably the easiest ways to install and manage Python. If you work locally, you may launch Jupyter Lab either from the Navigator application (anaconda) or via the command-line as jupyter-lab or jupyter lab.

In this assignment, we will be working with data from the United States Department of Agriculture’s FoodData Central. Rather than using this dataset directly, I have created a subset of this data, which can be read as a list of dictionaries. That data is located here, and I have created a template notebook, a3.ipynb, that contains a cell that will download and read that data. You can right-click and save-as the a3.ipynb file, and, if working on tiger, upload that file. Once loaded, the data is a list of dictionaries where each dictionary has nine key-value pairs. Those keys and a brief description are:

  • fdc_id: a unique identifier assigned by FoodData Central
  • brand_owner: the company that makes the product
  • brand_name: a brand name, if different from the company
  • description: the product’s name or description
  • branded_food_category: the category for the food product
  • ingredients: a comma-separated string of ingredients in the product
  • serving_size: the serving size of the product in the units specified by serving_size_unit
  • serving_size_unit: the units for the serving size value
  • nutrition: a list of dictionaries containing nutrition information; each dictionary contains the keys name, amount, and unit_type and their associated values

You will be answering queries and writing functions to help analyze this data. You may not use external libraries including statistics, collections, datetime, or pandas for this assignment.

Due Date

The assignment is due at 11:59pm on Monday, September 30.

Submission

You should submit the completed notebook file required for this assignment on Blackboard. The filename of the notebook should be a3.ipynb.

Details

Please make sure to follow instructions to receive full credit. Use a markdown cell to Label each part of the assignment with the number of the section you are completing. You may put the code for each part into one or more cells.

0. Name & Z-ID (5 pts)

The first cell of your notebook should be a markdown cell with a line for your name and a line for your Z-ID. If you wish to add other information (the assignment name, a description of the assignment), you may do so after these two lines.

1. Serving Size Units (5 pts)

Find all of the possible values for serving size units. List each type only once!

Hints
  • Iterate through all of the list elements, and extract the serving size unit from each element
  • Which data type will work best for this?

2. Largest Serving Size (10 pts)

Write code to find the food items in the dataset with the largest serving size among those in mililiters (serving unit type is ‘ml’). There may be multiple items with the same maximum serving size. Output the description and brand_owner of each food item. Remember that you will need to iterate through each element of the list, and each element is a dictionary which has various keys including serving_unit_size and name.

3. Category Counts (10 pts)

Write code to create a dictionary, category_counts that keeps track of how many items each food category (branded_food_category) has listed in our sample dataset. Next, use this dictionary to find and display the name of the category that has the largest number of items.

Hints
  • The count_letters example from class may be useful

4. Add Unsaturated Fat (15 pts)

Update the list of each food item’s nutrition information to include the amount of unsaturated fat. This can be computed by subtracting the amount of saturated fat from the amount of total fat. You will need to add a new dictionary to the list of nutrition information. The keys for name and unit_type should be “Unsaturated Fat” and “G”, respectively. The amount is what you are computing via the subtraction. After computing this for all items, an item would, for example, now look like this:

{'fdc_id': 1106099,
 'brand_owner': 'Rovira Biscuit Corporation',
 'brand_name': None,
 'description': 'TITA CRACKERS',
 'branded_food_category': 'Crackers & Biscotti',
 'ingredients': 'ENRICHED WHEAT FLOUR (NIACIN, IRON, THIAMINE MONONITRATE (VITAMIN B1), RIBOFLAVIN (VITAMIN B2), FOLIC ACID), SUGAR, VEGETABLE SHORTENING (CONTAINS PARTIALLY HYDROGENATED SOYBEAN OIL, AND/OR COTTONSEED OIL, AND/OR CANOLA OIL) *ADDS A DIETARILY INSIGNIFICANT AMOUNT OF SATURATED FAT, GLUCOSE, MALT, AMMONIUM BICARBONATE, SALT, GINGER, SODIUM BICARBONATE, SODIUM SULFITE, ARTIFICIAL FLAVOR, ARTIFICIAL COLORS (YELLOW #5, YELLOW #6, RED #40).',
 'serving_size': 15.0,
 'serving_size_unit': 'g',
 'nutrition': [{'name': 'Carbohydrates', 'amount': 80.0, 'unit_name': 'G'},
  {'name': 'Saturated Fat', 'amount': 0.0, 'unit_name': 'G'},
  {'name': 'Calories', 'amount': 400.0, 'unit_name': 'KCAL'},
  {'name': 'Protein', 'amount': 6.67, 'unit_name': 'G'},
  {'name': 'Sugar', 'amount': 20.0, 'unit_name': 'G'},
  {'name': 'Fiber', 'amount': 0.0, 'unit_name': 'G'},
  {'name': 'Sodium', 'amount': 267.0, 'unit_name': 'MG'},
  {'name': 'Total Fat', 'amount': 6.67, 'unit_name': 'G'},
  {'name': 'Unsaturated Fat', 'amount': 6.67, 'unit_name': 'G'}]}

5. Filter by Fiber Range (15 pts)

Write a function filter_by_fiber that takes two arguments, min_fiber and max_fiber, and returns a list of food items whose amount of fiber is in the specified range, inclusive. For each item, you will need to find the Fiber listing in the nutrition list. Do not assume that item will be in a particular index of the list! Then, test whether the item’s amount of fiber is in the specified range, only including it in the returned list if it satisfies the condition. For example, the list comprehension [d['description'] for d in filter_by_fiber(7.3,7.35)] should evaluate to:

For example,

['ATHLETE FUEL ORGANIC MUESLI',
 "WILBUR'S OF MAINE, ALL NATURAL DARK CHOCOLATE CRANBERRIES",
 'MULTIGRAIN PIZZA DOUGH',
 'APPLE CINNAMON GRANOLA',
 'VANILLA ICED LATTE CHILLED COFFEE DRINK, VANILLA',
 'KODIAK CAKES, GRANOLA UNLEASHED, VERMONT MAPLE PECAN',
 'PROTEIN CEREAL, OATS & HONEY',
 'PEACE CEREAL, SUPERGRAINS CEREAL, MAPLE BUCKWHEAT HEMP',
 'ARTISAN BLEND GRANOLA',
 'HONEY ALMOND GRANOLA, HONEY ALMOND',
 'HONEY ALMOND CEREAL, HONEY ALMOND']
Hints
  • Python allows chained comparisons
  • For each item, you don’t need to check any other entries in the nutrition information once you’ve found the one for fiber

6. [CSCI 503 Only] Filter by Ingredients (15 pts)

Only CSCI 503 students need to complete this part. CSCI 490 students may complete it for extra credit.

Write a function filter_by_ingredients that will filter the food items by their ingredients. Specifically, given an ingredient (e.g. “Apple”), return the food items that have that ingredient. Note that you should do a case-insensitive comparison so the ingredient “apple” should return food items that list “APPLE”, “Apple”, “apple”, etc. Do not worry about “apple” also matching “pineapple” (this is extra credit). For example, len(filter_by_ingredients('apple') should evaluate to 605 639 and the list comprehension [d['description'] for d in filter_by_ingredients('saffron')] should evaluate to:

['SHISH KABOB SEASONING',
 'MANITOU TRADING COMPANY, ALL NATURAL PAELLA RICE',
 'SEASONED YELLOW RICE',
 'SEASONED YELLOWRICE']
Hints
  • Make sure to handle the case where the item has no ingredients (e.g. the value is None)
  • Make sure your comparison allows case differences in both the argument and the ingredients
  • str.upper may help

Extra Credit

  • CSCI 490 Students may complete Part 6 for extra credit.
  • Update Part 6 so that it differentiates between individual ingredients. Then, for example, “apple” should not match “pineapple”.