The goal of this assignment is to work with lists and dictionaries in Python.
You will be doing your work in a Jupyter notebook for this
assignment. You may choose to work on this assignment on a hosted
environment (e.g. tiger)
or on your own local installation of Jupyter and Python. You should use
Python 3.9 or higher for your work. To use tiger, use the credentials
you received. If you work remotely, make sure to download the .ipynb
file to turn in. If you choose to work locally, Anaconda is the easiest way
to install and manage Python. If you work locally, you may launch
Jupyter Lab either from the Navigator application or via the
command-line as jupyter-lab
.
In this assignment, we will be working with data from the United States Department of Agriculture’s FoodData Central. Rather than using this dataset directly, I have created a subset of this data, which can be read as a list of dictionaries. That data is located here, and I have created a template notebook, a3.ipynb, that contains a cell that will download and read that data. You can right-click and save-as the a3.ipynb file, and, if working on tiger, upload that file. Once loaded, the data is a list of dictionaries where each dictionary has nine key-value pairs. Those keys and a brief description are:
fdc_id
: a unique identifier assigned by FoodData
Centralbrand_owner
: the company that makes the productbrand_name
: a brand name, if different from the
companydescription
: the product’s name or descriptionbranded_food_category
: the category for the food
productingredients
: a comma-separated string of ingredients in
the productserving_size
: the serving size of the product in the
units specified by serving_size_unit
serving_size_unit
: the units for the serving size
valuenutrition
: a list of dictionaries containing nutrition
information; each dictionary contains the keys name
,
amount
, and unit_type
and their associated
valuesYou will be answering queries and writing functions to help analyze this data. You may not use external libraries including statistics, collections, datetime, or pandas for this assignment.
The assignment is due at 11:59pm on Monday, February 21.
You should submit the completed notebook file required for this
assignment on Blackboard. The
filename of the notebook should be a3.ipynb
.
Please make sure to follow instructions to receive full credit. Use a markdown cell to Label each part of the assignment with the number of the section you are completing. You may put the code for each part into one or more cells.
The first cell of your notebook should be a markdown cell with a line for your name and a line for your Z-ID. If you wish to add other information (the assignment name, a description of the assignment), you may do so after these two lines.
Find all of the possible values for serving size units. List each type only once!
Write code to find the food item in the dataset with the largest
serving size among those in grams (serving unit type is ‘g’). Output the
name description and
brand_owner of the food item. Remember that you will
need to iterate through each element of the list, and each element is a
dictionary which has various keys including
serving_unit_size
and name
.
Write code to create a dictionary, category_counts
that
keeps track of how many items each food category
(branded_food_category
) has listed in our sample dataset.
Next, use this dictionary to find and display the name of the category
that has the largest number of items.
count_letters
example from class may be usefulUpdate the list of each food item’s nutrition
information to include the amount of unsaturated fat. This can be
computed by subtracting the amount of saturated fat from the amount of
total fat. You will need to add a new dictionary to the
list of nutrition information. The keys for name
and
unit_type
should be “Unsaturated Fat” and “G”,
respectively. The amount is what you are computing via the subtraction.
After computing this for all items, an item would, for example, now look
like this:
'fdc_id': 374367,
{'brand_owner': 'Swift-Eckrich Inc.',
'brand_name': None,
'description': 'PEPPERONI',
'branded_food_category': 'Pepperoni, Salami & Cold Cuts',
'ingredients': 'PORK, BEEF, SALT, CONTAINS 2% OR LESS OF FLAVORINGS, LACTIC ACID STARTER CULTURE, OLEORESIN OF PAPRIKA, SODIUM NITRITE, SPICES, SUGAR, BHA, BHT, CITRIC ACID.',
'serving_size': 28.0,
'serving_size_unit': 'g',
'nutrition': [{'name': 'Fiber', 'amount': 0.0, 'unit_name': 'G'},
'name': 'Saturated Fat', 'amount': 14.29, 'unit_name': 'G'},
{'name': 'Carbohydrates', 'amount': 3.57, 'unit_name': 'G'},
{'name': 'Sodium', 'amount': 1786.0, 'unit_name': 'MG'},
{'name': 'Total Fat', 'amount': 39.29, 'unit_name': 'G'},
{'name': 'Protein', 'amount': 21.43, 'unit_name': 'G'},
{'name': 'Calories', 'amount': 464.0, 'unit_name': 'KCAL'},
{'name': 'Sugar', 'amount': 0.0, 'unit_name': 'G'},
{'name': 'Unsaturated Fat', 'amount': 25.0, 'unit_name': 'G'}]} {
None
value in the upper part
of the range, and set the sum of two values to None
whenever either value is None
.Write a function filter_by_fiber
that
takes two arguments, min_fiber
and max_fiber
,
and returns a list of food items whose amount of fiber is in the
specified range, inclusive. For each item, you will
need to find the Fiber listing in the nutrition list. Do
not assume that item will be in a particular index of the list!
Then, test whether the item’s amount of fiber is in the specified range,
only including it in the returned list if it satisfies the condition.
For example, the list comprehension
[d['description'] for d in filter_by_fiber(6.3,6.35)]
should evaluate to:
For example,
'SISTERS FRUIT COMPANY, RED DELICIOUS SLICED APPLE CHIPS, LIGHT & CRISPY',
['PINTO BEANS',
'VANILLA ALMOND PREMIUM NATURALLY FLAVORED GRANOLA',
"BUSH'S Red Beans in a Mild Chili Sauce 16 oz",
'FRUIT & NUT GRANOLA, FRUIT & NUT',
'VANILLA ALMOND WARM VANILLA FLAVOR PERFECTLY MIXED WITH SWEET HONEY AND SATISFYING ALMONDS PREMIUM GRANOLA, VANILLA ALMOND']
Only CSCI 503 students need to complete this part. CSCI 490 students may complete it for extra credit.
Write a function filter_by_ingredients
that will filter
the food items by their ingredients. Specifically, given an ingredient
(e.g. “Apple”), return the food items that have that ingredient. Note
that you should do a case-insensitive comparison so the
ingredient “apple” should return food items that list “APPLE”, “Apple”,
“apple”, etc. Do not worry about “apple” also matching “pineapple” (this
is extra credit). For example,
len(filter_by_ingredients('apple')
should evaluate to
605
and the list comprehension
[d['description'] for d in filter_by_ingredients('saffron')]
should evaluate to:
'WILD MUSHROOM & TRUFFLE',
['ARTICHOKE HEARTS',
'FLAN DESSERT MIX',
'CHICKEN TIKKA MASALA WITH SAFFRON RICE, MEDIUM',
'CON AZAFRAN SEASONING',
'SEASONED YELLOWRICE']