CBrien99 · CBrien99 · Jan 25, 2025 · Jan 26, 2025 · kelichiu · Jan 27, 2025
diff --git a/02_activities/assignments/Assignment1.md b/02_activities/assignments/Assignment1.md
@@ -205,5 +205,8 @@ Consider, for example, concepts of fariness, inequality, social structures, marg
 
 
 ```
-Your thoughts...
+Databases and data systems are deeply integrated into our daily lives, shaping how we interact with the world, especially in an era where technology is so present. Data systems reflect the values, biases, and assumptions of their creators and, as a result, often carry built-in biases. In my everyday life, I encounter databases in various forms, from targeted advertisements on the internet to credit checks when applying for loans. As the world becomes increasingly interconnected through the internet, the scale of data creation and usage has grown immensely. This expansion demands that we remain cognisant about how we use data to avoid discriminating against marginalized groups and perpetuating systemic biases.
+Value systems are embedded in many different databases and can often have overlooked negative effects. Some examples of this include data systems like Canada’s healthcare and tax records that should aim to serve all citizens equitably. Unfortunately, these databases do not accurately represent the full diversity of our population. Healthcare databases frequently categorize gender as strictly male or female, excluding non-binary and transgender individuals from accurate representation. Similarly, tax and benefits systems are built around traditional family structures, which can disadvantage single parents or those in non-conventional family arrangements. These oversights reveal how outdated value systems are limiting fairness within these systems. Social structures are further reinforced by data systems in subtle but significant ways. Employment databases and hiring platforms, for example, often prioritize conventional career trajectories, penalizing women who have taken time off to raise their families or people who have faced unexpected hardships throughout their lives. It can be hard for these people who have followed non-linear career paths to have success when being evaluated using algorithms trained on databases containing data from traditional value systems. Automated hiring also often favour male candidates when trained on data from industries historically dominated by men. Similarly, educational databases, such as standardized testing systems, tend to favour students from higher socio-economic backgrounds, perpetuating existing cycles. Many educational databases reward the already privileged and fail to account for the structural barriers faced by marginalized groups. These examples highlight how technological systems trained on biased databases are active participants in shaping and perpetuating societal values.
+Although overt discrimination may no longer be legal, the embedding of outdated value systems in databases still occurs and can have grave effects on certain populations. Policing and justice systems often rely heavily on historical crime data that disproportionately target marginalized communities, particularly indigenous and black populations in Canada. Predictive policing tools trained on such data perpetuate cycles of surveillance and discrimination, embedding systemic inequities into the very algorithms that claim to be objective. Similarly, financial systems use databases to generate credit scores that continue to reflect historical biases against women and other underrepresented groups. Perhaps one of the most significant issues with databases is their failure to fully account for certain groups, rendering them invisible within these systems and therefore not accurately represented. Databases in fields like science and technology frequently underrepresent women, reinforcing the misconception that they are less capable or interested in these areas. This invisibility not only perpetuates inequities but also hinders progress by failing to capture the full diversity of human experience.
+
 ```
diff --git a/02_activities/assignments/SQL_assignment1_Entity_Relationship_Diagram.png b/02_activities/assignments/SQL_assignment1_Entity_Relationship_Diagram.png
diff --git a/02_activities/assignments/assignment1.sql b/02_activities/assignments/assignment1.sql
@@ -2,24 +2,57 @@
 /* SECTION 2 */
 
 
+
 --SELECT
 /* 1. Write a query that returns everything in the customer table. */
 
-
+SELECT 
+* 
+FROM 
+customer
 
 /* 2. Write a query that displays all of the columns and 10 rows from the cus- tomer table, 
 sorted by customer_last_name, then customer_first_ name. */
 
-
+SELECT 
+customer_id, 
+customer_first_name, 
+customer_last_name,
+customer_postal_code 
+FROM 
+customer 
+ORDER BY 
+customer_last_name, 
+customer_first_name
+LIMIT 10;
 
 --WHERE
 /* 1. Write a query that returns all customer purchases of product IDs 4 and 9. */
 -- option 1
 
+SELECT 
+* 
+FROM 
+customer_purchases
+WHERE 
+product_id 
+IN 
+(4, 9);
 
 -- option 2
 
-
+SELECT 
+* 
+FROM 
+customer_purchases
+WHERE 
+product_id 
+=
+4
+OR
+product_id 
+= 
+9;
 
 /*2. Write a query that returns all customer purchases and a new calculated column 'price' (quantity * cost_to_customer_per_qty), 
 filtered by vendor IDs between 8 and 10 (inclusive) using either:
@@ -28,47 +61,132 @@ filtered by vendor IDs between 8 and 10 (inclusive) using either:
 */
 -- option 1
 
+SELECT 
+*,
+(quantity * cost_to_customer_per_qty) 
+AS 
+price
+FROM
+customer_purchases
+WHERE 
+vendor_id >= 8 
+AND 
+vendor_id <= 10;
 
 -- option 2
 
-
+SELECT 
+*,
+(quantity * cost_to_customer_per_qty) 
+AS 
+price
+FROM 
+customer_purchases
+WHERE 
+vendor_id 
+BETWEEN 
+8 AND 10
 
 --CASE
 /* 1. Products can be sold by the individual unit or by bulk measures like lbs. or oz. 
 Using the product table, write a query that outputs the product_id and product_name
 columns and add a column called prod_qty_type_condensed that displays the word “unit” 
 if the product_qty_type is “unit,” and otherwise displays the word “bulk.” */
 
-
+SELECT 
+product_id, 
+product_name, 
+CASE 
+    WHEN 
+    product_qty_type = 'unit' 
+    THEN 'unit'
+        ELSE 'bulk'
+            END AS prod_qty_type_condensed
+FROM 
+product;
 
 /* 2. We want to flag all of the different types of pepper products that are sold at the market. 
 add a column to the previous query called pepper_flag that outputs a 1 if the product_name 
 contains the word “pepper” (regardless of capitalization), and otherwise outputs 0. */
 
-
+SELECT 
+product_id,      
+product_name, 
+CASE 
+    WHEN 
+    LOWER(product_name) 
+    LIKE '%pepper%' 
+    THEN 1
+        ELSE 0
+            END AS pepper_flag
+FROM 
+product;
 
 --JOIN
 /* 1. Write a query that INNER JOINs the vendor table to the vendor_booth_assignments table on the 
 vendor_id field they both have in common, and sorts the result by vendor_name, then market_date. */
 
-
-
+SELECT 
+vendor.*, 
+vendor_booth_assignments.*
+FROM vendor
+INNER JOIN 
+vendor_booth_assignments 
+ON 
+vendor.vendor_id = vendor_booth_assignments.vendor_id
+ORDER BY 
+vendor.vendor_name, 
+vendor_booth_assignments.market_date;
 
 /* SECTION 3 */
 
 -- AGGREGATE
 /* 1. Write a query that determines how many times each vendor has rented a booth 
 at the farmer’s market by counting the vendor booth assignments per vendor_id. */
 
-
+SELECT 
+vendor.vendor_id, 
+vendor.vendor_name, 
+COUNT
+(vendor_booth_assignments.vendor_id) AS booth_rentals
+FROM 
+vendor
+INNER JOIN 
+vendor_booth_assignments 
+ON 
+vendor.vendor_id = vendor_booth_assignments.vendor_id
+GROUP BY 
+vendor.vendor_id, 
+vendor.vendor_name
+ORDER BY 
+booth_rentals DESC;
 
 /* 2. The Farmer’s Market Customer Appreciation Committee wants to give a bumper 
 sticker to everyone who has ever spent more than $2000 at the market. Write a query that generates a list 
 of customers for them to give stickers to, sorted by last name, then first name. 
 
 HINT: This query requires you to join two tables, use an aggregate function, and use the HAVING keyword. */
 
-
+SELECT 
+customer.customer_id, 
+customer.customer_last_name, 
+customer.customer_first_name, 
+SUM(customer_purchases.quantity * customer_purchases.cost_to_customer_per_qty) AS total_spent
+FROM 
+customer
+INNER JOIN 
+customer_purchases 
+ON 
+customer.customer_id = customer_purchases.customer_id
+GROUP BY 
+customer.customer_id, 
+customer.customer_last_name, 
+customer.customer_first_name
+HAVING 
+total_spent > 2000
+ORDER BY 
+customer.customer_last_name, 
+customer.customer_first_name;
 
 --Temp Table
 /* 1. Insert the original vendor table into a temp.new_vendor and then add a 10th vendor: 
@@ -82,19 +200,58 @@ When inserting the new vendor, you need to appropriately align the columns to be
 VALUES(col1,col2,col3,col4,col5) 
 */
 
-
+CREATE 
+TEMPORARY TABLE new_vendor AS
+SELECT 
+* 
+FROM 
+vendor
+
+INSERT INTO 
+new_vendor 
+(
+    vendor_id, 
+    vendor_name, 
+    vendor_type, 
+    vendor_owner_first_name, 
+    vendor_owner_last_name
+    )
+VALUES 
+(
+    10, 
+    'Thomass Superfood Store', 
+    'Fresh Focused', 
+    'Thomas', 
+    'Rosenthal'
+    );
 
 -- Date
 /*1. Get the customer_id, month, and year (in separate columns) of every purchase in the customer_purchases table.
 
 HINT: you might need to search for strfrtime modifers sqlite on the web to know what the modifers for month 
 and year are! */
 
-
+SELECT 
+customer_id,
+STRFTIME('%m', market_date) AS month,
+STRFTIME('%Y', market_date) AS year
+FROM 
+customer_purchases;
 
 /* 2. Using the previous query as a base, determine how much money each customer spent in April 2022. 
 Remember that money spent is quantity*cost_to_customer_per_qty. 
 
 HINTS: you will need to AGGREGATE, GROUP BY, and filter...
 but remember, STRFTIME returns a STRING for your WHERE statement!! */
 
+SELECT 
+customer_id,
+SUM(quantity * cost_to_customer_per_qty) AS total_spent
+FROM 
+customer_purchases
+WHERE 
+STRFTIME('%m', market_date) = '04'
+AND 
+STRFTIME('%Y', market_date) = '2022'
+GROUP BY 
+customer_id;