eceyucer · eceyucer · Jan 31, 2025 · Jan 31, 2025 · Jan 31, 2025 · Jan 31, 2025
diff --git a/02_activities/assignments/Assignment2.md b/02_activities/assignments/Assignment2.md
@@ -45,16 +45,29 @@ There are several tools online you can use, I'd recommend [Draw.io](https://www.
 
 **HINT:** You do not need to create any data for this prompt. This is a conceptual model only. 
 
+##### Answer:
+<img src="assignment_2_bookstore-prompt1.png" width="500">
+
 #### Prompt 2
 We want to create employee shifts, splitting up the day into morning and evening. Add this to the ERD.
 
+##### Answer:
+<img src="assignment_2_bookstore-prompt2.png" width="500">
+
 #### Prompt 3
 The store wants to keep customer addresses. Propose two architectures for the CUSTOMER_ADDRESS table, one that will retain changes, and another that will overwrite. Which is type 1, which is type 2? 
 
 **HINT:** search type 1 vs type 2 slowly changing dimensions. 
 
+##### Answer:
+<img src="assignment_2_bookstore-prompt3.png" width="500">
+
 ```
-Your answer...
+From my research, a type 1 model would overwrite the old customer address with the new one, while a type 2 model would retain changes.
+In my opinion, in a bookstore database, it'd be more useful to overwrite the old addresses with new ones since keeping data that is not
+utilized would be redundant. However, if address types are not being considered (i.e., home, billing or shipping), a type 1 model might 
+results in a loss of important data. In my model, I have incorporated another table to define address types -- this can come handy if a 
+customer's home and billing addresses are different, and the correct address type can be updated when needed. 
 ```
 
 ***
@@ -182,5 +195,20 @@ Consider, for example, concepts of labour, bias, LLM proliferation, moderating c
 
 
 ```
-Your thoughts...
+The assigned article discusses the limitations of automatization and reliance on human labour to train neural nets. It is a 
+well-known fact that major fast fashion brands rely on human workers who have very low wages and work in oppressive conditions.
+On top of conditions that make it difficult to automatize sewing due to a lack of dexterity from robots and challenges to train
+automated models to keep up with new styles, the current system prefers this exploitative approach to maximize profits. Further,
+training datasets and reliable human coding of these datasets are imperative to building neural nets like large language models (LLM).
+The performance and outputs of these models are as good as their training datasets. If the implicit racial bias and sexist 
+stereotypes that the trainers hold impact the coding of the training datasets, this introduces implicit biases into the models that
+utilize the training sets. For example, in 2019, Google's Vision AI was the face of online discourse when the model was labelling
+a hand-held device differently based on skin tone -- if a Black person was holding the item, it was labelled as a "gun"; in contrast,
+it was labelled as a "monocular" when a white person was holding it. Google Translate is another example, where translating from
+a gender-neutral language such as Turkish to English results in a stereotypical generalization of professions (i.e., a sentence
+referring to a doctor uses he/him pronouns when the input sentence does not indicate gender). Overall, minimizing bias and bigotry 
+in technology and automated models boils down to a need for people to acknowledge their implicit biases and address them through 
+further discussions and better education. 
+
+On a side note, the robot attempting to fold a towel really mirrors my struggles putting a duvet cover on.
 ```
diff --git a/02_activities/assignments/assignment2.sql b/02_activities/assignments/assignment2.sql
@@ -20,7 +20,9 @@ The `||` values concatenate the columns into strings.
 Edit the appropriate columns -- you're making two edits -- and the NULL rows will be fixed. 
 All the other rows will remain the same.) */
 
-
+SELECT 
+product_name || ', ' || COALESCE(product_size,' ') || ' (' || COALESCE(product_qty_type, 'unit')  || ')'
+FROM product;
 
 --Windowed Functions
 /* 1. Write a query that selects from the customer_purchases table and numbers each customer’s  
@@ -32,17 +34,39 @@ each new market date for each customer, or select only the unique market dates p
 (without purchase details) and number those visits. 
 HINT: One of these approaches uses ROW_NUMBER() and one uses DENSE_RANK(). */
 
+-- option with row_number 
 
+SELECT
+customer_id
+,market_date
+,ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date) AS num_of_visits
+FROM customer_purchases;
 
 /* 2. Reverse the numbering of the query from a part so each customer’s most recent visit is labeled 1, 
 then write another query that uses this one as a subquery (or temp table) and filters the results to 
 only the customer’s most recent visit. */
 
-
+SELECT 
+customer_id
+,market_date
+FROM (
+	SELECT
+	customer_id
+	,market_date
+	,ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY market_date DESC) AS most_recent_visit
+	FROM customer_purchases
+) x
+WHERE most_recent_visit = 1;
 
 /* 3. Using a COUNT() window function, include a value along with each row of the 
 customer_purchases table that indicates how many different times that customer has purchased that product_id. */
 
+SELECT DISTINCT
+customer_id
+,product_id
+,COUNT() OVER (PARTITION BY customer_id, product_id) AS customer_purchase_count
+FROM customer_purchases
+ORDER BY customer_id, product_id;
 
 
 -- String manipulations
@@ -57,10 +81,22 @@ Remove any trailing or leading whitespaces. Don't just use a case statement for
 
 Hint: you might need to use INSTR(product_name,'-') to find the hyphens. INSTR will help split the column. */
 
+SELECT
+product_name
+,CASE WHEN INSTR(product_name,'-')
+	THEN TRIM(SUBSTR(product_name, INSTR(product_name, '-') + 1))
+	ELSE NULL
+	END as description
+FROM product;
 
 
 /* 2. Filter the query to show any product_size value that contain a number with REGEXP. */
 
+SELECT
+product_name
+,product_size
+FROM product
+WHERE product_size REGEXP  '[0-9]';
 
 
 -- UNION
@@ -73,7 +109,34 @@ HINT: There are a possibly a few ways to do this query, but if you're struggling
 3) Query the second temp table twice, once for the best day, once for the worst day, 
 with a UNION binding them. */
 
-
+SELECT
+market_date
+,daily_sales
+,rank AS [rank]
+,'the min' AS [preserve]
+FROM (
+	SELECT DISTINCT
+	market_date
+    ,SUM(quantity * cost_to_customer_per_qty)  AS daily_sales
+    ,RANK() OVER (ORDER BY SUM(quantity * cost_to_customer_per_qty) ASC) AS rank
+	FROM customer_purchases
+	GROUP BY market_date
+)x
+WHERE rank = 1
+
+UNION
+
+SELECT *
+,'the max' AS [preserve]
+FROM (
+	SELECT DISTINCT
+	market_date
+    ,SUM(quantity * cost_to_customer_per_qty) AS daily_sales
+    ,RANK() OVER (ORDER BY SUM(quantity * cost_to_customer_per_qty) DESC) AS rank
+	FROM customer_purchases
+	GROUP BY market_date
+)x
+WHERE rank = 1;
 
 
 /* SECTION 3 */
@@ -89,27 +152,64 @@ Think a bit about the row counts: how many distinct vendors, product names are t
 How many customers are there (y). 
 Before your final group by you should have the product of those two queries (x*y).  */
 
-
+WITH vendor_product AS (
+	SELECT DISTINCT
+	vendor_name
+	,product_name
+	,original_price
+	FROM vendor_inventory AS vi
+	INNER JOIN vendor AS V
+		ON vi.vendor_id = v.vendor_id
+	INNER JOIN product AS p
+		ON vi.product_id = p.product_id
+),
+big_customer_sales AS (
+	SELECT
+	vendor_name
+	,product_name
+	,original_price
+	,customer_id
+	FROM vendor_product
+	CROSS JOIN customer
+)
+SELECT
+vendor_name
+,product_name
+,SUM(5 * original_price) AS surge_earnings
+FROM big_customer_sales
+GROUP BY vendor_name, product_name
+ORDER BY vendor_name, product_name;
 
 -- INSERT
 /*1.  Create a new table "product_units". 
 This table will contain only products where the `product_qty_type = 'unit'`. 
 It should use all of the columns from the product table, as well as a new column for the `CURRENT_TIMESTAMP`.  
 Name the timestamp column `snapshot_timestamp`. */
 
+DROP TABLE IF EXISTS product_units;
+CREATE TABLE product_units AS
+SELECT p.*
+FROM product AS p
+WHERE product_qty_type = 'unit';
 
+ALTER TABLE product_units 
+ADD snapshot_timestamp time;
 
 /*2. Using `INSERT`, add a new row to the product_units table (with an updated timestamp). 
 This can be any product you desire (e.g. add another record for Apple Pie). */
 
-
+INSERT INTO product_units
+VALUES(10, 'Eggs', '1 dozen', 6, 'unit', CURRENT_TIMESTAMP);
 
 -- DELETE
 /* 1. Delete the older record for the whatever product you added. 
 
 HINT: If you don't specify a WHERE clause, you are going to have a bad time.*/
 
-
+DELETE FROM product_units
+--SELECT * FROM product_units --just for the testing purposes
+WHERE product_id = 10 
+AND snapshot_timestamp IS NULL;
 
 -- UPDATE
 /* 1.We want to add the current_quantity to the product_units table. 
@@ -128,6 +228,35 @@ Finally, make sure you have a WHERE statement to update the right row,
 	you'll need to use product_units.product_id to refer to the correct row within the product_units table. 
 When you have all of these components, you can run the update statement. */
 
+ALTER TABLE product_units
+ADD current_quantity INT;
 
-
+-- part one, getting the last quantity per product
+DROP  TABLE IF EXISTS last_quantity_per_product;
+CREATE TEMP TABLE last_quantity_per_product AS 
+	SELECT
+	product_id
+	,quantity
+	FROM (
+		SELECT *
+		,ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY market_date DESC) AS most_recent_day
+		FROM vendor_inventory
+	)x
+	WHERE most_recent_day =1; --create a temp table with most recent quantities of each product in vendor inventory
+
+-- part two, left join to add current quantity values to view the nulls
+SELECT *
+FROM product_units AS pu
+LEFT JOIN last_quantity_per_product AS lqpp
+	ON pu.product_id = lqpp.product_id;
+
+-- part three, actual update
+UPDATE product_units AS pu
+-- set current_quantity to most recent quantity or 0 if null
+SET current_quantity = COALESCE(( --use coalesce to replace any null values with 0
+	SELECT 
+	quantity
+	FROM last_quantity_per_product AS lqpp
+	WHERE lqpp.product_id = pu.product_id
+), 0);
 
diff --git a/02_activities/assignments/assignment_2_bookstore-prompt1.png b/02_activities/assignments/assignment_2_bookstore-prompt1.png
diff --git a/02_activities/assignments/assignment_2_bookstore-prompt2.png b/02_activities/assignments/assignment_2_bookstore-prompt2.png
diff --git a/02_activities/assignments/assignment_2_bookstore-prompt3.png b/02_activities/assignments/assignment_2_bookstore-prompt3.png