feat: crud functionality and aggregations for python backend by shuangela · Pull Request #5 · mongodb/docs-sample-apps

shuangela · 2025-10-24T19:17:24Z

Basic crud functionality and aggregation pipelines for python backend

This PR introduces basic CRUD functionality and aggregation pipelines for the FastAPI application. It sets up find, insert, delete, and find and delete operations. It also adds aggregations for movies by year, most recent comments (joining the movies collection with the comments collection), and by director.

Key Changes

Added CRUD functionality
Added three aggregation pipelines

Testing

Verified all endpoints locally via FastAPI docs (/docs)
Confirmed database connection and data persistence
Checked error responses for validation and connection issues

cbullinger

a couple comments/questions. nice job!

cbullinger · 2025-10-24T21:33:43Z

+@router.get("/{id}", response_model=SuccessResponse[Movie])
+async def get_movie_by_id(id: str):
+    # Validate ObjectId format
+    object_id = ObjectId(id)


should we wrap this in a try-except? i.e. what happens if id isn't valid?

cbullinger · 2025-10-24T21:38:11Z

+    print(f"Database name: {db.name if hasattr(db, 'name') else 'unknown'}")
+    print(f"Collection name: movies")
+
+    # For motor (async MongoDB driver), we need to await the aggregate call


I thought we weren't using Motor (instead using PyMongo's native async)

Suggested change

# For motor (async MongoDB driver), we need to await the aggregate call

# For async PyMongo driver, we need to await the aggregate call

Correct, we aren't using motor. Just async within the PyMongo driver.

i am not, i think copilot added this comment for some reason despite not using motor 😓

cbullinger · 2025-10-24T21:42:27Z

+
+    # For motor (async MongoDB driver), we need to await the aggregate call
+    cursor = await db.movies.aggregate(pipeline)
+    results = await cursor.to_list(length=None)  # Convert cursor to list


do we want to point out why we're using to_list() for aggregations vs. async for for find queries (what you did in lines 105-108)

cbullinger · 2025-10-24T21:44:17Z

+    cursor = await db.movies.aggregate(pipeline)
+    results = await cursor.to_list(length=None)  # Convert cursor to list
+
+    print(f"Aggregation returned {len(results)} results")  # Debug logging


is there a ticket to add proper logging?

I don't believe there's an official logging ticket, this was just my logging for my own testing purposes locally. I can remove it if that makes the code cleaner.

tmcneil-mdb

Great job!
These are some minor changes. Mostly to keep the code similar and adding in validation. I didn't get to the end of the file. I will get to find and delete on Monday.

I havent written an aggregation yet, so I might leave those for now. I will ping you, if I get to them.

tmcneil-mdb · 2025-10-24T22:59:32Z

+    object_id = ObjectId(id)
+
+    # Use findOne() to get a single document by _id
+    movie = await db.movies.find_one({"_id": object_id})


To grab the db, I added a function called get_collection from mongo_client.py file to make unit testing easier later. I am calling the db like this in the rest of the functions:

movies_collection = get_collection("movies")

tmcneil-mdb · 2025-10-24T23:04:11Z

+        genre (str): The genre of the movie.
+        year (int): The year the movie was released.
+        min_rating (float): The minimum IMDB rating.
+        max_rating (float): The maximum IMDB rating.


The request body for this the CreateMovieRequest object.

tmcneil-mdb · 2025-10-24T23:09:15Z

+    result = await db.movies.insert_one(movie_data)
+
+    # Retrieve the created document to return complete data
+    created_movie = await db.movies.find_one({"_id": result.inserted_id})


We need to verify that the document was created before querying it. A check that result is acknowledged.

tmcneil-mdb · 2025-10-24T23:16:43Z

+        SuccessResponse[Movie]: A response object containing the created movie data.
+"""
+
+@router.post("/", response_model=SuccessResponse[CreateMovieRequest], status_code=201)


Should be:
response_model=SuccessResponse[Movie]

We are returning the movie

tmcneil-mdb · 2025-10-24T23:29:58Z

+
+@router.delete("/{id}", response_model=SuccessResponse[dict])
+async def delete_movie_by_id(id: str):
+    object_id = ObjectId(id)


Wrap in a try/catch. Id might not be valid.

tmcneil-mdb · 2025-10-24T23:32:17Z

+    result = await db.movies.delete_one({"_id": object_id})
+
+    if result.deleted_count == 0:
+        raise HTTPException(status_code=404, detail="Movie not found")


Lets use the create_error_response() to keep the errors consistent.

tmcneil-mdb · 2025-10-24T23:32:43Z

+    object_id = ObjectId(id)
+
+    # Use deleteOne() to remove a single document
+    result = await db.movies.delete_one({"_id": object_id})


Same comment as above about accessing the db.

tmcneil-mdb · 2025-10-27T18:26:21Z

+
+@router.delete("/{id}/find-and-delete", response_model=SuccessResponse[Movie])
+async def find_and_delete_movie(id: str):
+    object_id = ObjectId(id)


Wrap in try /except

tmcneil-mdb · 2025-10-27T18:27:18Z

+    deleted_movie = await db.movies.find_one_and_delete({"_id": object_id})
+
+    if deleted_movie is None:
+        raise HTTPException(status_code=404, detail="Movie not found")


convert to our standard error response

tmcneil-mdb · 2025-10-27T19:19:10Z

+        SuccessResponse[List[dict]]: A response object containing aggregated genre statistics.
+"""
+
+@router.get("/aggregate/by-genre", response_model=SuccessResponse[List[dict]])


Not sure if we are doing by genre?

Either way the endpoint should be /api/movies/reportingByGenre

I'm not against using /aggregate. I think that would look nicer, but its a change we all have to agree on.

good point, i will remove

tmcneil-mdb · 2025-10-27T19:20:19Z

+        }
+    ]
+
+    # Execute the aggregation


removed this code

tmcneil-mdb · 2025-10-27T19:21:53Z

+        SuccessResponse[List[dict]]: A response object containing movies with their most recent comments.
+"""
+
+@router.get("/aggregate/recent-commented", response_model=SuccessResponse[List[dict]])


same as above.
I think the endpoint is /api/movies/reportingByYear

tmcneil-mdb · 2025-10-27T19:28:17Z

+            object_id = ObjectId(movie_id)
+            pipeline[0]["$match"]["_id"] = object_id
+        except Exception:
+            raise HTTPException(status_code=400, detail="Invalid movie_id format")


Standardize error response.

tmcneil-mdb · 2025-10-27T19:28:48Z

+    ])
+
+    # Execute the aggregation
+    results = await execute_aggregation(pipeline)


try / except.

tmcneil-mdb · 2025-10-27T19:33:40Z

+            "$sort": {"mostRecentCommentDate": -1}
+        },
+        {
+            "$limit": 50 if movie_id else 20


Why not just use limit? You defined it earlier.

good point, i think this is some copilot weirdness i should've caught! fixing

tmcneil-mdb · 2025-10-27T19:34:21Z

+        SuccessResponse[List[dict]]: A response object containing yearly movie statistics.
+"""
+
+@router.get("/aggregate/by-year", response_model=SuccessResponse[List[dict]])


/api/movies/reportingByYear

tmcneil-mdb · 2025-10-27T19:36:01Z

+    ]
+
+    # Execute the aggregation
+    results = await execute_aggregation(pipeline)


try/ except

tmcneil-mdb · 2025-10-27T19:39:12Z

+    ]
+
+    # Execute the aggregation
+    results = await execute_aggregation(pipeline)


try / except

tmcneil-mdb · 2025-10-27T19:39:52Z

+        SuccessResponse[List[dict]]: A response object containing director statistics.
+"""
+
+@router.get("/aggregate/directors", response_model=SuccessResponse[List[dict]])


/api/movies/reportingByDirector

tmcneil-mdb

N: I would consider adding more comments to the aggregations to better explain what is happening at the stages & improving the JSON formatting so its easier to read.

cbullinger · 2025-10-28T18:37:01Z

+
+@router.post("/", response_model=SuccessResponse[Movie], status_code=201)
+async def create_movie(movie: CreateMovieRequest):
+    # Pydantic will automatically validate the structure


Suggested change

# Pydantic will automatically validate the structure

# Pydantic automatically validates the structure

cbullinger · 2025-10-28T18:52:52Z

+    # Add lookup and additional pipeline stages
+    pipeline.extend([
+        {
+            "$lookup": {
+                "from": "comments",
+                "localField": "_id",
+                "foreignField": "movie_id",
+                "as": "comments"
+            }
+        },
+        {
+            "$match": {
+                "comments": {"$ne": []}
+            }
+        },
+        {
+            "$addFields": {
+                "recentComments": {
+                    "$slice": [
+                        {
+                            "$sortArray": {
+                                "input": "$comments",
+                                "sortBy": {"date": -1}
+                            }
+                        },
+                        limit
+                    ]
+                },
+                "mostRecentCommentDate": {
+                    "$max": "$comments.date"
+                }
+            }
+        },
+        {
+            "$sort": {"mostRecentCommentDate": -1}
+        },
+        {
+            "$limit": limit
+        },
+        {
+            "$project": {
+                "title": 1,
+                "year": 1,
+                "genres": 1,
+                "imdbRating": "$imdb.rating",
+                "recentComments": {
+                    "$map": {
+                        "input": "$recentComments",
+                        "as": "comment",
+                        "in": {
+                            "userName": "$$comment.name",
+                            "userEmail": "$$comment.email",
+                            "text": "$$comment.text",
+                            "date": "$$comment.date"
+                        }
+                    }
+                },
+                "totalComments": {"$size": "$comments"},
+                "_id": 1
+            }
+        }
+    ])


Agree with Taylor that these would all benefit from more comments. Here's an example of how we might document the stages and what kind of info to provide (aggregation is confusing)

Suggested change

# Add lookup and additional pipeline stages

pipeline.extend([

{

"$lookup": {

"from": "comments",

"localField": "_id",

"foreignField": "movie_id",

"as": "comments"

}

},

{

"$match": {

"comments": {"$ne": []}

}

},

{

"$addFields": {

"recentComments": {

"$slice": [

{

"$sortArray": {

"input": "$comments",

"sortBy": {"date": -1}

}

},

limit

]

},

"mostRecentCommentDate": {

"$max": "$comments.date"

}

}

},

{

"$sort": {"mostRecentCommentDate": -1}

},

{

"$limit": limit

},

{

"$project": {

"title": 1,

"year": 1,

"genres": 1,

"imdbRating": "$imdb.rating",

"recentComments": {

"$map": {

"input": "$recentComments",

"as": "comment",

"in": {

"userName": "$$comment.name",

"userEmail": "$$comment.email",

"text": "$$comment.text",

"date": "$$comment.date"

}

}

},

"totalComments": {"$size": "$comments"},

"_id": 1

}

}

])

# Add a multi-stage aggregation that:

# 1. Filters movies by valid year range

# 2. Joins with comments collection (like SQL JOIN)

# 3. Filters to only movies that have comments

# 4. Sorts comments by date and extracts most recent ones

# 5. Sorts movies by their most recent comment date

# 6. Shapes the final output with transformed comment structure

pipeline = [

# STAGE 1: $match - Initial Filter

# Filter movies to only those with valid year data

# Tip: Use $match early to reduce the initial dataset for better performance

{

"$match": {

"year": {"$type": "number", "$gte": 1800, "$lte": 2030}

}

}

]

# Add movie_id filter if provided (optional single movie lookup)

if movie_id:

try:

object_id = ObjectId(movie_id)

# Add _id filter to the existing $match stage

pipeline[0]["$match"]["_id"] = object_id

except Exception:

raise HTTPException(status_code=400, detail="Invalid movie_id format")

# Add remaining pipeline stages

pipeline.extend([

# STAGE 2: $lookup - Join with the 'comments' Collection

# This gives each movie document a 'comments' array containing all its comments

{

"$lookup": {

"from": "comments",

"localField": "_id",

"foreignField": "movie_id",

"as": "comments"

}

},

# STAGE 3: $match - Filter Movies with at Least One Comment

# This helps reduces dataset to only movies with user engagement

{

"$match": {

"comments": {"$ne": []}

}

},

# STAGE 4: $addFields - Add New Computed Fields

{

"$addFields": {

# Add computed field 'recentComments' that extracts only the N most recent comments (up to 'limit')

"recentComments": {

"$slice": [

{

"$sortArray": {

"input": "$comments",

"sortBy": {"date": -1} # -1 = descending (newest first)

}

},

limit # Number of comments to keep

]

},

# Add computed field 'mostRecentCommentDate' that gets the date of the most recent comment (to use in the next $sort stage)

"mostRecentCommentDate": {

"$max": "$comments.date"

}

}

},

# STAGE 5: $sort - Sort Movies by Most Recent Comment Date

{

"$sort": {"mostRecentCommentDate": -1}

},

# STAGE 6: $limit - Restrict Result Set Size

# - If querying single movie: return up to 50 results

# - If querying all movies: return up to 20 results

# Tip: This prevents overwhelming the client with too much data

{

"$limit": 50 if movie_id else 20

},

# STAGE 7: $project - Shape Final Response Output

{

"$project": {

# Include basic movie fields

"title": 1,

"year": 1,

"genres": 1,

"_id": 1,

# Extract nested field: imdb.rating -> imdbRating

"imdbRating": "$imdb.rating",

# Use $map to reshape computed 'recentComments' field with cleaner field names

"recentComments": {

"$map": {

"input": "$recentComments",

"as": "comment",

"in": {

"userName": "$$comment.name", # Rename: name -> userName

"userEmail": "$$comment.email", # Rename: email -> userEmail

"text": "$$comment.text", # Keep: text

"date": "$$comment.date" # Keep: date

}

}

},

# Calculate the total number of comments into 'totalComments' (not just 'recentComments')

# Used in display (e.g., "Showing 5 of 127 comments")

"totalComments": {"$size": "$comments"}

}

}

])

thanks corry! i'll add comments to all the agg pipelines before merging

- #1 Movie Cards: Make entire card clickable, enforce consistent heights, tone down checkbox - #2 Top Toolbar: Remove batch buttons, add contextual bottom selection bar - #3 Filters Bar: Replace mint/green with neutral gray borders and backgrounds - #4 Navbar: Remove full-width green border and animated underline effect - #5 Aggregations: Use light gray for row hover, tone down comment pills and show more button - Additional: Remove bright green border from aggregations section headers All changes improve visual hierarchy and reduce competing visual elements per reviewer feedback on PR #75.

add new methods

5ba4f9e

shuangela changed the title ~~Crud and Aggregations for Python~~ feat: crud functionality and aggregations for python backend Oct 24, 2025

shuangela added 2 commits October 24, 2025 15:20

remove comment line

1731299

add comment for vector search

303fecc

shuangela requested review from cbullinger, jordan-smith721 and tmcneil-mdb October 24, 2025 20:00

cbullinger requested changes Oct 24, 2025

View reviewed changes

tmcneil-mdb requested changes Oct 24, 2025

View reviewed changes

tmcneil-mdb reviewed Oct 27, 2025

View reviewed changes

shuangela added 3 commits October 27, 2025 15:40

feedback

e6942b2

pr feedback

aa1db24

remove unneeded imports

0362be7

shuangela requested review from cbullinger and tmcneil-mdb October 27, 2025 20:42

tmcneil-mdb approved these changes Oct 28, 2025

View reviewed changes

shuangela changed the base branch from main to development October 28, 2025 18:20

cbullinger reviewed Oct 28, 2025

View reviewed changes

add agg stage comments

1cc125a

fix broken code

49b967a

shuangela merged commit f884b21 into development Oct 28, 2025
1 check passed

shuangela mentioned this pull request Nov 6, 2025

feat: Clean up sample app and finalize endpoints #26

Merged

dacharyc deleted the crud-aggregations-python branch November 10, 2025 14:50

	# For motor (async MongoDB driver), we need to await the aggregate call
	# For async PyMongo driver, we need to await the aggregate call

	# Pydantic will automatically validate the structure
	# Pydantic automatically validates the structure

Conversation

shuangela commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cbullinger left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmcneil-mdb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmcneil-mdb Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmcneil-mdb Oct 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tmcneil-mdb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

shuangela commented Oct 24, 2025 •

edited

Loading

tmcneil-mdb Oct 27, 2025 •

edited

Loading

tmcneil-mdb Oct 27, 2025 •

edited

Loading