feat: crud functionality and aggregations for python backend#5
feat: crud functionality and aggregations for python backend#5shuangela merged 8 commits intodevelopmentfrom
Conversation
cbullinger
left a comment
There was a problem hiding this comment.
a couple comments/questions. nice job!
| @router.get("/{id}", response_model=SuccessResponse[Movie]) | ||
| async def get_movie_by_id(id: str): | ||
| # Validate ObjectId format | ||
| object_id = ObjectId(id) |
There was a problem hiding this comment.
should we wrap this in a try-except? i.e. what happens if id isn't valid?
| print(f"Database name: {db.name if hasattr(db, 'name') else 'unknown'}") | ||
| print(f"Collection name: movies") | ||
|
|
||
| # For motor (async MongoDB driver), we need to await the aggregate call |
There was a problem hiding this comment.
I thought we weren't using Motor (instead using PyMongo's native async)
| # For motor (async MongoDB driver), we need to await the aggregate call | |
| # For async PyMongo driver, we need to await the aggregate call |
There was a problem hiding this comment.
Correct, we aren't using motor. Just async within the PyMongo driver.
There was a problem hiding this comment.
i am not, i think copilot added this comment for some reason despite not using motor 😓
|
|
||
| # For motor (async MongoDB driver), we need to await the aggregate call | ||
| cursor = await db.movies.aggregate(pipeline) | ||
| results = await cursor.to_list(length=None) # Convert cursor to list |
There was a problem hiding this comment.
do we want to point out why we're using to_list() for aggregations vs. async for for find queries (what you did in lines 105-108)
| cursor = await db.movies.aggregate(pipeline) | ||
| results = await cursor.to_list(length=None) # Convert cursor to list | ||
|
|
||
| print(f"Aggregation returned {len(results)} results") # Debug logging |
There was a problem hiding this comment.
is there a ticket to add proper logging?
There was a problem hiding this comment.
I don't believe there's an official logging ticket, this was just my logging for my own testing purposes locally. I can remove it if that makes the code cleaner.
tmcneil-mdb
left a comment
There was a problem hiding this comment.
Great job!
These are some minor changes. Mostly to keep the code similar and adding in validation. I didn't get to the end of the file. I will get to find and delete on Monday.
I havent written an aggregation yet, so I might leave those for now. I will ping you, if I get to them.
| object_id = ObjectId(id) | ||
|
|
||
| # Use findOne() to get a single document by _id | ||
| movie = await db.movies.find_one({"_id": object_id}) |
There was a problem hiding this comment.
To grab the db, I added a function called get_collection from mongo_client.py file to make unit testing easier later. I am calling the db like this in the rest of the functions:
movies_collection = get_collection("movies")
| genre (str): The genre of the movie. | ||
| year (int): The year the movie was released. | ||
| min_rating (float): The minimum IMDB rating. | ||
| max_rating (float): The maximum IMDB rating. |
There was a problem hiding this comment.
The request body for this the CreateMovieRequest object.
| result = await db.movies.insert_one(movie_data) | ||
|
|
||
| # Retrieve the created document to return complete data | ||
| created_movie = await db.movies.find_one({"_id": result.inserted_id}) |
There was a problem hiding this comment.
We need to verify that the document was created before querying it. A check that result is acknowledged.
| SuccessResponse[Movie]: A response object containing the created movie data. | ||
| """ | ||
|
|
||
| @router.post("/", response_model=SuccessResponse[CreateMovieRequest], status_code=201) |
There was a problem hiding this comment.
Should be:
response_model=SuccessResponse[Movie]
We are returning the movie
|
|
||
| @router.delete("/{id}", response_model=SuccessResponse[dict]) | ||
| async def delete_movie_by_id(id: str): | ||
| object_id = ObjectId(id) |
There was a problem hiding this comment.
Wrap in a try/catch. Id might not be valid.
| result = await db.movies.delete_one({"_id": object_id}) | ||
|
|
||
| if result.deleted_count == 0: | ||
| raise HTTPException(status_code=404, detail="Movie not found") |
There was a problem hiding this comment.
Lets use the create_error_response() to keep the errors consistent.
| object_id = ObjectId(id) | ||
|
|
||
| # Use deleteOne() to remove a single document | ||
| result = await db.movies.delete_one({"_id": object_id}) |
There was a problem hiding this comment.
Same comment as above about accessing the db.
|
|
||
| @router.delete("/{id}/find-and-delete", response_model=SuccessResponse[Movie]) | ||
| async def find_and_delete_movie(id: str): | ||
| object_id = ObjectId(id) |
| deleted_movie = await db.movies.find_one_and_delete({"_id": object_id}) | ||
|
|
||
| if deleted_movie is None: | ||
| raise HTTPException(status_code=404, detail="Movie not found") |
There was a problem hiding this comment.
convert to our standard error response
| SuccessResponse[List[dict]]: A response object containing aggregated genre statistics. | ||
| """ | ||
|
|
||
| @router.get("/aggregate/by-genre", response_model=SuccessResponse[List[dict]]) |
There was a problem hiding this comment.
Not sure if we are doing by genre?
Either way the endpoint should be /api/movies/reportingByGenre
I'm not against using /aggregate. I think that would look nicer, but its a change we all have to agree on.
There was a problem hiding this comment.
good point, i will remove
| } | ||
| ] | ||
|
|
||
| # Execute the aggregation |
There was a problem hiding this comment.
removed this code
| SuccessResponse[List[dict]]: A response object containing movies with their most recent comments. | ||
| """ | ||
|
|
||
| @router.get("/aggregate/recent-commented", response_model=SuccessResponse[List[dict]]) |
There was a problem hiding this comment.
same as above.
I think the endpoint is /api/movies/reportingByYear
| object_id = ObjectId(movie_id) | ||
| pipeline[0]["$match"]["_id"] = object_id | ||
| except Exception: | ||
| raise HTTPException(status_code=400, detail="Invalid movie_id format") |
There was a problem hiding this comment.
Standardize error response.
| ]) | ||
|
|
||
| # Execute the aggregation | ||
| results = await execute_aggregation(pipeline) |
| "$sort": {"mostRecentCommentDate": -1} | ||
| }, | ||
| { | ||
| "$limit": 50 if movie_id else 20 |
There was a problem hiding this comment.
Why not just use limit? You defined it earlier.
There was a problem hiding this comment.
good point, i think this is some copilot weirdness i should've caught! fixing
| SuccessResponse[List[dict]]: A response object containing yearly movie statistics. | ||
| """ | ||
|
|
||
| @router.get("/aggregate/by-year", response_model=SuccessResponse[List[dict]]) |
There was a problem hiding this comment.
/api/movies/reportingByYear
| ] | ||
|
|
||
| # Execute the aggregation | ||
| results = await execute_aggregation(pipeline) |
| ] | ||
|
|
||
| # Execute the aggregation | ||
| results = await execute_aggregation(pipeline) |
| SuccessResponse[List[dict]]: A response object containing director statistics. | ||
| """ | ||
|
|
||
| @router.get("/aggregate/directors", response_model=SuccessResponse[List[dict]]) |
There was a problem hiding this comment.
/api/movies/reportingByDirector
tmcneil-mdb
left a comment
There was a problem hiding this comment.
N: I would consider adding more comments to the aggregations to better explain what is happening at the stages & improving the JSON formatting so its easier to read.
|
|
||
| @router.post("/", response_model=SuccessResponse[Movie], status_code=201) | ||
| async def create_movie(movie: CreateMovieRequest): | ||
| # Pydantic will automatically validate the structure |
There was a problem hiding this comment.
| # Pydantic will automatically validate the structure | |
| # Pydantic automatically validates the structure |
| # Add lookup and additional pipeline stages | ||
| pipeline.extend([ | ||
| { | ||
| "$lookup": { | ||
| "from": "comments", | ||
| "localField": "_id", | ||
| "foreignField": "movie_id", | ||
| "as": "comments" | ||
| } | ||
| }, | ||
| { | ||
| "$match": { | ||
| "comments": {"$ne": []} | ||
| } | ||
| }, | ||
| { | ||
| "$addFields": { | ||
| "recentComments": { | ||
| "$slice": [ | ||
| { | ||
| "$sortArray": { | ||
| "input": "$comments", | ||
| "sortBy": {"date": -1} | ||
| } | ||
| }, | ||
| limit | ||
| ] | ||
| }, | ||
| "mostRecentCommentDate": { | ||
| "$max": "$comments.date" | ||
| } | ||
| } | ||
| }, | ||
| { | ||
| "$sort": {"mostRecentCommentDate": -1} | ||
| }, | ||
| { | ||
| "$limit": limit | ||
| }, | ||
| { | ||
| "$project": { | ||
| "title": 1, | ||
| "year": 1, | ||
| "genres": 1, | ||
| "imdbRating": "$imdb.rating", | ||
| "recentComments": { | ||
| "$map": { | ||
| "input": "$recentComments", | ||
| "as": "comment", | ||
| "in": { | ||
| "userName": "$$comment.name", | ||
| "userEmail": "$$comment.email", | ||
| "text": "$$comment.text", | ||
| "date": "$$comment.date" | ||
| } | ||
| } | ||
| }, | ||
| "totalComments": {"$size": "$comments"}, | ||
| "_id": 1 | ||
| } | ||
| } | ||
| ]) |
There was a problem hiding this comment.
Agree with Taylor that these would all benefit from more comments. Here's an example of how we might document the stages and what kind of info to provide (aggregation is confusing)
| # Add lookup and additional pipeline stages | |
| pipeline.extend([ | |
| { | |
| "$lookup": { | |
| "from": "comments", | |
| "localField": "_id", | |
| "foreignField": "movie_id", | |
| "as": "comments" | |
| } | |
| }, | |
| { | |
| "$match": { | |
| "comments": {"$ne": []} | |
| } | |
| }, | |
| { | |
| "$addFields": { | |
| "recentComments": { | |
| "$slice": [ | |
| { | |
| "$sortArray": { | |
| "input": "$comments", | |
| "sortBy": {"date": -1} | |
| } | |
| }, | |
| limit | |
| ] | |
| }, | |
| "mostRecentCommentDate": { | |
| "$max": "$comments.date" | |
| } | |
| } | |
| }, | |
| { | |
| "$sort": {"mostRecentCommentDate": -1} | |
| }, | |
| { | |
| "$limit": limit | |
| }, | |
| { | |
| "$project": { | |
| "title": 1, | |
| "year": 1, | |
| "genres": 1, | |
| "imdbRating": "$imdb.rating", | |
| "recentComments": { | |
| "$map": { | |
| "input": "$recentComments", | |
| "as": "comment", | |
| "in": { | |
| "userName": "$$comment.name", | |
| "userEmail": "$$comment.email", | |
| "text": "$$comment.text", | |
| "date": "$$comment.date" | |
| } | |
| } | |
| }, | |
| "totalComments": {"$size": "$comments"}, | |
| "_id": 1 | |
| } | |
| } | |
| ]) | |
| # Add a multi-stage aggregation that: | |
| # 1. Filters movies by valid year range | |
| # 2. Joins with comments collection (like SQL JOIN) | |
| # 3. Filters to only movies that have comments | |
| # 4. Sorts comments by date and extracts most recent ones | |
| # 5. Sorts movies by their most recent comment date | |
| # 6. Shapes the final output with transformed comment structure | |
| pipeline = [ | |
| # STAGE 1: $match - Initial Filter | |
| # Filter movies to only those with valid year data | |
| # Tip: Use $match early to reduce the initial dataset for better performance | |
| { | |
| "$match": { | |
| "year": {"$type": "number", "$gte": 1800, "$lte": 2030} | |
| } | |
| } | |
| ] | |
| # Add movie_id filter if provided (optional single movie lookup) | |
| if movie_id: | |
| try: | |
| object_id = ObjectId(movie_id) | |
| # Add _id filter to the existing $match stage | |
| pipeline[0]["$match"]["_id"] = object_id | |
| except Exception: | |
| raise HTTPException(status_code=400, detail="Invalid movie_id format") | |
| # Add remaining pipeline stages | |
| pipeline.extend([ | |
| # STAGE 2: $lookup - Join with the 'comments' Collection | |
| # This gives each movie document a 'comments' array containing all its comments | |
| { | |
| "$lookup": { | |
| "from": "comments", | |
| "localField": "_id", | |
| "foreignField": "movie_id", | |
| "as": "comments" | |
| } | |
| }, | |
| # STAGE 3: $match - Filter Movies with at Least One Comment | |
| # This helps reduces dataset to only movies with user engagement | |
| { | |
| "$match": { | |
| "comments": {"$ne": []} | |
| } | |
| }, | |
| # STAGE 4: $addFields - Add New Computed Fields | |
| { | |
| "$addFields": { | |
| # Add computed field 'recentComments' that extracts only the N most recent comments (up to 'limit') | |
| "recentComments": { | |
| "$slice": [ | |
| { | |
| "$sortArray": { | |
| "input": "$comments", | |
| "sortBy": {"date": -1} # -1 = descending (newest first) | |
| } | |
| }, | |
| limit # Number of comments to keep | |
| ] | |
| }, | |
| # Add computed field 'mostRecentCommentDate' that gets the date of the most recent comment (to use in the next $sort stage) | |
| "mostRecentCommentDate": { | |
| "$max": "$comments.date" | |
| } | |
| } | |
| }, | |
| # STAGE 5: $sort - Sort Movies by Most Recent Comment Date | |
| { | |
| "$sort": {"mostRecentCommentDate": -1} | |
| }, | |
| # STAGE 6: $limit - Restrict Result Set Size | |
| # - If querying single movie: return up to 50 results | |
| # - If querying all movies: return up to 20 results | |
| # Tip: This prevents overwhelming the client with too much data | |
| { | |
| "$limit": 50 if movie_id else 20 | |
| }, | |
| # STAGE 7: $project - Shape Final Response Output | |
| { | |
| "$project": { | |
| # Include basic movie fields | |
| "title": 1, | |
| "year": 1, | |
| "genres": 1, | |
| "_id": 1, | |
| # Extract nested field: imdb.rating -> imdbRating | |
| "imdbRating": "$imdb.rating", | |
| # Use $map to reshape computed 'recentComments' field with cleaner field names | |
| "recentComments": { | |
| "$map": { | |
| "input": "$recentComments", | |
| "as": "comment", | |
| "in": { | |
| "userName": "$$comment.name", # Rename: name -> userName | |
| "userEmail": "$$comment.email", # Rename: email -> userEmail | |
| "text": "$$comment.text", # Keep: text | |
| "date": "$$comment.date" # Keep: date | |
| } | |
| } | |
| }, | |
| # Calculate the total number of comments into 'totalComments' (not just 'recentComments') | |
| # Used in display (e.g., "Showing 5 of 127 comments") | |
| "totalComments": {"$size": "$comments"} | |
| } | |
| } | |
| ]) |
There was a problem hiding this comment.
thanks corry! i'll add comments to all the agg pipelines before merging
- #1 Movie Cards: Make entire card clickable, enforce consistent heights, tone down checkbox - #2 Top Toolbar: Remove batch buttons, add contextual bottom selection bar - #3 Filters Bar: Replace mint/green with neutral gray borders and backgrounds - #4 Navbar: Remove full-width green border and animated underline effect - #5 Aggregations: Use light gray for row hover, tone down comment pills and show more button - Additional: Remove bright green border from aggregations section headers All changes improve visual hierarchy and reduce competing visual elements per reviewer feedback on PR #75.
Basic crud functionality and aggregation pipelines for python backend
This PR introduces basic CRUD functionality and aggregation pipelines for the FastAPI application. It sets up find, insert, delete, and find and delete operations. It also adds aggregations for movies by year, most recent comments (joining the movies collection with the comments collection), and by director.
Key Changes
Testing