Skip to content

[Cosmos] Adds support for non streaming ORDER BY#35468

Merged
simorenoh merged 69 commits intoAzure:mainfrom
simorenoh:vector-search-query
May 15, 2024
Merged

[Cosmos] Adds support for non streaming ORDER BY#35468
simorenoh merged 69 commits intoAzure:mainfrom
simorenoh:vector-search-query

Conversation

@simorenoh
Copy link
Copy Markdown
Member

@simorenoh simorenoh commented May 3, 2024

Python follow-up to the .NET PR: Azure/azure-cosmos-dotnet-v3#4362

Using the flag for nonStreamingOrderBy that is now present in the query plan, we choose to create a separate query execution context for these types of operations.

The process starts as a normal order-by query, creating one document producer per physical partition involved. However since there's no ordering guarantees, in this case we first need to drain the results from these document producers. The current approach is to initialize a priority queue that will serve as the ordering mechanism, receiving a document producer's batch of items one at a time, and then re-balancing with every new document producer being processed. This makes it so we hold 2*items_per_partition items in memory at most at any given time. Once fully drained, we return a priority queue with only the top k items in it.

This PR includes changes for the following:

  • sync client support for dealing with non streaming order by
  • async client support for dealing with non streaming order by
  • samples
  • tests
  • readme

This branch was made on top of this one that includes just the changes for the vector policies: #34882 please ignore those files

@github-actions github-actions Bot added the Cosmos label May 3, 2024
@azure-sdk
Copy link
Copy Markdown
Collaborator

API change check

APIView has identified API level changes in this PR and created following API reviews.

azure-cosmos

Comment thread sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/query_execution_info.py Outdated
Comment thread sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/document_producer.py Outdated
Comment thread sdk/cosmos/azure-cosmos/azure/cosmos/_execution_context/document_producer.py Outdated
@simorenoh simorenoh marked this pull request as ready for review May 3, 2024 19:16
Comment thread sdk/cosmos/azure-cosmos/CHANGELOG.md Outdated
@simorenoh
Copy link
Copy Markdown
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Copy Markdown
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Copy Markdown
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Copy Markdown
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Copy Markdown
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Copy Markdown
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@simorenoh
Copy link
Copy Markdown
Member Author

/azp run python - cosmos - tests

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants