[WX-1675] Record cloud quota delay to GroupMetrics table#7501
[WX-1675] Record cloud quota delay to GroupMetrics table#7501salonishah11 merged 23 commits intodevelopfrom
Conversation
There was a problem hiding this comment.
New actor whose purpose is to receive messages:
- record which group has run into quota exhaustion
- get message asking about if this group is quota exhausted or not (this will be implemented in follow up PR)
| * Checks if the job has run into any cloud quota exhaustion and records it to GroupMetrics table | ||
| * @param runStatus The run status | ||
| */ | ||
| def checkAndRecordQuotaExhaustion(runStatus: StandardAsyncRunState): Unit = () |
There was a problem hiding this comment.
Backend agnostic method and each backend can implement how to detect quota exhaustion. Currently only implemented in PipelinesApiAsyncBackendJobExecutionActor.scala
| lazy val groupMetricsActor: ActorRef = | ||
| context.actorOf(GroupMetricsActor.props(EngineServicesStore.engineDatabaseInterface)) | ||
|
|
There was a problem hiding this comment.
This is where the singleton GroupMetricsActor is created and wired through other actors to make its way through to backend actor.
There was a problem hiding this comment.
Not sure how this was created, will delete it.
aednichols
left a comment
There was a problem hiding this comment.
I think this branch would be a good mob-time demo. I'm curious to see it in action, but I also don't want to deal with migrating/resetting my local Cromwell DB.
Have you thought about where you'd like to place the conversion from "quota timestamp is recent" to Boolean "quota is exhausted"?
I was thinking of quotaExhaustionForGroupId in the component, but maybe that's too low level and it should wait for the next PR.
| case 0 => dataAccess.groupMetricsEntryIdsAutoInc += groupMetricsEntry | ||
| case _ => assertUpdateCount("recordGroupMetricsEntry", updateCount, 1) |
There was a problem hiding this comment.
I hadn't seen this pattern before but I see that it's common in our DB. TIL.
There was a problem hiding this comment.
If you had a different pattern in mind to achieve this upsert let me know and I am happy to look into it!
There was a problem hiding this comment.
That was not a leading question, it seems reasonable and I don't know of anything better.
@aednichols I was thinking probably in the |
There was a problem hiding this comment.
Currently this is only a simple test file that checks the actor receives messages when sent and calls appropriate database method. But in the follow up PR where GroupMetricsActor will receive another message related to figuring out whether group is quota exhausted or not, I am expecting that this test file will get more useful tests then.
|
|
||
| override def receive: Receive = { | ||
| case RecordGroupQuotaExhaustion(group) => | ||
| log.info(s"Recording quota exhaustion for group '$group'.") |
There was a problem hiding this comment.
From discussion - remove this log line
jgainerdewar
left a comment
There was a problem hiding this comment.
LGTM once that log line is removed. Thanks for the walkthrough!
…dinstitute/cromwell into sps_record_quota_exhaustion merge origin to local
Jira: https://broadworkbench.atlassian.net/browse/WX-1675
Description
GroupMetricstable when a job runs intoAwaitingCloudQuotastateGroupMetricsActorwhich takes in currentlyRecordGroupQuotaExhaustionmessage to record the group that ran into quota exhaustion into the new tableExample screenshot:

Release Notes Confirmation
CHANGELOG.mdCHANGELOG.mdin this PRCHANGELOG.mdbecause it doesn't impact community usersTerra Release Notes