Add tool for analyzing and reporting random CDash test failures

## Related issues

* [TRILFRAME-614](https://sems-atlassian-son.sandia.gov/jira/browse/TRILFRAME-614): Tool to analyze dashboard failures
* #598 
* #597 
* Trilinos/Trilinos#12391

## Description
Random failures can bring down an entire CI iteration on a regular basis and waste resources whenever a retest is requested in order to pass the various checks of a pull request. 

Spotting a randomly failing test requires a lot of manual CDash querying and analysis by the developer. However, in most cases, a developer may not have the time to trace, identify, and report the randomly failing test, and instead will opt to ignore it in favor of requesting a retest, leading to the previously stated point of wasting resources. This lack of reporting also leads to bigger issue in that it allows the randomly failing test to linger inside the code base and further affect developers in the future.

## Proposed Solution

This issue proposes a new tool (which for now would live inside of TriBITS under [`tribits/ci_support`](https://github.com/TriBITSPub/TriBITS/tree/master/tribits/ci_support)) that can run automatically to query, scrape, analyze, and report tests that are deemed to be "randomly failing" to an operations team via email or an automated issue creation in the repository.

The definition for a randomly failing test will be a test that intermittently reports as passing or failing without any changes made to the topic or target branch being tested (topic and target tip SHA1 are the same) between CI testing iterations.

Fortunately, there is a lot of already existing work done that can be leveraged to build this tool in Python that already exists inside of `tribits/ci_support`. Notably, the module [`CreateIssueTrackerFromCDashQuery.py`](https://github.com/TriBITSPub/TriBITS/blob/master/tribits/ci_support/CreateIssueTraquckerFromCDashQuery.py) which can be used in the template example [`example_test_failure_github_issue.py`](https://github.com/TriBITSPub/TriBITS/blob/master/test/ci_support/example_test_failure_github_issue.py) along with the module [`CDashQueryAnalyzeReport.py`](https://github.com/TriBITSPub/TriBITS/blob/master/tribits/ci_support/CDashQueryAnalyzeReport.py) which contains most of the heavy CDash querying functionality. Thus, the core work that will need to be done after utilizing the previously written modules will be to implement the algorithm that determines a random failure that is customizable on a project basis.

The goal will be for this tool to be able to look for randomly failing tests for any projects that posts their test results to CDash. The specifics of how this tool will gather the version information of the builds in CDash will be unique to each project and will require implementation on a project basis.

Ideally, this tool can be extended to analyze and report randomly failing configure, builds, and tests, however starting with randomly failing tests should lead to a similar framework that can be used for those other cases.
 
## Requirements
* ~~posts a github issue upon identifying a randomly failing test~~ (TRILFRAME-614 requirement for any post starting with an email first)
* be able to query cdash results over a period of time
* all functionality is tested
* usage is documented

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tool for analyzing and reporting random CDash test failures #600

Related issues

Description

Proposed Solution

Requirements

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add tool for analyzing and reporting random CDash test failures #600

Description

Related issues

Description

Proposed Solution

Requirements

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions