[llvm] Attempt to fix libc++ actions runner restarter. (PR #120627)
via llvm-commits
llvm-commits at lists.llvm.org
Thu Dec 19 11:42:05 PST 2024
https://github.com/EricWF created https://github.com/llvm/llvm-project/pull/120627
It appears that introducing docker containers has broken the restarter
job since additional failure messages appear with the preemption
messages.
This should get jobs restarting on preemption again, but may do so
for jobs that also contain unrelated failures
>From 5c33c2dd9e0a9b135b6dde46ec559cc42e8d835e Mon Sep 17 00:00:00 2001
From: Eric Fiselier <eric at efcs.ca>
Date: Thu, 19 Dec 2024 14:40:40 -0500
Subject: [PATCH] Attempt to fix libc++ actions runner restarter.
It appears that introducing docker containers has broken the restarter
job since additional failure messages appear with the preemption
messages.
This should get jobs restarting on preemption again, but may do so
for jobs that also contain unrelated failures
---
.../libcxx-restart-preempted-jobs.yaml | 37 +++++++++++++++----
1 file changed, 30 insertions(+), 7 deletions(-)
diff --git a/.github/workflows/libcxx-restart-preempted-jobs.yaml b/.github/workflows/libcxx-restart-preempted-jobs.yaml
index 82d84c01c92af2..b27debd0e6fe71 100644
--- a/.github/workflows/libcxx-restart-preempted-jobs.yaml
+++ b/.github/workflows/libcxx-restart-preempted-jobs.yaml
@@ -92,6 +92,12 @@ jobs:
check_run_id: check_run_id
})
+ // For temporary debugging purposes to see the structure of the annotations.
+ console.print(annotations);
+
+ has_failed_job = false;
+ saved_failure_message = null;
+
for (annotation of annotations.data) {
if (annotation.annotation_level != 'failure') {
continue;
@@ -106,15 +112,32 @@ jobs:
const failure_match = annotation.message.match(failure_regex);
if (failure_match != null) {
- // We only want to restart the workflow if all of the failures were due to preemption.
- // We don't want to restart the workflow if there were other failures.
- core.notice('Choosing not to rerun workflow because we found a non-preemption failure' +
- 'Failure message: "' + annotation.message + '"');
- await create_check_run('skipped', 'Choosing not to rerun workflow because we found a non-preemption failure\n'
- + 'Failure message: ' + annotation.message)
- return;
+ has_failed_job = true;
+ saved_failure_message = annotation.message;
}
}
+ if (has_failed_job and not has_preempted_job) {
+ // We only want to restart the workflow if all of the failures were due to preemption.
+ // We don't want to restart the workflow if there were other failures.
+ //
+ // However, libcxx runners running inside docker containers produce both a preemption message and failure message.
+ //
+ // The desired approach is to ignore failure messages which appear on the same job as a preemption message
+ // (An job is a single run with a specific configuration, ex generic-gcc, gcc-14).
+ //
+ // However, it's unclear that this code achieves the desired approach, and it may ignore all failures
+ // if a preemption message is found at all on any run.
+ //
+ // For now, it's more important to restart preempted workflows than to avoid restarting workflows with
+ // non-preemption failures.
+ //
+ // TODO Figure this out.
+ core.notice('Choosing not to rerun workflow because we found a non-preemption failure' +
+ 'Failure message: "' + saved_failure_message + '"');
+ await create_check_run('skipped', 'Choosing not to rerun workflow because we found a non-preemption failure\n'
+ + 'Failure message: ' + saved_failure_message)
+ return;
+ }
}
if (!has_preempted_job) {
More information about the llvm-commits
mailing list