https://github.com/JonChesterfield approved this pull request. Good fix. We might want a different strategy for host threads later but staying with one per device looks good for now. https://github.com/llvm/llvm-project/pull/126067