-
Notifications
You must be signed in to change notification settings - Fork 883
Pull requests: kubeflow/trainer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
chore(deps): Bump Go 1.25, k8s v1.35, and controller-runtime v0.23.1
dependencies
Pull requests that update a dependency file
size/XL
#3127
opened Jan 26, 2026 by
andreyvelich
Loading…
feat(runtimes): add support for ClusterTrainingRuntimes in Helm chart
size/XL
#3124
opened Jan 24, 2026 by
khushiiagrawal
Loading…
1 task done
fix: allow atomic update of podTemplateOverrides when unsuspending TrainJob
kind/bug
kind/feature
ok-to-test
size/M
#3122
opened Jan 24, 2026 by
NarayanaSabari
Loading…
feat(docs): KEP-2598 XGBoost Runtime for Trainer V2
size/XL
#3118
opened Jan 22, 2026 by
Krishna-kg732
Loading…
1 of 5 tasks
chore(deps): bump sigs.k8s.io/controller-runtime from 0.22.4 to 0.23.0 in the kubernetes group
dependencies
Pull requests that update a dependency file
go
Pull requests that update Go code
size/L
#3104
opened Jan 20, 2026 by
dependabot
bot
Loading…
feat: Add CI pipeline to validate manifests and helm chart
size/M
#3101
opened Jan 18, 2026 by
juyterman1000
Loading…
feat(runtimes): Support Elastic PyTorch in TrainJob
do-not-merge/hold
size/L
#3099
opened Jan 17, 2026 by
SoumyaRaikwar
Loading…
feat(runtimes): add AMD ROCm torch distributed runtime ref #2335
size/S
#3097
opened Jan 16, 2026 by
JEETDESAI25
Loading…
feat: add unit tests for config package
size/XL
#3095
opened Jan 15, 2026 by
Denyme24
Loading…
1 task
chore: add unit tests for configuration loading and controller options
size/M
#3094
opened Jan 14, 2026 by
Goku2099
Loading…
feat(api): add immutability for TrainingRuntimes types
size/XXL
#3082
opened Jan 11, 2026 by
Misha6Sharma
Loading…
1 task
feat(examples): add torch.compile to PyTorch local examples
ok-to-test
size/XL
#3076
opened Jan 8, 2026 by
Ishtiyaque-Alam
Loading…
1 task
chore(examples): add GPU passthrough support to container backend example
size/L
#3075
opened Jan 6, 2026 by
muzzlol
Loading…
feat(docs): proposal for adding TTLSecondsAfterFinished and ActiveDeadlineSeconds fields to TrainJob CRD
size/L
#3068
opened Jan 5, 2026 by
XploY04
Loading…
feat: replaced vm runner with test gpu arc from cncf
ok-to-test-gpu-runner
size/XL
#3067
opened Jan 5, 2026 by
jaiakash
Loading…
1 task done
feat: add TTLSecondsAfterFinished and ActiveDeadlineSeconds fields to TrainJob CRD
size/XL
#3065
opened Jan 4, 2026 by
XploY04
Loading…
feat: support for Flux Framework as HPC manager
size/XXL
#3064
opened Jan 4, 2026 by
vsoch
Loading…
1 task done
feat: add production-ready MNIST example for PyTorch
size/XL
#3063
opened Jan 3, 2026 by
Snehadas2005
Loading…
1 task done
fix(runtimes): propagate Trainer.NumNodes into TemplateSpec (Parallelism/Completions)
kind/bug
ok-to-test
size/L
#3057
opened Dec 24, 2025 by
NarayanaSabari
Loading…
fix(operator): fix TrainJob suspend/resume webhook error (#3008)
ok-to-test
size/L
#3041
opened Dec 16, 2025 by
JEETDESAI25
Loading…
feat: Add the manager field to the podTemplateOverride object
ok-to-test
size/L
#3020
opened Dec 4, 2025 by
kaisoz
Loading…
1 task
feat(examples): Add kubectl-friendly YAML examples for TrainJob and TrainingRuntime
size/XXL
#2925
opened Nov 6, 2025 by
NarayanaSabari
Loading…
chore: Add comprehensive unit tests for Config API
size/XXL
#2893
opened Oct 16, 2025 by
kapil27
Loading…
1 task
Previous Next
ProTip!
Filter pull requests by the default branch with base:master.