metal3-io incubating

Provision bare metal hardware via k8s-native APIs, including integration with the Cluster API.

v0.13.1

Changes since v0.13.0

🐛 Bug Fixes

  • Fix reconciling HFS created before BMH and add e2e tests (#3259)

🌱 Others

  • Bump cluster-api to v1.13.3 (#3376)
  • fix dependabots build workflop step (#3367)
  • Use proper IPA cache address (#3359)
  • Bump go.etcd.io/etcd/client/pkg/v3 from 3.6.11 to 3.6.12 (#3323)
  • Bump x/net to v0.55.0 (#3320)
  • Bump Go version to 1.25.11 (#3315)
  • Bump github.com/metal3-io/ironic-standalone-operator/api from 0.8.1 to 0.8.2 in /test (#3295)
  • Bump github.com/metal3-io/ironic-standalone-operator/api from 0.8.1 to 0.8.2 (#3294)
  • Bump github.com/onsi/ginkgo/v2 from 2.28.1 to 2.28.3 in /test (#3231)
  • Bump github.com/onsi/ginkgo/v2 from 2.28.1 to 2.28.3 (#3228)
  • Bump the kubernetes group across 3 directories with 5 updates (#3351)
  • Fix e2e test to use irso v0.9 and ironic v35.0 (#3257)
  • Bump the github-actions group with 2 updates (#3297)
  • remove obsoleted release action (#3234)
  • Bump the capi group across 2 directories with 1 update (#3225)
  • Use hard reboot in firmware settings test (#3316)

♻️ Superseded or Reverted

The image for this release is: v0.13.1

Thanks to all our contributors! 😊

Open Policy Agent (OPA) graduated

v1.18.0

This release contains a mix of bugfixes and small features. Notably:

  • A breaking fix to the outbound User-Agent header so it conforms to RFC 9110 (see below)
  • Container-aware resource limits: automatic GOMAXPROCS is restored and automatic GOMEMLIMIT is now supported
  • Several opa fmt correctness fixes
  • Improvements to opa test --coverage (ranges in report, inline rule head tracking, conjunction-expression coverage)

Breaking: Fix User-Agent according to RFC9110 (#8792)

OPA's outbound HTTP requests (bundle, discovery, decision log, status, http.send, AWS KMS/ECR)
previously sent User-Agent: Open Policy Agent/<version> (<os>, <arch>), which is not a valid
RFC 9110 User-Agent value because the product token cannot contain spaces. The header is now
Open-Policy-Agent/<version> (<os>, <arch>). Server-side log filters or WAF rules that
exact-match the old string will need to be updated.

Authored by @sspaink, reported by @SpecLad

Runtime, SDK, Tooling

Compiler, Topdown and Rego

Docs, Website, Ecosystem

Miscellaneous

  • benchmarks: smaller tweaks (#8759) authored by @srenatus
  • benchmarks: split off script, emit markdown table (#8812) authored by @srenatus
  • benchmarks: use details+summary comments for benchlab results (#8811) authored by @srenatus
  • capabilities: Integrate 1.17.1 patch release (#8798) authored by @sspaink
  • chore: tidy go.mod to remove untagged versions (#8791) authored by @thaJeztah
  • e2e: Add proto schemas for the IR plan and bundle manifest (#8766) reported and authored by @sspaink
  • gha: deduplicate change-detection output in pr CI checks (#8808) authored by @sspaink
  • nightly: use regal@main (#8735) authored by @srenatus
  • workflow: remove tests from docker (edge) image build (#8721) authored by @srenatus
  • workflows: bring back docker edge tags for post-merge (#8718) authored by @srenatus
  • workflows: use go-version-file with actions/setup-go (#8751) authored by @srenatus
  • Dependency updates; notably:
    • build(deps): Add github.com/KimMachineGun/automemlimit v0.7.5
    • build(deps): Add go.uber.org/automaxprocs v1.6.0
    • build(deps): Bump github.com/dgraph-io/badger/v4 from v4.9.1 to v4.9.2
    • build(deps): Bump github.com/vektah/gqlparser/v2 from v2.5.33 to v2.5.34
    • build(deps): Bump go.opentelemetry.io/contrib/bridges/prometheus from v0.68.0 to v0.69.0
    • build(deps): Bump go.opentelemetry.io/contrib/instrumentation/net/http/otelhttp from v0.68.0 to v0.69.0
    • build(deps): Bump go.opentelemetry.io/otel from v1.43.0 to v1.44.0
    • build(deps): Bump go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetricgrpc from v1.43.0 to v1.44.0
    • build(deps): Bump go.opentelemetry.io/otel/exporters/otlp/otlpmetric/otlpmetrichttp from v1.43.0 to v1.44.0
    • build(deps): Bump go.opentelemetry.io/otel/exporters/otlp/otlptrace from v1.43.0 to v1.44.0
    • build(deps): Bump go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracegrpc from v1.43.0 to v1.44.0
    • build(deps): Bump go.opentelemetry.io/otel/exporters/otlp/otlptrace/otlptracehttp from v1.43.0 to v1.44.0
    • build(deps): Bump go.opentelemetry.io/otel/sdk from v1.43.0 to v1.44.0
    • build(deps): Bump go.opentelemetry.io/otel/sdk/metric from v1.43.0 to v1.44.0
    • build(deps): Bump go.opentelemetry.io/otel/trace from v1.43.0 to v1.44.0
    • build(deps): Bump golang.org/x/sync from v0.20.0 to v0.21.0
    • build(deps): Bump golang.org/x/text from v0.37.0 to v0.38.0
    • build(deps): Bump google.golang.org/grpc from v1.81.0 to v1.81.1
    • build(deps): Bump gopkg.in/ini.v1 from v1.67.2 to v1.67.3
    • build(deps): Bump oras.land/oras-go/v2 from v2.6.0 to v2.6.1
    • build(deps): bump golang.org/x/crypto to v0.52.0 and golang.org/x/net to v0.55.0 (#8745) authored by @BGebken
    • build: bump go 1.26.3 -> 1.26.4 (#8726) authored by @srenatus
Atlantis sandbox

Terraform Pull Request Automation for Teams

v0.44.1

This patch release focuses on provider correctness, safer apply behavior, plan/output rendering fixes, and refreshed runtime foundations.

Highlights

  • Broader GitHub Enterprise support: existing GitHub Enterprise Server behavior is preserved, and GitHub Enterprise Cloud *.ghe.com tenants now use the correct REST and GraphQL API endpoint patterns.
  • The Debian-based Atlantis image now uses Debian 13 Trixie with refreshed pinned system packages including curl, git, OpenSSH, GnuPG, OpenSSL, and libcap.
  • Apply safety is improved by failing closed when apply-lock backends cannot be reached and by correctly evaluating pull request status for API apply requirements.
  • Mergeability and status behavior is more accurate across GitHub, GitLab, GitHub App checkout flows, and no-change apply statuses.
  • Terraform/OpenTofu plan output rendering now handles to forget, multi-unit summaries, heredoc/multiline-string diffs, and very long single-line command output more reliably.

Provider fixes

  • GitHub: support GitHub Enterprise Cloud *.ghe.com API URL patterns. (#6339)
  • GitHub: truncate status contexts to GitHub's 255-character limit. (#6541)
  • GitHub App: skip unused source remotes on fork-safe checkout paths. (#6568)
  • GitLab: filter mergeability status checks by the merge request ref/SHA. (#6557)
  • GitLab: scope project apply mergeability to the project being applied. (#6543)
  • Azure DevOps: guard nil pull request status values while parsing pull events. (#6496)
  • Gitea: avoid nil response dereferences while logging API errors. (#6442)

Apply, plan, and output fixes

  • Fail closed when the apply lock backend cannot be reached instead of allowing applies to proceed on lock-check errors. (#6533)
  • Populate pull request status before API apply requirement evaluation so approved and mergeable are enforced correctly. (#6535)
  • Show no-change plans as up to date in apply statuses instead of implying they were applied. (#6498)
  • Preserve and aggregate to forget plan statistics. (#6570)
  • Aggregate multi-unit plan summaries correctly. (#6490)
  • Remove stale .tfplan files when refreshing a working directory to a new ref. (#6358)
  • Skip non-git directories when finding pending plans. (#6453)
  • Preserve command output lines longer than 64 KiB. (#6544)
  • Color changed lines inside heredoc and multiline-string diffs. (#6561)
  • Fix a divergence check order issue. (#6452)

Runtime and maintenance

  • Updated the Debian image base to Debian 13 Trixie. (#6572)
  • Updated OpenTofu and CI/test image digests. (#6549, #6580, #6581)
  • Removed the obsolete check-lint Makefile target. (#6554)
  • Pinned scorecard-related Go and npm commands for OpenSSF compliance. (#6480)
  • Updated website and CI dependencies, including security-related frontend dependency updates. (#6563, #6564, #6565, #6566)

Documentation

  • Clarified GitHub App merge checkout behavior. (#6577)
  • Clarified recent status, locking, VCS, pending-plan, and output-handling behavior. (#6578, #6582)

New Contributors

Full Changelog: v0.44.0...v0.44.1

Keycloak incubating

Keycloak is an open-source identity and access management solution for modern applications and services, built on top of industry security standard protocols.

nightly

[OID4VCI-HAIP] Pass oid4vci-1_0-issuer-batch-issuance

closes #50216

Signed-off-by: Thomas Diesler <tdiesler@proton.me>

* Extend Proofs.create(...) to accept multiple proof values (varargs) and reject empty input.
* Update the basic wallet to generate multiple JWT proofs instead of a single proof
* Add/extend HAIP issuer conformance coverage for batch issuance

Meshery sandbox

As a self-service engineering platform, Meshery enables collaborative design and operation of cloud and cloud native infrastructure.

Meshery v1.0.48

What's New

🔤 General

⌨️ Meshery CLI

🖥 Meshery UI

  • chore(k8s-context): remove dead context code and fix persist logging @leecalcote (#20260)

🧰 Maintenance

📖 Documentation

👨🏽‍💻 Contributors

Thank you to our contributors for making this release possible:
@Harishrs2006, @Katotodan, @MrDadhich456, @aabidsofi19, @carlosriosilva, @fitzergerald, @jamieplu, @leecalcote, @marblom007, @pontusringblom, @ritzorama, @simihablo, @winkletinkle and @yi-nuo426

xRegistry sandbox

The xRegistry project defines an abstract model for managing metadata about resources and provides a REST-based interface to discover, create, modify and delete those resources.

dev

Latest development build of the 'xr(server)' executables. The commit pointer and zip/tar files are old, do not use them.

Kube-OVN sandbox

v1.15.16

v1.15.16 (2026-06-26)

Contributors

  • Mengxin Liu
  • renovate[bot]
  • 张祖建
Cozystack sandbox

Cozystack is a free PaaS platform and framework for building private clouds and providing users/customers with managed Kubernetes, KubeVirt-based VMs, databases as a service, NATS, message brokers, etc. with GPU support in VMs and Kubernetes clusters.

v1.4.4

v1.4.4 (2026-06-18)

A patch release that fixes the dashboard build on the release-1.4 branch and ships a new talm v0.31.0 with stability and usability improvements.

Fixes

  • fix(dashboard): pin UI source branch to release-1.4: Patch builds of Cozystack 1.4.x were cloning cozystack-ui@main instead of cozystack-ui@release-1.4, which could pull in unreleased UI features and break the patch release gate entirely. CONSOLE_BRANCH is now pinned to release-1.4 in the dashboard Makefile. The PR also adds documentation explaining that cutting a new minor release (vX.Y.0) requires creating the matching cozystack-ui/release-X.Y branch and pinning CONSOLE_BRANCH in the dashboard Makefile (@myasnikovdaniil in #2945).

Other repositories

talm v0.31.0

  • [talm] feat(cozystack): add DRBD-oriented sysctl and etcd backend defaults: Adds a set of production-tested TCP sysctls (tcp_orphan_retries, tcp_fin_timeout, netdev_max_backlog, netdev_budget, netdev_budget_usecs) to the cozystack preset that prevent TCP port exhaustion during DRBD reconnect storms on node reboots or resyncs. An opt-in tcpKeepaliveTuning group is also available for deployments that need faster idle-socket failure detection across all long-lived TCP connections. The etcd backend quota is exposed as a configurable etcd.quotaBackendBytes value (default 8 GiB) so control planes holding many DRBD-resource CRDs do not trip etcd's 2 GiB NOSPACE alarm (@IvanHunters in cozystack/talm#131).

  • [talm] fix(cli, engine): emit progress on stderr, enrich lookup error chain, retry transient failures: Fixes talm template -f X > Y producing YAML files with a stray progress line on stdout. Lookup failures now surface the resource kind, namespace, id, and dialed endpoints, with a six-class taxonomy of hints (TLS handshake, connection refused, deadline, auth, resource, unknown) so operators receive actionable guidance instead of a raw gRPC error. Transient connectivity classes (refused, deadline) are retried up to three times with exponential backoff; permanent errors fail immediately (@lexfrei in cozystack/talm#212).

  • [talm] fix(engine): require a template before rendering the chart: Running talm template without --file/--template previously rendered the entire chart before checking that any template was actually requested, which in online mode triggered live node discovery and produced a misleading connection refused error instead of the actionable templates are not set message. The guard is now evaluated before any chart render or node dial (@lexfrei in cozystack/talm#217).

  • [talm] fix(cozystack): exclude loop devices from LVM global_filter: Adds /dev/loop.* to the LVM global_filter in the generated Talos machine config so the host does not scan or activate volume groups inside loop-mounted images, preventing unexpected LVM activation on boot (@kvaps in cozystack/talm#215).

Documentation

  • [website] feat(blog): add Cozystack vs OpenStack comparison post: Publishes a vendor-neutral blog post comparing Cozystack and OpenStack across seven dimensions — architecture, compute, networking, storage, managed services, operations, and multi-tenancy — for teams evaluating private cloud options (@tym83 in cozystack/website#580).

  • [website] fix(layout): keep fixed-header offset after banner removal: Restores the correct top-offset for the fixed navigation header after the CozySummit banner was removed, preventing content from being obscured behind the navbar (@kvaps in cozystack/website#579).

  • [website] chore(banner): remove CozySummit Virtual 2026 announcement: Removes the expired event banner from the website (@kvaps in cozystack/website#576).

Contributors

Thanks to everyone who contributed to this patch release:

Full Changelog: v1.4.3...v1.4.4

Download cozystack

Cozystack sandbox

Cozystack is a free PaaS platform and framework for building private clouds and providing users/customers with managed Kubernetes, KubeVirt-based VMs, databases as a service, NATS, message brokers, etc. with GPU support in VMs and Kubernetes clusters.

v1.5.0

Cozystack v1.5.0

Cozystack v1.5.0 brings Gateway API support via Cilium as an opt-in ingress layer alongside ingress-nginx, TLS for managed databases and messaging (Kafka, NATS, Qdrant, and PostgreSQL external endpoints), backups that work out of the box with a platform-managed default BackupClass, a shared backups bucket, a new etcd backup strategy and a generic Job strategy, the Flux v2.8 upgrade with strict server-side apply and kstatus health checking, a new flux-shard-operator that spreads tenant HelmReleases across helm-controller shards so one noisy tenant can no longer stall the others, operator-provided wildcard certificates for platform and root-tenant ingress, GPU passthrough that works without manual KubeVirt patching, a deletion-protection guardrail for critical platform objects, and runtime-populated dashboard dropdowns via a new Option API. The release also rolls up every fix from v1.4.1 through v1.4.4.

Platform components bumped in this release: Flux v2.7.3 → v2.8.0 (flux-operator/flux-instance charts v0.33.0 → v0.50.0), MetalLB v0.15.2 → v0.16.1 (FRR-K8s is now the default BGP backend), SeaweedFS 4.05 → 4.31, etcd-operator v0.4.3 → v0.4.5, ouroboros v0.7.2 → v0.8.0, seaweedfs-cosi-driver v0.3.1, and the new kuberture system package.

Note: Items marked (backported to v1.4.x) were also shipped in the v1.4.1, v1.4.2, v1.4.3, or v1.4.4 patch releases.

Feature Highlights

Gateway API Support via Cilium

Cozystack-native services can now be exposed through the Gateway API backed by Cilium, as an opt-in alternative to the existing per-tenant ingress-nginx controllers. The feature is materialized per tenant through a new gateway.cozystack.io/v1alpha1 TenantGateway CRD reconciled by cozystack-controller.

Enable it at the platform level with publishing.gateway.enabled=true, then either give a tenant its own Gateway, LoadBalancer IP, and certificate with tenant.spec.gateway=true, or leave it unset and let the tenant inherit the nearest ancestor's Gateway through the same label-based selector model that already drives ingress inheritance. Two certificate solver modes are supported: HTTP-01 (the default — a per-app certificate with zero platform configuration for new apps) and DNS-01 (opt-in — a single wildcard certificate covering an apex, with cloudflare, route53, digitalocean, and rfc2136 providers).

Defaults stay on ingress-nginx, so existing clusters are unchanged. Two things to be aware of: Cilium Envoy / Gateway API is now always enabled (an extra cilium-envoy DaemonSet, roughly 100 MB RAM per node at idle), and cozystack-api now invokes admission (createValidation / deleteValidation) on Create and Delete for apps.cozystack.io/* — so any custom ValidatingAdmissionPolicies or webhooks on those kinds will now fire on all three verbs. See the Gateway API guide (@lexfrei in #2470).

TLS for Managed Databases and Messaging

Four managed-app charts gain TLS support driven by a single tls.enabled value with consistent tri-state semantics: when unset it inherits external (TLS auto-on when the service is published externally, off when cluster-internal), and an explicit true/false always wins. In every case the trust anchor is a chart- or operator-managed self-signed CA that clients retrieve and pin — there is no publicly trusted CA. The one upgrade-time behaviour change to plan for: existing instances with external: true flip to TLS-on after upgrade; cluster-internal instances are unaffected.

  • Kafka serves TLS on its external LoadBalancer listener (port 9094), with certificates managed end-to-end by the Strimzi operator; clients trust via the operator-published <release>-cluster-ca-cert / <release>-clients-ca-cert secrets. The external listener is now gated only on external: true, decoupled from tls.enabled (@Arsolitt in #2681).
  • NATS and Qdrant gain TLS via a self-contained cert-manager chain (self-signed Issuer → CA → leaf) rendered in the tenant namespace; NATS covers both client connections and cluster routes, Qdrant covers REST and gRPC. Clients trust the <release>-ca secret (@Arsolitt in #2684, #2685).
  • PostgreSQL — CNPG already serves TLS unconditionally, so tls.enabled here injects the external hostname into the operator-managed server certificate's SANs (when external: true), so sslmode=verify-full works against the external endpoint. Clients retrieve ca.crt from the <release>-credentials secret (@Arsolitt in #2686).

Backups That Work Out of the Box

This release closes the gap between "backup machinery is installed" and "backups actually work without per-app S3 configuration."

A platform-managed default BackupClass (cozy-default) is now shipped, backed by a system bucket (cozy-backups). Apps opt in with a useSystemBucket flag, after which the platform projects shared backup credentials into the tenant namespace (with RBAC isolation and projection metrics) and skips per-release credential Secrets. Default strategies are provided for every backup-capable app — Velero for VMDisk/VMInstance, CNPG for PostgreSQL, MariaDB, Altinity for ClickHouse, FoundationDB, and etcd — and a Velero BackupStorageLocation is wired to the system bucket. The legacy per-tenant S3 fields on Postgres and ClickHouse are deprecated in favour of this default flow (@androndo in #2716).

To make that default flow reliable, Velero is now a default system package rather than optional. This fixes a deterministic failure where the default backupstrategy-controller (which hard-depends on Velero) sat in DependenciesNotReady and kept the platform HelmRelease from ever reaching Ready. Existing clusters get Velero in the cozy-velero namespace on upgrade; opt out via bundles.disabledPackages if you do not back up VMs (@myasnikovdaniil in #2833).

Two new backup strategies join the catalog: an etcd strategy (cluster-scoped strategy.backups.cozystack.io Etcd CRD, S3-only, with snapshot BackupJob and a destructive in-place RestoreJob), and a generic application-agnostic Job strategy where the operator supplies a Kubernetes Job template that Cozystack renders and runs as a one-shot backup, then re-renders with .Mode == "restore" for recovery — the generic counterpart of the app-specific drivers (@androndo in #2641, @lllamnyp in #1721).

Flux v2.8 Upgrade with Strict Server-Side Apply

Flux is upgraded from v2.7.3 to v2.8.0 across both the embedded management-cluster Flux and the optional tenant Flux addon (flux-operator / flux-instance charts move v0.33.0 → v0.50.0). Flux v2.8's helm-controller v1.5 ships Server-Side Apply with --force-conflicts and kstatus-based health checking by default — so misplaced chart fields that v2.7 silently dropped are now hard errors (fixed here for foundationdb, kafka, kubevirt-instancetypes, vm-instance, and the platform chart), and parent HelmReleases now wait for every child resource to be Ready before reporting Ready themselves.

Action required on upgrade: Kubernetes 1.33+ is now required for the management cluster (and for any tenant cluster enabling the Flux addon). The upgrade.force: true knob is removed, so immutable-field changes (for example StatefulSet volumeClaimTemplates / serviceName) no longer self-heal and must be recreated manually (kubectl delete sts <name> --cascade=orphan). Persistent TPM/EFI is re-enabled for Windows KubeVirt preferences (each affected VM provisions one extra RWO PVC), and FoundationDB imageType is pinned to split to keep upgrades non-disruptive (@myasnikovdaniil in #2602).

flux-shard-operator: Tenant helm-controller Sharding

A new flux-shard-operator spreads tenant HelmReleases across multiple helm-controller shards, so one noisy tenant — for example a HelmRelease stuck in infinite remediation — can no longer degrade reconciliation for everyone else. Placement is per-tenant (all of a tenant's HelmReleases share one shard), assigned greedily by least load, with a CREATE-time mutating webhook stamping the shard label on each HelmRelease.

It ships with shardCount: auto by default, which sizes shards from the tenant HelmRelease count: small clusters stay at a single shard (today's behaviour) while large fleets shard out automatically, and an integer pins the count explicitly. The legacy hand-rolled flux-tenants deployment is drained and retired automatically by migration 44 (@kvaps in #2821).

Operator-Provided Wildcard Certificates

Operators can now serve platform services and the root tenant's ingress under a pre-existing wildcard TLS certificate instead of minting per-host ACME certificates. Set publishing.certificates.wildcardSecretName to the name of a TLS Secret already created in the publishing namespace (tenant-root by default) — only the Secret name travels over the cozystack-values channel, never the key material.

It works on both ingress paths: with ingress-nginx the controller serves it as --default-ssl-certificate and platform Ingresses drop their cert-manager annotations; with Gateway API a new existingSecret TenantGateway cert mode references the Secret directly and provisions no Issuer or Certificate. Scope is the root tenant only for now; extending wildcard mode to child tenants is a follow-up (@lexfrei in #2819).

GPU Passthrough Out of the Box

GPU enablement is wired up across all three paths a GPU can reach a workload, each of which previously needed manual reconciliation.

For tenant Kubernetes, node-groups declaring gpus automatically get the gpu=on kubelet label (so HAMi's device plugin schedules and advertises nvidia.com/gpu), and the tenant gpu-operator loads the driver with NVreg_NvLinkDisable=1, fixing single-SXM-GPU passthrough that previously hung at "Fabric State: In Progress" with CUDA "system not yet initialized." Both defaults are overridable via addons.gpuOperator.valuesOverride (@kvaps in #2780).

For KubeVirt VMs, enabling cozystack.gpu-operator now auto-populates the KubeVirt CR — injecting the HostDevices feature gate and filling permittedHostDevices (plus mediatedDevicesConfiguration for vGPU) from shipped NVIDIA default tables — so GPU VMs schedule without a manual kubectl patch. Action required: the bundle now owns spec.configuration.permittedHostDevices, so the first reconcile after upgrade overwrites any hand-edited entries; move custom device entries into .gpu.permittedHostDevices before upgrading and verify each resourceName matches what nodes advertise (@lexfrei in #2768). A third gpu-operator container variant is also added for hosts where the NVIDIA driver and container-toolkit are already installed by the OS, exposing GPUs to regular containerized pods via the device plugin only (@lexfrei in #2766).

Deletion-Protection Guardrail

A new deletion-protection guardrail blocks DELETE on critical platform objects labeled platform.cozystack.io/no-delete=true, evaluated in-process by the kube-apiserver via a ValidatingAdmissionPolicy — no webhook, DaemonSet, TLS, or extra image. Protected objects in this release include the cozy-system and tenant-root namespaces, the tenant-root HelmRelease, the cozystack-version ConfigMap, the cozystack-packages OCIRepository, the cert-manager ClusterIssuers, the LinstorCluster, and the packages CRDs; a migration backfills the label onto existing resources.

To delete a protected object, remove the label first (kubectl label <kind> <name> platform.cozystack.io/no-delete-). Requires Kubernetes 1.30+ (@myasnikovdaniil in #2650).

Runtime-Populated Dashboard Dropdowns

A new generic mechanism powers runtime-populated dropdowns in dashboard create/edit forms, so fields that reference live cluster resources — GPU devices, KubeVirt instancetypes and preferences, Multus networks, VM images, storage pools, storage classes, backup classes and plans — become accurate dropdowns instead of free text or stale static enums. It introduces a namespaced, read-only Option resource (core.cozystack.io/v1alpha1) computed on read by a privileged in-process provider registry, and an x-cozystack-options schema keyword that app charts declare via the cozyvalues-gen @x-cozystack-options directive. Tenants get read-only access to options in their own namespace (from cozy:tenant:view:base upward), so curated lists populate without granting broad cluster reads (@kvaps in #2778).

Upgrade Notes and Required Actions

Most operators can take v1.5.0 with no manual action — the in-platform migrations handle config rewrites automatically — but five changes warrant attention.

  • Kubernetes 1.33+ required for the management cluster. With #2602, Flux jumps v2.7.3 → v2.8.0 and helm-controller v1.5 requires Kubernetes 1.33 or newer on the management cluster (and on any tenant cluster that enables the Flux addon). Upgrade Kubernetes before taking this release if you are below 1.33.

  • upgrade.force is gone — immutable-field changes no longer self-heal. Also with #2602, the helm-controller upgrade.force: true knob is removed. If a chart upgrade changes an immutable field (StatefulSet volumeClaimTemplates, serviceName, etc.) the apply now fails instead of silently force-replacing; recreate the object manually with kubectl delete sts <name> --cascade=orphan and let Flux re-reconcile.

  • GPU VM operators must move custom host devices before upgrading. With #2768, when cozystack.gpu-operator is enabled the platform takes ownership of KubeVirt.spec.configuration.permittedHostDevices and overwrites it on the first reconcile. Move any hand-edited permittedHostDevices / mediatedDevicesConfiguration entries into the .gpu.permittedHostDevices value before upgrading, and confirm each resourceName matches what your nodes advertise.

  • MetalLB switches to the FRR-K8s BGP backend; metrics are now HTTPS-only. With #2699, MetalLB jumps v0.15.2 → v0.16.1 and adopts the upstream-default FRR-K8s backend (the classic FRR mode is deprecated). Metrics endpoints moved from plain HTTP to HTTPS only (kube-rbac-proxy replaced by native TLS + RBAC), so any scrape config pointing at the old HTTP metrics endpoints must be updated. The host-network port denylist is rotated to match the new listener set, including the dedicated /healthz / /readyz probe port 17472.

  • Externally published databases and messaging gain TLS automatically. With #2681 / #2684 / #2685 / #2686, instances with external: true flip to TLS-on after upgrade. Because the trust anchor is a self-signed CA, external clients must retrieve and pin the CA (and PostgreSQL clients using sslmode=verify-full benefit from the new external-hostname SAN). Cluster-internal instances are unaffected.

Platform Components

  • Flux: v2.7.3 → v2.8.0 (flux-operator / flux-instance charts v0.33.0 → v0.50.0). helm-controller v1.5 with embedded Helm v4 brings strict Server-Side Apply and kstatus health checking; see Feature Highlights for the upgrade impact. (release notes) (@myasnikovdaniil in #2602)

    • flux-operator gained an optional Flux Status Page Web UI with anonymous or OIDC auth and RBAC-gated actions (suspend/resume, rollout restart, run job, pod delete) — shipped opt-in in the cozystack package (v0.37.0+).
    • Security (CVE-2026-23990 / GHSA-4xh5-jcj2-ch8q, v0.40.0): fixed a Web UI impersonation bypass via empty OIDC claims — relevant only if the new Web UI is exposed with OIDC.
    • ResourceSetInputProvider gained GitLab Environments, Gitea/Forgejo, AWS CodeCommit, and ExternalService provider types; ResourceSet gained checksumFrom (rollout restart on external secret/configmap change) and includeEmptyProviders.
    • Breaking: the --disable-wait-interruption flag and DISABLE_WAIT_INTERRUPTION env var were removed (v0.39.0); flux-operator's CRD migration on Flux minor upgrades was fixed (v0.41.0) and a flux-operator migrate command added (v0.48.0).
  • MetalLB: v0.15.2 → v0.16.1. FRR-K8s is now the default BGP backend (Cozystack vendors the matching frr-k8s subchart at v0.0.25); classic FRR mode is deprecated. See Upgrade Notes for the HTTPS-only metrics change. (release notes) (@lexfrei in #2699)

    • New BGP features: per-peer localASN on BGPPeer, configurable FRR config-reload debounce, and a ServiceSelector on advertisements to target a subset of services.
    • v0.15.3 added a ConfigurationState CRD to surface configuration errors, NetworkPolicy support in the chart, and fixed CVE-2025-22874 in the images with a hardened pod security context.
  • SeaweedFS: 4.05 → 4.31 (chart 4.0.405 → 4.31.0). The bump clears the upstream 4.23 hazard flagged "not safe for erasure coding and multi-disk volume servers" and lands a large batch of S3-API correctness fixes (versioned-object semantics, Hadoop S3A multipart-ETag compatibility, bucket-quota read-only enforcement) plus EC bitrot detection, volume-server write-stall fixes, and new /healthz / /readyz probes. (release notes) (@lexfrei in #2834)

    • Follow-up fixes required after the bump (within this release): S3 TLS and the COSI provisioner ServiceAccount were restored (#2916), and the -lock BucketClass / s3 service name were restored with volumeSizeLimitMB dropped (#2943).
  • etcd-operator: v0.4.3 → v0.4.5. v0.4.4 added an EtcdBackupStatus.snapshot field describing the created backup artifact and moved 127.0.0.1 into ipAddresses; v0.4.5 fixed a broken restore-datadir path (restore was non-functional before this). (release notes) (bumped alongside the etcd backup-strategy controller by @androndo in #2641)

  • ouroboros: v0.7.2 → v0.8.0. Now logs an explicit reason when its TCP backend readiness check fails — making stuck-proxy situations immediately diagnosable instead of silently NotReady — and migrates the kubectl sidecar image from Docker Hub to mirror.gcr.io to avoid anonymous pull rate-limits (backported to v1.4.3 via #2835) (@lexfrei in #2807).

  • seaweedfs-cosi-driver: bumped to v0.3.1 with a stale-socket self-heal — the COSI driver removes any leftover UNIX socket before binding, so the objectstorage provisioner recovers automatically from CrashLoopBackOff after a non-graceful exit instead of wedging on "bind: address already in use" (backported to v1.4.3 via #2827) (@lexfrei in #2791).

  • kuberture: new optional system package (v0.1.1). A controller that bridges an external-dns gap (external-dns cannot read EndpointSlices): it watches the default/kubernetes API-server EndpointSlice and emits annotated headless Services that external-dns consumes to publish the Kubernetes API endpoint to DNS. Off by default; enable via bundles.enabledPackages and declare at least one config.outputs entry. (sources) (@lexfrei in #2647)

Major Features and Improvements

  • [networking] Gateway API support via Cilium: See Feature Highlights — opt-in TenantGateway CRD, platform publishing.gateway.enabled toggle, per-tenant opt-in with HTTP-01/DNS-01 cert modes; supersedes #2213 (@lexfrei in #2470).

  • [fluxcd] Add flux-shard-operator for tenant helm-controller sharding: See Feature Highlights — spreads tenant HelmReleases across helm-controller shards with shardCount: auto; retires the hand-rolled flux-tenants deployment via migration 44 (@kvaps in #2821).

  • [flux] Upgrade to Flux v2.8.0: See Feature Highlights and Platform Components — strict SSA + kstatus, chart fixes for foundationdb/kafka/kubevirt-instancetypes/vm-instance/platform, Kubernetes 1.33+ requirement, upgrade.force removed (folds #2612) (@myasnikovdaniil in #2602).

  • [platform] Operator-provided wildcard certificate for platform and tenant ingress: See Feature Highlights — publishing.certificates.wildcardSecretName serves a pre-existing wildcard cert on ingress-nginx (--default-ssl-certificate) and Gateway API (existingSecret mode); root tenant only (@lexfrei in #2819).

  • [platform] Add default backupclass: See Feature Highlights — platform-managed cozy-default BackupClass backed by the cozy-backups system bucket, useSystemBucket opt-in with projected shared credentials, default strategies for every backup-capable app (@androndo in #2716).

  • [platform] Add backup-strategy controller for etcd: See Feature Highlights — cluster-scoped strategy.backups.cozystack.io Etcd CRD with snapshot BackupJob and in-place RestoreJob; bumps etcd-operator to v0.4.5 (@androndo in #2641).

  • [backups] Implement Job backup strategy: A generic, application-agnostic strategy.backups.cozystack.io/Job strategy — the operator supplies a Kubernetes Job template that Cozystack renders and runs as a one-shot backup, emitting a Backup artifact on completion; restore re-renders the same template with .Mode == "restore" (cross-namespace restore is not supported) (@lllamnyp in #1721).

  • [platform] Install velero by default: See Feature Highlights — Velero moves from optional to default so the default backupstrategy-controller no longer blocks the platform HelmRelease; opt out via bundles.disabledPackages (@myasnikovdaniil in #2833).

  • [api] Add Option resource and x-cozystack-options for dynamic form dropdowns: See Feature Highlights — read-only core.cozystack.io/v1alpha1 Option resource computed on read, x-cozystack-options schema keyword, tenant-scoped read access (@kvaps in #2778).

  • [api] Expose HelmRelease generation knobs as cozystack-api flags: A follow-up to v1.4.0's #2509 that brings the same HelmRelease generation knobs to the second HelmRelease-generating path. The cozystack-api convertApplicationToHelmRelease path previously hardcoded Interval: 5m, Remediation{Retries: -1}, and no Strategy/MaxHistory; it now honours the --helmrelease-interval, --helmrelease-retry-interval, --helmrelease-install-timeout, --helmrelease-upgrade-timeout, and --helmrelease-max-history flags so the operator-generated and api-generated HelmReleases behave identically (@myasnikovdaniil in #2571).

  • [platform] Add deletion-protection guardrail via ValidatingAdmissionPolicy: See Feature Highlights — platform.cozystack.io/no-delete=true blocks DELETE on critical platform objects in-apiserver; remove the label to delete; requires Kubernetes 1.30+ (@myasnikovdaniil in #2650).

  • [kafka] Add TLS support via Strimzi listener configuration: See Feature Highlights — TLS on the external listener (9094), Strimzi-managed certs, external listener decoupled from TLS (@Arsolitt in #2681).

  • [nats] Add TLS support via cert-manager: See Feature Highlights — self-contained cert-manager chain for client connections and cluster routes (@Arsolitt in #2684).

  • [qdrant] Add TLS support via cert-manager: See Feature Highlights — TLS for REST and gRPC endpoints via a single switch (@Arsolitt in #2685).

  • [postgres] Add TLS support via CNPG operator-managed certificates: See Feature Highlights — injects the external hostname into the CNPG server certificate SANs so sslmode=verify-full works against the external endpoint (@Arsolitt in #2686).

  • [kubernetes] Enable GPU passthrough out-of-the-box: See Feature Highlights — gpu=on kubelet label on GPU node-groups and NVreg_NvLinkDisable=1 driver flag fix single-SXM-GPU passthrough (@kvaps in #2780).

  • [platform] Auto-wire KubeVirt permittedHostDevices and HostDevices feature gate: See Feature Highlights — GPU VMs schedule without manual kubectl patch; the bundle now owns permittedHostDevices (see Upgrade Notes) (@lexfrei in #2768).

  • [gpu-operator] Add container variant for preinstalled host driver: New container gpu-operator variant for hosts where the NVIDIA driver and container-toolkit are already OS-installed — exposes GPUs to containerized pods via the device plugin only (@lexfrei in #2766).

  • [rbac] Allow tenants to start/stop/restart their VMs: New RBAC rule grants update on the virtualmachines/start, /stop, and /restart KubeVirt subresources at the cozy:tenant:use:base level, so the dashboard's VM power buttons (which previously returned 403 for every tenant role) now work (@kvaps in #2777).

  • [monitoring] Add tenant overview dashboard: A new "Tenant Overview" Grafana dashboard for platform admins, deployed only to the root/infra Grafana in cozy-monitoring (never to per-tenant Grafanas, so no cross-tenant data leak). Gives a cross-tenant fleet summary, per-tenant leaderboard, top-N consumers, usage trends, and health signals (@myasnikovdaniil in #2809).

  • [dashboard] Add cluster-usage RBAC for the new admin page: New cozystack-dashboard-cluster-usage ClusterRole (cluster-wide read on nodes/pods + metrics.k8s.io nodes) bound to the cozystack-cluster-admin group, backing the new Console → Administration → Cluster Usage page (cluster-wide and per-node utilization, including GPUs). The sidebar entry is fail-closed without the binding (@lexfrei in #2743).

  • [apps] Mark stateful-app storageClass fields as immutable: 16 stateful apps (clickhouse, foundationdb, harbor, http-cache, kafka, kubernetes, mariadb, mongodb, nats, openbao, opensearch, postgres, qdrant, rabbitmq, redis, vm-disk) declare storageClass immutable via an x-kubernetes-validations rule in their chart schema and the dashboard renders the field read-only on edit forms, because changing storageClass never migrates data (PVCs pin storageClassName at creation). Enforcement is UI-only in this release — the aggregated apiserver does not yet evaluate the CEL rule on Update, so a direct kubectl patch is still accepted; apiserver-level enforcement is tracked in #2657. kubernetes.nodeGroups[].storageClass is intentionally excluded (@lexfrei in #2639).

  • [seaweedfs] Bump SeaweedFS to 4.31: See Platform Components — chart 4.0.405 → 4.31.0, clears the 4.23 EC hazard (@lexfrei in #2834).

  • [metallb] Bump to v0.16.1 and rotate host-network port denylist: See Platform Components and Upgrade Notes — FRR-K8s default backend, HTTPS-only metrics, rebuilt controller/speaker images, rotated port denylist (@lexfrei in #2699).

  • [ouroboros] Bump to v0.8.0: See Platform Components (@lexfrei in #2807, backported to v1.4.3).

  • [platform] Add kuberture as optional system package: See Platform Components — publishes the Kubernetes API endpoint to DNS via external-dns (@lexfrei in #2647).

Bug Fixes

  • [api] Publish open spec as x-kubernetes-preserve-unknown-fields, not additionalProperties:true: Fixes a cluster-wide kube-controller-manager CrashLoopBackOff (nil-pointer panic) that stalled reconciliation and timed out installs/upgrades. cozystack-api previously published the free-form .spec of apps.cozystack.io resources as additionalProperties: true, whose nil inner schema crashed the VAP type-checker run by KCM. The fix publishes x-kubernetes-preserve-unknown-fields: true instead and adds a recursive sanitizer that rewrites any boolean additionalProperties anywhere — including in untrusted ApplicationDefinition.openAPISchema input — closing the whole crash class (@myasnikovdaniil in #2867).

  • [prometheus-operator-crds] Ship full upstream CRD bundle: The package previously shipped only the four service-discovery CRDs VictoriaMetrics needs. This adds the six previously-stripped CRDs (Alertmanager, AlertmanagerConfig, Prometheus, PrometheusAgent, ScrapeConfig, ThanosRuler), so third-party apps in tenant clusters that ship their own prometheus-operator (e.g. kube-prometheus-stack) can create those CRs, or set crds.enabled: false and consume the platform-managed CRDs (which Flux previously kept reverting) (@myasnikovdaniil in #2660).

  • [info] Use root-host for Keycloak OIDC issuer URL in tenant kubeconfig: The dashboard-issued kubeconfig for non-root tenants built --oidc-issuer-url from the tenant subdomain, but Keycloak's ingress and TLS cert live only at the root host, so kubectl oidc-login failed TLS verification and the kubeconfig was unusable. The issuer now always uses the root host (https://keycloak.<root-domain>/realms/cozy), so non-root tenants can authenticate with the downloaded kubeconfig (@myasnikovdaniil in #2704).

  • [kubernetes] Add config_path patch for containerd 2.x: Per-registry mirror/credential config (/etc/containerd/certs.d) did not take effect on containerd 2.x tenant nodes because the CRI plugin config section was renamed and the old single sed was a no-op. The fix runs version-tolerant seds and loosens quote matching (also fixing Ubuntu 24.04, which emits single-quoted section headers), so config_path registry config works on both containerd 1.x and 2.x without manual patching (@elaugaste in #2723).

  • [csi] Detach orphan hot-plug volumes when VMI outlives its VM: When a VMI outlives its owning VM, hot-plug volumes were left attached, blocking the volume from being re-attached elsewhere; the CSI wrapper now detaches the orphaned hot-plug volumes so the PVC can be reused (@kvaps in #2866).

  • [csi] Route RWX Block volumes to upstream hotplug detach: KubeVirt live-migration disks (RWX Block PVCs) were incorrectly matched by the NFS-cleanup branch in ControllerUnpublishVolume and never detached, so a later attach to a different VM was rejected by linstor-csi's anti-split-brain check. A new isNFSVolume predicate requires both ReadWriteMany and VolumeMode=Filesystem, accurately matching only NFS-backed PVCs, and the same predicate fixes in-VM disk expansion being skipped for RWX Block volumes (backported to v1.4.2 via #2749) (@myasnikovdaniil in #2658).

  • [csi] Verify VMI Ready after kubevirt-csi Publish to surface stuck-PVC failures: Upstream ControllerPublishVolume has a fast path that reports success as soon as the volume entry appears in VM.spec.template.spec.volumes, even if the backing PVC is stuck ClaimPending and the hotplug never completed — surfacing later as a confusing kubelet couldn't find device by serial id. The wrapper now re-reads VMI.Status.VolumeStatus and returns codes.Unavailable with the upstream reason if the volume is not VolumeReady, keeping external-attacher retrying and surfacing the real provisioning failure at the CSI layer (backported to v1.4.2 via #2748) (@myasnikovdaniil in #2659).

  • [capi] Add startupProbe to capi-controller-manager to survive cert provisioning delay: The CAPI controller-manager could CrashLoop during initial certificate provisioning because the readiness probe failed while the controller was still booting; a startup probe now gives it room to come up (backported to v1.4.x stabilization) (@myasnikovdaniil in #2946).

  • [cluster-api] Fix kamaji OOM and set limits on unset providers: The Kamaji control-plane-provider resource override targeted a container named manager, but the upstream image names it controller, so the intended limits were dropped and the pod OOMKilled (exit 137) on the upstream 128 Mi default. The container name is corrected and modest requests/limits are set for the core, kubeadm-bootstrap, and kubevirt-infrastructure providers that previously ran as BestEffort (backported to v1.4.1 via #2709) (@myasnikovdaniil in #2708).

  • [platform] Migrate ephemeralStorage to diskSize via pre-upgrade hook: The #2454 rename of nodeGroups[*].ephemeralStorage to diskSize added a hard {{ fail }} guard that blocked reconciliation of any HelmRelease still carrying the legacy field. A new platform pre-upgrade migration (migration 41) walks all kuberneteses.apps.cozystack.io Application CRs cluster-wide and renames the field automatically before chart resources are applied; the migration is idempotent and best-effort (backported to v1.4.1 via #2712) (@IvanHunters in #2688).

  • [postgres] Accept integer values for postgresql.parameters in schema: PostgreSQL parameters such as max_connections are natural integers, but the chart schema declared {type: string} only, so bare integers were rejected at schema-validation time. An intOrString alias in cozyvalues-gen now emits anyOf: [integer, string] plus x-kubernetes-int-or-string: true, making both forms valid; the $dangerousParams blocklist is also extended with archive_cleanup_command and recovery_end_command (backported to v1.4.1 via #2715) (@IvanHunters in #2687).

  • [kafka] Remove ZooKeeper PVCs on uninstall: The Strimzi Kafka CR set deleteClaim: true on broker JBOD volumes but left ZooKeeper persistent-claim storage at deleteClaim: false, orphaning the ZooKeeper PVCs on uninstall and requiring manual cleanup before the release name could be reused. deleteClaim: true is now set on the ZooKeeper storage as well (backported to v1.4.1 via #2705) (@Arsolitt in #2679).

  • [opensearch-operator] Replace deprecated kube-rbac-proxy image: The gcr.io/kubebuilder/kube-rbac-proxy image is no longer available since the kubebuilder GCR registry was sunset. It is replaced with quay.io/brancz/kube-rbac-proxy (the source already used by other Cozystack components), applied via a values.yaml entry instead of a vendored-chart patch (backported to v1.4.1 via #2695) (@myasnikovdaniil in #2689).

  • [platform] Add OpenSearch to PaaS bundle: The OpenSearch packages existed in the repo for several releases but the PaaS bundle template never referenced the two cozystack.opensearch-* PackageSources, so the operator was never deployed and the dashboard catalog showed no OpenSearch entry. The two missing includes are now added (backported to v1.4.2 via #2757) (@myasnikovdaniil in #2648).

  • [dashboard] Grant tenant dashboard read on cozy-public PVCs: The VM disk source-image dropdown returned 403 and stayed empty even when golden images existed, because the cozy:tenant:dashboard Role only granted read on Flux HelmRepositories and HelmCharts. get/list/watch on PersistentVolumeClaims is added so tenant identities can list the vm-default-images-* PVCs (backported to v1.4.3 via #2858) (@myasnikovdaniil in #2843).

  • [api] Emit initial-events-end bookmark for core.cozystack.io watches: The TenantSecret, TenantModule, and TenantNamespace aggregated API resources never sent the k8s.io/initial-events-end bookmark required by the WatchList protocol, so client-go informers using WatchListClient (on by default since v1.35) never reached HasSynced. The bookmark is now emitted after initial ADDED events (backported to v1.4.3 via #2844) (@sunib in #2786).

  • [networking] Point host ouroboros proxy at the root-tenant ingress: When publishing.proxyProtocol was enabled, the host-level ouroboros proxy inherited a default backend FQDN that describes a managed tenant cluster and never resolved on the host. The host ouroboros Package is now emitted with a proxy.target override derived from publishing.ingressName (backported to v1.4.3 via #2846) (@lexfrei in #2800).

  • [ingress,platform] Deliver publishing.proxyProtocol to host ingress-nginx: Threads the publishing.proxyProtocol setting through to the host ingress-nginx controller so PROXY-protocol mode is actually applied on the host path (@lexfrei in #2799).

  • [objectstorage-controller] Propagate Bucket readiness to BucketClaim: The vendored COSI controller hardcoded bucketReady=false after dynamic provisioning and never re-read the Bucket, so BucketAccess was never granted and provisioned buckets ended up without credentials. The controller now re-reads the live Bucket and propagates its readiness (backported to v1.4.3 via #2828) (@lexfrei in #2792).

  • [kubernetes] Stamp application lineage labels on worker node VMs: Worker-node VMs created by Cluster API and the KubeVirt provider were never stamped with apps.cozystack.io/application.{group,kind,name} lineage labels, so the dashboard could not attribute those pods to their owning Kubernetes application. The labels are now applied to the KubevirtMachineTemplate, and application.name is quoted so a purely-numeric cluster name renders as a YAML string (backported to v1.4.3 via #2790) (@kvaps in #2779).

  • [kubernetes] Add spec.timeout to tenant CSI HelmRelease: Adds an explicit spec.timeout to the tenant CSI HelmRelease so slow first installs are not classified as failures under the new Flux kstatus waits (@myasnikovdaniil in #2727).

  • [platform] Add cert-manager dependency to webhook-cert consumers: Adds an explicit dependsOn cert-manager to the HelmReleases that consume webhook certificates, so they no longer race cert-manager during cold install (@myasnikovdaniil in #2726).

  • [platform] Order cozystack-basics after the APIs its admission policies type-check: Reorders cozystack-basics so it installs after the APIs its ValidatingAdmissionPolicies reference, avoiding type-check failures when the referenced APIs are not yet registered (@myasnikovdaniil in #2842).

  • [dashboard] Unblock token-proxy startup when JWKS is briefly unreachable: The dashboard token-proxy refused to start if the JWKS endpoint was momentarily unreachable; startup is now resilient to a brief JWKS outage so the dashboard comes up and recovers once JWKS is reachable (@lexfrei in #2745).

  • [kubevirt-cdi-operator,grafana-operator] Add startupProbes: Adds startup probes to the kubevirt-cdi-operator and grafana-operator so slow boots are not killed by the readiness probe before initialization finishes (@myasnikovdaniil in #2725).

  • [seaweedfs] Restore S3 TLS and COSI provisioner SA after 4.31 bump: Restores the S3 TLS configuration and the COSI provisioner ServiceAccount that regressed during the SeaweedFS 4.31 bump (@kvaps in #2916).

  • [seaweedfs] Restore -lock BucketClass, s3 service name, and drop volumeSizeLimitMB: Restores the -lock BucketClass and the s3 service name and drops the now-invalid volumeSizeLimitMB field after the 4.31 bump (@myasnikovdaniil in #2943).

  • [seaweedfs] Bump seaweedfs-cosi-driver to v0.3.1: See Platform Components — stale-socket self-heal (backported to v1.4.3 via #2827) (@lexfrei in #2791).

  • [backups] Carry dropdown option sources in CRD annotations, not schema: Moves the backup dropdown option sources from the chart schema into CRD annotations so they are consumed by the new x-cozystack-options mechanism without polluting the schema (@lexfrei in #2823).

  • [backups] Fix CI broken after merge: Repairs the backups test suite that broke after a merge conflict between concurrent backup-strategy PRs (@androndo in #2762).

Security

  • [deps] Bump Go to 1.26.4 and x/net to v0.55.0 to close OSV advisories: Bumps the Go toolchain to 1.26.4 and golang.org/x/net to v0.55.0 to clear reported OSV/Dependabot advisories across the first-party modules (@myasnikovdaniil in #2852).

  • [ci] Add CodeQL workflow for Go (SAST): Adds a CodeQL static-analysis workflow for the Go codebase so security issues are flagged on PRs (@myasnikovdaniil in #2851).

  • [ci] Add OpenSSF Scorecard workflow: Adds the OpenSSF Scorecard workflow to track and publish the project's supply-chain security posture; a follow-up pins the Scorecard workflow actions by SHA (@tym83 in #2720, #2721).

  • [platform] Deletion-protection guardrail: See Feature Highlights — blocks accidental DELETE of critical platform objects in-apiserver (@myasnikovdaniil in #2650).

System Configuration

  • [platform] Velero installed by default: See Feature Highlights — Velero moves to a default system package; opt out via bundles.disabledPackages (@myasnikovdaniil in #2833).

  • [platform] Register kuberture as optional system package: See Platform Components — off by default, enabled via bundles.enabledPackages with a required config.outputs entry (@lexfrei in #2647).

  • [seaweedfs] Split seaweedfs-system into seaweedfs-db + seaweedfs-system: Extracts the SeaweedFS CNPG postgres Cluster into its own seaweedfs-db HelmRelease that reports Ready only when postgres actually serves connections, fixing a fresh-install race where the filer CrashLooped against an endpoint-less ClusterIP under Cilium. An automatic, data-safe migration adopts the existing postgres cluster in place (re-annotates Helm ownership and stamps helm.sh/resource-policy: keep); no data is moved and no PV/PVC is re-provisioned (@myasnikovdaniil in #2601).

Dependencies & Version Updates

See the Platform Components section near the top of this changelog for the full list of upstream bumps with user-facing impact summaries (Flux v2.8.0, MetalLB v0.16.1, SeaweedFS 4.31, etcd-operator v0.4.5, ouroboros v0.8.0, seaweedfs-cosi-driver v0.3.1, kuberture v0.1.1).

Additional dependency-related changes:

  • [deps] Bump Go to 1.26.4 and x/net to v0.55.0 — see Security (@myasnikovdaniil in #2852).
  • [ci] Bump cozyvalues-gen pin to v1.5.0, then v1.6.0 — schema-generation tooling refresh (@lexfrei in #2730, #2784).

Development, Testing, and CI/CD

  • [ci/build] Isolate each PR build on its own ephemeral runner VM: Moves PR builds onto per-job ephemeral runner VMs to fix the cross-job buildkit single-writer-lock contention that caused intermittent build hangs on the shared runner host (@myasnikovdaniil in #2939).

  • [refactor/build] mode=max registry cache with a main-only warmer: Switches the build cache to mode=max with a main-branch-only cache warmer so PR builds warm-start without each PR re-pushing cache layers (@myasnikovdaniil in #2938).

  • [build] Read --cache-from from ghcr :latest so PR builds warm-start: Points --cache-from at the ghcr :latest tag so PR builds reuse the last published cache (@myasnikovdaniil in #2855).

  • [refactor/build] Standardize image tagging to fix concurrent PR push conflicts: Standardizes the build image-tagging scheme so concurrent PR builds no longer collide when pushing to the registry (@myasnikovdaniil in #2711).

  • [ci] Create buildx builder in the build step's DOCKER_CONFIG: Fixes buildx builder creation by scoping it to the build step's DOCKER_CONFIG (@myasnikovdaniil in #2962).

  • [feat/ci] Test-impact analysis E2E (default-on) + release E2E workflow: Adds test-impact analysis that selects which E2E suites to run based on the PR diff (default-on), plus a dedicated release E2E workflow (@myasnikovdaniil in #2559).

  • [ci/tests] Fix OCIR registry login in Release E2E workflow: Corrects the OCIR registry login step in the Release E2E workflow so the release E2E run authenticates correctly (@myasnikovdaniil in #2973).

  • [ci/release] Auto-patch-release only the 2 newest release lines: The automated patch-release cron now targets only the two newest release lines, matching the supported-version window, and documents the support window (@myasnikovdaniil in #2856).

  • [ci/release] Repair orphaned draft tag_name on retag: When a tag was deleted and re-created between draft creation and merge, GitHub orphaned the draft release by setting tag_name to untagged-<hash>. The finalize step now detects the orphaned form, matches by name, repairs tag_name, and publishes (backported to v1.4.3 via #2829) (@myasnikovdaniil in #2761).

  • [ci] Add Go-scoped Renovate configuration: Adds a Renovate configuration scoped to the Go modules so Go dependency updates are managed automatically (@myasnikovdaniil in #2850).

  • [ci/tags] Include data/versions in website docs commit: The tags workflow now includes data/versions in the website docs commit so version metadata stays in sync (@myasnikovdaniil in #2707).

  • [ci] Consolidated platform & e2e stabilization batch: A consolidated batch of platform and E2E stabilization changes folded into a single PR (@myasnikovdaniil in #2948).

  • [ci/e2e] Drop 3x retry on Run E2E + Install Cozystack, wait out gateway.bats tenant teardown: Removes the blanket 3x retry on deterministic E2E steps and instead waits out the gateway.bats tenant teardown explicitly (@myasnikovdaniil in #2558).

  • [ci/workflows] Add SSH breakpoint on E2E failure for debug PRs: Adds an opt-in SSH breakpoint on E2E failure for debug PRs so failures can be inspected live (@kvaps in #2535).

  • [test/e2e] Self-heal Cilium orphaned-endpoint leak across install and apps: Adds an in-cluster self-heal Job that recovers from the Cilium agent's orphaned-endpoint leak (the "IP already in use" flake) during install and app churn (@myasnikovdaniil in #2874).

  • [test/e2e] Pin Kubernetes to v1.33.12 to avoid KCM VAP type-checker panic: Pins the E2E Kubernetes version to v1.33.12 to avoid the kube-controller-manager VAP type-checker panic on affected Kubernetes builds (the durable apiserver-side fix ships in #2867) (@myasnikovdaniil in #2868).

  • [test/e2e] Drive LINSTOR post-install waits off a single 15m deadline: Replaces per-step LINSTOR waits with a single 15-minute deadline so the slowest path determines runtime (@myasnikovdaniil in #2928).

  • [test/e2e] Pre-pull platform images via staged-busybox DaemonSet: Pre-pulls timing-sensitive platform images via a staged-busybox DaemonSet before install so first-install reconciliation does not race image pulls (@myasnikovdaniil in #2724).

  • [test/e2e] Fix bucket.bats port-forward and S3 client reliability: Hardens the bucket.bats test's port-forward and S3 client interaction for reliability (@myasnikovdaniil in #2944).

  • [test/e2e] ouroboros fold + CSI/NFS ordering + OIDC keycloakInternalUrl: Folds the ouroboros E2E case, fixes CSI/NFS ordering, and corrects the OIDC keycloakInternalUrl in the suite (@myasnikovdaniil in #2728).

  • [test/e2e] Silence helm-template render in install-cozystack trace: Silences the noisy helm-template render output in the install trace so failures are easier to read (@myasnikovdaniil in #2615).

  • [tests] Exclude loop devices from host LVM scanning: Excludes loop devices from host LVM scanning in the test harness so loop-mounted images do not get their volume groups activated (@kvaps in #2798).

  • [e2e-sandbox] Use curl -fsSL when downloading mc: The MinIO client was downloaded without -L, so a 302 redirect from dl.min.io wrote the HTML redirect body to /usr/local/bin/mc and the bucket E2E test then failed trying to execute it. Adding -fsSL follows the redirect and fails fast on HTTP errors (shipped in v1.4.1) (@myasnikovdaniil in #2690).

  • [test/metallb] Assert digest-pinned image form, not exact version literal: Updates the metallb test to assert the digest-pinned image form rather than an exact version literal, so the test survives version bumps (@myasnikovdaniil in #2873).

  • [test/api] Add read-path authorization tests for TenantNamespace registry: Adds read-path authorization tests for the TenantNamespace registry to lock down the IDOR fixes (@lexfrei in #2864).

  • [feat/hack] Rewrite check-readiness as a Go command with test suite: Rewrites the check-readiness helper as a Go command with a test suite, replacing the previous shell implementation (@myasnikovdaniil in #2755).

  • [chore/controller] Remove dead dashboard controller: Removes the now-dead dashboard controller code left behind after the schema-driven UI rewrite (@kvaps in #2694).

  • [chore/img] Refresh Cozystack logo and add icon/stacked variants: Refreshes the Cozystack logo and adds icon and stacked logo variants (@kvapsova in #2746).

  • [chore/maintainers] Update roster, add Emeritus section, mark Timofei Larkin (@lllamnyp) as Core Maintainer: Updates the maintainer roster, adds an Emeritus section, documents Emeritus technical offboarding and reactivation, and marks @lllamnyp as a Core Maintainer (@tym83 in #2717, #2718, #2719).

  • [docs/agents] Add E2E/CI testing conventions and consolidate build/test/verify guidance: Adds the E2E and CI testing conventions doc and consolidates the build, test, and verification guidance for agents and contributors (@myasnikovdaniil in #2932, #2933).

  • [docs/release] Expand release.md into a release-engineer playbook: Expands release.md into a full release-engineer playbook covering the release process end to end (@myasnikovdaniil in #2732).

Documentation

  • [harbor] Document S3 object-storage (SeaweedFS) prerequisite: Documents that Harbor requires an S3 object-storage backend (SeaweedFS) and how to satisfy the prerequisite (@lexfrei in #2818).

  • [gpu-operator] Document passthrough variant host-driver prerequisite: Documents the host-driver prerequisite for the gpu-operator passthrough variant (@lexfrei in #2767).

  • [website] KubeVirt VM Disk and VM Instance how-to: New how-to explaining the VMDisk and VMInstance primitives — independent disk/compute lifecycles, golden images, cloning, fast provisioning, with Dashboard and kubectl examples and VM access methods (serial console, SSH, VNC) (@tym83 in cozystack/website#586).

  • [website] etcd-operator v1alpha2 announcement and migration guide: Announces the donation of etcd-operator to Cozystack and the new etcd-operator.cozystack.io/v1alpha2 API — a Membership-API lifecycle replacing the StatefulSet model — with the in-place v1alpha1 → v1alpha2 migration path (@kvaps in cozystack/website#573).

The following documentation shipped in the v1.4.x patch releases and is included for completeness:

  • [website] Add backup and recovery guides for managed applications: Operator and tenant guides for managed-application backups covering PostgreSQL, MariaDB, ClickHouse, and FoundationDB (shipped in v1.4.1) (@androndo in cozystack/website#536).
  • [website] Cilium Gateway API — architecture, security, and migration guide: Comprehensive networking/gateway-api.md page for the Cilium-backed Gateway API feature (shipped in v1.4.2) (@lexfrei in cozystack/website#509).
  • [website] Publish Kubernetes API endpoint via external-dns with kuberture: Documents exposing the managed Kubernetes API endpoint through external-dns using the kuberture system package (shipped in v1.4.3) (@lexfrei in cozystack/website#539).
  • [ingress] Explain how ingress works in the platform: Adds a "How ingress works" overview to the ingress package README (shipped in v1.4.3) (@myasnikovdaniil in #2770).
  • [website] Managed Kubernetes how-to: Practical guide for deploying and using managed Kubernetes clusters within Cozystack (shipped in v1.4.3) (@tym83 in cozystack/website#565).
  • [website] Platform-managed backups introduction: Introduces Cozystack's platform-managed backup capabilities for stateful workloads (shipped in v1.4.3) (@tym83 in cozystack/website#566).
  • [website] talm: document DRBD sysctl tuning, keepalive toggle, and etcd quota: Documents the talm DRBD sysctl performance tuning, the DRBD keepalive toggle, and the etcd backend quota configuration that ship in talm v0.31.0 (shipped in v1.4.3) (@lexfrei in cozystack/website#567).
  • [website] Cozystack vs OpenStack comparison: Vendor-neutral comparison post covering seven dimensions — architecture, compute, networking, storage, managed services, operations, and multi-tenancy — for teams evaluating private cloud options (shipped in v1.4.4) (@tym83 in cozystack/website#580).

Other Repositories

talm v0.31.0 (shipped in v1.4.4)

  • [talm] Add DRBD-oriented sysctl and etcd backend defaults: Adds production-tested TCP sysctls to the cozystack preset that prevent TCP port exhaustion during DRBD reconnect storms, an opt-in tcpKeepaliveTuning group, and a configurable etcd.quotaBackendBytes (default 8 GiB) (@IvanHunters in cozystack/talm#131).
  • [talm] Emit progress on stderr, enrich lookup error chain, retry transient failures: Fixes talm template -f X > Y emitting a stray progress line on stdout, enriches lookup-failure errors with a six-class hint taxonomy, and retries transient connectivity failures with exponential backoff (@lexfrei in cozystack/talm#212).
  • [talm] Require a template before rendering the chart: Running talm template without --file/--template now fails fast with templates are not set instead of triggering live node discovery and a misleading connection refused (@lexfrei in cozystack/talm#217).
  • [talm] Exclude loop devices from LVM global_filter: Adds /dev/loop.* to the LVM global_filter in the generated Talos machine config to prevent unexpected LVM activation inside loop-mounted images (@kvaps in cozystack/talm#215).

ansible-cozystack (shipped in v1.4.x)

  • [ansible-cozystack] Exclude loop and virtual devices from host LVM scanning: Sets an LVM global_filter on all prepare playbooks (Ubuntu, RHEL, SUSE) so the host does not scan or activate DRBD, device-mapper, zd, or loop-backed volume groups; exposed as cozystack_lvm_global_filter (@kvaps in cozystack/ansible-cozystack#49).
  • [ansible-cozystack] Enable containerd device_ownership_from_security_context for CDI block imports: Adds a k3s containerd drop-in enabling device_ownership_from_security_context so the KubeVirt CDI importer can write VM disk images into raw block volumes instead of failing with "Permission denied" (@lexfrei in cozystack/ansible-cozystack#48).

Contributors

We'd like to thank all contributors who made this release possible:

New Contributors

We're excited to welcome our first-time contributor:

Full Changelog: v1.4.0...v1.5.0

Download cozystack

k3s sandbox

Lightweight Kubernetes

v1.36.2+k3s1

This release updates Kubernetes to v1.36.2, and fixes a number of issues.

For more details on what's new, see the Kubernetes release notes.

Changes since v1.36.1+k3s1:

  • Backport GitHub Action SHA pin updates from main (#14127)
  • Backports for 2026-06 (#14151)
  • Bump v3.7.4 Traefik (#14193)
  • More backports for 2026-06 (#14211)
  • Testing Backports 2026-06 (#14213)
  • Bump klipper-helm for CVE reasons (#14235) (#14236)
  • Bump containerd with the fix for []byte (#14243)
  • Update to v1.36.2-k3s1 and Go 1.26.4 (#14230)
  • Bump containerd to v2.3.2-k3s1 (#14254)
  • Bump cri-api and containerd for upstream env string fix (#14277)

Embedded Component Versions

Component Version
Kubernetes v1.36.2
Kine v0.16.1
SQLite 3.53.0
Etcd v3.6.12-k3s1
Containerd v2.3.2-k3s2
Runc v1.4.2
Flannel v0.28.4
Metrics-server v0.8.1
Traefik v3.7.4
CoreDNS v1.14.4
Helm-controller v0.17.1
Local-path-provisioner v0.0.36

Helpful Links

As always, we welcome and appreciate feedback from our community of users. Please feel free to:

k3s sandbox

Lightweight Kubernetes

v1.35.6+k3s1

This release updates Kubernetes to v1.35.6, and fixes a number of issues.

For more details on what's new, see the Kubernetes release notes.

Changes since v1.35.5+k3s1:

  • Backport GitHub Action SHA pin updates from main (#14126)
  • Backports for 2026-06 (#14152)
  • Bump v3.7.4 Traefik (#14194)
  • More backports for 2026-06 (#14212)
  • Testing Backports 2026-06 (#14214)
  • Bump klipper-helm for CVE reasons (#14235) (#14237)
  • Bump containerd to fix []byte envvar value (#14241)
  • Update to v1.35.6-k3s1 and Go 1.25.11 (#14229)
  • Bump containerd for 1.35 (#14251)
  • Bump cri-api and containerd for upstream env string fix (#14278)

Embedded Component Versions

Component Version
Kubernetes v1.35.6
Kine v0.16.1
SQLite 3.53.0
Etcd v3.6.12-k3s1
Containerd v2.2.5-k3s2
Runc v1.4.2
Flannel v0.28.4
Metrics-server v0.8.1
Traefik v3.7.4
CoreDNS v1.14.4
Helm-controller v0.17.1
Local-path-provisioner v0.0.36

Helpful Links

As always, we welcome and appreciate feedback from our community of users. Please feel free to:

k3s sandbox

Lightweight Kubernetes

v1.34.9+k3s1

This release updates Kubernetes to v1.34.9, and fixes a number of issues.

For more details on what's new, see the Kubernetes release notes.

Changes since v1.34.8+k3s1:

  • Backport GitHub Action SHA pin updates from main (#14125)
  • Backports for 2026-06 (#14154)
  • Bump v3.7.4 Traefik (#14195)
  • More backports for 2026-06 (#14216)
  • Testing Backports 2026-06 (#14215)
  • Bump klipper-helm for CVE reasons (#14239)
  • Bump containerd with the fix for []byte (#14242)
  • Update to v1.34.9-k3s1 and Go 1.25.11 (#14228)
  • Bump containerd to v2.2.5-k3s1 (#14255)
  • Bump cri-api and containerd for upstream env string fix (#14279)

Embedded Component Versions

Component Version
Kubernetes v1.34.9
Kine v0.16.1
SQLite 3.53.0
Etcd v3.6.12-k3s1
Containerd v2.2.5-k3s2
Runc v1.4.2
Flannel v0.28.4
Metrics-server v0.8.1
Traefik v3.7.4
CoreDNS v1.14.4
Helm-controller v0.17.1
Local-path-provisioner v0.0.36

Helpful Links

As always, we welcome and appreciate feedback from our community of users. Please feel free to:

k3s sandbox

Lightweight Kubernetes

v1.33.13+k3s1

This release updates Kubernetes to v1.33.13, and fixes a number of issues.

For more details on what's new, see the Kubernetes release notes.

Changes since v1.33.12+k3s1:

Embedded Component Versions

Component Version
Kubernetes v1.33.13
Kine v0.16.1
SQLite 3.53.0
Etcd v3.6.12-k3s1
Containerd v2.2.5-k3s1.33
Runc v1.4.2
Flannel v0.28.4
Metrics-server v0.8.1
Traefik v3.7.4
CoreDNS v1.14.4
Helm-controller v0.17.1
Local-path-provisioner v0.0.36

Helpful Links

As always, we welcome and appreciate feedback from our community of users. Please feel free to:

LoxiLB sandbox

eBPF based cloud-native load-balancer. Powering Kubernetes|Edge|5G|IoT|XaaS Apps.

vlatest

Merge pull request #874 from TrekkieCoder/main

gh-868 Generate packages runnable with systemd

Keycloak incubating

Keycloak is an open-source identity and access management solution for modern applications and services, built on top of industry security standard protocols.

26.6.4

Set version to 26.6.4

Kgateway sandbox

An Envoy-powered, Kubernetes-native API Gateway that integrates Kubernetes Gateway API with a control plane for API connectivity in any cloud environment.

v2.4.0-alpha.2

🎉 Welcome to the v2.4.0-alpha.2 release of the kgateway project!

Release Notes

Changes since v2.4.0-alpha.1

New Features

  • Added BackendConfigPolicy zone-aware routing with native Envoy prefer-local and force-local support, including bootstrap locality wiring for Envoy proxies. (#13978)
  • Add requestAttributes field to GatewayExtension ext_proc config, allowing
    Envoy attributes (e.g. source.address) to be forwarded to ext_proc servers.
    (#14109)
  • Added Envoy local reply configuration to ListenerPolicy (#14146)
  • Added an AssumeRole AWS auth type to the Backend API (spec.aws.auth.assumeRole) for per-Backend STS role chaining, used for both Lambda request signing and EC2 instance discovery. The previous spec.aws.ec2.roleArn field is replaced by spec.aws.auth.assumeRole.roleArn. (#14148)
  • EC2 backends now report an EndpointsDiscovered status condition reflecting whether runtime endpoint discovery succeeded, including credential, authorization, and zero-match failures. (#14173)
  • added reference grant mode (#14209)
  • Added support for GatewayHTTPListenerIsolation conformance behavior for HTTP listeners. (#14234)

Bug Fixes

  • kgateway no longer overwrites an existing Kubernetes Service for a Gateway unless the Service has a matching Gateway ownerReference or kgateway ownership metadata. (#14145)
  • Fixed a bug where a BackendConfigPolicy health check host (HTTP host or gRPC authority) was ignored for Static backends because the endpoint-level hostname overrode it, causing health checks to use the wrong Host header. (#14201)
  • Fixed route Hostname/ServiceEntry backendRefs to the requested port on multi-port hosts (#14212)
  • Advertise support for the Gateway API BackendTLSPolicySANValidation conformance feature. (#14214)
  • ServiceEntry clusters with workloadSelector-backed pod endpoints now respect pod readiness: NotReady pods are excluded from routing, matching the EndpointSlice-based Service behavior and enabling locality failover when all locally selected pods are NotReady. WorkloadEntry and inline ServiceEntry endpoints are unaffected. (#14222)
  • Fixed EC2 backend discovery serving endpoints resolved under an outdated config (e.g. the old port) for up to a refresh interval after a Backend spec change. Spec changes and newly created EC2 backends now trigger an immediate discovery refresh, and a credential rotation combined with a transient AWS API failure no longer drops healthy endpoints. (#14228)
  • fix: resolve FrontendTLS CA certificate references in the Gateway's namespace when listeners are contributed by a ListenerSet, rather than incorrectly looking in the ListenerSet's namespace. (#14232)
  • Strict validation (KGW_VALIDATION_MODE=STRICT) now caches validation verdicts by config content
    hash, eliminating redundant envoy invocations across per-client translation and recomputes. New
    settings: KGW_VALIDATOR_MODE (CACHE [default] | BINARY) and KGW_VALIDATOR_CACHE_SIZE (default
    4096).
    (#14253)
  • Fix ListenerPolicy with clientCertificateValidation not being marked as Attached if there are other policies applied to the same target (#14278)
  • Fix excessive DNS queries from gateway pods by rendering the xDS cluster address as a rooted FQDN (trailing dot), preventing DNS search-domain expansion under the default ndots:5 resolver config. (#14291)
  • Fixed a bug where a Gateway, Route, Backend, or ListenerSet status observedGeneration could intermittently freeze at a stale value after a spec change, due to a skew between the translation cache and the status syncer's cache. (#14302)

Cleanup

  • Added kgateway validation metrics for Envoy validation calls, cache behavior, results, and duration by validation caller. (#14026)
  • Reduced controller memory usage by interning policy ref ID strings retained in policy merge tracking. (#14217)
  • Fixed GatewayExtension equality to include the object source, and Listener equality to ignore parent object metadata churn (e.g. resourceVersion bumps from status writes), preventing missed updates and spurious recomputation. (#14248)
  • Improved strict HTTPRoute validation performance by batching full-route Envoy validation per virtual host. (#14269)

Dependency Updates

  • upgrade envoy to v1.38.3 (#14314)

Contributors

Thanks to all the contributors who made this release possible:

@andy-fong @artberger @ayushi-work @chandler-solo @danehans @davidjumani @JCigan @livegrenier @marvin-roesch @MayorFaj @nmnellis @puertomontt @Soham271 @ymesika

Installation

The kgateway project is available as a Helm chart and docker images.

Helm Charts

The Helm charts are available at:

Docker Images

The docker images are available at:

  • cr.kgateway.dev/kgateway-dev/kgateway:v2.4.0-alpha.2
  • cr.kgateway.dev/kgateway-dev/sds:v2.4.0-alpha.2
  • cr.kgateway.dev/kgateway-dev/envoy-wrapper:v2.4.0-alpha.2

Quickstart

Try installing this release:

helm install kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds --version v2.4.0-alpha.2 --namespace kgateway-system --create-namespace
helm install kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --version v2.4.0-alpha.2 --namespace kgateway-system --create-namespace

For detailed installation instructions and next steps, please visit our quickstart guide.

Meshery sandbox

As a self-service engineering platform, Meshery enables collaborative design and operation of cloud and cloud native infrastructure.

Meshery v1.0.47

What's New

🔤 General

🖥 Meshery UI

🧰 Maintenance

  • Add Telemetry support with Grafana dashboards and Prometheus metrics @aabidsofi19 (#20161)

📖 Documentation

👨🏽‍💻 Contributors

Thank you to our contributors for making this release possible:
@YASHMAHAKAL, @aabidsofi19, @alexquincy, @iyush05, @leecalcote, @sangramrath, @saurabhraghuvanshii and @vedant21-ctr

Dalec sandbox

Dalec provides a declarative format for building system packages and containers from those packages in a secure way for supply chain security.

v0.21.2

What's Changed

  • Make generated LLB deterministic by sorting map iteration by @cpuguy83 in #1111
  • testrunner: use buildkit PassthroughOp for validation-only steps by @cpuguy83 in #1099

Full Changelog: v0.21.1...v0.21.2

OpenCost incubating

OpenCost provides visibility into current and historical Kubernetes spend and resource allocation.

v1.120.4

What's Changed

New Contributors

Full Changelog: v1.120.3...v1.120.4

OpenCost incubating

OpenCost provides visibility into current and historical Kubernetes spend and resource allocation.

v1.120

Release OpenCost v1.120.4

kube-vip sandbox

Kubernetes Virtual IP and Load-Balancer for both control plane and Kubernetes services

v1.2.1

What's Changed

New Contributors

Full Changelog: v1.2.0...v1.2.1

cert-manager graduated

cert-manager is a powerful and extensible X.509 certificate controller for Kubernetes and OpenShift workloads. It will obtain certificates from a variety of Issuers, both popular public Issuers as well as private Issuers, and ensure the certificates are valid and up-to-date, and will attempt to renew certificates at a configured time before expiry.

v1.19.6

cert-manager is the easiest way to automatically manage certificates in Kubernetes and OpenShift clusters.

This patch release fixes a security issue (GHSA-8rvj-mm4h-c258, HIGH) where the default cert-manager-edit aggregate ClusterRole granted namespace users permission to create ACME Challenge and Order resources directly. A user who could create a Challenge referencing a ClusterIssuer could supply attacker-controlled solver configuration while cert-manager loaded credentials from the ClusterIssuer's namespace, bypassing Issuer solver selectors (dnsZones, dnsNames, matchLabels). With the acme-dns provider specifically, this could disclose DNS credentials to an attacker-controlled endpoint.

This release also includes Go version bumps to address reported CVEs. All users should upgrade.

Warning

Potentially breaking change: The cert-manager-edit aggregate ClusterRole no longer grants create for challenges.acme.cert-manager.io or create, patch, update for orders.acme.cert-manager.io. These resources are internal to cert-manager's ACME workflow and are not intended to be created or modified directly by users. If you have tooling or workflows that create Challenge or Order resources directly (outside of the normal Certificate → CertificateRequest → Order → Challenge flow), you will need to grant those permissions explicitly.

Changes by Kind

Bug or Regression

Other (Cleanup or Flake)

cert-manager graduated

cert-manager is a powerful and extensible X.509 certificate controller for Kubernetes and OpenShift workloads. It will obtain certificates from a variety of Issuers, both popular public Issuers as well as private Issuers, and ensure the certificates are valid and up-to-date, and will attempt to renew certificates at a configured time before expiry.

v1.20.3

cert-manager is the easiest way to automatically manage certificates in Kubernetes and OpenShift clusters.

This patch release fixes a security issue (GHSA-8rvj-mm4h-c258, HIGH) where the default cert-manager-edit aggregate ClusterRole granted namespace users permission to create ACME Challenge and Order resources directly. A user who could create a Challenge referencing a ClusterIssuer could supply attacker-controlled solver configuration while cert-manager loaded credentials from the ClusterIssuer's namespace, bypassing Issuer solver selectors (dnsZones, dnsNames, matchLabels). With the acme-dns provider specifically, this could disclose DNS credentials to an attacker-controlled endpoint.

This release also removes the issuer owner reference from Challenges which was blocking Challenge garbage collection, and updates Go to fix reported CVEs.

All users should upgrade.

Warning

Potentially breaking change: The cert-manager-edit aggregate ClusterRole no longer grants create for challenges.acme.cert-manager.io or create, patch, update for orders.acme.cert-manager.io. These resources are internal to cert-manager's ACME workflow and are not intended to be created or modified directly by users. If you have tooling or workflows that create Challenge or Order resources directly (outside of the normal Certificate → CertificateRequest → Order → Challenge flow), you will need to grant those permissions explicitly.

Changes by Kind

Bug or Regression

Other (Cleanup or Flake)

Kgateway sandbox

An Envoy-powered, Kubernetes-native API Gateway that integrates Kubernetes Gateway API with a control plane for API connectivity in any cloud environment.

v2.3.5

🎉 Welcome to the v2.3.5 release of the kgateway project!

Release Notes

Changes since v2.3.4

Bug Fixes

  • Fixed BackendConfigPolicy health check host and gRPC authority overrides for static backends. (#14290)
  • Use a rooted FQDN for the default xDS service address to avoid DNS search-domain expansion during Envoy DNS resolution. (#14296)

Dependency Updates

  • upgraded envoy to 1.37.5 (#14300)

Contributors

Thanks to all the contributors who made this release possible:

@andy-fong @JCigan @puertomontt

Installation

The kgateway project is available as a Helm chart and docker images.

Helm Charts

The Helm charts are available at:

Docker Images

The docker images are available at:

  • cr.kgateway.dev/kgateway-dev/kgateway:v2.3.5
  • cr.kgateway.dev/kgateway-dev/sds:v2.3.5
  • cr.kgateway.dev/kgateway-dev/envoy-wrapper:v2.3.5

Quickstart

Try installing this release:

helm install kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds --version v2.3.5 --namespace kgateway-system --create-namespace
helm install kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --version v2.3.5 --namespace kgateway-system --create-namespace

For detailed installation instructions and next steps, please visit our quickstart guide.

Kgateway sandbox

An Envoy-powered, Kubernetes-native API Gateway that integrates Kubernetes Gateway API with a control plane for API connectivity in any cloud environment.

v2.2.8

🎉 Welcome to the v2.2.8 release of the kgateway project!

Release Notes

Changes since v2.2.7

Dependency Updates

  • upgraded envoy to 1.36.9 (#14301)

Installation

The kgateway project is available as a Helm chart and docker images.

Helm Charts

The Helm charts are available at:

Docker Images

The docker images are available at:

  • cr.kgateway.dev/kgateway-dev/kgateway:v2.2.8
  • cr.kgateway.dev/kgateway-dev/sds:v2.2.8
  • cr.kgateway.dev/kgateway-dev/envoy-wrapper:v2.2.8

Quickstart

Try installing this release:

helm install kgateway-crds oci://cr.kgateway.dev/kgateway-dev/charts/kgateway-crds --version v2.2.8 --namespace kgateway-system --create-namespace
helm install kgateway oci://cr.kgateway.dev/kgateway-dev/charts/kgateway --version v2.2.8 --namespace kgateway-system --create-namespace

For detailed installation instructions and next steps, please visit our quickstart guide.

gRPC incubating

A high performance, open source universal RPC framework.

Release v1.82.0-pre2

This is a prerelease of gRPC Core 1.82.0 (glacier).

For gRPC documentation, see grpc.io. For previous releases, see Releases.

This prerelease contains refinements, improvements, and bug fixes.

Cadence Workflow sandbox

Cadence is a distributed, scalable, durable, and highly available fault-oblivious stateful code platform.

v1.4.1-prerelease32: fix(dlq): read nil task panic in ReadDLQMessages (#8280)

What changed?

"cadence admin dlq read" calls this endpoint and get nil point error

Why?

cadence --ct 100 --env productionX admin dlq read --shards X --source_cluster X

Error: failed to read DLQ messages in shard 10269: fail to read dlq message for shard: X: code:unknown message:code:unknown message:panic: runtime error: invalid memory address or nil pointer dereference

How did you test it?

Unit test

Potential risks

Release notes

Documentation Changes


Co-authored-by: Cursor cursoragent@cursor.com

KitOps sandbox

An open standard for packaging, managing, and deploying ML models and artifacts across different systems

Release v1.15.0

Welcome to the v1.15.0 release of Kit! This release improves support for the CNCF ModelPack format, adds support for more options when packing ModelKit data, and adds support for MCP bundles.

New Features

Improved CNCF ModelPack support

Kit is now better at handling CNCF ModelPack artifacts. When working with a ModelPack that was not generated by Kit, Kit can interpret the fields normally found within a Kitfile from the ModelPack's configuration and annotations.

To use Kit to package artifacts in the ModelPack format instead of ModelKit, you can use the flag --use-model-pack for the pack command. Note that not all KitOps features are supported by ModelPacks currently.

Ultimately, you should be able to use ModelPacks with KitOps, regardless of how they were created. For example, the Docker CLI recently added support for creating ModelPacks, and those are now compatible with KitOps.

For more details, see PR #1202

Additional format and compression options for artifact layers

By default, Kit packages each layer (model weights, datasets, code, etc.) as an uncompressed tarball, which provides a good tradeoff in terms of usability, size, and speed. With KitOps v1.15.0, the following extra options are now available while packing ModelKits (or ModelPacks)

  • Compression options: in addition to the existing --compression=gzip option, KitOps now supports using Zstandard compression via the flag --compression=zstd.

  • Layer format options: instead of packaging files and/or directories as tarballs, Kit can now package single files as-is -- i.e. with no tar wrapper. The primary benefit of the raw format is that the layer DiffID (and digest, for the no compression case) match the actual SHA256 sum of the file on disk, making it easier to track which files are stored in which ModelKits.

    To enable raw layers, use the flag --layer-format=raw on the pack command, though note that this flag currently applies to all layers in the Kitfile.

These changes were added as part of improving ModelPack support; for more information, see PR #1202

MCP support in ModelKits

With Kit v1.15.0, the Kitfile now has a mcpServers section, which can be used to package MCP bundles (.mcpb) files in a new layer format. These bundles are a common way to distribute MCP servers, and adding them to the Kitfile allows for building automation and management around MCP servers specifically.

For more details on how MCP bundles are handled, see PR #1214

Bug Fixes

New Contributors

Full Changelog: v1.14.0...v1.15.0