Security Hardening Guide

Introduction

openDesk Edu is built on a defense-in-depth security model. The platform integrates multiple overlapping security layers — Kubernetes-native controls, service-level authentication via Keycloak, network segmentation through Kubernetes NetworkPolicies, encrypted backups via k8up/restic, and TLS termination with automated certificate management. This layered approach ensures that the compromise of any single component does not automatically lead to a full system breach.

This guide extends beyond the default deployment configuration. The base Helm chart values provide a functional and reasonably secure starting point, but production deployments — particularly those at German universities handling personal data under DSGVO and BDSG — require additional hardening. Each section below addresses a specific attack surface and provides concrete, actionable configuration.

Target audience: University IT security teams, DevOps engineers, and system administrators responsible for operating openDesk Edu in compliance with institutional security policies and German data protection law.

Prerequisites: You should have a running openDesk Edu deployment and administrative access to the Kubernetes cluster (kubectl, helmfile access).

Kubernetes Cluster Security

Pod Security Standards

openDesk Edu services should run under the Kubernetes restricted Pod Security Standard (PSS). This prohibits privileged containers, host network access, hostPath volumes (except for specific CSI drivers), and allows only non-root, read-only root filesystem containers.

Enforce the restricted profile at the namespace level:

apiVersion: v1
kind: Namespace
metadata:
  name: opendesk-edu
  labels:
    pod-security.kubernetes.io/enforce: restricted
    pod-security.kubernetes.io/enforce-version: latest
    pod-security.kubernetes.io/audit: restricted
    pod-security.kubernetes.io/warn: restricted

Some services (e.g., BigBlueButton's webcams-needed container, certain monitoring agents) may require baseline profile exceptions. For these, create dedicated namespaces with the appropriate label and isolate them with NetworkPolicies. Never grant privileged access without explicit security review.

Network Policies

The default deployment applies a permissive network policy. For production, implement a default-deny policy and whitelist only required traffic.

Default deny ingress and egress:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: opendesk-edu
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

Service-specific allow rules — example: ILIAS can only communicate with its database and Keycloak:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-ilias-egress
  namespace: opendesk-edu
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: ilias
  policyTypes:
    - Egress
  egress:
    # Database access (PostgreSQL)
    - ports:
        - port: 5432
          protocol: TCP
      to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: postgresql
              app.kubernetes.io/component: ilias-db
    # Keycloak authentication
    - ports:
        - port: 8080
          protocol: TCP
      to:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: keycloak
    # DNS resolution (cluster DNS)
    - ports:
        - port: 53
          protocol: UDP
      to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
          podSelector:
            matchLabels:
              k8s-app: kube-dns

The same principle applies to all services. The following table summarizes recommended egress rules:

Service	DB Access	Keycloak	Internet	Other Services
ILIAS	Yes	Yes	No	No
Moodle	Yes	Yes	No	No
Nextcloud	Yes	Yes	Limited*	Collabora
Grommunio	Yes	Yes	Yes†	DNS, NTP
BigBlueButton	No	Yes	No	No
Element	Yes	Yes	Yes	Matrix fed.
XWiki	Yes	Yes	No	No
OpenProject	Yes	Yes	No	No
Monitoring	No	No	No	API server

* Nextcloud requires internet access only for app store updates and external storage. Block on non-admin networks. † Grommunio requires SMTP/IMAP access to external mail servers — restrict egress by IP ranges of partner mail exchangers.

Pod Resource Limits

Resource limits prevent denial-of-service via resource starvation from a compromised pod or a misbehaving service. Every workload must have both requests and limits set.

Example resource block for Keycloak:

resources:
  requests:
    memory: "2Gi"
    cpu: "1"
  limits:
    memory: "4Gi"
    cpu: "2"

Use a LimitRange in the namespace to enforce minimum and maximum defaults:

apiVersion: v1
kind: LimitRange
metadata:
  name: opendesk-edu-limits
  namespace: opendesk-edu
spec:
  limits:
    - default:
        memory: "2Gi"
        cpu: "2"
      defaultRequest:
        memory: "512Mi"
        cpu: "250m"
      type: Container

Seccomp and AppArmor

Enable runtime-default seccomp profiles for all workloads. For sensitive services (Keycloak, Nextcloud, Grommunio), consider custom seccomp profiles that narrow allowed syscalls further.

securityContext:
  seccompProfile:
    type: RuntimeDefault

Where AppArmor is available (Ubuntu nodes), apply profiles to critical services:

annotations:
  container.apparmor.security.beta.kubernetes.io/keycloak: local/opendesk-keycloak

A typical custom AppArmor profile for Keycloak should deny mount, ptrace, and exec of unexpected binaries.

Node Hardening

Run nodes with CIS Benchmark compliance (use kube-bench for validation).
Keep node OS images updated; automate with a weekly update window.
Remove unnecessary packages and services from node images. Minimize the attack surface.
Restrict SSH access to nodes: use a bastion host or VPN. No direct public SSH.
Enable auditd on nodes to capture kernel-level events (syscall auditing).
Use dedicated node pools for sensitive services (Keycloak, database) with taints and tolerations to prevent co-location with less trusted workloads.

Secret Management

Current Approach

The default deployment stores secrets as Kubernetes Secrets created via Helm values or helmfile. While this works for initial setup, it presents several risks:

Secrets may be committed to Git repositories (plaintext in values files).
No rotation lifecycle management.
Limited audit trail for who accessed which secret.

Recommended: Sealed Secrets

Sealed Secrets allows encrypting Kubernetes Secrets into SealedSecret custom resources that can safely be stored in Git. Only the Sealed Secrets controller (running in the cluster) can decrypt them.

apiVersion: bitnami.com/v1alpha1
kind: SealedSecret
metadata:
  name: keycloak-admin-secret
  namespace: opendesk-edu
spec:
  encryptedData:
    admin-password: AgBy3i4OJSWK+... (encrypted payload)
  template:
    type: Opaque
    metadata:
      labels:
        app.kubernetes.io/name: keycloak
    data:
      admin-password: "" # placeholder, replaced by decrypted value

Alternative: External Secrets Operator

For teams already using HashiCorp Vault, AWS Secrets Manager, or Azure Key Vault, the External Secrets Operator synchronizes secrets from external stores into Kubernetes:

apiVersion: external-secrets.io/v1beta1
kind: SecretStore
metadata:
  name: vault-backend
  namespace: opendesk-edu
spec:
  provider:
    vault:
      server: "https://vault.internal:8200"
      path: "opendesk-edu"
      version: "v2"
      auth:
        kubernetes:
          mountPath: "kubernetes"
          role: "opendesk-edu"
---
apiVersion: external-secrets.io/v1beta1
kind: ExternalSecret
metadata:
  name: keycloak-admin
  namespace: opendesk-edu
spec:
  refreshInterval: "1h"
  secretStoreRef:
    name: vault-backend
    kind: SecretStore
  target:
    name: keycloak-admin-secret
  data:
    - secretKey: admin-password
      remoteRef:
        key: "opendesk-edu/keycloak/admin"
        property: "password"

Credential Rotation

Credential	Recommended rotation interval	Method
Keycloak admin password	90 days	Manual via UI/API or External Secrets Operator
Database passwords	90-180 days	Helmfile re-run with new values, or automated via Vault
TLS private keys	1 year (or at revocation)	cert-manager renewal (Let's Encrypt: 90 days auto)
SMTP/API tokens	Per service policy	External Secrets + periodic renewal
k8up/restic repository	At breach or annually	Generate new key; re-encrypt all backups

Procedure for database password rotation:

Generate new password (e.g., openssl rand -base64 32).
Update the database user password via SQL (ALTER USER ... PASSWORD '...').
Update the Kubernetes Secret or SealedSecret.
Restart the affected service pods to pick up the new secret.
Verify connectivity from the service to the database.
Rotate the old password out of credential stores.

TLS Private Key Protection

Use cert-manager with automated renewal to minimize the window of manual key handling.
Store private keys exclusively in Kubernetes Secrets — never export them to files.
For openDesk Certificates (Bundesdruckerei), import the private key directly into a Secret via a one-time, air-gapped procedure.
Restrict access to TLS Secrets via RBAC:

apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
  name: tls-secret-reader
  namespace: opendesk-edu
rules:
  - apiGroups: [""]
    resources: ["secrets"]
    resourceNames: ["opendesk-edu-tls"]
    verbs: ["get", "list"]

Network Segmentation

Service Mesh vs. Kubernetes Network Policies

For most openDesk Edu deployments, Kubernetes NetworkPolicies provide sufficient network segmentation. They are simpler to manage, require no additional control plane, and integrate natively with your existing RBAC.

Consider a service mesh (Istio, Linkerd, Cilium) only if you need:

mTLS between all pods (beyond what NetworkPolicies offer).
Fine-grained L7 traffic policies (HTTP method filtering, path-based routing).
Observability with distributed tracing across services.

Recommended NetworkPolicy Architecture

The following diagram describes the recommended isolation model:

Ingress tier — The ingress controller (nginx-ingress or similar) is the only pod with a public-facing port. It forwards traffic based on hostname rules.
Keycloak — Exposed to the ingress controller and all services that require authentication. Keycloak should not initiate outbound connections to arbitrary services.
Database tier — Each database Pod is accessible only from its corresponding application service and the backup operator (k8up). No other pod, including the ingress controller, should reach databases directly.
LMS services (ILIAS, Moodle) — Limited egress: database, Keycloak, and Shibboleth SP only. No general internet access.
Monitoring — Prometheus, Loki, and Grafana should only be accessible from an admin VPN or a bastion network.

Database tier — access restricted to service + backup operator:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: restrict-db-access
  namespace: opendesk-edu
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/component: database
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/component: app
      ports:
        - port: 5432
        - port: 3306
    - from:
        - podSelector:
            matchLabels:
              app.kubernetes.io/name: k8up
      ports:
        - port: 5432
        - port: 3306

Keycloak — exposed to ingress and authenticated services:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: keycloak-ingress
  namespace: opendesk-edu
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: keycloak
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: ingress-nginx
      ports:
        - port: 8080
        - port: 8443
    - from:
        - podSelector: {}  # All pods in namespace can reach Keycloak for auth
      ports:
        - port: 8080

Monitoring — restricted to admin VPN:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: restrict-monitoring
  namespace: opendesk-edu
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/component: monitoring
  policyTypes:
    - Ingress
  ingress:
    - from:
        - ipBlock:
            cidr: 10.0.0.0/8   # Replace with your admin VPN CIDR
      ports:
        - port: 9090  # Prometheus
        - port: 3100  # Loki
        - port: 3000  # Grafana

Egress Filtering

Map out which services require internet access. The principle: deny by default, allow explicitly.

Service	Needs Internet?	Details
Keycloak	Conditional	Federation metadata refresh; block if using static metadata
Nextcloud	Limited	App store, external storage — block from internal networks
Grommunio	Yes	SMTP/IMAP to external mail servers — whitelist IP ranges
Element	Yes	Matrix federation — restrict to known servers if possible
Moodle/ILIAS	No	All content hosted internally
BigBlueButton	No	No external dependencies
All databases	No	Strictly internal
k8up (restic)	Yes	S3-compatible storage endpoint — whitelist the S3 URL only
Cert-manager	Yes	ACME challenge endpoint or openDesk CA API

Authentication Hardening

Keycloak Session Management

Configure reasonable session timeouts to limit the window of a stolen session token.

Setting	Value	Rationale
SSO Session Max	8 hours	Aligns with a single workday
SSO Session Idle	30 minutes	Auto-logout after inactivity
Access Token Lifetime	5 minutes	Minimizes stolen-token window
Refresh Token Max	12 hours	Require re-authentication daily
Refresh Token Idle	30 minutes	Refresh tokens expire on inactivity
Client Session Max	1 day	Bound per-application sessions
Offline Session Max	30 days	Only for approved clients; default to 0 (offline disabled)

Enable refresh token rotation and revoke refresh tokens on token reuse to detect token theft.

Password Policy

Configure in Keycloak's built-in password policy:

Policy	Value	Rationale
Minimum Length	12 characters	NIST SP 800-63B recommendation
One Digit	Enabled	Complexity requirement
One Lowercase	Enabled	Complexity requirement
One Uppercase	Enabled	Complexity requirement
One Special Character	Enabled	Complexity requirement
Password History	5 passwords	Prevents password reuse
Expire Password	180 days	Regular rotation
Not Recently Used	Disabled	Avoid usability impact without security gain

Multi-Factor Authentication

openDesk Edu supports MFA through Keycloak's authentication flows. Two approaches are recommended:

WebAuthn/FIDO2 for administrative accounts:

Administrators (Keycloak realm administrators, service operators) should be required to register a hardware security key (YubiKey, Nitrokey, or platform authenticator). This prevents credential theft from turning into a full administrative takeover.

# In Keycloak admin console:
# 1. Create an authentication flow "Admin MFA"
# 2. Add "WebAuthn Authenticator" as REQUIRED
# 3. Bind this flow to all admin roles
# 4. Require users to register their security key on next login

TOTP for regular users:

Time-based one-time passwords (via authenticator apps) provide an accessible MFA method for the general user base.

Brute Force Protection

Keycloak's built-in brute force detection:

Max Login Failures: 5
Failure Reset Time: 300 seconds
Wait Increment: 60 seconds
Quick Login Check: Enabled

Additionally, deploy fail2ban on the ingress node to rate-limit HTTP 401 responses to the Keycloak auth endpoint:

[keycloak-auth]
enabled  = true
port     = http,https
filter   = keycloak-auth
logpath  = /var/log/nginx/access.log
maxretry = 10
findtime = 60
bantime  = 600

Rate Limiting at Ingress Level

Configure the ingress controller (nginx-ingress) with global rate limits:

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginx-ingress-rate-limit
  namespace: ingress-nginx
data:
  limit-rps: "50"
  limit-burst: "100"
  limit-whitelist: "10.0.0.0/8"  # Internal networks exempt from rate limits

For the Keycloak auth endpoint specifically, apply a stricter annotation:

nginx.ingress.kubernetes.io/limit-rps: "10"
nginx.ingress.kubernetes.io/limit-burst: "20"
nginx.ingress.kubernetes.io/limit-whitelist: "10.0.0.0/8"

Audit Logging and Monitoring

What to Log

Event Type	Log Source	Retention	Sensitivity
Authentication success/failure	Keycloak	1 year	Personal data
Admin console actions	Keycloak, Kubernetes	2 years	High
Pod creation/deletion/modification	Kubernetes audit log	1 year	Metadata
Network policy violations	Cilium/Calico network	90 days	Metadata
Backup success/failure	k8up	1 year	Metadata
TLS certificate expiry	cert-manager	1 year	Metadata
Resource quota violations	Kubernetes	90 days	Metadata
API server access	Kubernetes audit log	2 years	High

Log Aggregation

Deploy the Loki stack (Grafana Loki, Promtail, Grafana) for centralized log collection. This is preferred over Elasticsearch for Kubernetes-native log aggregation due to its lighter resource footprint and native integration with Prometheus metrics.

# Promtail configuration snippet — scrape Keycloak logs
scrape_configs:
  - job_name: kubernetes-pods
    kubernetes_sd_configs:
      - role: pod
    pipeline_stages:
      - cri: {}
    relabel_configs:
      - source_labels: [__meta_kubernetes_pod_label_app_kubernetes_io_name]
        action: keep
        regex: keycloak|nextcloud|ilias|moodle|grommunio|matrix

Alerting Rules

Configure Prometheus Alertmanager with security-relevant alerts:

Alert Name	Condition	Severity
`HighFailedLogins`	Rate of 401 errors on Keycloak > 10/min for 5 minutes	Critical
`KeycloakAdminLogin`	Admin console login event	Warning
`PodCrashLooping`	Pod in CrashLoopBackOff for > 5 minutes	Critical
`TLSExpiringSoon`	Certificate expires in < 30 days	Warning
`BackupFailure`	k8up backup job fails	Critical
`PersistentVolumeUsageHigh`	PV usage > 85%	Warning
`NetworkPolicyViolation`	Denied connection logged (e.g., Cilium `policy-veredicted`)	Info

Log Retention and DSGVO Compliance

Logs containing personal data (e.g., IP addresses in authentication logs) must be retained only as long as necessary.

Access logs (IP addresses): 7 days raw, 30 days anonymized (truncate last octet).
Authentication logs: 1 year (required for forensic analysis).
Keycloak admin actions: 2 years (compliance with BDSG § 83).
Kubernetes audit logs: Configure dynamic audit webhook with retention aligned to institutional policy.

Use log masking to redact sensitive fields (passwords, session tokens, personal identifiers) before logs reach the aggregation layer. Promtail supports regex-based replace stages for this purpose.

Data Protection (DSGVO Compliance)

Encryption at Rest

All persistent volumes should be encrypted. For on-premise deployments:

Use CSI drivers with volume encryption (e.g., LUKS on the storage backend, or the CSI driver manages LUKS per volume).
For cloud deployments, use encrypted cloud disks (AWS EBS encryption, Azure Disk Encryption at rest, GCE PD encryption).

Database-level encryption should be enabled:

-- PostgreSQL: enable TDE or use pgcrypto for column-level encryption
CREATE EXTENSION IF NOT EXISTS pgcrypto;

-- MariaDB/MySQL: tablespace encryption
CREATE TABLE sensitive_data (...) ENCRYPTION='Y';

Data Classification

Service	Personal Data Stored	Metadata Stored
Keycloak	User profiles, email, group memberships	Login timestamps, IP addresses
Nextcloud	Files, user profiles, sharing metadata	Access logs, file metadata
OpenCloud	Files, user profiles	Access logs, file metadata
Grommunio	Emails, contacts, calendar data	Mail headers, access logs
OX App Suite	Emails, contacts, calendar, files	Access logs, usage analytics
SOGo	Emails, contacts, calendar	Access logs
Moodle/ILIAS	User profiles, course data, grades	Activity logs, quiz data
BigBlueButton	Recordings (if enabled), chat logs	Attendance data, session metadata
Element/Matrix	Chat messages, user profiles	Room metadata, federation logs
XWiki	User profiles, page content	Edit history, page view logs
OpenProject	User profiles, project data	Activity logs, time tracking
Zammad	User profiles, ticket content	Access logs, agent activity
CryptPad	Encrypted documents (zero-knowledge)	Session metadata (minimal)

Data Retention and Deletion

Data Type	Retention Period	Deletion Method
User profiles	Account deletion + 30 days	Keycloak user deletion + DB purge
Course/LMS content	2 years after course end	Soft delete + 90-day grace period
Email content	Per user retention policy	IMAP delete or administrative purge
File storage	Per user retention policy	Immediate delete from PV + backup
Authentication logs	1 year	Partition drop or log rotation
Session data	Session expiry + 1 day	Automatic cleanup by Keycloak
Backup data	Per policy (30-365 days)	restic forget + prune

Right to Erasure and Data Portability

openDesk Edu supports DSGVO Article 17 (right to erasure) and Article 20 (data portability) through:

Keycloak's User Self-Service for profile export and account deletion.
Nextcloud/OpenCloud's OCM API for file portability between instances.
Grommunio's IMAP export capabilities.
Moodle/ILIAS user data export plugins.

Establish an operational procedure:

Receive deletion/portability request.
Verify identity (e.g., via institutional email or in-person verification).
Export data within 30 days (Article 12 compliance).
Execute data deletion across all services.
Wait for backup retention window to expire (or perform emergency backup rotation if immediate deletion is required).
Document the erasure request and fulfillment.

Backup Encryption

The k8up/restic backup system already encrypts backups at rest and in transit. Ensure:

The restic repository password is stored in a SealedSecret or Vault — never in plaintext.
Backup data is encrypted with AES-256-GCM (default with restic).
The backup S3 endpoint uses TLS (HTTPS).
Restore procedures are tested quarterly, including decryption verification.

Incident Response Plan Template

Every openDesk Edu deployment should maintain an incident response plan. The following template covers DSGVO-specific obligations:

# INCIDENT RESPONSE RECORD

## 1. Detection
- Detected by: [monitoring alert / user report / external notification]
- Date/Time: YYYY-MM-DD HH:MM UTC
- Description:
- Severity (Critical / High / Medium / Low):

## 2. Containment
- Actions taken to limit damage:
- Compromised systems isolated via NetworkPolicy:
- Credentials rotated:
- Service pods restarted:

## 3. Assessment
- Data types involved: [personal data / credentials / metadata / none]
- Estimated number of affected users:
- Is this a DSGVO-reportable breach? [Yes / No]
- If yes: notified the supervisory authority within 72 hours.

## 4. Eradication
- Root cause identified:
- Vulnerability patched:
- Backdoor or persistence mechanism removed:

## 5. Recovery
- Restored from backup: [yes / no, timestamp]
- Service verified operational:
- Monitoring re-enabled with additional alerts:

## 6. Post-Mortem
- Lessons learned:
- Remediation items (with owners and deadlines):
- Report written by:

Service-Specific Hardening

Keycloak

Admin console restriction: Apply a NetworkPolicy that restricts access to the Keycloak admin console port (typically 8443) to a dedicated admin VPN CIDR. The public-facing ingress should only route to the authentication endpoint (/auth or /realms/).
Custom theme: Replace the default Keycloak theme to avoid version fingerprinting. At minimum, change the logo, footer text, and page title. This complicates targeted attacks against known Keycloak CVEs.
Disable default user registration if not required (universities typically provision users via LDAP/DFN-AAI federation).
Restrict grant types to only those needed (authorization code flow for web apps; disable implicit and password grants).

Nextcloud

Brute force protection: Nextcloud has built-in rate limiting. Configure:

'ratelimit.enabled' => true,
'ratelimit.limit' => 30,
'ratelimit.period' => 60,

File access control: Deploy the Nextcloud Files Access Control app to restrict file operations based on group membership, IP range, or time of day.
Sharing policies: Disable public link sharing by default; allow only group-sharing. Require password and expiration date on any external share.
App management: Remove unused apps (appstoreenabled → disable in config.php). Regularly audit installed apps.

Moodle / ILIAS

LTI security: Configure LTI 1.3 (not 1.0/1.1) for tool interoperability. Use signed messages and enforce TLS.
SSO enforcement: Disable local password-based login. All authentication must go through Keycloak/SAML. In Moodle: disable manual and email authentication plugins; keep only saml or oidc.
```
// Moodle config.php
$CFG->auth = 'auth_saml2';
```
In ILIAS: disable local authentication in ilias.ini.php.
Role auditing: Regularly audit administrator and manager role assignments. Export role lists monthly and diff against expected state.
Session security: Enforce HTTPS-only cookies; set session timeout to 30 minutes of inactivity.

Grommunio

SMTP submission restrictions: Disable open relay. Require authentication for SMTP submission (port 587). Restrict SMTP AUTH to known users only.

Rate limiting: Grommunio supports smtpd_client_connection_rate_limit. Configure:

smtpd_client_connection_rate_limit = 100
smtpd_client_message_rate_limit = 50

SPF/DKIM/DMARC: Configure email authentication for all outgoing domains. This prevents your domain from being used in spoofing attacks:
- Publish SPF records for all sending domains.
- Generate DKIM keys and publish the public key in DNS.
- Set up DMARC with p=quarantine initially, then move to p=reject after monitoring.

BigBlueButton

Recording access control: Store recordings in an access-controlled bucket. Require authentication for playback. Delete recordings after 90 days or per institutional policy.
API secret rotation: Rotate the BBB shared secret periodically (or use per-session HMAC keys).
Disable guest access in the default configuration; enable only after explicit approval workflow.
Greenlight (frontend): Restrict room creation to authenticated users only.

Element / Matrix

Federation policy: Decide based on institutional requirements:
- Restricted federation: Only allow federation with known universities (DFN Matrix homeservers). Block the rest.
- Open federation: Allow federation with all servers but monitor for spam and abuse.
Room access controls: By default, create rooms as invite-only. Disable public room listing unless explicitly needed.
Content scanning: Deploy a Matrix content scanner to block malware in shared files.
Rate limiting on registration: Prevent account creation abuse by configuring Synapse rate limits.

Vulnerability Management

Keeping Dependencies Updated

openDesk Edu uses Helm chart dependencies. Use helmfile deps update regularly and test updates in a staging environment before applying to production.

Recommended update cadence:

Dependency type	Update frequency	Critical CVE response
Base Kubernetes	Quarterly	Within 7 days
Helm charts	Monthly	Within 48 hours
Container images	Monthly	Within 24 hours
Keycloak	Per patch release	Within 7 days
Service apps	Per patch release	Within 14 days

Monitoring CVEs

Track CVEs for the stack components:

Keycloak: Subscribe to Keycloak security advisories.
Kubernetes: Watch the Kubernetes security mailing list.
Service components: Use Trivy for automated container image scanning in CI/CD:

# GitLab CI example — scan container images on every build
trivy image:
  script:
    - trivy image --severity CRITICAL,HIGH --exit-code 1 $CI_REGISTRY_IMAGE:$CI_COMMIT_SHA

Patch Procedure

Staging test: Deploy the update to a staging environment that mirrors production. Run integration tests and security scans.
Rolling update: Push to production using Helmfile's incremental update (one service at a time).
Verify: Check service health endpoints, Keycloak authentication flow, and database connectivity. Monitor Prometheus for anomaly detection.

# Example rolling update for a single service
helmfile -e production -l name=nextcloud apply --suppress-diff
# Wait for rollout to complete
kubectl rollout status deployment/nextcloud -n opendesk-edu --timeout=10m
# Verify
kubectl get events -n opendesk-edu --sort-by='.lastTimestamp'

Security Scanning

Trivy: Scan all container images in the deployment pipeline. Block deployments on CRITICAL or HIGH findings where a fix is available.
Clair: If using Harbor as a container registry, enable Clair for continuous scanning.
kube-bench: Run weekly CIS Benchmark checks against the cluster nodes.
kube-hunter: Run periodically to identify Kubernetes-level vulnerabilities.
Network policy audit: Verify that all services have network policies defined. Use kubectl get networkpolicies -n opendesk-edu and diff against your intended policy document.

Incident Response

Detection

The following alerts should be configured in your monitoring stack:

Alert	Trigger	Response SLA
Brute force attack	> 10 failed Keycloak logins/min for 5 min	< 15 min
Unknown admin login	Keycloak admin console login outside business hrs	< 30 min
Pod compromise	Pod crash loop + unexpected network connections	< 15 min
Data exfiltration	Unusual egress traffic volume (e.g., > 1 GB/h)	< 10 min
Secret exposure	Secret store reports unauthorized access	Immediate
Backup failure	k8up job fails for 2+ consecutive runs	< 30 min
TLS certificate expired	Certificate expiry check fails	< 1 hour

Containment

When a compromise is detected:

Apply an isolating NetworkPolicy to the compromised namespace or pod:

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: isolate-compromised
  namespace: opendesk-edu
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/name: compromised-service
  policyTypes:
    - Ingress
    - Egress

Rotate all secrets used by the compromised service (database credentials, API tokens, Keycloak client secrets).
Scale the compromised deployment to zero (if the service can be taken offline) or replace pods with a known-good version.
Block the attacker IP in the ingress controller via fail2ban or manual ipBlock rule.
Preserve forensic data — snapshot the pod's logs before rotation.

Eradication

Identify the root cause (unpatched CVE, misconfiguration, leaked credential).
Apply the necessary patch or configuration fix.
Remove any backdoors or persistence mechanisms (unknown cron jobs, sidecar containers, modified images).
Audit all service accounts and RBAC bindings in the affected namespace.

Recovery

Restore the service from a known-good backup (k8up restic restore).
Verify service integrity — check data consistency, user authentication, and all integrations.
Gradually restore network access following the principle of least privilege.
Notify affected users per institutional policy and DSGVO Article 34 (if personal data is involved).

Post-Mortem

After the incident is resolved, document the full timeline, root cause, actions taken, and lessons learned. Use the template in the incident response plan template section above. The post-mortem should be reviewed by the security team and used to update this hardening guide.