SQLcl Project Orchestration Made Easy

Technische Dokumentation eines Tools zur Web-UI-basierten Orchestrierung von Oracle SQLcl Project Deployments. Single-Server Node-Anwendung, YAML-konfigurierte Pipelines, SSE-Streams für Live-Output, integrierter Editor für Fehlerkorrekturen.

1. Scope und Constraints

Eingangsbedingungen:

Oracle SQLcl Project (project export/stage/release/gen-artifact/deploy) als Versionierungs-Backbone für ein DB-Schema mit ~10 Schemas, ~7000 Changesets pro Deploy, Liquibase als Execution-Engine.
Pro Release-Zyklus: 5 Pipelines (candidat, testmig-rerun, create-release, test, prod) mit ~25 Stages, davon mehrere project-Subkommandos und domain-spezifische Cleanup-Steps (KON-Stammdaten, APEX-Automations, Liquibase-Changelog-Verifikation).
Bisher: Shell-Skripte + manuelles Anklicken in SQL Developer.
Anforderungen: Approval-Gates pro Stage, Live-Log, Retry/Skip pro Stage, Multi-Connection-Handling, kein zusätzlicher Build-Server.

Non-Goals:

Multi-Tenant/Multi-User mit Auth (lokales Netz, single-team)
Verteilte Worker (alles in einem Node-Process)
Generischer Workflow-Engine (SQLcl-spezifisch)

Stack: Node 22, Fastify 5, TypeScript (tsc build, tsx für dev), SQLite via node:sqlite (native), simple-git, yaml, vanilla JS im Browser (4500 LoC, kein Framework, direktes DOM).

2. Datenmodell

2.1 Konfiguration

Drei YAML-Dateigruppen unter config/:

config/
├── connections.yaml          # YAML-Definitionen (optional)
├── settings.yaml             # gitignored: workdir, sqlclPath, ddlInspector
├── features/                 # Wiederverwendbare Stage-Blöcke
│   ├── copy-kon-stammdaten.yaml
│   ├── sync-tab-map.yaml
│   └── …
└── pipelines/                # Pro Datei eine Pipeline-Definition
    ├── candidat.yaml
    ├── testmig-rerun.yaml
    └── …

config/ wird beim Startup einmal in cfg.pipelines: Map<id, PipelineDef> und cfg.features: Map<id, FeatureDef> geladen. Mutationen aus dem UI-Editor schreiben das jeweilige File und ersetzen den Map-Eintrag in-place (hot-reload, kein Restart, laufende Runs sehen den alten Stand weil sie ihren StageDef-Snapshot im RunRecord halten).

2.2 Runtime

Persistente Daten unter data/ (gitignored):

data/
├── runs.db                   # SQLite: runs, stages, logs
├── connections.json          # Bundled Connections (AES-256-GCM)
├── secrets.json              # Pipeline-Secrets (AES-256-GCM)
├── workdir.json              # Workdir-Pfad (overrides settings.yaml)
├── sqlcl-runtime.json        # JVM Heap-Settings, hot-reload
├── self-update.log           # Letzter Self-Update-Lauf
├── .master.key               # 32 Byte random, von Secrets+Connections geteilt
└── logs/                     # Plain-Text-Log pro Run (zodis-run-<id>.log)

SQLite-Schema:

CREATE TABLE runs (
  id TEXT PRIMARY KEY,        -- 8 hex chars
  pipelineId TEXT NOT NULL,
  inputs TEXT NOT NULL,       -- JSON: User-Eingaben
  vars TEXT NOT NULL,         -- JSON: interpolierte Variablen
  status TEXT NOT NULL,       -- pending|running|blocked|done|failed|cancelled
  currentStageIdx INTEGER,
  createdAt INTEGER,
  updatedAt INTEGER
);

CREATE TABLE stages (
  runId TEXT, stageId TEXT,
  idx INTEGER,
  status TEXT,                -- pending|running|blocked|done|failed|skipped
  blockedReason TEXT,         -- approval|approvalAfter|ddl-risk|error|input
  blockedMessage TEXT,
  blockedData TEXT,           -- JSON: Stage-spezifischer Block-Payload
  startedAt INTEGER, finishedAt INTEGER,
  error TEXT, result TEXT,
  PRIMARY KEY (runId, stageId)
);

CREATE TABLE logs (
  id INTEGER PRIMARY KEY AUTOINCREMENT,
  runId TEXT NOT NULL,
  stageId TEXT,
  ts INTEGER NOT NULL,
  level TEXT NOT NULL,        -- stdout|stderr|info|warn|error
  text TEXT NOT NULL
);
CREATE INDEX logs_run_idx ON logs(runId, id);

WAL-Mode (PRAGMA journal_mode = WAL), ein Schreiber (Engine), mehrere Leser (HTTP-Endpoints + SSE-Replay).

3. Stage-Handler-Modell

Ein Handler ist ein Objekt mit type: string und run(ctx):

interface StageHandler {
  type: string;
  run(ctx: StageContext): Promise<void>;
}

interface StageContext {
  runId: string;
  stage: StageDef;
  stageIdx: number;
  workdir: string;
  vars: Record<string, string>;
  signal: AbortSignal;
  emit(level: LogLevel, text: string): void;
  log(text: string): void;
  warn(text: string): void;
  err(text: string): void;
  blockForApproval(message: string, data?: any): Promise<void>;
  blockForInput(prompt: string, recentOutput: string): Promise<string>;
  setResult(result: any): void;
}

Implementierte Handler:

Type	Implementation
`approval`	`await ctx.blockForApproval(message)` — wartet auf POST `/api/runs/:id/stages/:stageId/approve`
`sqlcl-exec`	Spawnt SQLcl mit inline-SQL via stdin
`sqlcl-script`	Spawnt SQLcl mit `@<file>.sql`
`sqlcl-spool`	Spawnt SQLcl + SPOOL-Capture in einer Datei
`project`	Spawnt SQLcl mit `project <subcmd>`, optional Quiet-Mode-Marker
`parallel-sqlcl-script`	Splittet Skript in N gleichmäßige Command-Chunks, ein Worker pro Chunk
`ddl-inspect`	Scannt subdir, blockt mit Findings als blockedData
`review-files`	Listet alle Files unter subdir, blockt für UI-Editor
`patch-changeset-attrs`	Inline-Patch von `-- changeset`-Zeilen
`summarize-manual-changes`	Walks workdir, listet `.review-bak`/`.workfile-bak` + git status
`sync-project-version`	Patcht `.dbtools/project.config.json`'s `sqlcl.version`
`compile-schemas`	`dbms_utility.compile_schema` für jeden Schema-Eintrag
`git`	Wraps `simple-git` (commit, push, merge, tag, checkout, …)
`delete-file`	Pfad-validierter `unlinkSync` im Workdir
`show-file`	Cat-äquivalent ins Log

Stage-Definition in YAML:

- id: deploy-testmig-test
  name: Deploy .test-Artifact auf TESTMIG
  type: project
  timeout: 30m
  idleTimeout: 2m
  params:
    connection: zodis_developer-testmig-installer
    command: project deploy -file artifact/zodis_erp-${release_version}.test.zip
    retryFrom: gen-test-artifact
  approvalAfter:
    message: Mit Post-Deployment fortfahren?

timeout ist wall-clock pro Stage (Abort via AbortController). idleTimeout killt den Child-Process wenn N ms keine Stdout-Bytes ankommen — Liquibase schreibt nach einem ORA-Fehler manchmal 60s an seinem error-log, ohne Stdout. Default ist 5 Min, kann pro Stage hochgesetzt werden.

Variable-Interpolation läuft auf zwei Ebenen:

Feature-Expansion (config-load): ${param} aus with:-Block wird in den Feature-Stages substituiert. Implementation in config.ts — substituteVars lässt unbekannte ${var} unangetastet stehen.
Runtime (per Stage): interpolateDeep(stage.params, run.vars) — ersetzt verbleibende ${var} aus run.vars (inputs + derived + secrets).

4. Engine

4.1 Run-Lifecycle

class Engine {
  createRun(pipelineId: string, inputs: Record<string, string>): RunRecord
  startRun(runId: string): void
  cancel(runId: string): void
  retry(runId: string, stageId: string, fromStageId?: string): boolean
  approve(runId: string, stageId: string): boolean
  reject(runId: string, stageId: string, reason: string): boolean
  provideInput(runId: string, stageId: string, value: string): boolean
}

createRun validiert die Inputs gegen pipeline.inputs[].validate (Regex), berechnet vars aus inputs + derived + Secrets-Store, persistiert Run + Stages in SQLite.

startRun startet eine async-Schleife (executeRun) im Hintergrund. Die Schleife iteriert über pipeline.stages[currentStageIdx..] und ruft pro Stage runStage(...) auf. Rückgabewert:

type Outcome = 'ok' | 'failed' | 'retry';

'retry' löst eine Sonderbehandlung in der Loop aus: wenn pendingRetryFrom.get(runId) gesetzt ist, springt i auf den Index der Ziel-Stage und resettet alle Stages dazwischen auf pending. Sonst wird i-- und die gleiche Stage läuft erneut.

4.2 Approval/Input-Gates

private pending = new Map<string, PendingItem>();
// key: `\({runId}:\){stageId}`
// PendingItem: { runId, stageId, kind: 'approval'|'input', resolve, reject }

waitForApproval(runId, stageId, reason, message, data?): Promise<void> {
  return new Promise((resolve, reject) => {
    this.pending.set(`\({runId}:\){stageId}`, { resolve, reject, ... });
    this.updateStageState(runId, stageId, {
      status: 'blocked',
      blockedReason: reason,
      blockedMessage: message,
      blockedData: data
    });
  });
}

Resolve via approve() → continue. Reject via reject() → throw aus runStage, der catch-Block markiert die Stage failed. Spezial-Marker: new Error('__retry__') löst die Retry-Outcome aus.

4.3 Abort-Propagation

private activeControllers = new Map<string, AbortController>();
// key: `\({runId}:\){stageId}`

cancel(runId: string): void {
  // 1. Abort aktive Stage-Controller (killt SQLcl-Children via SIGTERM/SIGKILL)
  for (const [k, ctrl] of this.activeControllers) {
    if (k.startsWith(runId + ':')) {
      ctrl.abort(new Error('run cancelled'));
    }
  }
  // 2. Reject pending Approvals
  for (const [k, p] of this.pending) {
    if (p.runId === runId) p.reject(new Error('run cancelled'));
  }
  // 3. Run-Status auf cancelled
  this.updateRunStatus(runId, 'cancelled');
}

runSqlcl propagiert das Signal an die Child-Process-Lifecycle:

opts.signal.addEventListener('abort', () => {
  killedByAbort = true;
  child.kill('SIGTERM');
  setTimeout(() => { if (!child.killed) child.kill('SIGKILL'); }, 2000);
});

4.4 Stage-Output-Pipeline

Output von runSqlcl läuft zeilenweise durch onOutput(level, text). Im project deploy-Modus (Quiet-Mode aktiv) durchläuft jede Zeile mehrere Transformationsschritte:

SQLcl stdout
   ↓
sqlcl.ts: split by \r?\n, ANSI-strip, ERROR_PATTERN-detect
   ↓
onOutput(level, line)
   ↓
runWithLog.outputBuf += line + '\n'    (50KB-rolling, für post-hoc Scan)
   ↓
suppressing ?
  → puffern bis 'UPDATE SUMMARY' im Buffer → ab Marker emit
  → bei Fehler: bevor Throw extractFailureContext(buf) emittieren
emitFiltered(level, line):
  if STACK_LINE_RE.test(line) → drop, droppedStackLines++
  else → ctx.emit(level, line)

STACK_LINE_RE = /^(?:\s*at\s+[\w$.]+|\s*\.\.\.\s+\d+\s+more\b)/ — matcht Java-Stacktraces und ... N more-Continuations. Filter ist nur aktiv wenn quietUntilMarker gesetzt ist (also in project deploy-Stages).

extractFailureContext(buf) sucht in der Reihenfolge:

Letzter Index von Migration failed, error reported:
Letzter Index von liquibase.exception.DatabaseException:
Letzter Index von ERROR at line

Vom höchsten Match bis zum nächsten Running Changeset: / UPDATE SUMMARY / An error has occurred: — das ist der ORA-Block den Liquibase unterhalb des Quiet-Markers gedruckt hat. Max 8KB.

findFailureReason(buf) sucht in dieser Priorität:

Letzter ORA-NNNN: ... / SP2-N: ... / PLS-NNNNN: ... / TNS-NNN: ...
error reported: <msg>
Reason: <msg>

Resultat landet als failureReason am Error-Objekt, von der Engine gelesen und in die blockedMessage der Error-Box gehängt.

findLastRunningChangeset(buf) sucht Running Changeset: <path>.sql — letzter Match ist der Verursacher (Liquibase druckt's vor jedem Changeset).

5. SSE-Stream und Backpressure

GET /api/runs/:id/stream öffnet einen Server-Sent-Events-Stream:

reply.raw.writeHead(200, {
  'Content-Type': 'text/event-stream',
  'Cache-Control': 'no-cache, no-transform',
  Connection: 'keep-alive',
  'X-Accel-Buffering': 'no',
});

Replay-Strategie: beim Connect werden stage-batch und log-batch als JE EIN SSE-Event mit dem kompletten Array gesendet — nicht ein Event pro Stage/Log. Bei großen Logs (~5000 Zeilen) würde N Events einen Browser-Main-Thread-Block von mehreren Sekunden auslösen, weil appendLogLine pro Aufruf pane.scrollHeight liest (forced reflow):

// Client:
es.addEventListener('log-batch', (e) => {
  const logs = JSON.parse(e.data);
  state.logs.push(...logs);
  // DocumentFragment: ein Append, ein Scroll-Read
  const frag = document.createDocumentFragment();
  for (const log of logs) frag.appendChild(buildLogLine(log));
  pane.appendChild(frag);
});
es.addEventListener('log', (e) => {
  // Live-Events einzeln, pro Event ein appendChild
  appendLogLine(JSON.parse(e.data));
});

Heartbeat-Comment alle 15s damit Reverse-Proxies (nginx, traefik) die idle-Connection nicht schließen.

6. SQLcl-Spawn-Layer

src/sqlcl.ts kapselt das Process-Lifecycle inkl. Prompts:

async function runSqlcl(opts: {
  sqlclPath: string;
  workdir: string;
  connection?: ConnectionDef;
  script: string;
  signal?: AbortSignal;
  idleTimeoutMs?: number;
  onOutput: (level: LogLevel, text: string) => void;
  onPrompt?: (info: PromptInfo) => Promise<string>;
}): Promise<SqlclResult>

Das ausgeführte Skript:

SET TERM ON
SET PAGESIZE 10000
SET LINESIZE 200
CONNECT <user>/<pwd>@<host>:<port>/<svc>   -- wenn connection.connectString
<opts.script>
EXIT;

stdio: ['pipe', 'pipe', 'pipe'] — stdin wird offen gehalten für SQLcl-Prompts (Substitution-Variables, Passwort-Eingaben). Detection läuft auf tailBuffer (partial line buffer, alles nach dem letzten \n):

const PROMPT_PATTERNS = [
  { re: /(?:^|\s)Enter\s+value\s+for\s+\S+\s*:\s*$/i,    kind: 'substitution' },
  { re: /(?:^|\s)Password:\s*$/i,                          kind: 'password' },
  { re: /(?:^|\s)([0-9]+\s+rows\s+selected.|\bPause\b)\s*$/i, kind: 'pause' },
];

Bei Match wird onPrompt(info) aufgerufen — wenn der Caller ein Input liefert (über ctx.blockForInput-Flow), wird es per child.stdin.write in den Process geschrieben. Reentrancy-safe via promptInFlight-Flag.

ENV für jeden Spawn aus buildSqlclEnv():

// In src/util.ts:
const utf8Flags = '-Dfile.encoding=UTF-8 -Dstdout.encoding=UTF-8 -Dstderr.encoding=UTF-8';
const rt = getSqlclRuntimeSettings();   // hot-reload aus data/sqlcl-runtime.json
const heapFlags = `-Xmx\({rt.heapMaxMb}m -Xms\){rt.heapMinMb}m`;
const existing = base.JAVA_TOOL_OPTIONS ?? '';
const parts = existing ? [existing] : [];
if (!existing.includes('file.encoding')) parts.push(utf8Flags);
if (!/-Xmx/.test(existing)) parts.push(heapFlags);
if (rt.extraJavaOpts) parts.push(rt.extraJavaOpts);

Heap-Cap (default 512m max / 64m min) verhindert OOM-Kill bei parallel-sqlcl-Workern auf RAM-limitierten VMs. User kann via UI hochsetzen (z.B. 1024m wenn Migration-Skripte mehr brauchen).

7. DDL-Inspector

scanDistNext(workdir, subdir, scans) walk't rekursiv durch alle .sql unter <workdir>/<subdir> und führt drei Scan-Typen aus:

7.1 `uncomment-drop`

const UNCOMMENT_RE = /\/\*\s*Uncomment[^\n]*\n([\s\S]+?)\*\//g;

Matcht Liquibase's auskommentierte DROP-Blöcke. Apply verschiebt den DROP-Body an den File-Anfang (nach dem Liquibase-Header):

function uncommentDrop(match: InspectorMatch): boolean {
  const content = readFileSync(match.fullPath, 'utf8');
  const inner = extractDropBody(content, match);
  const withoutBlock = content.slice(0, start) + content.slice(end);
  const insertAt = findHeaderEnd(withoutBlock);  // walks `--` lines + already-prepended DROPs
  const dropPayload = inner.trimEnd() + (hasSlash(inner) ? '\n' : '\n/\n');
  return writeFileSync(...slice + dropPayload + slice...);
}

findHeaderEnd ist character-basiert: konsumiert führende ---Zeilen, Leerzeilen und schon prepended DROP …; / -Blöcke. Damit mehrfache Applies sich nicht überlappen sondern hintereinander stehen.

7.2 `comment-out-flashback`

/^\s*alter\s+table\s+[\w.""]+\s+no\s+flashback\s+archive\s*[;\/]/im über Zeilen. Apply: wickelt die Zeile + folgendes / in /* ... */.

Detection ignoriert Statements innerhalb anderer Block-Kommentare (isInsideBlockComment(content, offset)).

7.3 `prepend-create-sequence`

Findet CREATE TABLE mit <seq>.NEXTVAL-Referenzen und prüft, ob das File DAVOR ein CREATE SEQUENCE <seq> enthält. Regex akzeptiert quoted und bareword Identifier:

const CREATE_TABLE_RE =
  /CREATE\s+TABLE\s+(?:(?:"([^"]+)"|(\w+))\s*\.\s*)?(?:"([^"]+)"|(\w+))\s*\(([\s\S]+?)\)\s*[;\/]/gi;
const NEXTVAL_RE =
  /(?:(?:"([^"]+)"|(\w+))\s*\.\s*)?(?:"([^"]+)"|(\w+))\s*\.\s*NEXTVAL\b/gi;

Match-Result-Schema:

interface InspectorMatch {
  action: 'uncomment-drop' | 'comment-out-flashback' | 'prepend-create-sequence';
  actionLabel: string;
  file: string;        // relativ zum workdir
  fullPath: string;
  startLine: number; endLine: number;
  matchStart: number; matchEnd: number;
  statement: string;   // Preview im UI
  hint?: string;
  extra?: Record<string, any>;  // action-spezifisch
}

Stage ddl-inspect ruft scanDistNext, schreibt Findings in blockedData.report, blockt für Approval. Der Frontend rendert die Matches als Liste, jeder mit [📝 Datei öffnen] [✓ Anwenden]. Apply ruft POST /api/runs/:id/stages/:stageId/ddl/apply mit {file, startLine, action} — die Engine validiert dass die Stage geblockt ist, applyMatch patcht das File, dann re-läuft scanDistNext und der Block-Status wird aktualisiert (verbleibende Treffer angezeigt).

Genehmigte DDLs werden zusätzlich in <workdir>/activated_ddls_<release>.sql append'ed — Audit-Trail.

8. Retry-from-Stage

Manche Stage-Fehler erfordern Re-Run einer vorgelagerten Stage. Beispiel: project deploy -file artifact.zip failed mit ORA-Fehler, der User korrigiert die Source-SQL in dist/.../foo.sql, jetzt muss aber das Artifact neu gebaut werden bevor der Deploy nochmal läuft.

Lösung: retryFrom-Annotation auf der Stage-YAML:

- id: deploy-testmig-test
  type: project
  params:
    command: project deploy -file artifact/zodis_erp-${release_version}.test.zip
    retryFrom: gen-test-artifact

API: POST /api/runs/:id/stages/:stageId/retry mit {from: "gen-test-artifact"}. Engine:

retry(runId, stageId, fromStageId?): boolean {
  if (fromStageId) this.pendingRetryFrom.set(runId, fromStageId);
  this.pending.get(`\({runId}:\){stageId}`)?.reject(new Error('__retry__'));
}

// in runStage catch:
if (rejectErr?.message === '__retry__') {
  this.updateStageState(...{ status: 'pending', error: undefined, ... });
  return 'retry';
}

// in executeRun loop:
if (outcome === 'retry') {
  const rewindTo = this.pendingRetryFrom.get(runId);
  if (rewindTo) {
    const targetIdx = pipeline.stages.findIndex(s => s.id === rewindTo);
    if (targetIdx >= 0 && targetIdx < i) {
      // Stages targetIdx..i auf pending zurücksetzen
      for (let j = targetIdx; j <= i; j++) {
        this.updateStageState(runId, pipeline.stages[j].id, { status: 'pending', ... });
      }
      i = targetIdx - 1;  // wird auf targetIdx hochgezählt
      continue;
    }
  }
  i--;  // default: gleiche Stage nochmal
  continue;
}

9. Pipeline Editor und Auto-Commit

Der Pipeline/Feature-Editor ist eine YAML-Textarea mit Validate-Button. Save-Flow:

PUT /api/pipelines/files/:id (body: { yaml })
  → parseSinglePipeline(id, yaml, cfg.features)   // throws bei Parse-Fehler
  → writeFileSync(path + '.bak', oldContent)
  → writeFileSync(path, yaml)
  → cfg.pipelines.set(id, parsed)                 // hot-reload
  → autoCommitConfigChange([relPath], message)

autoCommitConfigChange:

async function autoCommitConfigChange(relPaths, message) {
  const g = simpleGit(projectRoot);
  if (!await g.checkIsRepo()) return { committed: false };
  await g.add(relPaths);   // nur die spezifischen Dateien, kein git add .
  const status = await g.status();
  if (stagedCount === 0) return { committed: false };  // identischer Inhalt
  await g.commit(message);
  try {
    await Promise.race([
      g.push(),
      timeoutPromise(15000)
    ]);
    return { committed: true, pushed: true };
  } catch (pushErr) {
    return { committed: true, pushed: false, warning: pushErr.message };
  }
}

Push-Fehler ist nicht-fatal — der Commit bleibt lokal, Response-Body enthält die Warning. Damit der Push-Fehler beim nächsten Tool-Update nicht in einen Divergent-Branch-Konflikt mündet, ruft der Self-Update- Pfad git pull --rebase --autostash statt --ff-only.

Deaktivierbar via Environment=AUTO_COMMIT_CONFIG=0 in der systemd-Unit.

10. Connection-Modell

Drei Quellen, in dieser Reihenfolge konsolidiert:

const cfg.connections = new Map<string, ConnectionDef>();

// 1. YAML (config/connections.yaml)
for (const [name, def] of yamlConnections) cfg.connections.set(name, def);

// 2. Bundled (data/connections.json, AES-256-GCM)
for (const [name, def] of connStore.toConnectionMap()) cfg.connections.set(name, def);
// → bundled überschreibt yaml bei Name-Konflikt

ConnectionDef:

interface ConnectionDef {
  connectString?: string;     // "user/pwd@host:port/svc" — bundled oder direkt YAML
  namedConnection?: string;   // SQLcl-Wallet-Alias, wird zu CONNECT -n <name>
  jdbcUrl?: string;           // Reserviert
}

Beim SQLcl-Spawn wird einer von beiden ins generierte CONNECT-Statement gepiped (über stdin, nicht argv — keine Klartext-Passwörter im Process-Listing).

Bundled-Store verschlüsselt mit AES-256-GCM (12-Byte IV + 16-Byte Tag + Ciphertext, alles base64). Master-Key 32 random Bytes in data/.master.key (chmod 0600), beim ersten Start generiert.

Connection-Check-Tool im Admin:

checkConnections(cfg): {
  configured: Array<{ name, source: 'bundled' | 'wallet-named' | 'missing' }>;
  walletNames: string[];        // aus `connmgr list`
  missing: string[];            // configured aber weder bundled noch wallet
  referenced: Array<{           // in Pipelines via params.connection: <name>
    name, configured: boolean, usedByPipelines: string[]
  }>;
  referencedMissing: string[];  // referenced aber nicht in cfg.connections
}

Damit fallen drei Klassen von Konfigurations-Drift auf:

Connection im YAML aber nicht im Wallet → SQLcl-CONNECT failed
Connection in Pipeline referenziert aber nirgends konfiguriert → Stage failed
Connection im Wallet aber nirgends genutzt → toter Eintrag (nur Info)

11. Secrets-Store

Verschlüsseltes Key/Value (gleicher Master-Key, gleiche AES-256-GCM-Routine). Beim createRun Merge-Order:

private deriveVars(pipeline, inputs): Record<string, string> {
  const vars: Record<string, string> = {};
  // 1. Secrets (niedrigste Priorität)
  for (const [k, v] of this.secretsStore.toVarsMap()) vars[k] = v;
  // 2. User-Inputs
  for (const [k, v] of Object.entries(inputs)) vars[k] = v;
  // 3. Derived (computed from inputs)
  for (const [k, expr] of Object.entries(pipeline.derived)) {
    vars[k] = interpolate(expr, vars);
  }
  return vars;
}

Damit User-Inputs einer Pipeline (z.B. zdbPassword explizit eingegeben) einen Secret-Eintrag mit gleichem Namen überschreiben können. Das übliche Setup hat aber den Secret-Eintrag — und die Pipeline-YAML hat KEIN entsprechendes Input mehr, also läuft's transparent.

12. Workdir-Management

Default-Quelle:

// 1. workdir.json (UI-Override)
const overrideWorkdir = workdirStore.get();
if (overrideWorkdir) cfg.settings.defaultWorkdir = overrideWorkdir;
// 2. settings.yaml's defaultWorkdir
// 3. process.cwd() als Fallback

Admin-UI deckt ab:

GET /api/admin/workdir — Status (Pfad, isGitRepo, branch, ahead/behind, clean, lastCommit, remoteUrl)
PUT /api/admin/workdir — Pfad ändern (validiert isAbsolute)
POST /api/admin/workdir/clone — simpleGit().clone() für initial setup
POST /api/admin/workdir/fetch — git fetch
POST /api/admin/workdir/pull — git pull (returns merge-summary)
GET /api/admin/workdir/changes — strukturierte Liste modifizierter Dateien (für Commit-Form)
POST /api/admin/workdir/commit — git add -A; git commit -m <msg>; git push? (mit 30s push-Timeout)
POST /api/admin/workdir/push — push only

Pipeline-Inputs mit id: workdir werden im API-Response von /api/pipelines mit dem aktuellen cfg.settings.defaultWorkdir als default: überschrieben — das YAML-Original bleibt unangetastet, aber das Frontend sieht den serverspezifischen Default.

Navbar-Badge zeigt Branch via 60s-Polling auf /api/admin/workdir.

13. Self-Update

Trigger via Admin-Button. Script wird als detached bash spawn'd:

set -e
echo "[\((date -Iseconds)] === self-update start (pid \)PARENT_PID) ==="
cd "${projectRoot}"
git pull --rebase --autostash
npm install --include=dev      # --include=dev wegen NODE_ENV=production
npm run build                  # tsc
npm prune --omit=dev           # production-footprint
sleep 1
kill -TERM ${parentPid}        # systemd Restart=always

Tool-Side-State:

const selfUpdateState = { running: false, startedAt: 0, finishedAt: 0 };

child.on('close', ...) setzt running = false falls der Script vor dem SIGTERM exit'd (= Build-Fehler), damit der Flag nicht hängen bleibt. Bei Erfolg sieht der Tool-Process das close-Event nicht mehr weil er schon tot ist — systemd startet ihn neu, neuer Process initialisiert mit running: false.

Pre-Check Endpoint: GET /api/admin/self-update/check — git fetch (15s timeout) + git log HEAD..origin/<branch> für Pending-Commits. Wird beim Öffnen der Admin-Page automatisch gepingt, Button-Label zeigt (N commits) wenn was anliegt.

14. Beobachtete Schwachstellen

Während des Building's:

Output-Buffer ohne Newline-Separator — sqlcl.ts emittiert via onOutput(level, text) Lines OHNE trailing \n. Im Output-Buffer für post-hoc Pattern-Matching wurden sie ursprünglich einfach konkateniert (buf += text). Effekt: line-basierte Regexes (/[^\n]+/) matched über hunderte Lines hinweg, weil sie nie einen Trenner sehen. Fix: buf += text + '\n' — und der Stacktrace-Filter funktioniert plötzlich.

Detached Child + nicht-gesetzter close-Handler — der Self-Update- Child wurde initial mit child.unref() versehen damit er den Parent überlebt. Folge: kein close-Event wurde im Parent observed, also blieb running=true hängen wenn der Script vor SIGTERM failed. Lösung: NICHT unref'en — Child outliv't den Parent trotzdem (er wird vom Parent per SIGTERM beendet, wenn der Parent vorher tot ist gibt's halt kein close-Event mehr, was auch ok ist).

Audit-Trail vs. Cache-Invalidation — state.logs auf Client-Side wird bei ↻ Re-Build & Retry geleert für visuelle Klarheit. Bei Hard-Reload kommen die alten Logs via SSE-Replay zurück — DB-State bleibt. Trade-off: Audit-Trail bleibt, visuelles Noise weg.

Permission-Drift im git-Working-Tree — wenn mal als root und mal als Service-User git pull läuft, gehören manche Objects in .git/objects/ root, andere zodis. Service kann dann nicht mehr schreiben (insufficient permission for adding an object). Fix unidirektional: chown -R zodis:zodis /app/maintenance und ab dann NUR noch sudo -u zodis -H git ....

Heap-Cap-Interaktion mit JAVA_TOOL_OPTIONS — die UTF-8-Encoding- Flags wurden in buildSqlclEnv an existierende JAVA_TOOL_OPTIONS nur angehängt wenn file.encoding noch nicht drin war. Ein User der für SQLcl manuell JAVA_TOOL_OPTIONS=-Dfile.encoding=UTF-8 gesetzt hatte, verlor unbemerkt den Heap-Cap → exit 137. Fix: Encoding- und Heap-Flags unabhängig voneinander prüfen (!/-Xmx/.test(existing)).

15. Setup

Dev-Loop:

git clone https://gitlab.zgonc.at/zodis/maintenance.git
cd maintenance/deployment
npm install
npm run setup:sqlcl       # lädt SQLcl in vendor/sqlcl/
cp config/settings.example.yaml config/settings.yaml
npm run dev               # tsx watch src/server.ts

Production (systemd):

[Service]
Type=simple
User=zodis
WorkingDirectory=/app/maintenance/deployment
ExecStart=/usr/bin/node dist/server.js
Environment=PORT=3001
Environment=HOST=0.0.0.0
Environment=NODE_ENV=production
Environment=HOME=/home/zodis
Environment=JAVA_HOME=/usr/lib/jvm/jre-17-openjdk
Environment=PATH=/usr/lib/jvm/jre-17-openjdk/bin:/usr/bin:/bin
Restart=always
RestartSec=2s

Token-Setup (~/.git-credentials als Service-User):

sudo -u zodis -H -i bash
git config --global credential.helper store
git config --global user.name "ZodisDeploy"
git config --global user.email "zodis-deploy@..."
echo 'https://oauth2:glpat-XXX@gitlab.zgonc.at' > ~/.git-credentials
chmod 600 ~/.git-credentials

16. Quantitatives

Backend: ~6500 TypeScript LoC, davon ~2000 in api.ts (Endpoint-Dichte), ~1500 in stages.ts, ~700 in engine.ts.
Frontend: ~4500 JavaScript LoC, monolithisch (web/app.js), direkt-DOM via Helper el(tag, attrs, ...children).
Config-Volumen (Production-Setup): ~14 Features × ~50 Zeilen + 9 Pipelines × ~60 Zeilen = ~1200 Zeilen YAML.
Build: tsc ~3-5s, npm install --include=dev ~30s mit gefülltem Cache.
Cold-Start eines SQLcl-Spawns: 5-15s (JVM-Bootstrap).
DB-Footprint pro Run: 10-50 KB in runs.db, plus Plain-Text-Log unter data/logs/zodis-run-<id>.log (variabel, typisch 50KB-5MB).