feat: database detection + AKS workload identity support by gambtho · Pull Request #630 · Azure/containerization-assist

gambtho · 2026-03-12T21:32:18Z

Summary

Database detector (analyze-repo): new detectDatabases() function identifies databases from dependency lists across Node, Python, Java, .NET, Go, and Rust ecosystems. Results surface as detectedDatabases on each module in the analysis output and in the summary line.
Gradle parser fix: expanded dependency regex to capture runtimeOnly, api, and compileOnly scopes (previously only implementation), so DB drivers like org.postgresql:postgresql declared as runtimeOnly are no longer missed.
K8s manifest generation: when database dependencies are detected, automatically includes serviceaccount.yaml and appends workload identity guidance (annotations, pod labels, passwordless auth) to the manifest plan.
AKS loop prompt: inserts a new "Check for database dependencies" step after repo analysis, and passes DB types as detectedDependencies to manifest generation.
Knowledge packs: 3 new Kubernetes entries (workload identity SA, pod label, passwordless DB auth) and 1 new database entry (Azure managed DB migration pattern with ConfigMap + workload identity).

Test plan

27 unit tests for detectDatabases covering all ecosystems, groupId:artifactId (Java), Go module paths, dedup, and case-insensitivity
Existing analyze-repo tests pass (41 total)
Manual validation against tmp/spring-petclinic — detects MySQL + PostgreSQL from Gradle runtimeOnly deps
Run npm run validate (pre-existing embedded-packs build issue unrelated to this PR)
End-to-end test via MCP inspector: analyze-repo → generate-k8s-manifests with detected DB types

Copilot

Pull request overview

Adds database dependency detection to analyze-repo and threads that signal into AKS/Kubernetes manifest generation guidance (including workload identity ServiceAccount hints), plus supporting prompt steps and knowledge-pack recommendations.

Changes:

Introduces a detectDatabases() mapper and exposes results as modules[].detectedDatabases in analyze-repo output (and summary text).
Updates k8s manifest planning to optionally include serviceaccount.yaml and workload identity instructions when DB types are detected.
Expands Gradle dependency extraction and adds knowledge-pack entries + new unit tests for DB detection.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
test/unit/tools/database-detector.test.ts	Adds unit coverage for the new DB detection function.
src/tools/generate-k8s-manifests/tool.ts	Conditionally adds ServiceAccount manifest + workload identity instruction based on detected DB types.
src/tools/analyze-repo/tool.ts	Calls DB detector and adds detected DB types to the user-facing analysis summary.
src/tools/analyze-repo/schema.ts	Extends module schema with optional `detectedDatabases` output.
src/tools/analyze-repo/parsers.ts	Broadens Gradle dependency keyword matching.
src/tools/analyze-repo/database-detector.ts	New dependency-to-database detection/normalization logic (core of feature).
src/prompts/shared/steps.ts	Adds a prompt step instructing the workflow to check detected DBs.
src/prompts/aks-loop/prompt.ts	Updates AKS loop prompt to pass detected DB info into manifest generation.
knowledge/packs/kubernetes-pack.json	Adds AKS workload identity ServiceAccount + pod-label recommendations.
knowledge/packs/database-pack.json	Adds Azure-managed DB migration/config guidance for k8s + workload identity.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

src/tools/generate-k8s-manifests/tool.ts

src/tools/analyze-repo/database-detector.ts

test/unit/tools/database-detector.test.ts

src/prompts/shared/steps.ts

src/prompts/aks-loop/prompt.ts

src/tools/analyze-repo/database-detector.ts

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

src/tools/generate-k8s-manifests/tool.ts

src/tools/generate-k8s-manifests/schema.ts

src/tools/analyze-repo/schema.ts

src/tools/analyze-repo/parsers.ts

Copilot

Pull request overview

Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

test/unit/tools/database-detector.test.ts

knowledge/packs/kubernetes-pack.json

src/tools/generate-k8s-manifests/schema.ts

src/tools/analyze-repo/database-detector.ts

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/tools/generate-dockerfile/tool.ts

src/tools/analyze-repo/env-detector.ts

src/tools/analyze-repo/tool.ts

src/tools/analyze-repo/env-detector.ts

src/tools/generate-k8s-manifests/tool.ts

… quality - Return { vars, warning? } from detectEnvVarsFromDockerCompose instead of silently swallowing YAML parse errors - Add debug logging for undefined vs empty detectedDatabases in generate-k8s-manifests to aid orchestration debugging - Extract partitionEnvVarNames helper to deduplicate env var classification across generate-dockerfile and generate-k8s-manifests - Pre-compile regex patterns in env-detector for better performance

davidgamero · 2026-03-26T01:44:59Z

src/prompts/shared/steps.ts

+      '  1. Confirm the classifications (secret, database, config) are correct.',
+      '  2. For secret-classified vars, confirm they will be injected at runtime (not baked into the image).',
+      '  3. For config-classified vars, confirm default values or ask for correct values.',
+      '- Pass the confirmed `detectedEnvVars` to downstream tools (generate-dockerfile, generate-k8s-manifests).',


let's use TOOL_NAME var properties to ref the tools here so they don't drift
ex: TOOL_NAME.VERIFY_DEPLOY

davidgamero · 2026-03-26T02:04:11Z

src/tools/analyze-repo/env-detector.ts

+    case 'javascript':
+    case 'typescript':
+      add('NODE_ENV', 'production');
+      add('PORT', '3000');


we can probably move this out into a separate detection task within analyze repo next

Copilot

Pull request overview

Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-09T19:57:31Z

src/tools/analyze-repo/env-detector.ts

+/**
+ * Partition env vars into config names and secret/database names in a single pass.
+ */
+export function partitionEnvVarNames(vars: DetectedEnvVar[]): { configNames: string[]; secretNames: string[] } {
+  const configNames: string[] = [];
+  const secretNames: string[] = [];
+  for (const v of vars) {
+    if (v.classification === 'config') {
+      configNames.push(v.name);
+    } else {
+      secretNames.push(v.name);
+    }
+  }
+  return { configNames, secretNames };
+}


partitionEnvVarNames currently groups both 'secret' and 'database' classifications into secretNames. This causes DB_* vars like DB_HOST/DB_PORT to be treated as secrets downstream (Secret generation + 'do not bake' warnings), which conflicts with the separate 'database' classification and common practice of putting non-secret DB connection details into ConfigMaps. Consider returning separate arrays (config/database/secret) or at least grouping 'database' with configNames for downstream guidance.

Copilot · 2026-04-09T19:57:32Z

src/tools/analyze-repo/env-detector.ts

+export function detectEnvVarsFromSpringConfig(content: string): DetectedEnvVar[] {
+  const vars: DetectedEnvVar[] = [];
+  const source = 'application.properties';
+  const seen = new Set<string>();
+
+  // Match ${VAR_NAME} or ${VAR_NAME:defaultValue}
+  const pattern = /\$\{([A-Za-z_][A-Za-z0-9_.]*?)(?::([^}]*))?\}/g;
+  let match: RegExpExecArray | null;
+
+  while ((match = pattern.exec(content)) !== null) {
+    const name = match[1]!;
+    if (seen.has(name)) continue;
+    seen.add(name);
+
+    const defaultValue = match[2];
+    vars.push({
+      name,
+      classification: classifyEnvVar(name),
+      source,
+      required: defaultValue === undefined,
+      ...(defaultValue !== undefined && { defaultValue }),
+    });
+  }


detectEnvVarsFromSpringConfig hard-codes source='application.properties', so vars extracted from application.yml will be attributed to the wrong file. Also, the placeholder regex allows dots in the variable name ([A-Za-z0-9_.]), which will capture property placeholders like ${spring.profiles.active} as "env vars". Consider accepting a source param and restricting names to env-var-like tokens (e.g., [A-Za-z_][A-Za-z0-9_]), or filtering out keys with '.' to reduce false positives.

Copilot · 2026-04-09T19:57:32Z

src/tools/generate-k8s-manifests/tool.ts

+        envVarInstruction += ` Create ConfigMap with config vars: ${configNames.join(', ')}.`;
+      }
+      if (secretNames.length > 0) {
+        envVarInstruction += ` Reference Secrets for secret vars via envFrom: ${secretNames.join(', ')}.`;


The instruction text says "Reference Secrets for secret vars via envFrom: <VAR1, VAR2>" but envFrom references Secret/ConfigMap resources, not individual environment variable names/keys. This is likely to generate invalid manifests. Rephrase to something like "Create a Secret with keys: ... and reference it via envFrom/secretKeyRef" (and similarly for ConfigMap) so the generated YAML is syntactically correct.

Suggested change

envVarInstruction += ` Create ConfigMap with config vars: ${configNames.join(', ')}.`;

}

if (secretNames.length > 0) {

envVarInstruction += ` Reference Secrets for secret vars via envFrom: ${secretNames.join(', ')}.`;

envVarInstruction += ` Create a ConfigMap with keys for config vars: ${configNames.join(', ')} and reference it from the workload via envFrom/configMapRef or individual env entries with configMapKeyRef as appropriate.`;

}

if (secretNames.length > 0) {

envVarInstruction += ` Create a Secret with keys for secret vars: ${secretNames.join(', ')} and reference it from the workload via envFrom/secretRef or individual env entries with secretKeyRef as appropriate.`;

Copilot · 2026-04-09T19:57:32Z

src/tools/generate-k8s-manifests/tool.ts

+      // Partition env vars once for manifest decisions and instruction building
+      const { configNames, secretNames } = partitionEnvVarNames(input.detectedEnvVars ?? []);
+
+      // Add configmap if there are ports or config-classified environment variables
+      if ((input.ports && input.ports.length > 0) || configNames.length > 0) {
        manifestFiles.push({ path: './k8s/configmap.yaml', purpose: 'Configuration management' });
      }

+      // Add secret if secret-classified environment variables are detected
+      if (secretNames.length > 0) {
+        manifestFiles.push({ path: './k8s/secret.yaml', purpose: 'Secret management' });
+      }


Because partitionEnvVarNames treats classification==='database' as a secret, secretNames will include DB_HOST/DB_PORT/etc. This will (1) create secret.yaml instead of configmap.yaml for non-secret DB settings, and (2) omit configmap.yaml entirely when only database vars exist and ports are empty. Once partitioning is fixed, consider adjusting these manifest file decisions so 'database' vars map to a ConfigMap (and only credentials/connection strings map to a Secret).

Copilot · 2026-04-09T19:57:32Z

src/tools/analyze-repo/tool.ts

+import { detectDatabases } from './database-detector';
+import {
+  detectEnvVarsFromEnvFile,
+  detectEnvVarsFromDockerCompose,
+  detectEnvVarsFromSpringConfig,
+  inferFrameworkEnvVars,
+  deduplicateEnvVars,
+} from './env-detector';


PR description focuses on database detection + workload identity, but this PR also introduces a full env var detection/inference pipeline (new analyze-repo output fields, prompt step, Dockerfile/K8s instruction changes). Please update the PR description/test plan to explicitly call out env var detection changes and any expected user-facing behavior (e.g., always-inferred vars) so reviewers/users know what’s being introduced.

- Merge main (1 commit behind, no conflicts) - Use TOOL_NAME constants in envVarCheckStep (steps.ts) - Separate database from secret in partitionEnvVarNames (3-way split) - Fix detectEnvVarsFromSpringConfig: accept source param, reject dotted Spring property names from regex - Fix envFrom instruction syntax in generate-k8s-manifests to use proper configMapRef/secretRef K8s syntax - Add start anchors to unanchored regex patterns in database-detector - Expand CONFIG_FILE_PATTERN to match .yaml extension variants - Redact defaultValue for secret-classified env vars to prevent credential leakage in tool output - Update generate-dockerfile to handle databaseNames as config - Add TODO for future env-detector extraction - Fix pre-existing ESLint warnings in env-detector.ts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings March 12, 2026 21:32

gambtho changed the title ~~feat: enhance db detection and associated manifests~~ feat: database detection + AKS workload identity support Mar 12, 2026

Copilot started reviewing on behalf of gambtho March 12, 2026 21:33 View session

Copilot AI reviewed Mar 12, 2026

View reviewed changes

github-advanced-security bot found potential problems Mar 12, 2026

View reviewed changes

src/tools/analyze-repo/database-detector.ts Fixed Show fixed Hide fixed

src/tools/analyze-repo/database-detector.ts Fixed Show fixed Hide fixed

src/tools/analyze-repo/database-detector.ts Fixed Show fixed Hide fixed

gambtho force-pushed the thgamble/dbmigration branch from 4247bfe to a0ecd1c Compare March 12, 2026 21:45

Copilot AI review requested due to automatic review settings March 13, 2026 14:33

Copilot started reviewing on behalf of gambtho March 13, 2026 14:33 View session

Copilot AI reviewed Mar 13, 2026

View reviewed changes

src/tools/generate-k8s-manifests/tool.ts Outdated Show resolved Hide resolved

src/tools/generate-k8s-manifests/schema.ts Show resolved Hide resolved

src/tools/analyze-repo/schema.ts Show resolved Hide resolved

src/tools/analyze-repo/parsers.ts Outdated Show resolved Hide resolved

Copilot AI review requested due to automatic review settings March 13, 2026 15:01

Copilot started reviewing on behalf of gambtho March 13, 2026 15:02 View session

Copilot AI reviewed Mar 13, 2026

View reviewed changes

gambtho and others added 4 commits March 25, 2026 15:04

feat: enhance db detection and associated manifests

2dc1b57

pr feedback

ec10dfa

Apply suggestions from code review

980a2d1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

more pr feedback

54b63ee

Copilot AI review requested due to automatic review settings March 25, 2026 19:33

gambtho force-pushed the thgamble/dbmigration branch from 3124d49 to 54b63ee Compare March 25, 2026 19:33

Copilot started reviewing on behalf of gambtho March 25, 2026 19:36 View session

Copilot AI reviewed Mar 25, 2026

View reviewed changes

gambtho force-pushed the thgamble/dbmigration branch from 6c48f57 to 49864b8 Compare March 25, 2026 19:43

gambtho enabled auto-merge (squash) March 26, 2026 01:08

davidgamero reviewed Mar 26, 2026

View reviewed changes

Merge branch 'main' into thgamble/dbmigration

ab416af

Copilot AI review requested due to automatic review settings April 9, 2026 19:53

Copilot started reviewing on behalf of davidgamero April 9, 2026 19:53 View session

Copilot AI reviewed Apr 9, 2026

View reviewed changes

Merge remote-tracking branch 'gambtho/main' into pr-630-update

dc375d9

Conversation

gambtho commented Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

davidgamero Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

davidgamero Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 9, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

gambtho commented Mar 12, 2026 •

edited

Loading