feat: database detection + AKS workload identity support#630
feat: database detection + AKS workload identity support#630
Conversation
There was a problem hiding this comment.
Pull request overview
Adds database dependency detection to analyze-repo and threads that signal into AKS/Kubernetes manifest generation guidance (including workload identity ServiceAccount hints), plus supporting prompt steps and knowledge-pack recommendations.
Changes:
- Introduces a
detectDatabases()mapper and exposes results asmodules[].detectedDatabasesin analyze-repo output (and summary text). - Updates k8s manifest planning to optionally include
serviceaccount.yamland workload identity instructions when DB types are detected. - Expands Gradle dependency extraction and adds knowledge-pack entries + new unit tests for DB detection.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 5 comments.
Show a summary per file
| File | Description |
|---|---|
| test/unit/tools/database-detector.test.ts | Adds unit coverage for the new DB detection function. |
| src/tools/generate-k8s-manifests/tool.ts | Conditionally adds ServiceAccount manifest + workload identity instruction based on detected DB types. |
| src/tools/analyze-repo/tool.ts | Calls DB detector and adds detected DB types to the user-facing analysis summary. |
| src/tools/analyze-repo/schema.ts | Extends module schema with optional detectedDatabases output. |
| src/tools/analyze-repo/parsers.ts | Broadens Gradle dependency keyword matching. |
| src/tools/analyze-repo/database-detector.ts | New dependency-to-database detection/normalization logic (core of feature). |
| src/prompts/shared/steps.ts | Adds a prompt step instructing the workflow to check detected DBs. |
| src/prompts/aks-loop/prompt.ts | Updates AKS loop prompt to pass detected DB info into manifest generation. |
| knowledge/packs/kubernetes-pack.json | Adds AKS workload identity ServiceAccount + pod-label recommendations. |
| knowledge/packs/database-pack.json | Adds Azure-managed DB migration/config guidance for k8s + workload identity. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
4247bfe to
a0ecd1c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 4 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 11 out of 11 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
You can also share your feedback on Copilot code review. Take the survey.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
3124d49 to
54b63ee
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 7 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
… quality
- Return { vars, warning? } from detectEnvVarsFromDockerCompose instead
of silently swallowing YAML parse errors
- Add debug logging for undefined vs empty detectedDatabases in
generate-k8s-manifests to aid orchestration debugging
- Extract partitionEnvVarNames helper to deduplicate env var
classification across generate-dockerfile and generate-k8s-manifests
- Pre-compile regex patterns in env-detector for better performance
6c48f57 to
49864b8
Compare
src/prompts/shared/steps.ts
Outdated
| ' 1. Confirm the classifications (secret, database, config) are correct.', | ||
| ' 2. For secret-classified vars, confirm they will be injected at runtime (not baked into the image).', | ||
| ' 3. For config-classified vars, confirm default values or ask for correct values.', | ||
| '- Pass the confirmed `detectedEnvVars` to downstream tools (generate-dockerfile, generate-k8s-manifests).', |
There was a problem hiding this comment.
let's use TOOL_NAME var properties to ref the tools here so they don't drift
ex: TOOL_NAME.VERIFY_DEPLOY
| case 'javascript': | ||
| case 'typescript': | ||
| add('NODE_ENV', 'production'); | ||
| add('PORT', '3000'); |
There was a problem hiding this comment.
we can probably move this out into a separate detection task within analyze repo next
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 15 out of 15 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| /** | ||
| * Partition env vars into config names and secret/database names in a single pass. | ||
| */ | ||
| export function partitionEnvVarNames(vars: DetectedEnvVar[]): { configNames: string[]; secretNames: string[] } { | ||
| const configNames: string[] = []; | ||
| const secretNames: string[] = []; | ||
| for (const v of vars) { | ||
| if (v.classification === 'config') { | ||
| configNames.push(v.name); | ||
| } else { | ||
| secretNames.push(v.name); | ||
| } | ||
| } | ||
| return { configNames, secretNames }; | ||
| } |
There was a problem hiding this comment.
partitionEnvVarNames currently groups both 'secret' and 'database' classifications into secretNames. This causes DB_* vars like DB_HOST/DB_PORT to be treated as secrets downstream (Secret generation + 'do not bake' warnings), which conflicts with the separate 'database' classification and common practice of putting non-secret DB connection details into ConfigMaps. Consider returning separate arrays (config/database/secret) or at least grouping 'database' with configNames for downstream guidance.
| export function detectEnvVarsFromSpringConfig(content: string): DetectedEnvVar[] { | ||
| const vars: DetectedEnvVar[] = []; | ||
| const source = 'application.properties'; | ||
| const seen = new Set<string>(); | ||
|
|
||
| // Match ${VAR_NAME} or ${VAR_NAME:defaultValue} | ||
| const pattern = /\$\{([A-Za-z_][A-Za-z0-9_.]*?)(?::([^}]*))?\}/g; | ||
| let match: RegExpExecArray | null; | ||
|
|
||
| while ((match = pattern.exec(content)) !== null) { | ||
| const name = match[1]!; | ||
| if (seen.has(name)) continue; | ||
| seen.add(name); | ||
|
|
||
| const defaultValue = match[2]; | ||
| vars.push({ | ||
| name, | ||
| classification: classifyEnvVar(name), | ||
| source, | ||
| required: defaultValue === undefined, | ||
| ...(defaultValue !== undefined && { defaultValue }), | ||
| }); | ||
| } |
There was a problem hiding this comment.
detectEnvVarsFromSpringConfig hard-codes source='application.properties', so vars extracted from application.yml will be attributed to the wrong file. Also, the placeholder regex allows dots in the variable name ([A-Za-z0-9_.]), which will capture property placeholders like ${spring.profiles.active} as "env vars". Consider accepting a source param and restricting names to env-var-like tokens (e.g., [A-Za-z_][A-Za-z0-9_]), or filtering out keys with '.' to reduce false positives.
| envVarInstruction += ` Create ConfigMap with config vars: ${configNames.join(', ')}.`; | ||
| } | ||
| if (secretNames.length > 0) { | ||
| envVarInstruction += ` Reference Secrets for secret vars via envFrom: ${secretNames.join(', ')}.`; |
There was a problem hiding this comment.
The instruction text says "Reference Secrets for secret vars via envFrom: <VAR1, VAR2>" but envFrom references Secret/ConfigMap resources, not individual environment variable names/keys. This is likely to generate invalid manifests. Rephrase to something like "Create a Secret with keys: ... and reference it via envFrom/secretKeyRef" (and similarly for ConfigMap) so the generated YAML is syntactically correct.
| envVarInstruction += ` Create ConfigMap with config vars: ${configNames.join(', ')}.`; | |
| } | |
| if (secretNames.length > 0) { | |
| envVarInstruction += ` Reference Secrets for secret vars via envFrom: ${secretNames.join(', ')}.`; | |
| envVarInstruction += ` Create a ConfigMap with keys for config vars: ${configNames.join(', ')} and reference it from the workload via envFrom/configMapRef or individual env entries with configMapKeyRef as appropriate.`; | |
| } | |
| if (secretNames.length > 0) { | |
| envVarInstruction += ` Create a Secret with keys for secret vars: ${secretNames.join(', ')} and reference it from the workload via envFrom/secretRef or individual env entries with secretKeyRef as appropriate.`; |
| // Partition env vars once for manifest decisions and instruction building | ||
| const { configNames, secretNames } = partitionEnvVarNames(input.detectedEnvVars ?? []); | ||
|
|
||
| // Add configmap if there are ports or config-classified environment variables | ||
| if ((input.ports && input.ports.length > 0) || configNames.length > 0) { | ||
| manifestFiles.push({ path: './k8s/configmap.yaml', purpose: 'Configuration management' }); | ||
| } | ||
|
|
||
| // Add secret if secret-classified environment variables are detected | ||
| if (secretNames.length > 0) { | ||
| manifestFiles.push({ path: './k8s/secret.yaml', purpose: 'Secret management' }); | ||
| } |
There was a problem hiding this comment.
Because partitionEnvVarNames treats classification==='database' as a secret, secretNames will include DB_HOST/DB_PORT/etc. This will (1) create secret.yaml instead of configmap.yaml for non-secret DB settings, and (2) omit configmap.yaml entirely when only database vars exist and ports are empty. Once partitioning is fixed, consider adjusting these manifest file decisions so 'database' vars map to a ConfigMap (and only credentials/connection strings map to a Secret).
| import { detectDatabases } from './database-detector'; | ||
| import { | ||
| detectEnvVarsFromEnvFile, | ||
| detectEnvVarsFromDockerCompose, | ||
| detectEnvVarsFromSpringConfig, | ||
| inferFrameworkEnvVars, | ||
| deduplicateEnvVars, | ||
| } from './env-detector'; |
There was a problem hiding this comment.
PR description focuses on database detection + workload identity, but this PR also introduces a full env var detection/inference pipeline (new analyze-repo output fields, prompt step, Dockerfile/K8s instruction changes). Please update the PR description/test plan to explicitly call out env var detection changes and any expected user-facing behavior (e.g., always-inferred vars) so reviewers/users know what’s being introduced.
- Merge main (1 commit behind, no conflicts) - Use TOOL_NAME constants in envVarCheckStep (steps.ts) - Separate database from secret in partitionEnvVarNames (3-way split) - Fix detectEnvVarsFromSpringConfig: accept source param, reject dotted Spring property names from regex - Fix envFrom instruction syntax in generate-k8s-manifests to use proper configMapRef/secretRef K8s syntax - Add start anchors to unanchored regex patterns in database-detector - Expand CONFIG_FILE_PATTERN to match .yaml extension variants - Redact defaultValue for secret-classified env vars to prevent credential leakage in tool output - Update generate-dockerfile to handle databaseNames as config - Add TODO for future env-detector extraction - Fix pre-existing ESLint warnings in env-detector.ts Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Summary
analyze-repo): newdetectDatabases()function identifies databases from dependency lists across Node, Python, Java, .NET, Go, and Rust ecosystems. Results surface asdetectedDatabaseson each module in the analysis output and in the summary line.runtimeOnly,api, andcompileOnlyscopes (previously onlyimplementation), so DB drivers likeorg.postgresql:postgresqldeclared asruntimeOnlyare no longer missed.serviceaccount.yamland appends workload identity guidance (annotations, pod labels, passwordless auth) to the manifest plan.detectedDependenciesto manifest generation.Test plan
detectDatabasescovering all ecosystems,groupId:artifactId(Java), Go module paths, dedup, and case-insensitivitytmp/spring-petclinic— detects MySQL + PostgreSQL from GradleruntimeOnlydepsnpm run validate(pre-existingembedded-packsbuild issue unrelated to this PR)analyze-repo→generate-k8s-manifestswith detected DB types