Skip to content

Latest commit

 

History

History
242 lines (215 loc) · 12.6 KB

File metadata and controls

242 lines (215 loc) · 12.6 KB

jinx (development version)

Post-review hardening

  • First-time contributor check actually works. is_first_time_contributor() was passing creator= to GET /repos/{owner}/{repo}/pulls, which the API silently ignores — every author looked like a first-timer until a repo had more than one PR total, and like a returning contributor after that. Now uses GET /search/issues with is:pr/is:issue + author: to count authored items correctly.
  • Slack mention stripping handles all forms. slack_event_strip_mention() now removes <@U…|display> user mentions, <!subteam^S…|name> group mentions, and <!channel|here|everyone> broadcasts in addition to the bare <@U…> form. Previously the bot's own past mentions could leak into thread history as user turns.
  • Slack thread history fails closed without bot_user_id. If the team KV record has no cached bot_user_id (older installs), slack_thread_history() now drops the whole history rather than relabelling every prior bot reply as role: "user".
  • Slack API retries 429 + 5xx once. slack_api_call() honours Retry-After (capped at 5s) before raising, so a single rate-limited post no longer aborts the answer path with the failure quip.
  • Reaction events deduped by event_id. slack_event_handle_reaction() now skips duplicate Slack retries of the same reaction_added event so the daily counters stop double-incrementing.
  • Workflow-input injection lane closed. ops-chapter-onboarding, ops-global-team-invite, ops-global-team-finalize, ops-global-team-offboard, ops-update-contributors, and ops-airtable-sync now move every ${{ inputs.* }} into an env: block and read it from Sys.getenv() inside Rscript. The invite workflow also builds its JSON artifact with jq instead of echo so a ' in a name no longer crashes the run.
  • Workflow hardening. All actions/checkout steps now set persist-credentials: false, all ops/CI/infra/reusable workflows get timeout-minutes: 15, and every self-racing ops workflow gets a concurrency: group key (the directory sync, contributors update, per-username invite/offboard, etc.).
  • gh_branch_upsert(force = FALSE) returns the real head. Previously it returned the base SHA when the branch already existed, which was wrong for any caller that wanted to act on the branch tip. Now reads the existing branch's head SHA instead.
  • Airtable sync retries. airtable_list_records() now wraps each paginated request in req_retry(max_tries = 3), so a 429 mid-sync no longer aborts the whole pull.
  • Directory slug collisions. Two records named "Maria" used to collide on maria.json. airtable_to_directory_entry() now appends a 6-char hash of the Airtable record id to the slug.
  • Contributors update stops daily commit churn. The equality check used to compare the rendered file verbatim, including _Last updated: {Sys.Date()}_, so it always diffed and committed daily even when no contributors had changed. The check now strips that line before comparing.
  • is_team_member fails open on transient errors. A real 404 still returns FALSE, but a network blip or 5xx returns TRUE (with a warning) so we don't spam the global team with first-timer welcomes during a GitHub hiccup.
  • review_assign_onboarding surfaces errors. Comment-post failures now emit a cli_alert_warning instead of being swallowed silently, so a failed cc @rladies/... tag is visible in the run log.
  • Event-embedding KV write gets a 30-day TTL so a stale vector doesn't outlive the model behind it.

Slack bot: remember the thread

  • The Slack Q&A bot now reads prior turns in the same thread via conversations.replies before calling the LLM, so follow-up questions no longer require restating the whole conversation. Up to the last 8 turns (capped at ~4k chars) are passed in as user/assistant history. Applies to both assistant DM threads and channel app_mention threads. The user's two most recent prompts are also folded into the retrieval embedding so the right sources surface when the new question is a one-word follow-up.

Airtable directory sync: actually open a PR

  • directory_create_pr() was a stub that returned the directory's pulls-page URL without creating a branch, writing files, or opening a PR. The function now creates a dated jinx/airtable-sync-YYYYMMDD branch, commits each changed contact/{slug}.json file via the contents API, and opens a PR back into main (returning the URL of an existing open PR if the branch was already in flight).
  • write_directory_entries() now returns a list of {filename, path, content, sha} records for changed entries instead of only their filenames, so the PR step can actually write them.

Reports: stop publishing to global-team

  • The weekly activity, monthly chapter-health, analytics dashboard, and GitHub Actions dashboard ops workflows no longer open issues in rladies/global-team. The publish helpers (report_publish(), analytics_publish_dashboard(), gha_publish_dashboard(), event_publish_summary()) and the matching scheduled workflows have been removed.
  • chapter_report_health() now returns the formatted markdown body instead of opening an issue.
  • ops-event-sync.yml keeps the meetup sync but no longer publishes a summary issue.

Contributors: push directly to main

  • contributor_update() now commits the rendered contributors file straight to the default branch (main by default) and returns the commit URL. The previous behaviour of opening a jinx/update-contributors branch and PR has been removed. The ops-update-contributors workflow now declares contents: write.

RAG: tolerate malformed JSON fields

  • The awesome-rladies-creations packages feed has 41 entries where pkdown_url and/or repo_url parse as empty JSON objects ({}) rather than strings or null. The scheduled indexer crashed mid-run on 2026-05-24 with missing value where TRUE/FALSE needed when one of these reached nzchar(). The custom %or% operator and the downstream nzchar() checks now go through a new is_blank() helper that treats NULL, zero-length vectors and lists, NA, and the empty string uniformly as "missing". Regression tests pin the empty-list and NA cases.
  • parse_unix_date() no longer errors when handed an empty list or any other non-character / non-numeric value; it returns NULL, matching its behaviour for NULL and "".

jinx 0.1.1

RAG: indexer moved to R

  • The content indexer that feeds the Slack bot's Cloudflare Vectorize store has been moved from the standalone Node indexer/ directory into the R package. All 8 sources (hugo-site, github-org, pkgdown-llms, github-files, github-remote-files, events-json, awesome-creations, youtube-channel) are now configured in inst/config/rag-sources.yml and implemented as gather_<type>() functions under R/rag-source-*.R. Adding a new source is one new R file plus a YAML entry — see the RAG indexer article.
  • Vector IDs remain sha256("{repo}|{path}|{chunk_idx}")[1:32], so the R re-index updates the existing rladies-content index in place rather than orphaning vectors.
  • Hugo pages are now extracted with rvest + rmarkdown::pandoc_convert() (html → gfm-raw_html) instead of cheerio + turndown. Pandoc emits proper GFM pipe tables where turndown produced flat key/value text; other output differs only cosmetically (- vs * bullets).
  • bot-index-content.yml now uses r-lib/actions/setup-r and calls jinx::rag_index_build().
  • Hugo page fetches are parallelised via httr2::req_perform_parallel (max_active = 8) to match the throughput of the JS pool the indexer replaced.
  • gather_rag_source() now hard-errors on an unknown source type so a YAML typo in inst/config/rag-sources.yml aborts the run rather than silently skipping a source on the weekly cron.
  • Cloudflare API calls (embed, upsert, account-id discovery) inherit a req_retry(max_tries = 3) policy via the base cloudflare_request() helper, so a transient 5xx no longer kills the whole indexer.
  • Fixed NA propagation in extract_hugo_page() that could embed the literal string "NA" into chunk text when a Hugo page was missing a <title> or <meta name="description"> tag.

RAG: surface upcoming events

  • The reranker now applies a 1.6× boost to events chunks whose date is in the future, so the handful of upcoming events in the index float to the top of the top-5 instead of being drowned out by the much larger pool of past events kept on the 365-day trailing window.
  • For questions that look like event queries ("upcoming events", "when's the next meetup", "any workshops soon?", etc.), the retriever now runs a second targeted query against the index using a fixed event-shaped prompt and merges the resulting events chunks into the candidate pool before reranking. This rescues upcoming events whose cosine similarity to the user's casual phrasing would otherwise have left them outside the top-20. The fixed event-shaped prompt's embedding is memoised in KV under rag:event_embedding:v1 so the second retrieval only costs one extra vector query (no second embed call) after the first warm-up.
  • The Jinx system prompt now tells the model to use the "When:" / "Status:" lines on event chunks, prefer Status: upcoming when the user asks about future events, and own it honestly if no upcoming events are in the retrieved sources rather than substituting a past one.

Slash-command reply routing

  • /jinx slash commands from the community workspace previously failed with channel_not_found because the GHA workflow used the organisers' bot token to post into a community channel ID. The worker now forwards team_id in the dispatch payload, the workflow verifies it against SLACK_ORGANIZER_TEAM_ID / SLACK_COMMUNITY_TEAM_ID repo vars, and the reply step posts via the workspace-agnostic response_url instead of chat.postMessage.

Welcome reusable improvements

  • gh_welcome_contributor() and gh_greet_contributor() gained an extra_message argument. The string is appended after the standard welcome and before the jinx signature, so callers can add a project-specific reminder (e.g. "remember to add yourself to .zenodo.json") without forking the function.
  • reusable-welcome-contributor.yml now accepts a matching extra_message workflow input and forwards it via an environment variable (no string interpolation into the R source, so the input is safe to set to any markdown).
  • The reusable also fires on pull_request_target opens, so callers that need to welcome fork-PR authors can switch their trigger without losing the welcome step.

Module reorganisation

  • New gh_* module for reusable GitHub PR/issue automation. Functions moved out of the contributor/website modules so they can be wired into any repo, not just the website:
    • contributor_welcome()gh_welcome_contributor()
    • contributor_thank()gh_thank_contributor()
    • contributor_greet()gh_greet_contributor()
    • blog_post_checklist()gh_post_checklist() (also generalised: the message now opens with "Thank you for submitting a post" so it applies to blog and news content equally; the path filter on the caller workflow already covers both).
  • chapter_get_language()i18n_get_chapter_language() (it has always lived in R/i18n.R, now follows the <module>_<verb>_* pattern).
  • Internal helpers directory_validation_row() and directory_empty_validation_df() in R/i18n-validate.R renamed to i18n_validation_row() / i18n_empty_validation_df() (they validate translations, not directory entries).
  • Reusable workflows: reusable-website-blog-checklist.ymlreusable-post-checklist.yml. reusable-welcome-contributor.yml and reusable-thank-contributor.yml updated to call the new names.

jinx 0.1.0

Initial release. jinx automates organisational workflows for the RLadies+ GitHub org: chapter onboarding and health, directory validation, blog and announcement publishing, PR review, Airtable sync, Slack, analytics, events, CFP coordination, and i18n.

Exported functions follow a <module>_<verb>[_<object>] schema so they group cleanly by autocomplete. Short prefixes are used for established acronyms (cmd_, gt_, li_, cfp_, gha_, i18n_).

Triggered via /jinx issue comments (cmd_parse() / cmd_execute()) and 24 GitHub Actions workflows. Ships with a pkgdown site, Getting Started and Workflow Reference vignettes, and starter translations for Spanish, Portuguese, and French.