Add source-asana-native by enlore · Pull Request #3953 · estuary/connectors

enlore · 2026-03-04T06:41:21Z

Description:

MVP of an Asana source connector.

Workflow steps:

An Estuary Flow source connector for Asana.

Captured Resources

Resource	Type	Scope	Description
Workspaces	Snapshot	Global	All workspaces
Users	Snapshot	Per-workspace	Deduplicated across workspaces
Teams	Snapshot	Per-workspace	Organization workspaces only
Projects	Snapshot	Per-workspace	Active (non-archived) projects
Tags	Snapshot	Per-workspace	All tags
Portfolios	Snapshot	Per-workspace	Portfolios owned by authenticated user
Goals	Snapshot	Per-workspace	All goals
CustomFields	Snapshot	Per-workspace	Custom field definitions
TimePeriods	Snapshot	Per-workspace	Time period definitions
ProjectTemplates	Snapshot	Per-workspace	Project templates
StatusUpdates	Snapshot	Per-project	Project status updates
Memberships	Snapshot	Per-project	Project memberships
TeamMemberships	Snapshot	Per-team	Team memberships
Tasks	Incremental	Per-project	Sync-token events + backfill
Sections	Incremental	Per-project	Sync-token events + backfill
Attachments	Incremental	Per-project	Sync-token events + backfill
Stories	Incremental	Per-project	Sync-token events + backfill

Not captured (potentially useful)

Time Tracking Entries — logged time on tasks
Project Memberships — user-to-project membership details
Portfolio Memberships — user-to-portfolio membership details
Workspace Memberships — user-to-workspace membership details
Goal Relationships — links between goals
Task Templates — reusable task templates
User Task Lists — per-user "My Tasks" lists
Custom Field Settings — how custom fields are applied to projects
Reactions — emoji reactions on stories
Project Briefs — summary documents on projects

Setup

Prerequisites

Python 3.12+
Poetry
flowctl (brew install estuary/tap/flowctl)

Installation

Install dependencies:
```
poetry install
```

Configure credentials:

# Edit config.yaml with your Asana personal access token (gitignored)

Development

Run tests

# Unit tests (no credentials needed)
poetry run pytest tests/test_connector.py -v

# API connectivity tests (requires config.yaml)
poetry run pytest tests/test_api.py -v

# All tests
poetry run pytest -v

Test with flowctl

# Test spec output
flowctl raw spec --source test.flow.yaml

# Test discover
flowctl raw discover --source test.flow.yaml -o json --emit-raw

# Preview capture
flowctl preview --source test.flow.yaml --sessions 1 --delay 15s

Asana Event API Sync Token Flow

---
config:
  theme: redux-dark
---
graph TD
      subgraph CursorDict["Cursor Dict Structure"]
          direction LR
          CD["cursor_dict:<br/>{project_gid_1: sync_token_1,<br/> project_gid_2:
  sync_token_2,<br/> ...}"]
          CDNote["CDK manages each key as an<br/>independent concurrent subtask"]
          CD --- CDNote
      end
      subgraph Bootstrap["Bootstrap Phase"]
          direction TB
          B1["Fetch list of projects<br/>GET /projects"]
          B2["For each project_gid"]
          B3["GET /projects/{project_gid}/events<br/>(no sync token)"]
          B4["Asana returns 412<br/>Precondition Failed"]
          B5["Extract initial sync_token<br/>from 412 response body"]
          B6["Store in cursor dict:<br/>cursor_dict[project_gid] = sync_token"]
          B7["FetchPageFn runs concurrently:<br/>full paginated backfill of<br/>tasks for this
  project"]

          B1 --> B2
          B2 --> B3
          B3 --> B4
          B4 --> B5
          B5 --> B6
          B6 --> B7
      end
      subgraph Incremental["Incremental Phase"]
          direction TB
          I1["CDK spawns concurrent subtasks<br/>(one per project_gid in cursor dict)"]
          I2["GET /projects/{project_gid}/events<br/>?sync={token}"]
          I3{{"Asana returns 200?"}}
          I4["Receive events +<br/>has_more flag +<br/>new sync_token"]
          I5{"has_more?"}
          I6["Loop: call again<br/>with new token"]
          I7["Collect changed task GIDs<br/>from events, deduplicate"]
          I8{"Action type?"}
          I9["'deleted'<br/>Yield tombstone"]
          I10["'created' / 'changed' /<br/>'added' / 'removed'"]
          I11["Batch re-fetch full task docs<br/>GET /tasks/{gid}"]
          I12["Yield full documents"]
          I13["Yield new sync_token<br/>as updated cursor"]

          I1 --> I2
          I2 --> I3
          I3 -- "Yes (200)" --> I4
          I4 --> I5
          I5 -- "true" --> I6
          I6 --> I2
          I5 -- "false" --> I7
          I7 --> I8
          I8 -- "deleted" --> I9
          I8 -- "created / changed /<br/>added / removed" --> I10
          I10 --> I11
          I11 --> I12
          I9 --> I13
          I12 --> I13
      end
      subgraph Expiry["Token Expiry / Re-Backfill"]
          direction TB
          E1["Asana returns 412<br/>instead of 200"]
          E2["Extract new sync_token<br/>from 412 response body"]
          E3["Store new token in cursor dict:<br/>cursor_dict[project_gid] = new_token"]
          E4["Trigger re-backfill<br/>for that project scope"]

          E1 --> E2
          E2 --> E3
          E3 --> E4
      end
      B7 --> I1
      I3 -- "No (412)" --> E1
      E4 --> I1
      style CursorDict fill:#2d2d3d,stroke:#7c7ce0,color:#e0e0e0
      style Bootstrap fill:#1a2e1a,stroke:#4caf50,color:#e0e0e0
      style Incremental fill:#1a1a2e,stroke:#5c6bc0,color:#e0e0e0
      style Expiry fill:#2e1a1a,stroke:#ef5350,color:#e0e0e0

Documentation links affected:

N/A

Notes for reviewers:

TODO

Clean up/DRY
Generalize handling of incremental resources beyond Tasks (e.g. Attachments, Stories...)
Extract URL munging from api to models
ruff

Known Issues

Events are emitted against Stories that indicate my Attachment was added, but no Attachment doc appears to get captured during flowctl preview. Possibly due to rate limit on free tier account.

* Consolidate fetch logic into generalized funcs * Move URL munging and config to models * Refactor models to better represent API entity hierarchy

* Factor json dep out * Add model to handle entity envelope * Move more url munging out of api module

* Add generic api functions for incremental models with an exception for Stories since we fetch those per task * Make incremental models aware of their event_type * Fetch fresh copy of entity whenever we see an event action other than "deleted"

Alex-Bair

Thanks for the PR @enlore! I apologize for the delay reviewing it. It looks pretty good! I didn't realize the Asana API was so complex, especially around sync token handling.

I have a handful of questions, comments, and suggested changes. Please let me know if I was unclear anywhere or you have any questions.

Like Nicolas did with source-posthog, I'm happy to do the finishing touches (e.g. add snapshot tests that use our encrypted dev Asana credentials) once my comments are addressed. Thanks again!

Alex-Bair · 2026-03-11T19:42:46Z

+
+from source_asana_native import Connector
+
+if __name__ == "__main__":


nit: the if __name__ == "__main__" clause isn't needed here in the __main__.py file since this file only serves as the connector's entry point.

Alex-Bair · 2026-03-11T20:00:50Z

+    api_key: Annotated[
+        str,
+        Field(
+            description="Personal Access Token for Asana.",
+            title="API Key",
+            json_schema_extra={"secret": True},
+        ),
+    ]


Asana supports multiple types of authentication, not just personal access tokens. We'll want to eventually support both in source-asana-native, and it's easier to add a new authentication method if we nest all of them under a credentials field in the EndpointConfig. Instead of leaving api_key as it is now, can it instead be nested inside a credentials field so it's easier to add OAuth support later? Leveraging the AccessToken class from estuary_cdk.flow can help here too:

Suggested change

api_key: Annotated[

str,

Field(

description="Personal Access Token for Asana.",

title="API Key",

json_schema_extra={"secret": True},

),

]

credentials: AccessToken = Field(

discriminator="credentials_title",

title="Authentication",

)

Alex-Bair · 2026-03-11T20:06:19Z

+            json_schema_extra={"advanced": True},
+        ),
+    ]
+


Per the Asana docs, all API requests use the same base URL, so base_url does not need to be a user facing setting. It can just be a constant in api.py.

That simplification should also allow the removal of the advanced field from the EndpointConfig.

Alex-Bair · 2026-03-12T18:32:18Z

+            state,
+            task,
+            fetch_snapshot=fetch_fn,
+            tombstone=TOMBSTONE,


For snapshot bindings, it's sufficient to use BaseDocument(_meta=BaseDocument.Meta(op="d")) as the tombstone argument here instead of a subclass of BaseDocument. The tombstone argument is only used by the CDK here as part of the deletion inference mechanism described in the CDK's README, and I don't see many situations where it would be helpful to use a subclass of BaseDocument instead of BaseDocument itself as the tombstone.

Suggested change

tombstone=TOMBSTONE,

tombstone=BaseDocument(_meta=BaseDocument.Meta(op="d")),

Honestly, it makes sense to me for the CDK to fallback to using BaseDocument as a tombstone if one isn't provided in a common.open_binding call. I'll update the CDK to allow that so developing connectors is a smidge easier & consistent.

Alex-Bair · 2026-03-12T18:45:35Z

+        resources.append(
+            common.Resource(
+                name=name,
+                key=["/gid"],


For the CDK's deletion inference for snapshot resources to work, the key needs to be the synthetic key ["_meta/row_id"].

Suggested change

key=["/gid"],

key=["/_meta/row_id"],

This allows the CDK's deletion inference mechanism to work end-to-end through the Estuary platform. The CDK assigns each document an incrementing row_id on every snapshot cycle and emits tombstones for trailing positions when the count decreases. If the key is ["/gid"], those tombstones won't match any existing document and deletions won't propagate through to destination systems.

This is another thing I could simplify in the CDK since all snapshot bindings will use the same key. I'll make that update to the CDK too.

Alex-Bair · 2026-03-12T21:35:03Z

+            common.open_binding(
+                binding,
+                binding_index,
+                state,
+                task,
+                fetch_page=fp,
+                fetch_changes=fc,
+                tombstone=TOMBSTONE,
+            )


tombstone doesn't need provided to open_binding when there's no fetch_snapshot argument.

Suggested change

common.open_binding(

binding,

binding_index,

state,

task,

fetch_page=fp,

fetch_changes=fc,

tombstone=TOMBSTONE,

)

common.open_binding(

binding,

binding_index,

state,

task,

fetch_page=fp,

fetch_changes=fc,

)

Alex-Bair · 2026-03-12T21:43:43Z

+    projects = await _collect_projects(http, config, log)
+
+    for model in INCREMENTAL_MODELS:
+        backfill_fn = _get_backfill_fn(model)
+
+        fetch_page: dict[str, Any] = {}
+        fetch_changes: dict[str, Any] = {}
+        initial_inc: dict[str, ResourceState.Incremental] = {}
+        initial_backfill: dict[str, ResourceState.Backfill] = {}
+
+        for project in projects:
+            fetch_page[project.gid] = functools.partial(
+                backfill_fn,
+                http,
+                base_url,
+                project.gid,
+            )
+            fetch_changes[project.gid] = functools.partial(
+                fetch_events,
+                model,
+                http,
+                base_url,
+                project.gid,
+            )
+            initial_inc[project.gid] = ResourceState.Incremental(cursor=("",))
+            initial_backfill[project.gid] = ResourceState.Backfill(
+                cutoff=("",),
+                next_page=None,
+            )


Since the number of subtasks for each resource is dynamically determined by how many projects there are, we'll need to handle the case where a project is created after a capture has already been set up in Estuary. In that situation, the initial_state with the added project isn't used by the connector - it uses the state sent to it by the runtime instead. I talked about that challenge in this comment with Nicolas & pointed out how we've solved it elsewhere. Let me know if that comment or the examples are unclear & I can explain further.

Alex-Bair · 2026-03-12T21:56:19Z

+            if e.code == 412:
+                new_token = SyncTokenResponse.model_validate_json(e.body).sync
+                yield (new_token,)
+                return


If the sync token expires and we fetch a fresh one, we'll miss all the incremental changes between when the connector last checked for changes and when we fetch that new sync token, right? That seems to be the case, and the binding should be backfilled if that happens since that's a possible way for the connector to miss data. The estuary-cdk has triggers that can be yielded for the exact purpose of automatically triggering a backfill:

Suggested change

if e.code == 412:

new_token = SyncTokenResponse.model_validate_json(e.body).sync

yield (new_token,)

return

if e.code == 412:

log.warning("triggering automatic backfill due to sync token expiration")

yield Triggers.BACKFILL

You can see an example of that trigger in action in source-hubspot-native here. It essentially causes the binding's state to be wiped to the resource's initial_state the next time the connector restarts.

Alex-Bair · 2026-03-18T21:19:07Z

+    async for task in _fetch_paginated(tasks_url, Task, http, log):
+        url = Story.get_url(base_url, task.gid)
+        async for story in _fetch_paginated(url, Story, http, log):
+            yield story


Should the API calls to fetch stories be wrapped in a try/catch to handle Stories.tolerated_errors, similar to what's done in fetch_team_memberships and a handful of the other function above?

Alex-Bair · 2026-03-18T21:20:20Z

+async def fetch_stories(
+    http: HTTPSession,
+    config: EndpointConfig,
+    log: Logger,
+) -> AsyncGenerator[Story]:
+    base_url = config.advanced.base_url
+
+    for project in await _collect_projects(http, config, log):
+        tasks_url = Task.get_url(base_url, project.gid)
+        async for task in _fetch_paginated(tasks_url, Task, http, log):
+            url = Story.get_url(base_url, task.gid)
+            async for story in _fetch_paginated(url, Story, http, log):
+                yield story


The fetch_stories function is unused. If it's not needed anywhere, can it be removed?

Alex-Bair · 2026-03-20T13:16:53Z

Following up on the action items I took, a SnapshotResource class has been added to the estuary-cdk. It's used just like Resource but has defaults for key and initial_state that work for most snapshot resources. It's also not required to pass a tombstone to common.open_binding any longer; if a tombstone argument isn't provided, the CDK falls back to using BaseDocument as the tombstone.

Using SnapshotResource could help address some of my comments. If you'd like to see SnapshotResource in action, this commit shows how I moved source-intercom-native's snapshot resources over to use it.

jameswinegar · 2026-03-28T03:40:00Z

@Alex-Bair I think you should take this forward if possible. Ping me on slack if you want to discuss.

enlore added 13 commits March 3, 2026 23:54

Add source-asana-native

27d6bef

Add VERSION

24cc53f

Update pyproject to use local cdk

6d7e821

DRY things out, improve abstraction

02e0e14

* Consolidate fetch logic into generalized funcs * Move URL munging and config to models * Refactor models to better represent API entity hierarchy

Add comment to explain tolerated_errors

e4b988c

Improve handling of Event API 412/sync token flow

bf5fb01

* Factor json dep out * Add model to handle entity envelope * Move more url munging out of api module

Remove unused import

dd04f9b

Generalize incremental sync

7ff03de

* Add generic api functions for incremental models with an exception for Stories since we fetch those per task * Make incremental models aware of their event_type * Fetch fresh copy of entity whenever we see an event action other than "deleted"

Add unit tests

fe4ee24

Update README

ae05d57

Fix ruff issues

488348c

Format

7c1dce6

Fix imports, add dedup test

cd65a5f

Alex-Bair requested changes Mar 18, 2026

View reviewed changes


		from source_asana_native import Connector

		if __name__ == "__main__":

	tombstone=TOMBSTONE,
	tombstone=BaseDocument(_meta=BaseDocument.Meta(op="d")),

Conversation

enlore commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Captured Resources

Not captured (potentially useful)

Setup

Prerequisites

Installation

Development

Run tests

Test with flowctl

Asana Event API Sync Token Flow

Uh oh!

Alex-Bair left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Alex-Bair commented Mar 20, 2026

Uh oh!

jameswinegar commented Mar 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

enlore commented Mar 4, 2026 •

edited

Loading