fix: scroll element into view before click, hover, tap, and drag#1073
fix: scroll element into view before click, hover, tap, and drag#1073jin-2-kakaoent wants to merge 12 commits into
Conversation
Clicking off-screen elements silently dispatched mouse events to coordinates outside the viewport, causing the click to have no effect without returning an error. - Add `scroll_into_view_if_needed` using CDP `DOM.scrollIntoViewIfNeeded` with JS `scrollIntoView` fallback for unsupported environments - Add `assert_in_viewport` to return an error when an element cannot be scrolled into view - Unify `resolve_element_center` to always use `getBoundingClientRect` (viewport-relative) instead of `DOM.getBoxModel` (page-absolute) - Extract `get_center_and_viewport` to fetch element center and viewport dimensions in a single CDP call Closes vercel-labs#1044
|
@hyunjinee is attempting to deploy a commit to the Vercel Labs Team on Vercel. A member of the Team first needs to authorize it. |
…abs#1044) Verify that click, hover, and click-by-ref on off-screen elements auto-scroll into view before dispatching the interaction.
DOM.resolveNode succeeds for detached nodes, which broke the stale-ref fallback path. Instead of adding an extra CDP round-trip to check isConnected, fold the check into the existing getBoundingClientRect JS call (zero overhead on the happy path) and retry with a fresh accessibility tree lookup when a detached node is detected.
|
@jin-2-kakaoent LGTM! One minor follow-up to consider: |
…ck-1044 # Conflicts: # cli/src/native/e2e_tests.rs
…alidation - tap_touch and handle_drag now use resolve_scroll_and_center for auto-scroll + detached-node retry (addresses review feedback on vercel-labs#1073) - Fix drag ordering: mouseDown at source before scrolling to target, preventing source coordinates from being invalidated - Remove unused resolve_element_center (dead code after migration) - Unify e2e offscreen tests into a single e2e_offscreen_scroll_before_interactions
…alidation - tap_touch and handle_drag now use resolve_scroll_and_center for auto-scroll + detached-node retry (addresses review feedback on vercel-labs#1073) - Fix drag ordering: mouseDown at source before scrolling to target, preventing source coordinates from being invalidated - Remove unused resolve_element_center (dead code after migration) - Unify e2e offscreen tests into a single e2e_offscreen_scroll_before_interactions
…alidation - tap_touch and handle_drag now use resolve_scroll_and_center for auto-scroll + detached-node retry (addresses review feedback on vercel-labs#1073) - Fix drag ordering: mouseDown at source before scrolling to target, preventing source coordinates from being invalidated - Remove unused resolve_element_center (dead code after migration) - Unify e2e offscreen tests into a single e2e_offscreen_scroll_before_interactions
2ee77ed to
f2c4d7d
Compare
- Remove unused box_model_center and its test - Propagate JS scrollIntoView fallback error instead of silently ignoring
|
@ctate Thanks for the feedback! I've applied All e2e + manual tests pass. |
Resolve conflict in actions.rs: keep resolve_scroll_and_center call from branch with improved comment from main.
…ck-1044 # Conflicts: # cli/src/native/e2e_tests.rs
…lick-1044 # Conflicts: # cli/src/native/e2e_tests.rs
|
Really need this! Cost me an hour to debug the issue. |
|
Agreed on this being essential. I just ran into this issue and made multiple changes trying to work around it, without realizing it was a fundamental difference between agent browser and Playwright behavior. |
|
@ctate Could you take a look when you have a chance? |
|
@jin-2-kakaoent Thanks for your patience. Implementation looks great. Can you please sign your commits before merging? |
bcd4b70 to
82b72be
Compare
…alidation - tap_touch and handle_drag now use resolve_scroll_and_center for auto-scroll + detached-node retry (addresses review feedback on vercel-labs#1073) - Fix drag ordering: mouseDown at source before scrolling to target, preventing source coordinates from being invalidated - Remove unused resolve_element_center (dead code after migration) - Unify e2e offscreen tests into a single e2e_offscreen_scroll_before_interactions
82b72be to
1fd4226
Compare
|
@ctate Sorry for the delay! All commits are now signed and you can confirm the Verified badges on the Commits tab: |
|
@ctate any chance we can get this merged now that the commits are signed? Our agents keep hitting this issue (we've got special guidance to help them avoid it as much as possible, but it'd be so much better encoded into the library). |
Closes #1044
Summary
click,hover,tap, anddragusing CDPDOM.scrollIntoViewIfNeededwith JS fallbackisConnectedcheck inside the existinggetBoundingClientRectJS call — zero extra CDP round-trips on the happy pathRoot cause
Two issues combined to produce the silent failure:
1. No auto-scroll before interactions
snapshotusesAccessibility.getFullAXTreewhich returns all elements regardless of viewport position. When interacting with an off-screen element,Input.dispatchMouseEvent/Input.dispatchTouchEventwas dispatched to coordinates outside the viewport — CDP doesn't error on this, so the interaction silently had no effect.2. Coordinate system mismatch in
resolve_element_centerDOM.getBoxModelreturns page-absolute coordinates whileInput.dispatchMouseEventexpects viewport-relative coordinates. After scrolling, these diverge and interactions miss the target.Changes
get_center_and_viewport: usesgetBoundingClientRectfor viewport-relative coordinates, returnsCenterResult::FoundorCenterResult::Detachedscroll_into_view_if_needed: CDPDOM.scrollIntoViewIfNeededwith JSscrollIntoViewfallback (errors propagated)assert_in_viewport: validates element is within viewport after scrollingresolve_element_object_id_fresh: skips cachedbackend_node_idfor retry after detached node detectionresolve_scroll_and_center: shared helper for click/hover/tap/drag — resolve, scroll, get center, retry on detachhandle_drag: reordered to mouseDown at source before scrolling to target, preventing source coordinate invalidationresolve_element_center,box_model_centerTest plan
e2e_offscreen_scroll_before_interactions— click, hover, click-by-ref, tap, and drag on off-screen elements in a single unified teste2e_click_stale_ref_falls_back_to_role_name— stale ref fallback still workscargo fmt+cargo clippyclean