Skip to content

Return selector after consuming combinator#115

Merged
philss merged 2 commits into
philss:masterfrom
jjcarstens:fix_combinator
May 25, 2017
Merged

Return selector after consuming combinator#115
philss merged 2 commits into
philss:masterfrom
jjcarstens:fix_combinator

Conversation

@jjcarstens

Copy link
Copy Markdown
Contributor

Since the change to parse all tokens, a selector is returned when encountering a :comma token.

When we consume_combinators, that calls do_parse which gets a selector and continues on parsing tokens. However, the selector is returned as a result of matching :comma token and removes it from the remaining tokens. So when parsing continues after consume_combinator, it continues parsing the previous selector instead of starting a new one.

This changes so consume_combinator functions return a selector tuple since they will get a fully completed selector after parsing.

# before the change
iex> "a b.foo c, ul a[class=bar]" |> Floki.SelectorTokenizer.tokenize |> Floki.SelectorParser.parse |> length
1
iex> "a b.foo c, ul a[class=bar]" |> Floki.SelectorTokenizer.tokenize |> Floki.SelectorParser.parse
[%Floki.Selector{attributes: [], classes: [],
  combinator: %Floki.Combinator{match_type: :descendant,
   selector: %Floki.Selector{attributes: [], classes: ["foo"],
    combinator: %Floki.Combinator{match_type: :descendant,
     selector: %Floki.Selector{attributes: [%Floki.AttributeSelector{attribute: "class",
        match_type: :equal, value: "bar"}], classes: [], combinator: nil,
      id: nil, namespace: nil, pseudo_classes: [], type: "a"}}, id: nil,
    namespace: nil, pseudo_classes: [], type: "ul"}}, id: nil, namespace: nil,
  pseudo_classes: [], type: "a"}]

# With proposed changes
iex> "a b.foo c, ul a[class=bar]" |> Floki.SelectorTokenizer.tokenize |> Floki.SelectorParser.parse |> length
2
iex> "a b.foo c, ul a[class=bar]" |> Floki.SelectorTokenizer.tokenize |> Floki.SelectorParser.parse
[%Floki.Selector{attributes: [], classes: [],
  combinator: %Floki.Combinator{match_type: :descendant,
   selector: %Floki.Selector{attributes: [], classes: ["foo"],
    combinator: %Floki.Combinator{match_type: :descendant,
     selector: %Floki.Selector{attributes: [], classes: [], combinator: nil,
      id: nil, namespace: nil, pseudo_classes: [], type: "c"}}, id: nil,
    namespace: nil, pseudo_classes: [], type: "b"}}, id: nil, namespace: nil,
  pseudo_classes: [], type: "a"},
 %Floki.Selector{attributes: [], classes: [],
  combinator: %Floki.Combinator{match_type: :descendant,
   selector: %Floki.Selector{attributes: [%Floki.AttributeSelector{attribute: "class",
      match_type: :equal, value: "bar"}], classes: [], combinator: nil, id: nil,
    namespace: nil, pseudo_classes: [], type: "a"}}, id: nil, namespace: nil,
  pseudo_classes: [], type: "ul"}]

I think this covers all the missing cases, but would definitely love close 👀 and any suggestions you believe may still need to be tested.

Since [the change to parse all tokens a selector is returned when encountering a `:comma` token](https://github.com/philss/floki/pull/106/files#diff-7f26dd84090995cad0112f38a801b18bR32). However, when we `consume_combinators`, that calls `do_parse` which gets a selector and continues on parsing tokens. However, the selector is returned as a result of matching `:comma` token and removes it from the remaining tokens. So when parsing continues after `consume_combinator`, it continues parsing the previous selector instead of starting a new one.

This changes so `consume_combinator` functions return a selector tuple since they will get a fully completed selector after parsing.
@mischov

mischov commented May 25, 2017

Copy link
Copy Markdown
Contributor

Good catch. Your changes should be correct.

In the code that #106 is based on I append a comma onto the tokens returned from consuming a combinator, forcing a return of the selector when I try to continue to parse the selector (because reasons).

The behavior of appending a comma to the tokens was overkill in Floki so I removed it, but I (clearly) forgot to change the code to return the {selector, remaining_tokens} tuple like you have instead of continuing to parse the selector.

@philss philss merged commit c2187af into philss:master May 25, 2017
@philss

philss commented May 25, 2017

Copy link
Copy Markdown
Owner

I think this is correct! Thank you again, @jjcarstens!

@philss

philss commented May 25, 2017

Copy link
Copy Markdown
Owner

This fix was released in version 0.17.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants