Skip to content

Commit 9388d94

Browse files
author
Sylvain Pace
authored
Enhancing DocSearch FAQ (#388)
* enhancing DocSearch FAQ * fixing issues * fixing issues
1 parent e364dfc commit 9388d94

File tree

3 files changed

+55
-6
lines changed

3 files changed

+55
-6
lines changed

docs/source/documentation/2-docsearch-scraper/2-config-options.html.md.erb

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -205,12 +205,18 @@ as `lvl1` and `h3` as `lvl2`. `text` is usually any `p` of text.
205205

206206
We recommend making use of at least the three first levels for better relevancy.
207207

208-
### Global selectors _Optional_
208+
### `global` selectors _Optional_
209209

210210
It's possible to make a selector global which means that all records from the page will have
211211
this value. This is useful when you have a title that is in the right sidebar and
212212
the sidebar is placed after the content in the DOM.
213213

214+
`global` attributes should be seen as a way to extract the matching elements from the HTML flow. These global elements will not be considered as breaking ones when we encounter them along the flow. If this token is enabled :
215+
- we will not create a new record from the current builded stack.
216+
- we will apply its contextual value (or `default_value` if not matched) to every records
217+
218+
We mostly use this parameter for page that miss context. It enables us to pick up the context from another common part without having duplicates.
219+
214220
```json
215221
"selectors": {
216222
"lvl0": {

docs/source/documentation/4-docsearch-FAQ/1-customize-configuration-file.html.md.erb

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,8 @@
11
---
2-
title: Customize my configuration file
2+
title: Customize my DocSearch
33
---
44

5-
## What content do we recommend indexing?
5+
## What content do we recommend to index?
66

77
### Code blocks
88

@@ -14,7 +14,7 @@ A good practice would be to emphasize the meaningful underlying part of it
1414
thanks to a dedicated class, which you will want to add to the `text` selector
1515
in your configuration file.
1616

17-
For example, in this code snippet, you can exclude the code except the parameter of the function:
17+
For example, in this code snippet, you can exclude the code but the parameters of the function:
1818
```
1919
<code>
2020
function(<toIndex>something useful to search for</toIndex>)
@@ -24,7 +24,7 @@ function(<toIndex>something useful to search for</toIndex>)
2424

2525
### Table of contents
2626

27-
The elements of the table of contents only *target* but do not provide actual content.
27+
The elements of the table of contents only *target* but do not provide content.
2828
They are merely a step on the way to reaching relevant content. As a result, they are
2929
almost always in a different place from the payload. This is why we consider the
3030
table of contents an obstacle for DocSearch to find the coveted information.
@@ -76,4 +76,4 @@ You can override the way these elements are impacting the search thanks to `cust
7676
"desc(weight.level)"
7777
]
7878
}
79-
```
79+
```
Lines changed: 43 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,43 @@
1+
---
2+
title: Global wonderings
3+
---
4+
5+
## What is the general behaviour?
6+
7+
### How often do you scrap my website?
8+
9+
We scrap your website, at least, every 24 hours since documentation is not meant to change frequently.
10+
Furthermore we update it after every pull request on your configuration. This is the best way to request us to trigger a new crawl. Please leverage your configuration file to do so.
11+
12+
### I want to add search to my whole website?
13+
DocSearch is suited for documentation content. If you want to widen the scope of your search to not documentation content. You have two solutions:
14+
- [Run it on your own](https://community.algolia.com/docsearch/documentation/docsearch-scraper/overview/)
15+
- Build your own search-UI [thanks to InstanteSearch](https://community.algolia.com/instantsearch.js/) ([example here](https://jsfiddle.net/s_pace/965a4w3o/))
16+
17+
### Does the crawl encompass several domains/sub-domains?
18+
19+
The `start_urls` define the allowed domain for our crawler. Basically, we take the main domain of every URLs. We will not go outside this whitelisted domain list. If you want to encompass a wider domain, please include it as a new start_url.
20+
21+
### Where is hosted the data?
22+
23+
As every Algolia index, everything is stored on our servers with the security and privacy required. You can [find more details in the global documentation pages.](https://www.algolia.com/doc/guides/infrastructure/servers/)
24+
25+
## Feature
26+
27+
### Will it become paid?
28+
29+
Open source is great and we want to support as much as we can. Since these projects mostly have limited resources, every granted project will ever remain free.
30+
If you want to upgrade your search experience, you can do it with your own Algolia implementation and [apply for a free community plan](https://www.algolia.com/pricing). DocSearch will not be used anymore but still happy to help.
31+
32+
### What data are you collecting, who can get access to it?
33+
34+
We only scrap publicly available data according to your custom selectors (see the how it works part). DocSearch introduces [the algolia analytics](https://www.algolia.com/doc/guides/insights-and-analytics/analytics-overview/) for your DocSearch indices.
35+
You can ask access to these data by using the private email thread and send us the email addresses of the people to grant access.
36+
37+
### Where can I see the analytics?
38+
39+
Once you are granted (see above), analytics will be available in the Algolia dashboard like any regular application. You will need to select the analytics tabs.
40+
41+
## I am not the owner, what can i do?
42+
43+
You can definitely help improving the search experience of your favourite documentation website by opening issues on the repo. You can also reference the GitHub handle @s-pace if you want us to chime in / provide you a quick demo. You can advocate the owner of the repo by letting them request DocSearch. We would be happy to support you in such regards. Feel free to ping.

0 commit comments

Comments
 (0)