Skip to content

orassayag/full-text-search

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

12 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Full Text Search

A Node.js demonstration application showcasing full-text search capabilities using Elasticlunr.js for URL pattern matching. This project demonstrates how to efficiently match URLs against a set of rules with wildcard support.

Built in November 2018. A practical example of implementing a full-text search engine for URL classification and matching.

Features

  • ๐Ÿ” Full-text search indexing using Elasticlunr.js
  • ๐ŸŒ URL pattern matching with wildcard support
  • ๐ŸŽฏ Best match algorithm for overlapping patterns
  • โšก Fast lookup performance with indexed search
  • ๐Ÿ“ Simple rule-based configuration
  • ๐Ÿงช Built-in test cases for validation

Architecture

graph TD
    A[URL Rules] -->|Index| B[Elasticlunr Engine]
    C[Input URL] -->|Search| B
    B -->|Match Score| D[UrlManager]
    D -->|Best Match| E[Service Name]
    
    F[core/urlManager.js] -->|Defines| A
    F -->|Provides| C
    G[models/UrlManager.js] -->|Implements| D
    H[index.js] -->|Initializes| F
Loading

Flow Diagram

sequenceDiagram
    participant User
    participant UrlManager
    participant Elasticlunr
    participant Index
    
    User->>UrlManager: Load Rules
    UrlManager->>Elasticlunr: Create Index
    loop For Each Rule
        UrlManager->>Index: Add Document
    end
    
    User->>UrlManager: findBestMatchForURL(url)
    UrlManager->>Elasticlunr: search(url)
    Elasticlunr->>Index: Query
    Index-->>Elasticlunr: Match Results
    Elasticlunr-->>UrlManager: Best Match
    UrlManager-->>User: Service Name
Loading

Getting Started

Prerequisites

  • Node.js (v12 or higher)
  • npm

Installation

  1. Clone the repository:
git clone https://github.com/orassayag/full-text-search.git
cd full-text-search
  1. Navigate to the server directory:
cd server
  1. Install dependencies:
npm install

Running the Application

npm start

This will run the demo with predefined URL rules and test cases, showing how the search engine matches URLs to their corresponding services.

Usage Example

The application includes example rules for common services:

const setOfRules = {
    'www.facebook.com/connect.js': 'Facebook',
    'www.google-analytics.com/*': 'Google Analytics',
    'www.twitter.com/scripts/v1/index.js': 'Twitter'
};

When you query a URL like www.facebook.com/connect.js, the engine returns Facebook.

Project Structure

full-text-search/
โ”œโ”€โ”€ server/
โ”‚   โ”œโ”€โ”€ core/
โ”‚   โ”‚   โ””โ”€โ”€ urlManager.js      # URL rules and test cases
โ”‚   โ”œโ”€โ”€ models/
โ”‚   โ”‚   โ””โ”€โ”€ UrlManager.js      # URL matching logic
โ”‚   โ”œโ”€โ”€ index.js               # Entry point
โ”‚   โ””โ”€โ”€ package.json
โ””โ”€โ”€ README.md

How It Works

  1. Indexing Phase:

    • URL rules are loaded and indexed using Elasticlunr
    • Each rule gets a unique ID and is stored as a document
  2. Search Phase:

    • Input URL is queried against the index
    • Search engine returns relevance scores for all matching rules
    • The rule with the highest score is selected
  3. Result:

    • Returns the name/tag associated with the best matching rule
    • Supports exact matches and wildcard patterns

Configuration

To customize URL rules, edit server/core/urlManager.js:

const setOfRules = {
    'your.domain.com/path': 'Your Service',
    'another.domain.com/*': 'Another Service'
};

Built With

Use Cases

  • URL classification and categorization
  • Third-party script detection
  • Analytics tracking identification
  • Content delivery network (CDN) recognition
  • API endpoint routing

Contributing

Contributions to this project are released to the public under the project's open source license.

Everyone is welcome to contribute. Contributing doesn't just mean submitting pull requestsโ€”there are many different ways to get involved, including answering questions and reporting issues.

Please feel free to contact me with any question, comment, pull-request, issue, or any other thing you have in mind.

See CONTRIBUTING.md for details.

Author

License

This application has an MIT license - see the LICENSE file for details.

About

A Node.js demonstration application showcasing full-text search capabilities using Elasticlunr.js for URL pattern matching. This project demonstrates how to efficiently match URLs against a set of rules with wildcard support. Built in November 2018. A practical example of implementing a full-text search engine for URL classification and matching.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

โšก