Data processing using Analyzers

An analyzer is a component responsible for processing and converting raw textual data into a format that is conducive to effective full-text search. It includes various sub-processes, such as tokenization, stemming, and removal of stop words.

How is data processed ?

In Atlas Search, analyzers are like language experts that process and organize your data for effective searching. Think of them as tools with two main jobs:

  1. Tokenizer (Word Extractor):

    • The tokenizer takes your text and extracts meaningful words, like breaking a sentence into individual words.

  2. Filters (Cleanup Crew):

    • Filters are like a cleanup crew. They fix things like capitalization, punctuation, and unnecessary words, so your search results are spot-on.

By setting up an analyzer for a specific field, you decide how these language experts should do their job. They handle challenges like :

  • ignoring case (uppercase or lowercase)

  • removing unnecessary words

  • understanding word variations etc.

The result? Your data is processed in a way that makes searches super accurate and helpful. It's like having a language pro make your data search-friendly!

For In-Depth Details

Last updated