Understanding Search

When a learner has their Percipio site language set to any language other than English (US), and does a search in that language, search results also include relevant English (US) content.

Search engines

Elasticsearch

Percipio uses Elasticsearch, which is an open-source search engine based on Apache Lucene. Elasticsearch is currently the most popular enterprise search engine and is used by sites such as Facebook, Netflix, GitHub, and Skillport 8i. Elasticsearch supports many features expected of modern search engines such as type-ahead suggestions, word stemming/word forms, synonyms, and fuzzy matching (typo correction).

Azure AI Models

Percipio also uses Azure AI models to vectorize content and search queries. The AI models generate numerical vectors, which encode the meaning of a section of text. We compute and store vectors for all content published in Percipio. These vectors can be compared to find content that has similar meaning to the search query, even if the search query does not use the exact same words that are used to describe the content.

How search works

When you enter a search term, the search engine looks for matching indexed asset metadata fields and content types. When a match is found, a relevance score is calculated for each asset. If you have your language set to any language other than English (US) you can also see English content in the search results along with results in your selected language provided your admin has enabled Include English (US) content in localized search results. A Language filter option displays.

In addition, when a learner's Percipio language is set to any non-English language and a learner performs a search in that language, search includes matching results in any of the locale variants for that language. For example, when Percipio is set for French (FR) and a learner performs a search in French, search will return matching results in both French (FR) and French (CA) as well as English. The languages that currently support this functionality are:

Spanish (DO and ES)
French (CA and FR)
Portuguese (BR and PT)

Keyword search

For a single word search term, Percipio looks for simple word matches. For multiple words, it also searches for phrase matches and assets that contain most of the terms. For example, if you search on the phrase project management your search will return matches based on both words, not based on all assets that contain either project or management.

Quotes are not necessary for an “exact phrase match” since this is done automatically when a multiple word search term is entered.

Content Type search

Search will recognize certain content types when they are included in the search query. For example, a search for leadership audiobooks will match on the term leadership and apply a boost to all matching audiobooks. Searching for just audiobooks will return a list of only audiobooks the learner is entitled to.

Recognized content types include:

Audiobooks
Aspire Journeys
Live Events
Testpreps
Live Courses
Practice lab
Skill benchmark
AI Simulator (CAISY™)

Searching on these terms returns all items of that content type.

Metadata fields used in search

The asset metadata fields used in a search, include:

Asset and channel titles
Asset and channel descriptions (overview)
Book author / Video speaker / Course instructor names
Book ISBNs
Publisher names
Course, video and book IDs
Certification exam names and numbers
Technology and version
Video transcripts and book full text
Content source
Job role family
Skills

Type ahead

Type-ahead displays suggested terms as you type in the search field. These suggestions are compiled from several sources:

A curated list of common search terms
Certification exam names and numbers
Channel and asset titles
Author and instructor names

Selecting a suggested term enters it into the search field and executes the search.

Word stemming / word forms

Word stemming looks for different forms of words, so that relevant results are not omitted. In addition to searching for exact word/phrase matches, the search engine reduces words to their common form. For instance, the words "programming," "programmer," "programmed," "programs," and "program" will all count as matches.

Exact matches are scored higher than stemmed matches.

Synonyms

If the search term is not used in any asset metadata, a match may not be found. Therefore, the search engine uses synonyms to define equivalent or related terms.

For example, you want to find content on the Internet of Things. You enter the search term iot, but some content may not use this acronym in the descriptive metadata. The search engine has a defined synonym making iot equivalent to the text “internet of things” so the search for iot will match content that contains “internet of things".

Additionally, terms such as "accessibility," "wcag," and "section 508" are considered related. A course on accessibility may not use the term "section 508" in the asset metadata, but a search for that term will return accessibility-related assets.

"Coaching" and "mentoring" are very closely related, but an asset might use only one of these terms in the metadata. Using synonyms to associate these related terms fosters a successful search.

Skillsoft periodically reviews common search term history and works with the curators to keep the list of synonyms updated.

Fuzzy match and Slop

Both the search term and the type ahead suggestions use fuzziness to identify misspellings or typos. If no exact match results are found for the original search query, the search engine applies fuzziness to the term and looks for close matches.

The Elasticsearch Slop function is used for finding matches to author and instructor names that may not be an exact match. For example, 'Peter Drucker' and 'Peter F. Drucker' would be considered a match by using Slop.

Relevance score

The search engine calculates a relevance score for all matching assets. This relevance score is calculated based on a number of factors:

The more matches, the higher the score
Matching multiple words in a given field (phrase match), the higher the score
Matches in a shorter field are weighted higher than matches in a longer field (for example, matches in titles are weighted higher than matches in a description, which are weighted higher than matches in full text)

Each match is scored, then adjusted, to fine-tune the relevance. Percipio boosts the asset relevance score based on the following factors, in descending order (highest boost to lowest):

Phrase or word match in channel title
Phrase or word match in custom content title
Phrase or word match in video title
Phrase or word match in course title
Phrase or word match in certification exam
Phrase or word match in custom content source
Phrase match in book text
Phrase or word match in book title
Phrase or word match instructor/author/presenter name
Phrase or word match in content ID
Phrase or word match in parent channel title
Phrase or word match in publisher name
Phrase or word match in child titles (video titles in a course)
Word match in custom content description
Word match in channel description (for channel relevance)
Word match to book ISBN
Word match in technology title or version
Word match in book text
Word match in video transcripts

The match scores are combined to give a final relevance score for each matched asset. The results display in descending relevance order.

Age decay

A matching asset’s relevance score is reduced based on its age, whether it is archived, or has a scheduled retirement date.

Advanced features

In order to make search as simple to use as possible, advanced features such as “quoted strings” for exact phrase matches, Boolean operators, wildcards, and proximity indicators are not currently supported. However, Skillsoft monitors how learners use search and may add support for these features in the future.

AI-assisted Search

Skillsoft Percipio's AI-assisted search uses GPT technology to enhance your search experience by helping to refine ambiguous search queries, recommending related search queries, and providing direct AI-generated responses to questions or keywords.

Because the AI-assisted search is powered by a generative AI language model, it is not intended to replace professional advice or human interaction. The responses generated by this interface are based on statistical patterns learned from large data sets of text, and may not always be accurate or relevant to your specific query. Please use your own judgment when interpreting the responses and seek additional information or expert advice if unsure.

If you want more details about how AI-assisted search works, see the FAQ.