Hello! We continue to study the possibilities of SQL Server from Microsoft, and the next step is the Full-Text Search component, in the Russian version it is “Full-text search”. And now we will find out why it is needed, and how to implement this same full-text search in Microsoft SQL Server using this component.
And we will begin, of course, by considering the basics of full-text search, i.e. what is it and why is it needed at all.
What is full-text search?
A full-text search is a search for words or phrases in text data. Typically, this type of search is used to search for text in a large amount of data, for example, a table with a million or more rows, since it is much faster than a regular search that can be performed using the LIKE construct.
Full-text search involves creating a special index ( it differs from ordinary indexes ) of text data, which is a kind of dictionary of words that are found in this data.
Using full-text search, you can implement a kind of search engine for documents (i.e. strings), by words or phrases in your enterprise database. Since in addition to his quick work, he also has the ability to rank the documents found, i.e. rank each row found, in other words, you can find the most relevant entries, i.e. the most suitable for your request.
Full-text search capabilities in Microsoft SQL Server
- In the full-text search of the SQL server, you can search not only for individual words or phrases, but also for prefix expressions , for example, specify the text of the beginning of a word or phrase;
- You can also search for words by word forms, for example, various forms of verbs or nouns in the singular and in the plural, i.e. by derived expressions ;
- You can build a query to find words or phrases that are next to other words or phrases, i.e. location-specific expressions ;
- It is possible to search for synonymous forms of a particular word ( thesaurus ), i.e., for example, if the thesaurus defines that “ Car ” and “ Machine ” are synonyms, then when searching for the word “ Car ”, the result set will include lines containing the word ” Machine “;
- You can specify words or phrases with weighted meanings in the query, for example, if several words or phrases are indicated in the query, they can be assigned importance from 0.0 to 1.0 ( 1.0 means that this is the most important word or phrase );
- In order not to take into account some words in the search, you can use the “ list of stop words ” ie according to the words included in this list, the search will not be performed.
Preparing to implement full-text search in Microsoft SQL Server
Before you start creating a full-text search, you need to know a few important points:
- To implement full-text search component Full-Text Search ( text search ) should be set;
- A table can have only one full-text index;
- To create a full-text index, the table must contain one unique index, which includes one column and does not allow null values. It is recommended to use a unique clustered index ( or just a primary key ), the first column of which should have an integer data type;
- A full-text index can be created on columns with a data type: char, varchar, nchar, nvarchar, text, ntext, image, xml, varbinary or varbinary (max);
- In order to create a full-text index, you must first create a full-text catalog. Starting with SQL Server 2008, a full-text catalog is a logical concept that refers to a group of full-text indexes, i.e. is a virtual object and is not included in the file group ( there is a way to create a full-text index using the “Wizard”, in which a directory can be created at the same time as the index, we will consider this method below ).