|
Документ взят из кэша поисковой машины. Адрес
оригинального документа
: http://www.sai.msu.su/~megera/oddmuse/index.cgi/unaccent
Дата изменения: Unknown Дата индексирования: Mon Apr 11 07:42:49 2016 Кодировка: ISO8859-5 |
This module provides unaccent text search dictionary and function to remove accents from input text.
Unaccent dictionary is a filtering dictionary, i.e. its output is always passed to the next dictionary (if any), contrary to the standard behaviour. Currently, it supports most important accents from european languages. Edit accents.src file (should be UTF-8 encoded) to modify accents.
Compatibility: PostgreSQL version 8.4+
Installation:
cd unaccent && make && make install psql DB_NAME < unaccent.sql
Examples:
1. Unaccent dictionary does nothing and returns NULL. (lexeme 'Hotels' will be passed to the next dictionary if any)
=# select ts_lexize('unaccent','Hotels') is NULL;
?column?
----------
t
(1 row)
2. Unaccent dictionary removes accent and returns 'Hotel'. (lexeme 'Hotel' will be passed to the next dictionary if any)
=# select ts_lexize('unaccent','HУДtel') is NULL;
?column?
----------
f
(1 row)
3. Simple configuration for french language
CREATE TEXT SEARCH CONFIGURATION fr ( COPY = french );
ALTER TEXT SEARCH CONFIGURATION fr
ALTER MAPPING FOR hword, hword_part, word
WITH unaccent, french_stem;
=# select to_tsvector('fr','HУДtels de la Mer');
to_tsvector
-------------------
'hotel':1 'mer':4
(1 row)
'HУДtels'-> 'Hotels' -> 'hotel'
unaccent french_stem
=# select to_tsvector('fr','HУДtel de la Mer') @@ to_tsquery('fr','Hotels');
?column?
----------
t
(1 row)
=# select ts_headline('fr','HУДtel de la Mer',to_tsquery('fr','Hotels'));
ts_headline
------------------------
<b>HУДtel</b> de la Mer
(1 row)
text unaccent(text) - remove accents in input text
=# select unaccent('HУДtels');
unaccent
----------
Hotels
(1 row)