PostgreSQL：文档：17：9.13. 文本搜索函数和操作符

支持的版本：当前 (17) / 16 / 15 / 14 / 13

开发版本：开发版

不支持的版本： 12 / 11 / 10 / 9.6 / 9.5 / 9.4 / 9.3 / 9.2 / 9.1 / 9.0 / 8.4 / 8.3

9.13. 文本搜索函数和操作符
上一个	向上	第 9 章. 函数和操作符	主页	下一个

9.13. 文本搜索函数和操作符 #

表 9.42，表 9.43 和表 9.44 总结了为全文搜索提供的函数和操作符。有关 PostgreSQL 文本搜索功能的详细说明，请参阅第 12 章。

表 9.42. 文本搜索操作符

操作符描述示例
`tsvector` `@@` `tsquery` → `boolean` `tsquery` `@@` `tsvector` → `boolean` `tsvector` 是否匹配 `tsquery`？（参数可以以任意顺序给出。） `to_tsvector('肥猫吃了老鼠') @@ to_tsquery('猫 & 老鼠')` → `t`
`text` `@@` `tsquery` → `boolean` 文本字符串在隐式调用 `to_tsvector()` 之后是否匹配 `tsquery`？ `'肥猫吃了老鼠' @@ to_tsquery('猫 & 老鼠')` → `t`
`tsvector` `\|\|` `tsvector` → `tsvector` 连接两个 `tsvector`。如果两个输入都包含词位位置，则会相应地调整第二个输入的位置。 `'a:1 b:2'::tsvector \|\| 'c:1 d:2 b:3'::tsvector` → `'a':1 'b':2,5 'c':3 'd':4`
`tsquery` `&&` `tsquery` → `tsquery` 将两个 `tsquery` 逻辑与在一起，生成一个匹配同时匹配两个输入查询的文档的查询。 `'肥 \| 老鼠'::tsquery && '猫'::tsquery` → `( '肥' \| '老鼠' ) & '猫'`
`tsquery` `\|\|` `tsquery` → `tsquery` 将两个 `tsquery` 逻辑或在一起，生成一个匹配匹配任一输入查询的文档的查询。 `'肥 \| 老鼠'::tsquery \|\| '猫'::tsquery` → `'肥' \| '老鼠' \| '猫'`
`!!` `tsquery` → `tsquery` 否定一个 `tsquery`，生成一个匹配不匹配输入查询的文档的查询。 `!! '猫'::tsquery` → `!'猫'`
`tsquery` `<->` `tsquery` → `tsquery` 构造一个短语查询，如果两个输入查询在连续的词位上匹配，则该查询匹配。 `to_tsquery('肥') <-> to_tsquery('老鼠')` → `'肥' <-> '老鼠'`
`tsquery` `@>` `tsquery` → `boolean` 第一个 `tsquery` 是否包含第二个？（这只考虑一个查询中出现的所有词位是否出现在另一个查询中，忽略组合操作符。） `'猫'::tsquery @> '猫 & 老鼠'::tsquery` → `f`
`tsquery` `<@` `tsquery` → `boolean` 第一个 `tsquery` 是否包含在第二个中？（这只考虑一个查询中出现的所有词位是否出现在另一个查询中，忽略组合操作符。） `'猫'::tsquery <@ '猫 & 老鼠'::tsquery` → `t` `'猫'::tsquery <@ '!猫 & 老鼠'::tsquery` → `t`

操作符

描述

示例

tsvector @@ tsquery → boolean

tsquery @@ tsvector → boolean

tsvector 是否匹配 tsquery？（参数可以以任意顺序给出。）

to_tsvector('肥猫吃了老鼠') @@ to_tsquery('猫 & 老鼠') → t

text @@ tsquery → boolean

文本字符串在隐式调用 to_tsvector() 之后是否匹配 tsquery？

'肥猫吃了老鼠' @@ to_tsquery('猫 & 老鼠') → t

tsvector || tsvector → tsvector

连接两个 tsvector。如果两个输入都包含词位位置，则会相应地调整第二个输入的位置。

'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector → 'a':1 'b':2,5 'c':3 'd':4

tsquery && tsquery → tsquery

将两个 tsquery 逻辑与在一起，生成一个匹配同时匹配两个输入查询的文档的查询。

'肥 | 老鼠'::tsquery && '猫'::tsquery → ( '肥' | '老鼠' ) & '猫'

tsquery || tsquery → tsquery

将两个 tsquery 逻辑或在一起，生成一个匹配匹配任一输入查询的文档的查询。

'肥 | 老鼠'::tsquery || '猫'::tsquery → '肥' | '老鼠' | '猫'

!! tsquery → tsquery

否定一个 tsquery，生成一个匹配不匹配输入查询的文档的查询。

!! '猫'::tsquery → !'猫'

tsquery <-> tsquery → tsquery

构造一个短语查询，如果两个输入查询在连续的词位上匹配，则该查询匹配。

to_tsquery('肥') <-> to_tsquery('老鼠') → '肥' <-> '老鼠'

tsquery @> tsquery → boolean

第一个 tsquery 是否包含第二个？（这只考虑一个查询中出现的所有词位是否出现在另一个查询中，忽略组合操作符。）

'猫'::tsquery @> '猫 & 老鼠'::tsquery → f

tsquery <@ tsquery → boolean

第一个 tsquery 是否包含在第二个中？（这只考虑一个查询中出现的所有词位是否出现在另一个查询中，忽略组合操作符。）

'猫'::tsquery <@ '猫 & 老鼠'::tsquery → t

'猫'::tsquery <@ '!猫 & 老鼠'::tsquery → t

除了这些专门的操作符外，表 9.1 中显示的常用比较操作符也适用于 tsvector 和 tsquery 类型。这些操作符对于文本搜索不是很有用，但允许在这些类型的列上构建唯一索引。

表 9.43. 文本搜索函数

函数描述示例
`array_to_tsvector` ( `text[]` ) → `tsvector` 将文本字符串数组转换为 `tsvector`。给定的字符串按原样用作词位，无需进一步处理。数组元素不能为空字符串或 `NULL`。 `array_to_tsvector('{肥,猫,老鼠}'::text[])` → `'猫' '肥' '老鼠'`
`get_current_ts_config` ( ) → `regconfig` 返回当前默认文本搜索配置的 OID（由 default_text_search_config 设置）。 `get_current_ts_config()` → `english`
`length` ( `tsvector` ) → `integer` 返回 `tsvector` 中的词位数。 `length('肥:2,4 猫:3 老鼠:5A'::tsvector)` → `3`
`numnode` ( `tsquery` ) → `integer` 返回 `tsquery` 中词位和操作符的数量。 `numnode('(肥 & 老鼠) \| 猫'::tsquery)` → `5`
`plainto_tsquery` ( [ `config` `regconfig`, ] `query` `text` ) → `tsquery` 将文本转换为 `tsquery`，根据指定的或默认的配置规范化单词。字符串中的任何标点符号都将被忽略（它不会确定查询操作符）。生成的查询匹配包含文本中所有非停用词的文档。 `plainto_tsquery('english', '肥老鼠')` → `'肥' & '老鼠'`
`phraseto_tsquery` ( [ `config` `regconfig`, ] `query` `text` ) → `tsquery` 将文本转换为 `tsquery`，根据指定的或默认的配置规范化单词。字符串中的任何标点符号都将被忽略（它不会确定查询操作符）。生成的查询匹配包含文本中所有非停用词的短语。 `phraseto_tsquery('english', '肥老鼠')` → `'肥' <-> '老鼠'` `phraseto_tsquery('english', '猫和老鼠')` → `'猫' <2> '老鼠'`
`websearch_to_tsquery` ( [ `config` `regconfig`, ] `query` `text` ) → `tsquery` 将文本转换为 `tsquery`，根据指定的或默认的配置规范化单词。用引号引起来的单词序列将转换为短语测试。单词 “或” 被理解为产生 OR 操作符，而破折号产生 NOT 操作符；其他标点符号将被忽略。这近似于某些常见网络搜索工具的行为。 `websearch_to_tsquery('english', '"肥老鼠" 或猫狗')` → `'肥' <-> '老鼠' \| '猫' & '狗'`
`querytree` ( `tsquery` ) → `text` 生成 `tsquery` 的可索引部分的表示形式。为空或仅为 `T` 的结果表示不可索引的查询。 `querytree('foo & ! bar'::tsquery)` → `'foo'`
`setweight` ( `vector` `tsvector`, `weight` `"char"` ) → `tsvector` 将指定的 `weight` 分配给 `vector` 的每个元素。 `setweight('肥:2,4 猫:3 老鼠:5B'::tsvector, 'A')` → `'猫':3A '肥':2A,4A '老鼠':5A`
`setweight` ( `vector` `tsvector`, `weight` `"char"`, `lexemes` `text[]` ) → `tsvector` 将指定的 `weight` 分配给 `lexemes` 中列出的 `vector` 的元素。`lexemes` 中的字符串按原样用作词位，无需进一步处理。不匹配 `vector` 中任何词位的字符串将被忽略。 `setweight('肥:2,4 猫:3 老鼠:5,6B'::tsvector, 'A', '{猫,老鼠}')` → `'猫':3A '肥':2,4 '老鼠':5A,6A`
`strip` ( `tsvector` ) → `tsvector` 从 `tsvector` 中移除位置和权重信息。 `strip('fat:2,4 cat:3 rat:5A'::tsvector)` → `'cat' 'fat' 'rat'`
`to_tsquery` ( [ `config` `regconfig`, ] `query` `text` ) → `tsquery` 将文本转换为 `tsquery`，根据指定的或默认配置对单词进行规范化。单词必须通过有效的 `tsquery` 操作符进行组合。 `to_tsquery('english', 'The & Fat & Rats')` → `'fat' & 'rat'`
`to_tsvector` ( [ `config` `regconfig`, ] `document` `text` ) → `tsvector` 将文本转换为 `tsvector`，根据指定的或默认配置对单词进行规范化。结果中包含位置信息。 `to_tsvector('english', 'The Fat Rats')` → `'fat':2 'rat':3`
`to_tsvector` ( [ `config` `regconfig`, ] `document` `json` ) → `tsvector` `to_tsvector` ( [ `config` `regconfig`, ] `document` `jsonb` ) → `tsvector` 将 JSON 文档中的每个字符串值转换为 `tsvector`，根据指定的或默认配置对单词进行规范化。然后按照文档顺序将结果连接起来以产生输出。位置信息的生成方式就好像每对字符串值之间存在一个停用词。（请注意，当输入为 `jsonb` 时，JSON 对象的字段的“文档顺序”是依赖于实现的；请观察示例中的差异。） `to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::json)` → `'dog':5 'fat':2 'rat':3` `to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::jsonb)` → `'dog':1 'fat':4 'rat':5`
`json_to_tsvector` ( [ `config` `regconfig`, ] `document` `json`, `filter` `jsonb` ) → `tsvector` `jsonb_to_tsvector` ( [ `config` `regconfig`, ] `document` `jsonb`, `filter` `jsonb` ) → `tsvector` 选择 `filter` 请求的 JSON 文档中的每个项，并将每个项转换为 `tsvector`，根据指定的或默认配置对单词进行规范化。然后按照文档顺序将结果连接起来以产生输出。位置信息的生成方式就好像每对选定项之间存在一个停用词。（请注意，当输入为 `jsonb` 时，JSON 对象的字段的“文档顺序”是依赖于实现的。）`filter` 必须是包含零个或多个以下关键字的 `jsonb` 数组：`"string"`（包含所有字符串值）、`"numeric"`（包含所有数值）、`"boolean"`（包含所有布尔值）、`"key"`（包含所有键）或 `"all"`（包含以上所有）。作为特殊情况，`filter` 也可以是一个属于这些关键字之一的简单 JSON 值。 `json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]')` → `'123':5 'fat':2 'rat':3` `json_to_tsvector('english', '{"cat": "The Fat Rats", "dog": 123}'::json, '"all"')` → `'123':9 'cat':1 'dog':7 'fat':4 'rat':5`
`ts_delete` ( `vector` `tsvector`, `lexeme` `text` ) → `tsvector` 从 `vector` 中删除任何给定的 `lexeme` 的出现。`lexeme` 字符串被视为词素，不做进一步处理。 `ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat')` → `'cat':3 'rat':5A`
`ts_delete` ( `vector` `tsvector`, `lexemes` `text[]` ) → `tsvector` 从 `vector` 中删除 `lexemes` 中的任何词素的出现。 `lexemes` 中的字符串被视为词素，不做进一步处理。与 `vector` 中任何词素不匹配的字符串将被忽略。 `ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat'])` → `'cat':3`
`ts_filter` ( `vector` `tsvector`, `weights` `"char"[]` ) → `tsvector` 仅从 `vector` 中选择具有给定 `weights` 的元素。 `ts_filter('fat:2,4 cat:3b,7c rat:5A'::tsvector, '{a,b}')` → `'cat':3B 'rat':5A`
`ts_headline` ( [ `config` `regconfig`, ] `document` `text`, `query` `tsquery` [, `options` `text` ] ) → `text` 以简略的形式显示 `document` 中 `query` 的匹配项，`document` 必须是原始文本，而不是 `tsvector`。在与查询匹配之前，根据指定的或默认配置对文档中的单词进行规范化。此函数的使用在第 12.3.4 节中讨论，该节还描述了可用的 `options`。 `ts_headline('The fat cat ate the rat.', 'cat')` → `The fat <b>cat</b> ate the rat.`
`ts_headline` ( [ `config` `regconfig`, ] `document` `json`, `query` `tsquery` [, `options` `text` ] ) → `text` `ts_headline` ( [ `config` `regconfig`, ] `document` `jsonb`, `query` `tsquery` [, `options` `text` ] ) → `text` 以简略的形式显示 JSON `document` 中的字符串值中出现的 `query` 的匹配项。有关详细信息，请参阅第 12.3.4 节。 `ts_headline('{"cat":"raining cats and dogs"}'::jsonb, 'cat')` → `{"cat": "raining <b>cats</b> and dogs"}`
`ts_rank` ( [ `weights` `real[]`, ] `vector` `tsvector`, `query` `tsquery` [, `normalization` `integer` ] ) → `real` 计算一个分数，显示 `vector` 与 `query` 的匹配程度。有关详细信息，请参阅第 12.3.3 节。 `ts_rank(to_tsvector('raining cats and dogs'), 'cat')` → `0.06079271`
`ts_rank_cd` ( [ `weights` `real[]`, ] `vector` `tsvector`, `query` `tsquery` [, `normalization` `integer` ] ) → `real` 使用覆盖密度算法计算一个分数，显示 `vector` 与 `query` 的匹配程度。有关详细信息，请参阅第 12.3.3 节。 `ts_rank_cd(to_tsvector('raining cats and dogs'), 'cat')` → `0.1`
`ts_rewrite` ( `query` `tsquery`, `target` `tsquery`, `substitute` `tsquery` ) → `tsquery` 用 `substitute` 替换 `query` 中的 `target` 的出现。有关详细信息，请参阅第 12.4.2.1 节。 `ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo\|bar'::tsquery)` → `'b' & ( 'foo' \| 'bar' )`
`ts_rewrite` ( `query` `tsquery`, `select` `text` ) → `tsquery` 根据通过执行 `SELECT` 命令获得的目标和替换来替换 `query` 的部分内容。有关详细信息，请参阅第 12.4.2.1 节。 `SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases')` → `'b' & ( 'foo' \| 'bar' )`
`tsquery_phrase` ( `query1` `tsquery`, `query2` `tsquery` ) → `tsquery` 构造一个短语查询，该查询搜索连续词素的 `query1` 和 `query2` 的匹配项（与 `<->` 操作符相同）。 `tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'))` → `'fat' <-> 'cat'`
`tsquery_phrase` ( `query1` `tsquery`, `query2` `tsquery`, `distance` `integer` ) → `tsquery` 构造一个短语查询，该查询搜索相隔恰好 `distance` 个词素的 `query1` 和 `query2` 的匹配项。 `tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10)` → `'fat' <10> 'cat'`
`tsvector_to_array` ( `tsvector` ) → `text[]` 将 `tsvector` 转换为词素数组。 `tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector)` → `{cat,fat,rat}`
`unnest` ( `tsvector` ) → `setof record` ( `lexeme` `text`, `positions` `smallint[]`, `weights` `text` ) 将一个 `tsvector` 展开为一组行，每行对应一个词位。 `select * from unnest('cat:3 fat:2,4 rat:5A'::tsvector)` → lexeme \| positions \| weights --------+-----------+--------- cat \| {3} \| {D} fat \| {2,4} \| {D,D} rat \| {5} \| {A}

函数

描述

示例

array_to_tsvector ( text[] ) → tsvector

将文本字符串数组转换为 tsvector。给定的字符串按原样用作词位，无需进一步处理。数组元素不能为空字符串或 NULL。

array_to_tsvector('{肥,猫,老鼠}'::text[]) → '猫' '肥' '老鼠'

get_current_ts_config ( ) → regconfig

返回当前默认文本搜索配置的 OID（由 default_text_search_config 设置）。

get_current_ts_config() → english

length ( tsvector ) → integer

返回 tsvector 中的词位数。

length('肥:2,4 猫:3 老鼠:5A'::tsvector) → 3

numnode ( tsquery ) → integer

返回 tsquery 中词位和操作符的数量。

numnode('(肥 & 老鼠) | 猫'::tsquery) → 5

plainto_tsquery ( [ config regconfig, ] query text ) → tsquery

将文本转换为 tsquery，根据指定的或默认的配置规范化单词。字符串中的任何标点符号都将被忽略（它不会确定查询操作符）。生成的查询匹配包含文本中所有非停用词的文档。

plainto_tsquery('english', '肥老鼠') → '肥' & '老鼠'

phraseto_tsquery ( [ config regconfig, ] query text ) → tsquery

将文本转换为 tsquery，根据指定的或默认的配置规范化单词。字符串中的任何标点符号都将被忽略（它不会确定查询操作符）。生成的查询匹配包含文本中所有非停用词的短语。

phraseto_tsquery('english', '肥老鼠') → '肥' <-> '老鼠'

phraseto_tsquery('english', '猫和老鼠') → '猫' <2> '老鼠'

websearch_to_tsquery ( [ config regconfig, ] query text ) → tsquery

将文本转换为 tsquery，根据指定的或默认的配置规范化单词。用引号引起来的单词序列将转换为短语测试。单词 “或” 被理解为产生 OR 操作符，而破折号产生 NOT 操作符；其他标点符号将被忽略。这近似于某些常见网络搜索工具的行为。

websearch_to_tsquery('english', '"肥老鼠" 或猫狗') → '肥' <-> '老鼠' | '猫' & '狗'

querytree ( tsquery ) → text

生成 tsquery 的可索引部分的表示形式。为空或仅为 T 的结果表示不可索引的查询。

querytree('foo & ! bar'::tsquery) → 'foo'

setweight ( vector tsvector, weight "char" ) → tsvector

将指定的 weight 分配给 vector 的每个元素。

setweight('肥:2,4 猫:3 老鼠:5B'::tsvector, 'A') → '猫':3A '肥':2A,4A '老鼠':5A

setweight ( vector tsvector, weight "char", lexemes text[] ) → tsvector

将指定的 weight 分配给 lexemes 中列出的 vector 的元素。lexemes 中的字符串按原样用作词位，无需进一步处理。不匹配 vector 中任何词位的字符串将被忽略。

setweight('肥:2,4 猫:3 老鼠:5,6B'::tsvector, 'A', '{猫,老鼠}') → '猫':3A '肥':2,4 '老鼠':5A,6A

strip ( tsvector ) → tsvector

从 tsvector 中移除位置和权重信息。

strip('fat:2,4 cat:3 rat:5A'::tsvector) → 'cat' 'fat' 'rat'

to_tsquery ( [ config regconfig, ] query text ) → tsquery

将文本转换为 tsquery，根据指定的或默认配置对单词进行规范化。单词必须通过有效的 tsquery 操作符进行组合。

to_tsquery('english', 'The & Fat & Rats') → 'fat' & 'rat'

to_tsvector ( [ config regconfig, ] document text ) → tsvector

将文本转换为 tsvector，根据指定的或默认配置对单词进行规范化。结果中包含位置信息。

to_tsvector('english', 'The Fat Rats') → 'fat':2 'rat':3

to_tsvector ( [ config regconfig, ] document json ) → tsvector

to_tsvector ( [ config regconfig, ] document jsonb ) → tsvector

将 JSON 文档中的每个字符串值转换为 tsvector，根据指定的或默认配置对单词进行规范化。然后按照文档顺序将结果连接起来以产生输出。位置信息的生成方式就好像每对字符串值之间存在一个停用词。（请注意，当输入为 jsonb 时，JSON 对象的字段的“文档顺序”是依赖于实现的；请观察示例中的差异。）

to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::json) → 'dog':5 'fat':2 'rat':3

to_tsvector('english', '{"aa": "The Fat Rats", "b": "dog"}'::jsonb) → 'dog':1 'fat':4 'rat':5

json_to_tsvector ( [ config regconfig, ] document json, filter jsonb ) → tsvector

jsonb_to_tsvector ( [ config regconfig, ] document jsonb, filter jsonb ) → tsvector

选择 filter 请求的 JSON 文档中的每个项，并将每个项转换为 tsvector，根据指定的或默认配置对单词进行规范化。然后按照文档顺序将结果连接起来以产生输出。位置信息的生成方式就好像每对选定项之间存在一个停用词。（请注意，当输入为 jsonb 时，JSON 对象的字段的“文档顺序”是依赖于实现的。）filter 必须是包含零个或多个以下关键字的 jsonb 数组："string"（包含所有字符串值）、"numeric"（包含所有数值）、"boolean"（包含所有布尔值）、"key"（包含所有键）或 "all"（包含以上所有）。作为特殊情况，filter 也可以是一个属于这些关键字之一的简单 JSON 值。

json_to_tsvector('english', '{"a": "The Fat Rats", "b": 123}'::json, '["string", "numeric"]') → '123':5 'fat':2 'rat':3

json_to_tsvector('english', '{"cat": "The Fat Rats", "dog": 123}'::json, '"all"') → '123':9 'cat':1 'dog':7 'fat':4 'rat':5

ts_delete ( vector tsvector, lexeme text ) → tsvector

从 vector 中删除任何给定的 lexeme 的出现。lexeme 字符串被视为词素，不做进一步处理。

ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, 'fat') → 'cat':3 'rat':5A

ts_delete ( vector tsvector, lexemes text[] ) → tsvector

从 vector 中删除 lexemes 中的任何词素的出现。 lexemes 中的字符串被视为词素，不做进一步处理。与 vector 中任何词素不匹配的字符串将被忽略。

ts_delete('fat:2,4 cat:3 rat:5A'::tsvector, ARRAY['fat','rat']) → 'cat':3

ts_filter ( vector tsvector, weights "char"[] ) → tsvector

仅从 vector 中选择具有给定 weights 的元素。

ts_filter('fat:2,4 cat:3b,7c rat:5A'::tsvector, '{a,b}') → 'cat':3B 'rat':5A

ts_headline ( [ config regconfig, ] document text, query tsquery [, options text ] ) → text

以简略的形式显示 document 中 query 的匹配项，document 必须是原始文本，而不是 tsvector。在与查询匹配之前，根据指定的或默认配置对文档中的单词进行规范化。此函数的使用在第 12.3.4 节中讨论，该节还描述了可用的 options。

ts_headline('The fat cat ate the rat.', 'cat') → The fat <b>cat</b> ate the rat.

ts_headline ( [ config regconfig, ] document json, query tsquery [, options text ] ) → text

ts_headline ( [ config regconfig, ] document jsonb, query tsquery [, options text ] ) → text

以简略的形式显示 JSON document 中的字符串值中出现的 query 的匹配项。有关详细信息，请参阅第 12.3.4 节。

ts_headline('{"cat":"raining cats and dogs"}'::jsonb, 'cat') → {"cat": "raining <b>cats</b> and dogs"}

ts_rank ( [ weights real[], ] vector tsvector, query tsquery [, normalization integer ] ) → real

计算一个分数，显示 vector 与 query 的匹配程度。有关详细信息，请参阅第 12.3.3 节。

ts_rank(to_tsvector('raining cats and dogs'), 'cat') → 0.06079271

ts_rank_cd ( [ weights real[], ] vector tsvector, query tsquery [, normalization integer ] ) → real

使用覆盖密度算法计算一个分数，显示 vector 与 query 的匹配程度。有关详细信息，请参阅第 12.3.3 节。

ts_rank_cd(to_tsvector('raining cats and dogs'), 'cat') → 0.1

ts_rewrite ( query tsquery, target tsquery, substitute tsquery ) → tsquery

用 substitute 替换 query 中的 target 的出现。有关详细信息，请参阅第 12.4.2.1 节。

ts_rewrite('a & b'::tsquery, 'a'::tsquery, 'foo|bar'::tsquery) → 'b' & ( 'foo' | 'bar' )

ts_rewrite ( query tsquery, select text ) → tsquery

根据通过执行 SELECT 命令获得的目标和替换来替换 query 的部分内容。有关详细信息，请参阅第 12.4.2.1 节。

SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases') → 'b' & ( 'foo' | 'bar' )

tsquery_phrase ( query1 tsquery, query2 tsquery ) → tsquery

构造一个短语查询，该查询搜索连续词素的 query1 和 query2 的匹配项（与 <-> 操作符相同）。

tsquery_phrase(to_tsquery('fat'), to_tsquery('cat')) → 'fat' <-> 'cat'

tsquery_phrase ( query1 tsquery, query2 tsquery, distance integer ) → tsquery

构造一个短语查询，该查询搜索相隔恰好 distance 个词素的 query1 和 query2 的匹配项。

tsquery_phrase(to_tsquery('fat'), to_tsquery('cat'), 10) → 'fat' <10> 'cat'

tsvector_to_array ( tsvector ) → text[]

将 tsvector 转换为词素数组。

tsvector_to_array('fat:2,4 cat:3 rat:5A'::tsvector) → {cat,fat,rat}

unnest ( tsvector ) → setof record ( lexeme text, positions smallint[], weights text )

将一个 tsvector 展开为一组行，每行对应一个词位。

select * from unnest('cat:3 fat:2,4 rat:5A'::tsvector) →

 lexeme | positions | weights
--------+-----------+---------
 cat    | {3}       | {D}
 fat    | {2,4}     | {D,D}
 rat    | {5}       | {A}

注意

所有接受可选 regconfig 参数的文本搜索函数，当省略该参数时，将使用 default_text_search_config 指定的配置。

表 9.44 中的函数单独列出，因为它们通常不在日常文本搜索操作中使用。它们主要用于开发和调试新的文本搜索配置。

表 9.44. 文本搜索调试函数

函数描述示例
`ts_debug` ( [ `config` `regconfig`, ] `document` `text` ) → `setof record` ( `alias` `text`, `description` `text`, `token` `text`, `dictionaries` `regdictionary[]`, `dictionary` `regdictionary`, `lexemes` `text[]` ) 根据指定的或默认的文本搜索配置，从 `document` 中提取并规范化词元，并返回有关每个词元如何处理的信息。有关详细信息，请参见第 12.8.1 节。 `ts_debug('english', 'The Brightest supernovaes')` → `(asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ...`
`ts_lexize` ( `dict` `regdictionary`, `token` `text` ) → `text[]` 如果字典中已知输入词元，则返回替换词元的数组；如果字典中已知该词元但它是一个停用词，则返回一个空数组；如果它不是一个已知词，则返回 NULL。有关详细信息，请参见第 12.8.3 节。 `ts_lexize('english_stem', 'stars')` → `{star}`
`ts_parse` ( `parser_name` `text`, `document` `text` ) → `setof record` ( `tokid` `integer`, `token` `text` ) 使用指定的解析器从 `document` 中提取词元。有关详细信息，请参见第 12.8.2 节。 `ts_parse('default', 'foo - bar')` → `(1,foo) ...`
`ts_parse` ( `parser_oid` `oid`, `document` `text` ) → `setof record` ( `tokid` `integer`, `token` `text` ) 使用由 OID 指定的解析器从 `document` 中提取词元。有关详细信息，请参见第 12.8.2 节。 `ts_parse(3722, 'foo - bar')` → `(1,foo) ...`
`ts_token_type` ( `parser_name` `text` ) → `setof record` ( `tokid` `integer`, `alias` `text`, `description` `text` ) 返回一个表，描述指定的解析器可以识别的每种词元类型。有关详细信息，请参见第 12.8.2 节。 `ts_token_type('default')` → `(1,asciiword,"Word, all ASCII") ...`
`ts_token_type` ( `parser_oid` `oid` ) → `setof record` ( `tokid` `integer`, `alias` `text`, `description` `text` ) 返回一个表，描述由 OID 指定的解析器可以识别的每种词元类型。有关详细信息，请参见第 12.8.2 节。 `ts_token_type(3722)` → `(1,asciiword,"Word, all ASCII") ...`
`ts_stat` ( `sqlquery` `text` [, `weights` `text` ] ) → `setof record` ( `word` `text`, `ndoc` `integer`, `nentry` `integer` ) 执行 `sqlquery`，该查询必须返回一个 `tsvector` 列，并返回数据中包含的每个不同词位的统计信息。有关详细信息，请参见第 12.4.4 节。 `ts_stat('SELECT vector FROM apod')` → `(foo,10,15) ...`

函数

描述

示例

ts_debug ( [ config regconfig, ] document text ) → setof record ( alias text, description text, token text, dictionaries regdictionary[], dictionary regdictionary, lexemes text[] )

根据指定的或默认的文本搜索配置，从 document 中提取并规范化词元，并返回有关每个词元如何处理的信息。有关详细信息，请参见第 12.8.1 节。

ts_debug('english', 'The Brightest supernovaes') → (asciiword,"Word, all ASCII",The,{english_stem},english_stem,{}) ...

ts_lexize ( dict regdictionary, token text ) → text[]

如果字典中已知输入词元，则返回替换词元的数组；如果字典中已知该词元但它是一个停用词，则返回一个空数组；如果它不是一个已知词，则返回 NULL。有关详细信息，请参见第 12.8.3 节。

ts_lexize('english_stem', 'stars') → {star}

ts_parse ( parser_name text, document text ) → setof record ( tokid integer, token text )

使用指定的解析器从 document 中提取词元。有关详细信息，请参见第 12.8.2 节。

ts_parse('default', 'foo - bar') → (1,foo) ...

ts_parse ( parser_oid oid, document text ) → setof record ( tokid integer, token text )

使用由 OID 指定的解析器从 document 中提取词元。有关详细信息，请参见第 12.8.2 节。

ts_parse(3722, 'foo - bar') → (1,foo) ...

ts_token_type ( parser_name text ) → setof record ( tokid integer, alias text, description text )

返回一个表，描述指定的解析器可以识别的每种词元类型。有关详细信息，请参见第 12.8.2 节。

ts_token_type('default') → (1,asciiword,"Word, all ASCII") ...

ts_token_type ( parser_oid oid ) → setof record ( tokid integer, alias text, description text )

返回一个表，描述由 OID 指定的解析器可以识别的每种词元类型。有关详细信息，请参见第 12.8.2 节。

ts_token_type(3722) → (1,asciiword,"Word, all ASCII") ...

ts_stat ( sqlquery text [, weights text ] ) → setof record ( word text, ndoc integer, nentry integer )

执行 sqlquery，该查询必须返回一个 tsvector 列，并返回数据中包含的每个不同词位的统计信息。有关详细信息，请参见第 12.4.4 节。

ts_stat('SELECT vector FROM apod') → (foo,10,15) ...

上一个	向上	下一个
9.12. 网络地址函数和操作符	主页	9.14. UUID 函数

提交更正

如果您在文档中发现任何不正确、与您使用特定功能的经验不符或需要进一步澄清的内容，请使用此表单报告文档问题。