如何使用第二个索引实现文档提升

了解如何使用第二个索引（用于固定文档）通过Meilisearch实现推广搜索结果。

使用 Meilisearch

2023年8月9日3分钟阅读

Laurent Cazanove开发者体验工程师@StriftCodes

How to implement document boosting with a second index

分享文章

本文目录

安装初始化索引 `games` 索引 `pinned_games` 索引更新推广文档索引自定义搜索结果

本指南将引导您使用 Meilisearch 实现推广搜索结果。我们的目标是在用户查询匹配某些关键词时，搜索结果能优先显示特定文档。这些被提升的文档应显示在搜索结果的顶部。

本指南解释了如何在后端实现推广文档。对于前端优先的实现，请参阅使用 React InstantSearch 实现推广搜索结果

概述

以下是使用第二个“置顶文档”索引和多重搜索功能实现文档提升的简化流程。

创建索引： 设置两个索引：一个用于常规搜索，一个用于提升结果。提升索引将有一个特殊属性keywords来触发提升。
填充“games”索引： 使用提供的 JSON 文件将数据集填充到 games 索引中。此索引将作为我们提升文档的来源。
配置“pinned_games”索引： 配置pinned_games索引以显示属性而不暴露关键词。相应地调整可搜索属性和显示属性。
提升文档： 识别您想要提升的文档并为其分配相关的关键词。例如，您可以将关键词fps和shooter分配给游戏Counter-Strike。
实现多重搜索： 利用 Meilisearch 的多重搜索功能，在常规索引和提升索引上执行搜索查询。这样，匹配关键词的提升文档将首先出现。
显示结果： 以用户友好的格式呈现搜索结果，并用视觉指示器突出显示推广文档。

实施

安装

在深入了解之前，请确保您已成功运行 Meilisearch。如果您尚未安装，请按照以下步骤操作

启动 Meilisearch 实例——您可以在本地运行 Meilisearch，也可以通过Meilisearch 云运行。
确保您已安装了自己喜欢的语言 SDK（或框架集成）。

本指南使用 Python SDK，但它与其他 Meilisearch 集成的工作方式相同。🎉

初始化索引

在我们的示例中，我们将使用 Steam 游戏数据集。您可以将此过程适应您自己的数据。

下载steam-games.json和settings.json文件，用于我们的Steam 游戏数据集
通过添加文档从steam-games.json文件将数据集加载到您的 Meilisearch 实例中。

`games` 索引

import meilisearch
import json
from typing import Callable

client = meilisearch.Client(url="https://:7700")
games = client.index("games")

# helper to wait for Meilisearch tasks
def wait_with_progress(client: meilisearch.Client, task_uid: int):
    while True:
        try:
            client.wait_for_task(task_uid, timeout_in_ms=1000)
            break
        except meilisearch.errors.MeilisearchTimeoutError:
            print(".", end="")
    task = client.get_task(task_uid)
    print(f" {task.status}")
    if task.error is not None:
        print(f"{task.error}")
            
print("Adding settings...", end="")
with open("settings.json") as settings_file:
    settings = json.load(settings_file)
    task = games.update_settings(settings)
    wait_with_progress(client, task.task_uid)


with open("steam-games.json") as documents_file:
    documents = json.load(documents_file)
    task = games.add_documents_json(documents)
    print("Adding documents...", end="")
    wait_with_progress(client, task.task_uid)

`pinned_games` 索引

此索引将包含推广文档。pinned_games索引的设置与games索引相同，但有以下不同之处

唯一的searchableAttributes是keywords属性，其中包含触发固定该文档的词语。
displayedAttributes是文档的所有属性，除了keywords（我们不想向最终用户显示关键词）

pinned = client.index("pinned_games")

print("Adding settings...", end="")
with open("settings.json") as settings_file:
    settings = json.load(settings_file)
    settings["searchableAttributes"] = ["keywords"]
    # all but "keywords"
    settings["displayedAttributes"] = ["name", "description", "id", "price", "image", "releaseDate", "recommendationCount", "platforms", "players", "genres", "misc"]
    task = pinned.update_settings(settings)
    # see `wait_with_progress` implementation in previous code sample
    wait_with_progress(client, task.task_uid)

更新推广文档索引

现在，我们将从games索引中挑选要推广的文档来填充索引。

举例来说，假设我们想将游戏"Counter-Strike"固定到关键词"fps"和"first", "person", "shooter"。

counter_strike = games.get_document(document_id=10)
counter_strike.keywords = ["fps", "first", "person", "shooter"]

print("Adding pinned document...", end="")
task = pinned.add_documents(dict(counter_strike))
wait_with_progress(client, task.task_uid)

自定义搜索结果

现在，让我们创建一个函数，返回包含置顶文档的搜索结果。

from copy import deepcopy
from typing import Any, Dict, List
from dataclasses import dataclass

@dataclass
class SearchResults:
    pinned: List[Dict[str, Any]]
    regular: List[Dict[str, Any]]

def search_with_pinned(client: meilisearch.Client, query: Dict[str, Any]) -> SearchResults:
    pinned_query = deepcopy(query)
    pinned_query["indexUid"] = "pinned_games"
    regular_query = deepcopy(query)
    regular_query["indexUid"] = "games"
    results = client.multi_search([pinned_query, regular_query])
    # fetch the limit that was passed to each query so that we can respect that value when getting the results from each source
    limit = results["results"][0]["limit"]
    # fetch as many results from the pinned source as possible
    pinned_results = results["results"][0]["hits"]
    # only fetch results from the regular source up to limit
    regular_results = results["results"][1]["hits"][:(limit-len(pinned_results))]
    return SearchResults(pinned=pinned_results, regular=regular_results)

我们可以使用此函数检索带有推广文档的搜索结果

results = search_with_pinned(client, {"q": "first person shoot", "attributesToRetrieve": ["name"]})

results 对象应如下所示

SearchResults(pinned=[{'name': 'Counter-Strike'}], regular=[{'name': 'Rogue Shooter: The FPS Roguelike'}, {'name': 'Rocket Shooter'}, {'name': 'Masked Shooters 2'}, {'name': 'Alpha Decay'}, {'name': 'Red Trigger'}, {'name': 'RAGE'}, {'name': 'BRINK'}, {'name': 'Voice of Pripyat'}, {'name': 'HAWKEN'}, {'name': 'Ziggurat'}, {'name': 'Dirty Bomb'}, {'name': 'Gunscape'}, {'name': 'Descent: Underground'}, {'name': 'Putrefaction'}, {'name': 'Killing Room'}, {'name': 'Hard Reset Redux'}, {'name': 'Bunny Hop League'}, {'name': 'Kimulator : Fight for your destiny'}, {'name': 'Intrude'}])