Huggingface変圧器Pytorchチュートリアル:ロード、予測と提供/配備

18089 ワード

huggingface machinelearning ai pytorch テキストリンク

あなたの多くはBert、またはtransformersを聞いたに違いない.
そして、あなたはhuggingfaceを知っているかもしれません.
このチュートリアルでは、そのPytorch変圧器モデルで再生し、REST API

モデルの仕組み

不完全文の入力では、モデルは予測を行います.
入力

Paris is the [MASK] of France.

出力:

Paris is the capital of France.

これを試してみましょう

前提条件

Macユーザー

あなたが私のようなM 1 Macで働いているならば、あなたはインストールcmakeとrustを必要とします

brew install cmake

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

依存関係のインストール

PIPを使用して依存関係をインストールできます.

pip install tqdm boto3 requests regex sentencepiece sacremoses

または代わりにDockerイメージを使用できます.

docker run -it -p 8000:8000 -v $(pwd):/opt/workspace huggingface/transformers-pytorch-cpu:4.18.0 bash

モデルを読み込む

これはtokenizerとモデルをロードします.それはダウンロードする必要があります.

import torch

# load tokenizer
tokenizer = torch.hub.load(
    "huggingface/pytorch-transformers",
    "tokenizer",
    "bert-base-cased",
)
# load masked model
masked_lm_model = torch.hub.load(
    "huggingface/pytorch-transformers",
    "modelForMaskedLM",
    "bert-base-cased",
)

予測関数の定義

入力テキストは:パリはフランスの[マスク]です.

input_text = "Paris is the [MASK] of France."

最初に、私たちは

tokens = tokenizer(input_text)

マスクされたインデックスを見てみましょう.

mask_index = [
    i
    for i, token_id in enumerate(tokens["input_ids"])
    if token_id == tokenizer.mask_token_id
]

テンソルを準備する

segments_tensors = torch.tensor([tokens["token_type_ids"]])
tokens_tensor = torch.tensor([tokens["input_ids"]])

予測

with torch.no_grad():
    predictions = masked_lm_model(
        tokens_tensor, token_type_ids=segments_tensors
    )

では、結果を見てみましょう.

pred_tokens = torch.argmax(predictions[0][0], dim=1)

# replace the initail input text's mask with predicted text
for i in mask_index:
    tokens["input_ids"][i] = pred_tokens[i]
tokenizer.decode(tokens["input_ids"], skip_special_tokens=True)

出力:

'Paris is the capital of France.'

予測機能にコードを整理しましょう

def predict(input_text):
    # tokenize the input text
    tokens = tokenizer(input_text)

    # get all the mask index
    mask_index = [
        i
        for i, token_id in enumerate(tokens["input_ids"])
        if token_id == tokenizer.mask_token_id
    ]

    # convert the input ids and type ids to tensor
    segments_tensors = torch.tensor([tokens["token_type_ids"]])
    tokens_tensor = torch.tensor([tokens["input_ids"]])

    # run predictions
    with torch.no_grad():
        predictions = masked_lm_model(
            tokens_tensor, token_type_ids=segments_tensors
        )

    # pick the most likely predictions

    pred_tokens = torch.argmax(predictions[0][0], dim=1)

    # replace the initail input text's mask with predicted text
    for i in mask_index:
        tokens["input_ids"][i] = pred_tokens[i]
    return tokenizer.decode(tokens["input_ids"], skip_special_tokens=True)

実行:

predict("Paris is the [MASK] of France.")

出力:

'Paris is the capital of France.'

REST APIを介して

まず、Pinferenciaをインストールしましょう.

pip install "pinferencia[uvicorn]"

あなたがPinferenciaの聞いたことがないならば、それをチェックするためにそのGitthubページhttps://github.com/underneathall/pinferenciaまたはそのホームページhttps://pinferencia.underneathall.app/に行ってください、それはあなたが簡単にあなたのモデルを展開するのを援助する驚くべきライブラリです.
我々の予測機能をファイルapp.pyに保存して、それを登録するために若干の線を加えましょう.

import torch
from pinferencia import Server

# load tokenizer
tokenizer = torch.hub.load(
    "huggingface/pytorch-transformers",
    "tokenizer",
    "bert-base-cased",
)
# load masked model
masked_lm_model = torch.hub.load(
    "huggingface/pytorch-transformers",
    "modelForMaskedLM",
    "bert-base-cased",
)


def predict(input_text):
    # tokenize the input text
    tokens = tokenizer(input_text)

    # get all the mask index
    mask_index = [
        i
        for i, token_id in enumerate(tokens["input_ids"])
        if token_id == tokenizer.mask_token_id
    ]

    # convert the input ids and type ids to tensor
    segments_tensors = torch.tensor([tokens["token_type_ids"]])
    tokens_tensor = torch.tensor([tokens["input_ids"]])

    # run predictions
    with torch.no_grad():
        predictions = masked_lm_model(
            tokens_tensor, token_type_ids=segments_tensors
        )

    # pick the most likely predictions
    pred_tokens = torch.argmax(predictions[0][0], dim=1)

    # replace the initail input text's mask with predicted text
    for i in mask_index:
        tokens["input_ids"][i] = pred_tokens[i]
    return tokenizer.decode(tokens["input_ids"], skip_special_tokens=True)


service = Server()
service.register(model_name="transformer", model=predict)

サービスを実行し、モデルを読み込み、サーバーを起動するのを待ちます.

uvicorn app:service --reload

サービスをテストします.
curlの使用

curl --location --request POST 'http://127.0.0.1:8000/v1/models/transformer/predict' \
--header 'Content-Type: application/json' \
--data-raw '{
    "data": "Paris is the [MASK] of France."
}'

レスポンス

{
    "model_name":"transformer",
    "data":"Paris is the capital of France."
}

クール〜〜まだまだクールではない:
あなたは予測を試みるためにhttp://127.0.0.1:8000(サーバーのアドレス)でSwagger UIを使用することができます

Reference

この問題について(Huggingface変圧器Pytorchチュートリアル:ロード、予測と提供/配備), 我々は、より多くの情報をここで見つけました https://dev.to/wjiuhe/huggingface-transformers-pytorch-tutorial-load-predict-and-servedeploy-k7i

テキストは自由に共有またはコピーできます。ただし、このドキュメントのURLは参考URLとして残しておいてください。

Collection and Share based on the CC Protocol

Activityがフルスクリーンモードで動作する場合、ステータスバーの表示と非表示

のカスタムフックのNPMパッケージを作成して発行する