Hello world

本教程演示如何部署一个来自 Hugging Face 的文本摘要模型。您将在本教程中完成以下操作:

  • 设置 BentoML 环境

  • 创建一个 BentoML 服务

  • 在本地部署模型

您可以在 quickstart GitHub 仓库中找到源代码。

设置环境

  1. 克隆项目仓库。

    git clone https://github.com/bentoml/quickstart.git
    cd quickstart
    
  2. 创建一个虚拟环境并激活它。

    python3 -m venv quickstart
    source quickstart/bin/activate
    
    python -m venv quickstart
    quickstart\Scripts\activate
    

    注意

    我们建议您创建一个虚拟环境以隔离依赖。如果您不想设置本地开发环境,请跳至BentoCloud 部署文档

  3. 安装 BentoML 和模型所需的依赖。

    # Recommend Python 3.11
    pip install bentoml torch transformers
    

创建一个 BentoML 服务

您可以在 service.py 文件中定义模型的部署逻辑。以下是此项目中的示例:

service.py
from __future__ import annotations
import bentoml

with bentoml.importing():
    from transformers import pipeline


EXAMPLE_INPUT = "Breaking News: In an astonishing turn of events, the small town of Willow Creek has been taken by storm as local resident Jerry Thompson's cat, Whiskers, performed what witnesses are calling a 'miraculous and gravity-defying leap.' Eyewitnesses report that Whiskers, an otherwise unremarkable tabby cat, jumped a record-breaking 20 feet into the air to catch a fly. The event, which took place in Thompson's backyard, is now being investigated by scientists for potential breaches in the laws of physics. Local authorities are considering a town festival to celebrate what is being hailed as 'The Leap of the Century."


@bentoml.service
class Summarization:
    def __init__(self) -> None:
        self.pipeline = pipeline('summarization')

    @bentoml.api
    def summarize(self, text: str = EXAMPLE_INPUT) -> str:
        result = self.pipeline(text)
        return f"Hello world! Here's your summary: {result[0]['summary_text']}"

Summarization 类中,BentoML 服务会检索一个预训练模型并初始化一个文本摘要管道。 summarize 方法用作 API 端点。它接受一个字符串输入(提供了示例),通过管道处理该输入,并返回摘要文本。

在 BentoML 中,一个 服务 (Service) 是一个可部署和可扩展的单元,使用 @bentoml.service 装饰器定义为一个 Python 类。它可以管理状态及其生命周期,并通过 HTTP 暴露一个或多个 API。服务中的每个 API 都使用 @bentoml.api 装饰器定义,指定为一个 Python 函数。

bentoml.importing() 上下文管理器用于处理在服务期间所需的依赖的导入语句,这些依赖在其他情况下可能不可用。

在本地部署模型

  1. 运行 bentoml serve 来启动 BentoML 服务器。

    $ bentoml serve
    
    2024-02-02T07:16:14+0000 [WARNING] [cli] Converting 'Summarization' to lowercase: 'summarization'.
    2024-02-02T07:16:15+0000 [INFO] [cli] Starting production HTTP BentoServer from "service:Summarization" listening on https://:3000 (Press CTRL+C to quit)
    
  2. 您可以通过 https://:3000 调用暴露的 summarize 端点。

    curl -X 'POST' \
        'https://:3000/summarize' \
        -H 'accept: text/plain' \
        -H 'Content-Type: application/json' \
        -d '{
        "text": "Breaking News: In an astonishing turn of events, the small town of Willow Creek has been taken by storm as local resident Jerry Thompson'\''s cat, Whiskers, performed what witnesses are calling a '\''miraculous and gravity-defying leap.'\'' Eyewitnesses report that Whiskers, an otherwise unremarkable tabby cat, jumped a record-breaking 20 feet into the air to catch a fly. The event, which took place in Thompson'\''s backyard, is now being investigated by scientists for potential breaches in the laws of physics. Local authorities are considering a town festival to celebrate what is being hailed as '\''The Leap of the Century."
    }'
    
    import bentoml
    
    with bentoml.SyncHTTPClient("https://:3000") as client:
        result = client.summarize(
            text="Breaking News: In an astonishing turn of events, the small town of Willow Creek has been taken by storm as local resident Jerry Thompson's cat, Whiskers, performed what witnesses are calling a 'miraculous and gravity-defying leap.' Eyewitnesses report that Whiskers, an otherwise unremarkable tabby cat, jumped a record-breaking 20 feet into the air to catch a fly. The event, which took place in Thompson's backyard, is now being investigated by scientists for potential breaches in the laws of physics. Local authorities are considering a town festival to celebrate what is being hailed as 'The Leap of the Century.'"
        )
        print(result)
    

    访问 https://:3000,向下滚动到 服务 API,然后点击 尝试一下 (Try it out)。在 请求体 (Request body) 框中,输入您的提示,然后点击 执行 (Execute)

    BentoML hello world example Swagger UI

    预期输出

    Hello world! Here's your summary: Whiskers, an otherwise unremarkable tabby cat, jumped a record-breaking 20 feet into the air to catch a fly . The event is now being investigated by scientists for potential breaches in the laws of physics . Local authorities considering a town festival to celebrate what is being hailed as 'The Leap of the Century'
    

接下来是什么