管理部署

在您 在 BentoCloud 上部署 Bento 后,您可以使用 BentoML CLI 或 API 轻松管理它们。可用操作包括查看、更新、应用、终止和删除部署。

查看

列出 BentoCloud 帐户中的所有部署

bentoml deployment list

预期输出

Deployment                                         created_at           Bento                                                                      Status      Region
sentence-transformers-f8ng                         2024-02-20 17:11:29  sentence_transformers:zf6jipgbyom3denz                                     running     google-cloud-us-central-1
mistralai-mistral-7-b-instruct-v-0-2-service-cld5  2024-02-20 16:40:16  mistralai--mistral-7b-instruct-v0.2-service:2024-02-03                     running     google-cloud-us-central-1
summarization                                      2024-02-20 09:27:52  summarization:ghfvclwp2kwm5e56                                             running     aws-ca-1
control-net-gtb6                                   2024-02-20 01:53:29  control_net:cpvweqwbsgjswpmu                                               terminated  google-cloud-us-central-1
latent-consistency-4hno                            2024-02-19 03:02:34  latent_consistency:p3ltylgo2kxbwv6m                                        terminated  google-cloud-us-central-1

获取特定部署的详情

根据需要选择以下命令之一。

bentoml deployment get <deployment-name>

# To output the details in JSON
bentoml deployment get <deployment-name> -o json

# To output the details in YAML (Default)
bentoml deployment get <deployment-name> -o yaml

YAML 格式的预期输出

name: summarization
bento: summarization:ghfvclwp2kwm5e56
cluster: aws-ca-1
endpoint_urls:
- https://summarization-test--aws-ca-1.mt1.bentoml.ai
admin_console: https://test.cloud.bentoml.com/deployments/summarization/access?cluster=aws-ca-1&namespace=test--aws-ca-1
created_at: '2024-02-20 09:27:52'
created_by: bentoml-user
config:
  envs: []
  services:
    Summarization:
      instance_type: cpu.2
      scaling:
        min_replicas: 1
        max_replicas: 2
      envs: []
      deployment_strategy: Recreate
      extras: {}
      config_overrides:
        traffic:
          timeout: 10
status:
  status: running
  created_at: '2024-02-20 09:27:52'
  updated_at: '2024-02-21 05:46:18'

获取部署的详细信息

import bentoml

dep = bentoml.deployment.get(name="deploy-1")
print(dep.to_dict())  # To output the details in JSON
print(dep.to_yaml())  # To output the details in YAML

JSON 格式的预期输出

{
 "name": "deploy-1",
 "bento": "summarization:5vsa3ywqsoefgl7l",
 "cluster": "aws-ca-1",
 "endpoint_urls": [
   "https://deploy-1-test--aws-ca-1.mt1.bentoml.ai"
 ],
 "admin_console": "https://test.cloud.bentoml.com/deployments/deploy-1/access?cluster=aws-ca-1&namespace=test--aws-ca-1",
 "created_at": "2024-03-01 05:00:19",
 "created_by": "bentoml-user",
 "config": {
   "envs": [],
   "services": {
     "Summarization": {
       "instance_type": "cpu.2",
       "scaling": {
         "min_replicas": 1,
         "max_replicas": 1
       },
       "envs": [],
       "deployment_strategy": "Recreate",
       "extras": {},
       "config_overrides": {
         "traffic": {
           "timeout": 10
         }
       }
     }
   }
 },
 "status": {
   "status": "running",
   "created_at": "2024-03-01 05:00:19",
   "updated_at": "2024-03-06 06:22:53"
  }
}

检查部署状态

import bentoml

dep = bentoml.deployment.get(name="deploy-1")
status = dep.get_status()
print(status.to_dict()) # Show the current status of the Deployment
# Output: {'status': 'running', 'created_at': '2024-03-01 05:00:19', 'updated_at': '2024-03-06 03:55:17'}

get_status() 有一个参数 refetch 用于自动刷新状态,默认为 True。您可以使用 dep.get_status(refetch=False) 来禁用它。

获取部署的 Bento

import bentoml

dep = bentoml.deployment.get(name="deploy-1")
bento = dep.get_bento()
print(bento) # Show the Bento of the Deployment
# Output: summarization:5vsa3ywqsoefgl7l

get_bento() 有一个参数 refetch 用于自动刷新 Bento 信息,默认为 True。您可以使用 dep.get_bento(refetch=False) 来禁用它。

获取配置详情

import bentoml

dep = bentoml.deployment.get(name="deploy-1")
config = dep.get_config()
print(config.to_dict()) # Show the Deployment's configuration details in JSON
print(config.to_yaml()) # Show the Deployment's configuration details in YAML

注意

输出与上面示例输出中的 config 值相同。

get_config() 有一个参数 refetch 用于自动刷新配置数据,默认为 True。您可以使用 dep.get_config(refetch=False) 来禁用它。

更新

更新部署本质上是一个补丁操作。这意味着当您执行更新命令时,它只会修改更新命令中明确包含的特定字段。部署的所有其他现有字段和配置保持不变。这对于对部署进行增量更改而无需重新定义整个配置非常有用。

更新单个服务部署的特定参数

# Add the parameter name flag
bentoml deployment update <deployment-name> --scaling-min 1
bentoml deployment update <deployment-name> --scaling-max 5
import bentoml

bentoml.deployment.update(
  name = "deployment-1",
  scaling_min=1,
  scaling_max=3
  # No change to unspecified parameters
)

您还可以使用单独的文件更新部署配置(只在文件中添加您想要更改的字段)。当部署中有多个 BentoML 服务 时,这非常有用。

bentoml deployment update <deployment-name> -f patch.yaml
import bentoml

bentoml.deployment.update(name="deployment-1", config_file="patch.yaml")

推出部署

# Use the Bento name
bentoml deployment update <deployment-name> --bento bento_name:version

# Use the project directory
bentoml deployment update <deployment-name> --bento ./project/directory
import bentoml

# Use the Bento name
bentoml.deployment.update(name="deployment-1", bento="bento_name:version")

# Use the project directory
bentoml.deployment.update(name="deployment-1", bento="./project/directory")

应用

apply 操作是管理部署的一种全面方式,允许您根据提供的规范创建或更新部署。它的工作方式如下:

  • 如果指定名称的部署不存在,apply 将根据提供的规范创建新部署。

  • 如果指定名称的部署已存在,apply 将精确地按照提供的规范更新现有部署。

applyupdate 的区别

  • 更新 (仅补丁): 进行最小更改,只更新您指定的字段。

  • 应用 (覆盖): 考虑整个配置,可能会将未指定的字段重置为默认值,或者如果它们在应用配置中不存在,则将其删除。如果部署不存在,应用配置将创建部署。

要将新配置应用到部署,您可以在单独的文件中定义它们作为参考。

bentoml deployment apply <deployment_name> -f new_deployment.yaml
import bentoml

bentoml.deployment.apply(name = "deployment-1", config_file = "deployment.yaml")

终止

终止部署意味着它将被停止,从而不再产生任何费用。在终止后,您仍然可以恢复部署。

终止部署

bentoml deployment terminate <deployment_name>
import bentoml

bentoml.deployment.terminate(name="deployment-1")

删除

如果不再需要部署,您可以将其删除。要删除部署

bentoml deployment delete <deployment_name>
import bentoml

bentoml.deployment.delete(name="deployment-1")

警告

删除部署时请谨慎。此操作不可逆。