管理部署¶
在您 在 BentoCloud 上部署 Bento 后,您可以使用 BentoML CLI 或 API 轻松管理它们。可用操作包括查看、更新、应用、终止和删除部署。
查看¶
列出 BentoCloud 帐户中的所有部署
bentoml deployment list
预期输出
Deployment created_at Bento Status Region
sentence-transformers-f8ng 2024-02-20 17:11:29 sentence_transformers:zf6jipgbyom3denz running google-cloud-us-central-1
mistralai-mistral-7-b-instruct-v-0-2-service-cld5 2024-02-20 16:40:16 mistralai--mistral-7b-instruct-v0.2-service:2024-02-03 running google-cloud-us-central-1
summarization 2024-02-20 09:27:52 summarization:ghfvclwp2kwm5e56 running aws-ca-1
control-net-gtb6 2024-02-20 01:53:29 control_net:cpvweqwbsgjswpmu terminated google-cloud-us-central-1
latent-consistency-4hno 2024-02-19 03:02:34 latent_consistency:p3ltylgo2kxbwv6m terminated google-cloud-us-central-1
获取特定部署的详情
根据需要选择以下命令之一。
bentoml deployment get <deployment-name>
# To output the details in JSON
bentoml deployment get <deployment-name> -o json
# To output the details in YAML (Default)
bentoml deployment get <deployment-name> -o yaml
YAML 格式的预期输出
name: summarization
bento: summarization:ghfvclwp2kwm5e56
cluster: aws-ca-1
endpoint_urls:
- https://summarization-test--aws-ca-1.mt1.bentoml.ai
admin_console: https://test.cloud.bentoml.com/deployments/summarization/access?cluster=aws-ca-1&namespace=test--aws-ca-1
created_at: '2024-02-20 09:27:52'
created_by: bentoml-user
config:
envs: []
services:
Summarization:
instance_type: cpu.2
scaling:
min_replicas: 1
max_replicas: 2
envs: []
deployment_strategy: Recreate
extras: {}
config_overrides:
traffic:
timeout: 10
status:
status: running
created_at: '2024-02-20 09:27:52'
updated_at: '2024-02-21 05:46:18'
获取部署的详细信息
import bentoml
dep = bentoml.deployment.get(name="deploy-1")
print(dep.to_dict()) # To output the details in JSON
print(dep.to_yaml()) # To output the details in YAML
JSON 格式的预期输出
{
"name": "deploy-1",
"bento": "summarization:5vsa3ywqsoefgl7l",
"cluster": "aws-ca-1",
"endpoint_urls": [
"https://deploy-1-test--aws-ca-1.mt1.bentoml.ai"
],
"admin_console": "https://test.cloud.bentoml.com/deployments/deploy-1/access?cluster=aws-ca-1&namespace=test--aws-ca-1",
"created_at": "2024-03-01 05:00:19",
"created_by": "bentoml-user",
"config": {
"envs": [],
"services": {
"Summarization": {
"instance_type": "cpu.2",
"scaling": {
"min_replicas": 1,
"max_replicas": 1
},
"envs": [],
"deployment_strategy": "Recreate",
"extras": {},
"config_overrides": {
"traffic": {
"timeout": 10
}
}
}
}
},
"status": {
"status": "running",
"created_at": "2024-03-01 05:00:19",
"updated_at": "2024-03-06 06:22:53"
}
}
检查部署状态
import bentoml
dep = bentoml.deployment.get(name="deploy-1")
status = dep.get_status()
print(status.to_dict()) # Show the current status of the Deployment
# Output: {'status': 'running', 'created_at': '2024-03-01 05:00:19', 'updated_at': '2024-03-06 03:55:17'}
get_status()
有一个参数 refetch
用于自动刷新状态,默认为 True
。您可以使用 dep.get_status(refetch=False)
来禁用它。
获取部署的 Bento
import bentoml
dep = bentoml.deployment.get(name="deploy-1")
bento = dep.get_bento()
print(bento) # Show the Bento of the Deployment
# Output: summarization:5vsa3ywqsoefgl7l
get_bento()
有一个参数 refetch
用于自动刷新 Bento 信息,默认为 True
。您可以使用 dep.get_bento(refetch=False)
来禁用它。
获取配置详情
import bentoml
dep = bentoml.deployment.get(name="deploy-1")
config = dep.get_config()
print(config.to_dict()) # Show the Deployment's configuration details in JSON
print(config.to_yaml()) # Show the Deployment's configuration details in YAML
注意
输出与上面示例输出中的 config
值相同。
get_config()
有一个参数 refetch
用于自动刷新配置数据,默认为 True
。您可以使用 dep.get_config(refetch=False)
来禁用它。
更新¶
更新部署本质上是一个补丁操作。这意味着当您执行更新命令时,它只会修改更新命令中明确包含的特定字段。部署的所有其他现有字段和配置保持不变。这对于对部署进行增量更改而无需重新定义整个配置非常有用。
更新单个服务部署的特定参数
# Add the parameter name flag
bentoml deployment update <deployment-name> --scaling-min 1
bentoml deployment update <deployment-name> --scaling-max 5
import bentoml
bentoml.deployment.update(
name = "deployment-1",
scaling_min=1,
scaling_max=3
# No change to unspecified parameters
)
您还可以使用单独的文件更新部署配置(只在文件中添加您想要更改的字段)。当部署中有多个 BentoML 服务 时,这非常有用。
bentoml deployment update <deployment-name> -f patch.yaml
import bentoml
bentoml.deployment.update(name="deployment-1", config_file="patch.yaml")
推出部署
# Use the Bento name
bentoml deployment update <deployment-name> --bento bento_name:version
# Use the project directory
bentoml deployment update <deployment-name> --bento ./project/directory
import bentoml
# Use the Bento name
bentoml.deployment.update(name="deployment-1", bento="bento_name:version")
# Use the project directory
bentoml.deployment.update(name="deployment-1", bento="./project/directory")
应用¶
apply
操作是管理部署的一种全面方式,允许您根据提供的规范创建或更新部署。它的工作方式如下:
如果指定名称的部署不存在,
apply
将根据提供的规范创建新部署。如果指定名称的部署已存在,
apply
将精确地按照提供的规范更新现有部署。
apply
和 update
的区别
更新 (仅补丁): 进行最小更改,只更新您指定的字段。
应用 (覆盖): 考虑整个配置,可能会将未指定的字段重置为默认值,或者如果它们在应用配置中不存在,则将其删除。如果部署不存在,应用配置将创建部署。
要将新配置应用到部署,您可以在单独的文件中定义它们作为参考。
bentoml deployment apply <deployment_name> -f new_deployment.yaml
import bentoml
bentoml.deployment.apply(name = "deployment-1", config_file = "deployment.yaml")
终止¶
终止部署意味着它将被停止,从而不再产生任何费用。在终止后,您仍然可以恢复部署。
终止部署
bentoml deployment terminate <deployment_name>
import bentoml
bentoml.deployment.terminate(name="deployment-1")
删除¶
如果不再需要部署,您可以将其删除。要删除部署
bentoml deployment delete <deployment_name>
import bentoml
bentoml.deployment.delete(name="deployment-1")
警告
删除部署时请谨慎。此操作不可逆。