Headline
CVE-2023-43654: Release TorchServe v0.8.2 Release Notes · pytorch/serve
TorchServe is a tool for serving and scaling PyTorch models in production. TorchServe default configuration lacks proper input validation, enabling third parties to invoke remote HTTP download requests and write files to the disk. This issue could be taken advantage of to compromise the integrity of the system and sensitive data. This issue is present in versions 0.1.0 to 0.8.1. A user is able to load the model of their choice from any URL that they would like to use. The user of TorchServe is responsible for configuring both the allowed_urls and specifying the model URL to be used. A pull request to warn the user when the default value for allowed_urls is used has been merged in PR #2534. TorchServe release 0.8.2 includes this change. Users are advised to upgrade. There are no known workarounds for this issue.
This is the release of TorchServe v0.8.2.
Security
- Updated snakeyaml version to v2 #2523 @nskool
- Added warning about model allowed urls when default value is applied #2534 @namannandan
Custom metrics backwards compatibility
- add_metric is now backwards compatible with versions [< v0.6.1] but the default metric type is inferred to be COUNTER. If the metric is of a different type, it will need to be specified in the call to add_metric as follows:
metrics.add_metric(name=’GenericMetric’, value=10, unit=’count’, dimensions=[…], metric_type=MetricTypes.GAUGE) - When upgrading from versions [v0.6.1 - v0.8.1] to v0.8.2, replace the call to add_metric with add_metric_to_cache.
- All custom metrics updated in the custom handler will need to be included in the metrics configuration file for them to be emitted by Torchserve. This is shown here.
- A detailed upgrade guide is included in the metrics documentation.
New Features
- Supported KServe GPRC v2 #2176 @jagadeeshi2i
- Supported K8S session affinity #2519 @jagadeeshi2i
New Examples
Example LLama v2 70B chat using HuggingFace Accelerate #2494 @lxning @HamidShojanazeri @agunapal
large model example OPT-6.7B on Inferentia2 #2399 @namannandan
- This example demonstrates how NeuronX compiles the model , detects neuron core availability and runs the inference.
DeepSpeed deferred init with OPT-30B #2419 @agunapal
- This PR added feature deferred model init in OPT-30B example by leveraging DeepSpeed new version. This feature is able to significantly reduce model loading latency.
Torch TensorRT example #2483 @agunapal
- This PR uses Resnet-50 as an example to demonstrate Torch TensorRT.
K8S mnist example using minikube #2323 @agunapal
- This example shows how to use a pre-trained custom MNIST model to performing real time Digit recognition via K8S.
Example for custom metrics #2516 @namannandan
Example for object detection with ultralytics YOLO v8 model #2508 @agunapal
Improvements
- Migrated publishing torchserve-plugins-sdk from Maven JCenter to Maven Central #2429 #2422 @namannandan
- Fixed download model from S3 presigned URL #2416 @namannandan
- Enabled opt-6.7b benchmark on inf2 #2400 @namannandan
- Added job Queue Status in describe API #2464 @namannandan
- Added add_metric API to be backward compatible #2525 @namannandan
- Upgraded nvidia base image version to nvidia/cuda:11.7.1-base-ubuntu20.04 in GPU docker image #2442 @agunapal
- Added Docker regression tests in CI #2403 @agunapal
- Updated release version #2533 @agunapal
- Upgraded default cuda to 11.8 in docker image build #2489 @agunapal
- Updated docker nightly build parameters #2493 @agunapal
- Added path to save ab benchmark profile graph in benchmark report #2451 @agunapal
- Added profile information for benchmark #2470 @agunapal
- Fixed manifest null in base handler #2488 @pedrogengo
- Fixed batching input in DALI example #2455 @jagadeeshi2i
- Fixed metrcis for K8S setup #2473 @jagadeeshi2i
- Fixed kserve storage optional package in Dockerfile #2537 @jagadeeshi2i
- Fixed typo in ModelConfig.java comments #2506 @arnavmehta7
- Fixed netty direct buffer issues in torchserve-plugins-sdk #2511 @marrodion
- Fixed typo in ts/context.py comments #2536 @ethankim00
- Fixed Server error when gRPC client close connection unexpectedly #2420 @lxning
Documentation
- Updated large model documentation #2468 @sekyondaMeta
- Updated Sphinx landing page and requirements #2428 #2520 @sekyondaMeta
- Updated G analytics in docs #2449 @sekyondaMeta
- Added performance checklist in docs #2526 @sekyondaMeta
- Added performance guidance in FAQ #2524 @sekyondaMeta
- Added instruction for embedding handler examples #2431 @sidharthrajaram
- Updated PyPi description #2445 @bryanwweber @agunapal
- Updated Better Transformer README #2474 @HamidShojanazeri
- Fixed typo in microbatching README #2484 @InakiRaba91
- Fixed broken link in kubernetes AKS README #2490 @agunapal
- Fixed lint error #2497 @ankithagunapal
- Updated instructions for building GPU docker image for ONNX #2435 @agunapal
Platform Support
Ubuntu 16.04, Ubuntu 18.04, Ubuntu 20.04 MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above, and JDK17.
GPU Support
Torch 2.0.1 + Cuda 11.7, 11.8
Torch 2.0.0 + Cuda 11.7, 11.8
Torch 1.13 + Cuda 11.7, 11.8
Torch 1.11 + Cuda 10.2, 11.3, 11.6
Torch 1.9.0 + Cuda 11.1
Torch 1.8.1 + Cuda 9.2
Related news
The PyTorch model server contains multiple vulnerabilities that can be chained together to permit an unauthenticated remote attacker arbitrary Java code execution. The first vulnerability is that the management interface is bound to all IP addresses and not just the loop back interface as the documentation suggests. The second vulnerability (CVE-2023-43654) allows attackers with access to the management interface to register MAR model files from arbitrary servers. The third vulnerability is that when an MAR file is loaded, it can contain a YAML configuration file that when deserialized by snakeyaml, can lead to loading an arbitrary Java class.
Cybersecurity researchers have disclosed multiple critical security flaws in the TorchServe tool for serving and scaling PyTorch models that could be chained to achieve remote code execution on affected systems. Israel-based runtime application security company Oligo, which made the discovery, has coined the vulnerabilities ShellTorch. "These vulnerabilities [...] can lead to a full chain Remote
## Impact **Remote Server-Side Request Forgery (SSRF)** **Issue**: TorchServe default configuration lacks proper input validation, enabling third parties to invoke remote HTTP download requests and write files to the disk. This issue could be taken advantage of to compromise the integrity of the system and sensitive data. This issue is present in versions `0.1.0` to `0.8.1`. **Mitigation**: The user is able to load the model of their choice from any URL that they would like to use. The user of TorchServe is responsible for configuring both the [allowed_urls](https://github.com/pytorch/serve/blob/b3eced56b4d9d5d3b8597aa506a0bcf954d291bc/docs/configuration.md?plain=1#L296) and specifying the model URL to be used. A pull request to warn the user when the default value for `allowed_urls` is used has been merged - https://github.com/pytorch/serve/pull/2534. TorchServe release `0.8.2` includes this change. ## Patches ## TorchServe release 0.8.2 includes fixes to address the previou...