OnSpecta has worked with market leaders to deliver performance acceleration solutions for the cloud and the edge. Some sample solutions are described below.
Sample Cloud Solution: Oracle Cloud Inference Service powered by Ampere Altra CPU’s
Typical AI workloads, such as those for object detection, video processing or recommendation engines, are compute intensive and can increase cloud operating costs significantly. Customers want to optimize performance to achieve the lowest possible cost for their inference instances without writing complex optimization code.
The Ampere A1 compute platform is a great fit for AI inference workloads because its Arm-based architecture delivers superior performance per watt of power, resulting in a meaningfully lower total cost of ownership (TCO) compared to alternatives. OnSpecta’s DLS improves performance by up to 10x and provides ready-to-use optimization and acceleration on Arm-based servers deployed on the Oracle Cloud.
Edge Solutions: Seamless deployment of AI models on Arm SoCs
In order to decrease latency and bandwidth, many AI models need to be deployed on edge devices such as smartphones, cameras or car sensors. Edge deployments come with tradeoffs such as less compute power, limited battery life as well as support for a wide variety of Arm SoCs with CPUs and other types of processors. These tradeoffs significantly complicate the optimization and deployment of AI models on edge devices.
OnSpecta has worked with several market leaders to deliver performance acceleration, reduction in memory footprint, and the seamless deployment of AI models on ARM SoCs.