ChannelLife Australia - Industry insider news for technology resellers
Story image

Akamai unveils cloud service reducing AI inference costs 86%

Fri, 28th Mar 2025

Akamai has announced the launch of Akamai Cloud Inference, a new service designed to enhance the efficiency and effectiveness of organisations converting predictive and large language models (LLMs) into actionable outcomes.

This newly introduced service operates on Akamai Cloud, identified as a highly distributed cloud platform, aiming to overcome the growing constraints associated with centralised cloud models. By utilising Akamai Cloud Inference, businesses can achieve an up to 86% reduction in costs for AI inference and related workloads when compared to traditional hyperscaler infrastructure models.

The company emphasised the performance improvements their solution provides, stating it offers three times better throughput while significantly reducing latency by up to 60% compared to mainstream hyperscale infrastructure. Adam Karon, Chief Operating Officer and General Manager, Cloud Technology Group at Akamai, highlighted the ability of their platform to bring AI data closer to users and devices, a challenge often encountered with legacy cloud models.

"Getting AI data closer to users and devices is hard, and it's where legacy clouds struggle," said Karon. "While the heavy lifting of training LLMs will continue to happen in big hyperscale data centres, the actionable work of inferencing will take place at the edge where the platform Akamai has built over the past two and a half decades becomes vital for the future of AI and sets us apart from every other cloud provider in the market."

Akamai's comprehensive toolset for platform engineers and developers includes facilities for building and executing AI applications and data-intensive workloads nearer to end-users. This suite of tools encompasses various capabilities such as compute options, data management, and containerisation, all efficiently functioning on a globally distributed infrastructure.

The compute capabilities within Akamai Cloud include diverse resources like traditional CPUs, GPUs, and specialised ASIC VPUs, alongside integration with NVIDIA's AI Enterprise ecosystem. This integration is targeted at optimising AI inference performance on NVIDIA GPUs.

Furthermore, Akamai's data management capabilities are supported by a partnership with VAST Data, facilitating real-time data access and management vital for AI workloads. Akamai also offers scalable object storage and partnerships with vector database vendors like Aiven and Milvus to enable enhanced AI data retrieval and management processes.

Containerisation is another crucial aspect of the service, enabled through Kubernetes which allows for efficient scaling, improved resilience, and enhanced performance and cost optimisation. The service is underpinned by the Linode Kubernetes Engine-Enterprise, tailored for large-scale enterprise applications.

Additionally, Akamai Cloud Inference features edge compute capabilities through WebAssembly (Wasm), allowing developers to execute inferencing for LLMs directly from serverless applications, thereby facilitating latency-sensitive application deployment.

The move towards a distributed cloud model represents a growing demand for AI solutions that manage and leverage data closer to its generation points. This shift influences infrastructure needs as businesses evolve from training LLMs to utilising AI for making faster, more intelligent decisions and creating more personalised experiences.

Karon likened the training of LLMs to map creation, highlighting the progression to inference as being akin to real-time navigation using GPS. "Training an LLM is like creating a map, requiring you to gather data, analyse terrain, and plot routes. It's slow and resource-intensive, but once built, it's highly useful. AI inference is like using a GPS, instantly applying that knowledge, recalculating in real-time, and adapting to changes to get you where you need to go," he explained. "Inference is the next frontier for AI."

This advancement underscores the importance of distributed cloud and edge architectures, which are becoming increasingly beneficial for operational intelligence use cases by providing real-time, actionable insights across distributed environments. Some early sample applications include in-car voice assistance, AI-driven agricultural management, and virtual shopping experiences, pointing to the versatile applications possible with Akamai Cloud Inference.

Follow us on:
Follow us on LinkedIn Follow us on X
Share on:
Share on LinkedIn Share on X