#069 – AI on CPUs with Earl Ruby

In episode 69, Earl Ruby discusses his career highlights and his current role at Broadcom. He explains the Private AI Foundation with Intel and how it enables customers to run AI and ML workloads. The discussion then focuses on choosing between CPUs and GPUs for ML workloads, debunking misconceptions about CUDA, and the future of software tools like OneAPI. Earl also provides insights into AMX and its support in vSphere for running ML workloads on CPUs. In this conversation, Earl Ruby III discusses various topics related to AMX and large language models. He explains the concept of quantization and how it is used to run models on AMX. He also discusses the challenges of sizing virtual machines for large language models and the power consumption differences between GPUs and CPUs. The conversation touches on heterogeneous clusters and workload placement, as well as the future of AMX and Intel GPUs. Finally, Earl mentions his blog articles where he shares his insights and experiences.

Takeaways

  • The Private AI Foundation with Intel enables customers to run AI and ML workloads using Intel’s AMX instruction set and GPUs.
  • When choosing between CPUs and GPUs for ML workloads, consider factors such as use case, model complexity, and performance requirements.
  • CUDA is not the only option for writing optimized AI workloads, as Intel’s oneAPI provides an open API for working with their hardware.
  • AMX is a set of instructions backed by hardware in Intel CPUs for matrix multiplication and other matrix operations, and it is supported in vSphere for running ML workloads on CPUs. Quantization is a technique used to convert high bit numbers into lower bit equivalents, allowing for smaller memory footprint and accelerated processing on AMX.
  • Sizing virtual machines for large language models can be challenging, and it is important to consider the memory footprint and CPU cores required.
  • Power consumption of GPUs is higher than CPUs, especially when GPUs are underutilized. CPUs can become power competitive when not fully utilized.
  • Heterogeneous clusters can be used to ensure specific workloads land on AMX-enabled CPUs, while Kubernetes provides automatic workload placement based on hardware capabilities.
  • The future of AMX and Intel GPUs involves extensibility and integration with other GPU technologies. OneAPI allows for seamless software compatibility with new hardware.
  • AVX-512 can be used to accelerate ML workloads on older machines without AMX, but the performance boost is not as significant as with AMX or GPUs.
  • Earl Ruby shares his insights and experiences through his blog articles, where he provides solutions to unique challenges and saves others from similar frustrations.

Some links to topics discussed:

#068 – Diving into the VMC on AWS announcements with Niels Hagoort

In episode 067, we invited Niels Hagoort back to the show to talk about the latest VMC on AWS announcements.

Topics we discussed:

  • The new M7i instance, including its use cases and specs
  • The Cloud Management Add-on for VMC on AWS and how Aria can add value
  • The differences between the M7i instance and the other instance types, and the deployment considerations.

More about these announcements can be found here:
https://vmc.techzone.vmware.com/closer-look-m7i-instance-vmware-cloud-aws

#067 – Securing the world’s data with Rubrik featuring Jerry Rijnbeek

Jerry Rijnbeek, Vice President at Rubrik, discusses the history and capabilities of Rubrik’s data protection platform. He explains how Rubrik started as an appliance model, offering a converged solution for data protection. Jerry highlights Rubrik’s search capabilities, which allow users to easily find and restore files. He also discusses Rubrik’s support for protecting SaaS platforms and its recent expansion into container and Kubernetes protection. Rijnbeek emphasizes the growing threat of ransomware and the need for organizations to focus on data security posture management (DSPM) to limit the chances of successful attacks. He explains how Rubrik’s platform helps organizations monitor and detect anomalies in data access and behavior, as well as recover from ransomware attacks through mass orchestrated recovery.

Takeaways

  • Rubrik offers a converged data protection platform that simplifies backup and recovery processes.
  • Rubrik’s search capabilities make it easy to find and restore files, improving efficiency and reducing downtime.
  • Rubrik provides protection for SaaS platforms and has expanded its support to include containers and Kubernetes.
  • The rise of ransomware has made data security posture management (DSPM) crucial for organizations to limit the chances of successful attacks.
  • Rubrik’s platform includes data threat analytics and detection features to monitor and detect anomalies in data access and behavior.
  • Rubrik enables mass orchestrated recovery, allowing organizations to recover from ransomware attacks quickly and efficiently.
  • Rubrik offers multiple cloud regions and options for data recovery, providing flexibility and choice for customers.

#066 – Data Platforms for AI/ML workload with Vaughn Stewart

In episode 66, we spoke with Vaughn Stewart on data platforms for AI and ML workloads. Vaughn is the VP of Systems Engineering at VAST Data. He provides us with an overview of the needs and requirements of a data platform for successful deployments of AI-enabled applications.

Some links to topics discussed: