Robotics AI Suite

Description

Robotics AI Suite is a preview collection of robotics applications, libraries, samples, and benchmarking tools to help you build solutions faster. It includes models and pipelines optimized with the OpenVINO™ toolkit for accelerated performance on Intel® CPUs, integrated GPUs, and NPUs. Refer to the detailed user guide and documentation.

Collections

Collections organize workflows and capabilities for three robot categories—Autonomous Mobile Robots (AMRs), Humanoid Imitation Learning, and Stationary Robot Vision & Control. Each collection brings together libraries for core robotics workloads, robotics control recipes, and virtualization or application management; with Robot Operating System 2 (ROS 2) integration points, supported sensor profiles, and repeatable benchmarking. Each collection includes OpenVINO™ toolkit–optimized models across computer vision, large language models (LLMs), and vision-language-action (VLA) to accelerate inference on Intel® CPUs, integrated GPUs, and NPUs; helping teams evaluate, assemble, and scale solutions faster.

Humanoid - Imitation Learning:

Application	Documentation	Description
Diffusion Policy (OpenVINO Toolkit)	Diffusion Policy (OpenVINO Toolkit)	Diffusion Policy implementation optimized with OpenVINO toolkit
Imitation Learning - ACT	Imitation Learning - ACT	Imitation learning pipeline using Action Chunking with Transformers(ACT) algorithm to train and evaluate in simulated or real robot environments with Intel® optimization
Improved 3D Diffusion Policy (OpenVINO Toolkit)	Improved 3D Diffusion Policy (OpenVINO Toolkit)	Improved 3D Diffusion Policy implementation optimized with OpenVINO toolkit
LLM Robotics Demo	LLM Robotics Demo	Step-by-step guide for setting up a real-time system to control a JAKA robot arm with movement commands generated using an LLM
Robotics Diffusion Transformer (OpenVINO Toolkit)	Robotics Diffusion Transformer (OpenVINO Toolkit)	Robotics Diffusion Transformer implementation optimized with OpenVINO toolkit
VSLAM: ORB-SLAM3	VSLAM: ORB-SLAM3	One of the popular real-time feature-based SLAM libraries that can perform Visual, Visual-Inertial and Multi-Map SLAM with monocular, stereo and RGB-D cameras, using pin-hole and fish-eye lens models

Autonomous Mobile Robot:

Algorithm	Documentation	Description
ADBScan	ADBScan	ADBSCAN (Adaptive DBSCAN) is an Intel-patented algorithm. It is a highly adaptive and scalable object detection and localization (clustering) algorithm, tested successfully to detect objects at all ranges for 2D Lidar, 3D Lidar, and Intel® RealSense™ depth camera.
Collaborative-SLAM	Collaborative-SLAM	A Collaborative Visual SLAM example that is compiled natively for both Intel® Core™ and Intel® Atom® processor-based systems. In addition, GPU acceleration may be enabled on selected Intel® Core™ processor-based system.
Fastmapping	Fastmapping
GroundFloor Segmentation	GroundFloor Segmentation	Showcases an Intel® algorithm designed for the segmentation of depth sensor data, compatible with 3D LiDAR or a Intel® RealSense™ camera inputs
ITS-Planner	ITS-Planner	Intelligent Sampling and Two-Way Search (ITS) global path planner is an Intel-patented algorithm. ITS is a new search approach based on two-way path planning and intelligent sampling, which reduces the compute time by about 20x-30x on a 1000-node map comparing with the A* search algorithm.
Multi-Camera-Demo	Multicam-Demo	Demonstrates the multi-camera use case using an Axiomtek ROBOX500 ROS2 AMR controller and four Intel® RealSense™ depth cameras D457
Object Detection	Object Detection	An example on using the ROS 2 node with OpenVINO toolkit. It outlines the steps for installing the node and executing the object detection model.
Simulations	Simulations	Tutorials on using the ROS 2 simulations with the Autonomous Mobile Robot. You can test robot sensing and navigation in these simulated environments.
Wandering	Wandering	Wandering mobile robot application is a ROS 2 sample application. It can be used with different SLAM algorithms in combination with the ROS2 navigation stack, to move the robot around in an unknown environment. The goal is to create a navigational map of the environment.

Stationary Robot Vision & Control:

Application	Documentation	Description
Stationary Robot Vision & Control	Stationary Robot Vision & Control	Stationary Robot Vision & Control is a robotic software framework aimed at tackling pick-and-place and track-and-place industrial problems. This is under active development, hence released in the pre-release quality.

OpenVINO™ Toolkit-Optimized Model Algorithms:

Algorithm	Description
YOLOv8	CNN-based object detection
YOLOv12	CNN-based object detection
MobileNetV2	CNN-based object detection
SAM	Transformer-based segmentation
SAM2	Extends SAM for video segmentation and object tracking with cross attention to memory
FastSAM	Lightweight substitute to SAM
MobileSAM	Lightweight substitute to SAM (Same model architecture with SAM. Refer to the OpenVINO toolkit and Segment Anything Model (SAM) tutorials for model exporting and application)
U-NET	CNN-based segmentation and diffusion model
DETR	Transformer-based object detection
DETR GroundingDino	Transformer-based object detection
CLIP	Transformer-based image classification
Qwen2.5VL	Multimodal large language model
Whisper	Automatic speech recognition
FunASR	Automatic speech recognition
Action Chunking with Transformers - ACT	An end-to-end imitation learning model designed for fine manipulation tasks in robotics
Visual Servoing - CNS	A technique that uses feedback information extracted from a vision sensor to control robot motion
Diffusion Policy	The ability to learn the gradient of the action distribution score function and optimize through the stochastic Langevin dynamics steps during inference provides a stable and efficient way to find optimal actions
Improved 3D Diffusion Policy (iDP3)	Improved 3D Diffusion Policy (iDP3) builds upon the original Diffusion Policy framework by enhancing its capabilities for 3D robotic manipulation tasks
Robotics Diffusion Transformer (RDT-1B)	Robotics Diffusion Transformer with 1.2B parameters (RDT-1B), is a diffusion-based foundation model for robotic manipulation
Feature Extraction Model: SuperPoint	A self-supervised framework for interest point detection and description in images, suitable for a large number of multiple-view geometry problems in computer vision
Feature Tracking Model: LightGlue	A model designed for efficient and accurate feature matching in computer vision tasks
Bird’s Eye View Perception: Fast-BEV	Obtaining a Bird's Eye View (BEV) perception is to gain a comprehensive understanding of the spatial layout and relationships between objects in a scene
Monocular Depth Estimation: Depth Anything V2	A powerful tool that leverages deep learning to infer 3D information from 2D images

Using Intel® Industry Solution Builders Search

Quick Links

Oops! No matching results were found for the searched keyword(s)

Robotics AI Suite

Description

Collections

My Account

Sign In

Sign In

Using Intel® Industry Solution Builders Search

Quick Links

Oops! No matching results were found for the searched keyword(s)

Robotics AI Suite

Description

Collections