Vision Language Models for Robotics Open Class
What you will learn in this Open Class
Vision language models, often referred to as VLMs, are advanced AI models that combine computer vision and natural language processing (NLP) capabilities. These models are designed to understand images or videos and generate descriptions, answer questions, or perform various tasks based on the visual content.
In this Open Class, we will review some key aspects related to these VLMs and how they can be applied to robotics.
You will learn:
- What are Vision Language Models
- How to use them in a robotics application
Robot Used: RT-2