Video showcase of my project:

Video showing the project working with the camera in view:

https://youtu.be/FQVAORo8usQ

Video showing the best run of the project:

https://youtube.com/shorts/0SzdhBfsPbw?feature=share

1. Objective:

The aim of this project was to understand how AI + VLM (Visual Language Models) can be integrated into modern robotics, especially in path planning for manipulators.

The task at hand was to essentially integrate pre-trained AI models such as YOLO-world and use said models to help identify objects the user specifies so that the manipulator can move towards said object.

I was inspired to do this project by coming across the Voxposer project from Stanford University during my undergraduate years.

Voxposer: https://voxposer.github.io/

2. Equipment Used:

Note: this project was done at home as such I had used whatever was available to me at the time. Despite this, I really did only use only 2 pieces of equipment + my computer.