Abstract

Spatial applications require accurate geometrical measurements of physical spaces. Typically, room geometry can be determined based on the position of various corners in the room. A room keypoints model can provide the locations of corners in an image of a room. However, there is a dearth of diverse types of suitable images with labeled keypoints that can be employed to train keypoint models. Manual labeling does not scale because it is tedious, slow, and expensive. This disclosure describes an LLM-based agent to automate the labeling of keypoints, such as corners within room images at scale, thus enabling speedy generation of ground truth training data at scale. The agent can be fine-tuned to output pixel coordinates within an image corresponding to the locations of various keypoints within the room based on the image and a prompt specifying the task. The techniques can be enhanced by appropriately incorporating Reinforcement Learning from Human Feedback (RLHF).

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.

Share

COinS