ChatRex Demo: Visual Prompt Interaction Guide
November 27, 2024 ยท View on GitHub
ChatRex Demo: Visual Prompt Interaction Guide
Contents
- ChatRex Demo: Visual Prompt Interaction Guide
- Contents
- 1. Introduction ๐
- 2. Workflow ๐
- 3. Tips and Support ๐ก
1. Introduction ๐
Welcome to the ChatRex Demo! This tool demonstrates interactive visual prompt methods for AI-powered image understanding and question answering. This document provides detailed instructions on the workflow, interface components, and how to utilize the visual prompts effectively.
1.1. Video Demo for ChatRex
We also provide a gradio demo for ChatRex. Before you use, we highly recommend you to watch the following video to understand how to use this demo:
2. Workflow ๐
-
Choose a Visual Prompt Method
- Select either
Interactive Visual PromptorProposal Visual Promptto define your region of interest within the image.
- Select either
-
Provide a Question Input
- Enter a valid question in the
Raw Question Inputfield or use aPre-defined Question Template. Ensure input accuracy to achieve relevant results.
- Enter a valid question in the
-
Run the Demo
- Click on the
Run ChatRexbutton to process the image and display the results, including answers and visualizations.
- Click on the
2.1. Visual Prompt Methods ๐ค
2.1.1. Interactive Visual Prompt
-
Overview:
This mode allows you to manually annotate regions of interest by either:- Clicking on the image to add a point, or
- Drawing a bounding box over specific areas.
-
Display Visualization:
Once the annotations are complete, click onDisplay Visual Promptto visualize the selected regions. -
Important Notes:
- Ensure that neither
Fine Grained ProposalnorCoarse Grained Proposalcheckboxes are selected when using this mode.
- Ensure that neither
2.1.2. Proposal Visual Prompt
-
Overview:
This mode automatically generates bounding boxes based on the granularity of the proposal:- Fine Grained Proposal: Produces a detailed set of bounding boxes for smaller components (e.g., noses, eyes, or body parts).
- Coarse Grained Proposal: Generates fewer bounding boxes for larger objects or overall entities (e.g., a person, dog, or an whole entity).
-
Display Visualization:
ClickDisplay UPN Proposalto view the generated bounding boxes.
2.2. Question Input โ
2.2.1. Raw Question Input
- Enter your question in natural language. For example:
- What objects are present in this image?
- What is the color of the dog's collar?
- Who painted the sculpture?
2.2.2. Pre-defined Question Templates
- Select from a list of predefined templates to simplify the question input process.
- If you need to specify object categories (e.g., dog or cat ->
dog,cat), enter their names or IDs in the<Object ids>field, following the provided hints.
3. Tips and Support ๐ก
- If you're unsure how to interact with the application, refer to the tutorial video or browse the solved issues for additional guidance.
- For any further questions or feedback, feel free to contact us through the Issues page.
Enjoy exploring ChatRex's multimodal capabilities for seamless visual and language interaction!
