globot

January 4, 2024 ยท View on GitHub

With GPT-4V, we can finally complete the original vision of natbot.

Help solve general agents by contributing to this repo!

Ideas for Improvement

  • Scrolling (easy to add, but likely to cause divergence)
  • Better context management (learning from mistakes, more descriptive history)
  • Masking the image with node IDs
  • Better DOM parsing (please submit issues/PRs!)
  • More explicit planning
  • Data collection and fine-tuning

NOTE: Remember to use the latest release of the openai API for the vision model:

pip install --upgrade openai

made by Ivan Yevenko