visionscript

A high-level programming language for using computer vision.

MIT License

Downloads

120

Stars

342

Ecosystems: Python

Package Rankings

Top 24.36% on Pypi.org

[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale

Cambrian-1 is a family of multimodal LLMs with a vision-centric design.

Scenic: A Jax Library for Computer Vision Research and Beyond

The implementation of "Prismer: A Vision-Language Model with Multi-Task Experts".

Syllabus for Scrapism @ SFPC / Fall 2022

👁️ + 💬 + 🎧 = 🤖 Curated list of top foundation and multimodal models! [Paper + Code + Examples + ...

Klipse is a JavaScript plugin for embedding interactive code snippets in tech blogs.

Using LLMs and pre-trained caption models for super-human performance on image captioning.

VILA - a multi-image visual language model with training, inference and evaluation recipe, deploy...

[ICLR2024 Spotlight] Code Release of CLIPSelf: Vision Transformer Distills Itself for Open-Vocabu...

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多...