ChatGPT, GenerativeAI and LLMs Timeline
This repository organizes a timeline of key events (products, services, papers, GitHub, blog posts and news) that occurred before and after the ChatGPT announcement.
It's curating a variety of information in this timeline, with a particular focus on LLM and Generative AI.
Maybe it's a scene from the hottest history, so I thought it would be important to keep those memories well, so I organized them.
Statistics
These diagrams were generated by ChatGPT's Code Interpreter.
Contributing
Issues and Pull Requests are greatly appreciated. If you've never contributed to an open source project before I'm more than happy to walk you through how to create a pull request.
You can start by opening an issue describing the problem that you're looking to resolve and we'll go from there.
Emoji
arXiv โ, PDF ๐, arxiv-vanity ๐, paper page ๐ , papers with code โณ๏ธ, Github :octocat:
License
This document is licensed under the MIT license ยฉ Jonghong Jeon(์ ์ข
ํ)
Timeline V2
2024
- 05/17 - OpenAI strikes Reddit deal to train its AI on your posts (News),
- 05/17 - OpenAI dissolves team focused on long-term AI risks, less than one year after announcing it (News),
- 05/17 - International Scientific Report on the Safety of Advanced AI (Blog),
- 05/16 - TRANSIC: Sim-to-Real Policy Transfer by Learning from Online Correction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/16 - Toon3D: Seeing Cartoons from a New Perspective (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/16 - Testing the reliability of an AI-based large language model to extract ecological information from the scientific literature (News),
- 05/16 - Many-Shot In-Context Learning in Multimodal Foundation Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/16 - How to Hit Pause on AI Before Itโs Too Late (News),
- 05/16 - Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/16 - GPT Store Mining and Analysis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/16 - Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/16 - Chameleon: Mixed-Modal Early-Fusion Foundation Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/16 - CAT3D: Create Anything in 3D with Multi-View Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/15 - Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/15 - LoRA Learns Less and Forgets Less (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/15 - Googleโs invisible AI watermark will help identify generative text and video (News),
- 05/15 - Google I/O 2024: everything announced (Blog),
- 05/15 - BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/15 - ALPINE: Unveiling the Planning Capability of Autoregressive Learning in Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/14 - Understanding the performance gap between online and offline alignment algorithms (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/14 - SpeechVerse: A Large-scale Generalizable Audio Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/14 - SpeechGuard: Exploring the Adversarial Robustness of Multimodal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/14 - No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/14 - Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/14 - Compositional Text-to-Image Generation with Dense Blob Representations (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/14 - Beyond Scaling Laws: Understanding Transformer Performance with Associative Memory (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/13 - SambaNova SN40L: Scaling the AI Memory Wall with Dataflow and Composition of Experts (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/13 - RLHF Workflow: From Reward Modeling to Online RLHF (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/13 - Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/13 - OpenAI unveils newest AI model, GPT-4o (News),
- 05/13 - MS MARCO Web Search: a Large-scale Information-rich Web Dataset with Millions of Real Click Labels (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/13 - How Much Research Is Being Written by Large Language Models? (Blog),
- 05/13 - Hello GPT-4o (Blog),
- 05/13 - Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/11 - Piccolo2: General Text Embedding with Multi-task Hybrid Loss Training (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/11 - LogoMotion: Visually Grounded Code Generation for Content-Aware Animation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/10 - INSPECT - An open-source framework for large language model evaluations (Blog),
- 05/10 - AI Safety Institute releases new AI safety evaluations platform (News),
- 05/07 - SUTRA: Scalable Multilingual Language Model Architecture (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/07 - Meta Releases Llama 3 Open-Source LLM (News),
- 05/03 - What matters when building vision-language models? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/02 - WildChat: 1M ChatGPT Interaction Logs in the Wild (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/02 - StoryDiffusion: Consistent Self-Attention for Long-Range Image and Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/02 - Prometheus 2: An Open Source Language Model Specialized in Evaluating Other Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/02 - NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/02 - LLM-AD: Large Language Model based Audio Description System (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/02 - FLAME: Factuality-Aware Alignment for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/02 - Customizing Text-to-Image Models with a Single Image Pair (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/01 - Spectrally Pruned Gaussian Fields with Neural Compensation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/01 - Self-Play Preference Optimization for Language Model Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/01 - Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3 (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/01 - Clover: Regressive Lightweight Speculative Decoding with Sequential Knowledge (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 05/01 - A Careful Examination of Large Language Model Performance on Grade School Arithmetic (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - STT: Stateful Tracking with Transformers for Autonomous Driving (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - SemantiCodec: An Ultra Low Bitrate Semantic Audio Codec for General Sound (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - Octopus v4: Graph of language models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - MotionLCM: Real-time Controllable Motion Generation via Latent Consistency Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - MicroDreamer: Zero-shot 3D Generation in sim20 Seconds by Score-based Iterative Reconstruction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - Lightplane: Highly-Scalable Components for Neural 3D Fields (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - KAN: Kolmogorov-Arnold Networks (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - Iterative Reasoning Preference Optimization (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - Invisible Stitch: Generating Smooth 3D Scenes with Depth Inpainting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - InstantFamily: Masked Attention for Zero-shot Multi-ID Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - Extending Llama-3's Context Ten-Fold Overnight (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - DOCCI: Descriptions of Connected and Contrasting Images (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/30 - Better & Faster Large Language Models via Multi-token Prediction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/29 - Stylus: Automatic Adapter Selection for Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/29 - SAGS: Structure-Aware 3D Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/29 - Replacing Judges with Juries: Evaluating LLM Generations with a Panel of Diverse Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/29 - NIST AI RMF Generative AI Profile (News),
- 04/29 - LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/29 - Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/29 - Capabilities of Gemini Models in Medicine (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/28 - Paint by Inpaint: Learning to Add Image Objects by Removing Them First (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/28 - LEGENT: Open Platform for Embodied Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/27 - Ag2Manip: Learning Novel Manipulation Skills with Agent-Agnostic Visual and Action Representations (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/26 - MaPa: Text-driven Photorealistic Material Painting for 3D Shapes (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/26 - BlenderAlchemy: Editing 3D Graphics with Vision-Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - Tele-FLM Technical Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - PLLaVA : Parameter-free LLaVA Extension from Images to Videos for Video Dense Captioning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - Make Your LLM Fully Utilize the Context (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - Layer Skip: Enabling Early Exit Inference and Self-Speculative Decoding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - Interactive3D: Create What You Want by Interactive 3D Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/25 - ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - XC-Cache: Cross-Attending to Cached Context for Efficient LLM Inference (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - The Ethics of Advanced AI Assistants (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - PuLID: Pure and Lightning ID Customization via Contrastive Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - NeRF-XL: Scaling NeRFs with Multiple GPUs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - MotionMaster: Training-free Camera Motion Transfer For Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - MoDE: CLIP Data Experts via Clustering (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - MaGGIe: Masked Guided Gradual Human Instance Matting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - Editable Image Elements for Controllable Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/24 - BASS: Batched Attention-optimized Speculative Sampling (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/23 - Transformers Can Represent n-gram Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/23 - Pegasus-v1 Technical Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/23 - Multi-Head Mixture-of-Experts (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/23 - FlashSpeech: Efficient Zero-Shot Speech Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - SnapKV: LLM Knows What You are Looking for Before Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - SEED-X: Multimodal Models with Unified Multi-granularity Comprehension and Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - Scene Coordinate Reconstruction: Posing of Image Collections via Incremental Learning of a Relocalizer (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - OpenELM: An Efficient Language Model Family with Open-source Training and Inference Framework (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - MultiBooth: Towards Generating All Your Concepts in an Image from Text (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - Learning H-Infinity Locomotion Control (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - Align Your Steps: Optimizing Sampling Schedules in Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/22 - A Multimodal Automated Interpretability Agent (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/21 - Hyper-SD: Trajectory Segmented Consistency Model for Efficient Image Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/21 - AdvPrompter: Fast Adaptive Adversarial Prompting for LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/20 - Music Consistency Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/19 - The Instruction Hierarchy: Training LLMs to Prioritize Privileged Instructions (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/19 - TextSquare: Scaling up Text-Centric Visual Instruction Tuning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/19 - PhysDreamer: Physics-Based Interaction with 3D Objects via Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/19 - LLM-R2: A Large Language Model Enhanced Rule-based Rewrite System for Boosting Query Efficiency (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/19 - How Real Is Real? A Human Evaluation Framework for Unrestricted Adversarial Examples (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/19 - How Far Can We Go with Practical Function-Level Program Repair? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/19 - Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/19 - Does Gaussian Splatting need SFM Initialization? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/19 - AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - TriForce: Lossless Acceleration of Long Sequence Generation with Hierarchical Speculative Decoding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - Reuse Your Rewards: Reward Model Transfer for Zero-Shot Cross-Lingual Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - OpenBezoar: Small, Cost-Effective and Open Models Trained on Mixes of Instruction Data (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - MeshLRM: Large Reconstruction Model for High-Quality Mesh (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - Introducing v0.5 of the AI Safety Benchmark from MLCommons (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - Introducing Meta Llama 3: The most capable openly available LLM to date (Blog),
- 04/18 - EdgeFusion: On-Device Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - BLINK: Multimodal Large Language Models Can See but Not Perceive (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/18 - AniClipart: Clipart Animation with Text-to-Video Priors (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/17 - MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/17 - FlowMind: Automatic Workflow Generation with LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/17 - Dynamic Typography: Bringing Words to Life (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/17 - Stable Diffusion 3 API Now Available (twitter), (Blog), (Demo),
- 04/16 - VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/16 - U.S. Commerce Secretary Gina Raimondo Announces Expansion of U.S. AI Safety Institute Leadership Team (News),
- 04/16 - Long-form music generation with latent diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/15 - LLM Evaluators Recognize and Favor Their Own Generations (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/15 - Video2Game: Real-time, Interactive, Realistic and Browser-Compatible Environment from a Single Video (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/15 - Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/15 - Taming Latent Diffusion Model for Neural Radiance Field Inpainting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/15 - Opus can operate as a Turing machine (twitter),
- 04/15 - MathGPT: Leveraging Llama 2 to create a platform for highly personalized learning
- 04/15 - HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/15 - Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/15 - Compression Represents Intelligence Linearly (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/15 - CompGS: Efficient 3D Scene Representation via Compressed Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/14 - TextHawk: Exploring Efficient Fine-Grained Perception of Multimodal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/13 - Cathie Wood Muscles Into ChatGPT Boom With New OpenAI Stake (News),
- 04/12 - Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/12 - Probing the 3D Awareness of Visual Foundation Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/12 - Pre-training Small Base LMs with Fewer Tokens (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/12 - On the Robustness of Language Guidance for Low-Level Vision Tasks: Findings from Depth Estimation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/12 - MonoPatchNeRF: Improving Neural Radiance Fields with Patch-based Monocular Guidance (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/12 - Megalodon: Efficient LLM Pretraining and Inference with Unlimited Context Length (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/12 - Is ChatGPT Transforming Academics' Writing Style? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/12 - COCONut: Modernizing COCO Segmentation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/12 - AI Chip Trims Energy Budget Back by 99+ Percent (News),
- 04/12 - AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/12 - Grok-1.5 Vision Preview (Demo),
- 04/12 - The good, the bad, and the Humane Pin (News),
- 04/12 - Paid ChatGPT users can now access GPT-4 Turbo (twitter), (News), , (:octocat:)
- 04/11 - The Necessity of AI Audit Standards Boards (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/11 - Remembering Transformer for Continual Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - Amazon adds Andrew Ng, a leading voice in artificial intelligence, to its board of directors (News),
- 04/11 - Adobe Is Buying Videos for $3 Per Minute to Build AI Model (News),
- 04/11 - UltraEval: A Lightweight Platform for Flexible and Comprehensive Evaluation for LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/11 - Transferable and Principled Efficiency for Open-Vocabulary Segmentation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/11 - SWE-agent (twitter), (Demo), , (:octocat:)
- 04/11 - Sparse Laneformer (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - Rho-1: Not All Tokens Are What You Need (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/11 - ResearchAgent: Iterative Research Idea Generation over Scientific Literature with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - RecurrentGemma: Moving Past Transformers for Efficient Open Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - LLoCO: Learning Long Contexts Offline (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - JetMoE: Reaching Llama2 Performance with 0.1M Dollars (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS) (Project), (twitter), , (โณ๏ธ), (:octocat:)
- 04/11 - HGRN2: Gated Linear RNNs with State Expansion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/11 - From Words to Numbers: Your Large Language Model Is Secretly A Capable Regressor When Given In-Context Examples (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - ControlNet++: Improving Conditional Controls with Efficient Consistency Feedback (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - Context-aware Video Anomaly Detection in Long-Term Datasets (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - ChatGPT-3.5, Claude 3 kick pixelated butt in Street Fighter III tournament for LLMs (News),
- 04/11 - ChatGPT Can Predict the Future when it Tells Stories Set in the Future About the Past (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - Best Practices and Lessons Learned on Synthetic Data for Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - Benchmark LLMs by fighting in Street Fighter 3 (Demo), , (:octocat:)
- 04/11 - Audio Dialogues: Dialogues dataset for audio and music understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - Applying Guidance in a Limited Interval Improves Sample and Distribution Quality in Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/11 - AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/10 - LM Transparency Tool: Interactive Tool for Analyzing Transformer Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - Gemini 1.5 Pro now understands audio (twitter),
- 04/10 - Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/10 - Urban Architect: Steerable 3D Urban Scene Generation with Layout Prior (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - RealmDreamer: Text-Driven 3D Scene Generation with Inpainting and Depth Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - OpenAI and Meta are on the verge of releasing AI models capable of reasoning like humans, report says (News),
- 04/10 - MetaCheckGPT -- A Multi-task Hallucination Detector Using LLM Uncertainty and Meta-models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - Meta confirms that its Llama 3 open source LLM is coming in the next month (News),
- 04/10 - Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - Incremental XAI: Memorable Understanding of AI with Incremental Explanations (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - DreamScene360: Unconstrained Text-to-3D Scene Generation with Panoramic Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - Does Mapo Tofu Contain Coffee? Probing LLMs for Food-related Cultural Knowledge (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - BRAVE: Broadening the visual encoding of vision-language models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - AI startup Mistral launches a 281GB AI model to rival OpenAI, Meta, and Google (News),
- 04/10 - Agent-driven Generative Semantic Communication for Remote Surveillance (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - Adapting LLaMA Decoder to Vision Transformer (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/10 - A Survey on the Integration of Generative AI for Critical Thinking in Mobile Networks (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/09 - Take a Look at it! Rethinking How to Evaluate Language Model Jailbreak (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/09 - RULER: What's the Real Context Size of Your Long-Context Language Models? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/09 - Revising Densification in Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/09 - Reconstructing Hand-Held Objects in 3D (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/09 - RAR-b: Reasoning as Retrieval Benchmark (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/09 - Privacy Preserving Prompt Engineering: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/09 - On Evaluating the Efficiency of Source Code Generated by LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/09 - OmniFusion Technical Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/09 - MuPT: A Generative Symbolic Music Pretrained Transformer (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/09 - MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/09 - Magic-Boost: Boost 3D Generation with Mutli-View Conditioned Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/09 - LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/09 - InternLM-XComposer2-4KHD: A Pioneering Large Vision-Language Model Handling Resolutions from 336 Pixels to 4K HD (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/09 - Hash3D: Training-free Acceleration for 3D Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/09 - Google unveils open source projects for generative AI (News),
- 04/09 - Elephants Never Forget: Memorization and Learning of Tabular Data in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/09 - Apple just unveiled new Ferret-UI LLM โ this AI can read your iPhone screen (News),
- 04/09 - AEGIS: Online Adaptive AI Content Safety Moderation with Ensemble of LLM Experts (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - YaART: Yet Another ART Rendering Technology (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - WILBUR: Adaptive In-Context Learning for Robust and Accurate Web Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - UniFL: Improve Stable Diffusion via Unified Feedback Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - Unbridled Icarus: A Survey of the Potential Perils of Image Inputs in Multimodal Large Language Model Security (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - The Hallucinations Leaderboard -- An Open Effort to Measure Hallucinations in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - The Fact Selection Problem in LLM-Based Program Repair (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/08 - SwapAnything: Enabling Arbitrary Object Swapping in Personalized Visual Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - SambaLingo: Teaching Large Language Models New Languages (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - Naver debuts multilingual HyperCLOVA X LLM it will use to build sovereign AI for Asia (News),
- 04/08 - MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - LayoutLLM: Layout Instruction Tuning with Large Language Models for Document Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/08 - Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - Evaluating Interventional Reasoning Capabilities of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - Eagle and Finch: RWKV with Matrix-Valued States and Dynamic Recurrence (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/08 - CodecLM: Aligning Language Models with Tailored Synthetic Data (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/08 - AutoCodeRover: Autonomous Program Improvement (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/07 - TimeGPT in Load Forecasting: A Large Time Series Model Perspective (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/07 - OpenAI transcribed over a million hours of YouTube videos to train GPT-4 (News),
- 04/07 - MagicTime: Time-lapse Video Generation Models as Metamorphic Simulators (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/07 - ByteEdit: Boost, Comply and Accelerate Generative Image Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/06 - Majority Voting of Doctors Improves Appropriateness of AI Reliance in Pathology (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/06 - Diffusion-RWKV: Scaling RWKV-Like Architectures for Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/06 - DATENeRF: Depth-Aware Text-based Editing of NeRFs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/06 - BeyondScene: Higher-Resolution Human-Centric Scene Generation With Pretrained Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/06 - Aligning Diffusion Models by Optimizing Human Utility (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/06 - The Case for Developing a Foundation Model for Planning-like Tasks from Scratch (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/05 - Increased LLM Vulnerabilities from Fine-tuning and Quantization (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/05 - SpatialTracker: Tracking Any 2D Pixels in 3D Space (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/05 - Social Skill Training with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/05 - Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/05 - Robust Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/05 - PhysAvatar: Learning the Physics of Dressed 3D Avatars from Visual Observations (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/05 - Koala: Key frame-conditioned long video-LLM (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/05 - CLUE: A Clinical Language Understanding Evaluation for LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/05 - Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/05 - Assisting humans in complex comparisons: automated information comparison at scale (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/04 - Language Model Evolution: An Iterated Learning Perspective (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS) (twitter),
- 04/04 - No "Zero-Shot" Without Exponential Data: Pretraining Concept Frequency Determines Multimodal Model Performance (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/04 - Evaluating LLMs at Detecting Errors in LLM Responses (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/04 - Evaluating Generative Language Models in Information Extraction as Subjective Question Correction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/04 - Direct Nash Optimization: Teaching Language Models to Self-Improve with General Preferences (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - CBR-RAG: Case-Based Reasoning for Retrieval Augmented Generation in LLMs for Legal Question Answering (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - CantTalkAboutThis: Aligning Language Models to Stay on Topic in Dialogues (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/04 - Training LLMs over Neurally Compressed Text (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - ReFT: Representation Finetuning for Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/04 - Red Teaming GPT-4V: Are GPT-4V Safe Against Uni/Multi-Modal Jailbreak Attacks? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - RALL-E: Robust Codec Language Modeling with Chain-of-Thought Prompting for Text-to-Speech Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - PointInfinity: Resolution-Invariant Point Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - MiniGPT4-Video: Advancing Multimodal LLMs for Video Understanding with Interleaved Visual-Textual Tokens (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - CodeEditorBench: Evaluating Code Editing Capability of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/04 - AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/03 - Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/03 - On the Scalability of Diffusion-based Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/03 - Many-shot jailbreaking (โ)
- 04/03 - LVLM-Intrepret: An Interpretability Tool for Large Vision-Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/03 - Language Models as Compilers: Simulating Pseudocode Execution Improves Algorithmic Reasoning in Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/03 - InstantStyle: Free Lunch towards Style-Preserving in Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/03 - Freditor: High-Fidelity and Transferable NeRF Editing by Frequency Decomposition (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/03 - Cross-Attention Makes Inference Cumbersome in Text-to-Image Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/03 - ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/02 - UK & United States announce partnership on science of AI safety (News),
- 04/02 - Large Language Models as Planning Domain Generators (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 04/02 - Poro 34B and the Blessing of Multilinguality (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/02 - Octopus v2: On-device language model for super agent (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/02 - Mixture-of-Depths: Dynamically allocating compute in transformer-based language models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/02 - Long-context LLMs Struggle with Long In-context Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/02 - LLM-ABR: Designing Adaptive Bitrate Algorithms via Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/02 - Large language models could change the future of behavioral healthcare: a proposal for responsible development and evaluation (โ)
- 04/02 - HyperCLOVA X Technical Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/02 - CameraCtrl: Enabling Camera Control for Text-to-Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/02 - Advancing LLM Reasoning Generalists with Preference Trees (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/01 - Stream of Search (SoS): Learning to Search in Language (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/01 - LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/01 - The Rise and Rise of A.I. Large Language Models (LLMs) (Blog),
- 04/01 - Streaming Dense Video Captioning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/01 - Measuring Style Similarity in Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/01 - Getting it Right: Improving Spatial Consistency in Text-to-Image Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/01 - For Data-Guzzling AI Companies, the Internet Is Too Small (News),
- 04/01 - FlexiDreamer: Single Image-to-3D Generation with FlexiCubes (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/01 - Evalverse: Unified and Accessible Library for Large Language Model Evaluation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/01 - Direct Preference Optimization of Video Large Multimodal Models from Language Model Reward (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 04/01 - DBRX, Continual Pretraining, RewardBench, Faster Inference, and More (Blog),
- 04/01 - CosmicMan: A Text-to-Image Foundation Model for Humans (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/01 - Condition-Aware Neural Network for Controlled Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/01 - Bigger is not Always Better: Scaling Properties of Latent Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 04/01 - Are large language models superhuman chemists? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/31 - WavLLM: Towards Robust and Adaptive Speech Large Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/31 - Tired of Plugins? Large Language Models Can Be End-To-End Recommenders (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/30 - Survey on Large Language Model-Enhanced Reinforcement Learning: Concept, Taxonomy, and Methods (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/30 - ST-LLM: Large Language Models Are Effective Temporal Learners (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/30 - Noise-Aware Training of Layout-Aware Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/30 - MaGRITTe: Manipulative and Generative 3D Realization from Image, Topview and Text (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/30 - Aurora-M: The First Open Source Multilingual Language Model Red-teamed according to the U.S. Executive Order (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - Transformer-Lite: High-efficiency Deployment of Large Language Models on Mobile Phone GPUs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - Snap-it, Tap-it, Splat-it: Tactile-Informed 3D Gaussian Splatting for Reconstructing Challenging Surfaces (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - ReALM: Reference Resolution As Language Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - NVIDIA H200 GPUs Crush MLPerfโs LLM Inferencing Benchmark (News),
- 03/29 - MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - LLaVA-Gemma: Accelerating Multimodal Foundation Models with a Compact Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - InstantSplat: Unbounded Sparse-view Pose-free Gaussian Splatting in 40 Seconds (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - Gecko: Versatile Text Embeddings Distilled from Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - DiJiang: Efficient Large Language Models through Compact Kernelization (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/29 - DeepMind develops SAFE, an AI-based app that can fact-check LLMs (News),
- 03/29 - CtRL-Sim: Reactive and Controllable Driving Agents with Offline Reinforcement Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/29 - Are We on the Right Way for Evaluating Large Vision-Language Models? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/28 - sDPO: Don't Use Your Data All at Once (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/28 - Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/28 - Localizing Paragraph Memorization in Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/28 - Jamba: A Hybrid Transformer-Mamba Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/28 - GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/28 - Claude 3 overtakes GPT-4 in the duel of the AI bots. Here's how to get in on the action (News),
- 03/28 - Announcing Grok-1.5 (Blog), (Demo),
- 03/27 - A Path Towards Legal Autonomy: An interoperable and explainable approach to extracting, transforming, loading and computing legal information using large language models, expert systems and Bayesian networks (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/27 - ViTAR: Vision Transformer with Any Resolution (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/27 - Towards a World-English Language Model for On-Device Virtual Assistants (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/27 - TextCraftor: Your Text Encoder Can be Image Quality Controller (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/27 - ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and Insertion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/27 - Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/27 - Long-form factuality in large language models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/27 - LITA: Language Instructed Temporal-Localization Assistant (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/27 - Garment3DGen: 3D Garment Stylization and Texture Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/27 - Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/27 - FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/27 - BioMedLM: A 2.7B Parameter Language Model Trained On Biomedical Text (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/26 - MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/26 - The Unreasonable Ineffectiveness of the Deeper Layers (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/26 - TC4D: Trajectory-Conditioned Text-to-4D Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/26 - Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D Gaussians (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/26 - Introducing DBRX: A New State-of-the-Art Open LLM (Blog),
- 03/26 - InternLM2 Technical Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/26 - Improving Text-to-Image Consistency via Automatic Prompt Optimization (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/26 - Fully-fused Multi-Layer Perceptrons on Intel Data Center GPUs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/26 - EgoLifter: Open-world 3D Segmentation for Egocentric Perception (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/26 - AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait Animation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/26 - 2D Gaussian Splatting for Geometrically Accurate Radiance Fields (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/25 - Towards Automatic Evaluation for LLMs' Clinical Capabilities: Metric, Data, and Algorithm (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/25 - RepairAgent: An Autonomous, LLM-Based Agent for Program Repair (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/25 - RL for Consistency Models: Faster Reward Guided Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/25 - VP3D: Unleashing 2D Visual Prompt for Text-to-3D Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/25 - TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/25 - SDXS: Real-Time One-Step Latent Diffusion Models with Image Conditions (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/25 - LLM Agent Operating System (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/25 - FlashFace: Human Image Personalization with High-fidelity Identity Preservation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/25 - DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/25 - Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/23 - When LLM-based Code Generation Meets the Software Development Process (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/22 - ThemeStation: Generating Theme-Aware 3D Assets from Few Exemplars (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/22 - SiMBA: Simplified Mamba-Based Architecture for Vision and Multivariate Time series (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/22 - LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/22 - LATTE3D: Large-scale Amortized Text-To-Enhanced3D Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/22 - InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/22 - FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/22 - DragAPart: Learning a Part-Level Motion Prior for Articulated Objects (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/22 - Can large language models explore in-context? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/22 - AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/21 - PeerGPT: Probing the Roles of LLM-based Peer Agents as Team Moderators and Participants in Children's Collaborative Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - StyleCineGAN: Landscape Cinemagraph Generation using a Pre-trained StyleGAN (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/21 - StreamingT2V: Consistent, Dynamic, and Extendable Long Video Generation from Text (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/21 - ReNoise: Real Image Inversion Through Iterative Noising (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - Recourse for reclamation: Chatting with generative language models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - RakutenAI-7B: Extending Large Language Models for Japanese (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - MyVLM: Personalizing VLMs for User-Specific Queries (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - GRM: Large Gaussian Reconstruction Model for Efficient 3D Reconstruction and Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/21 - General Assembly adopts landmark resolution on artificial intelligence (News),
- 03/21 - Gaussian Frosting: Editable Complex Radiance Fields with Real-Time Rendering (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - Explorative Inbetweening of Time and Space (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - Efficient Video Diffusion Models via Content-Frame Motion-Latent Decomposition (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - DreamReward: Text-to-3D Generation with Human Preference (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/21 - Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/21 - Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/21 - AnyV2V: A Plug-and-Play Framework For Any Video-to-Video Editing Tasks (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/20 - Mapping LLM Security Landscapes: A Comprehensive Stakeholder Risk Assessment Proposal (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/20 - ZigMa: Zigzag Mamba Diffusion Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/20 - VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/20 - RewardBench: Evaluating Reward Models for Language Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/20 - Reverse Training to Nurse the Reversal Curse (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/20 - RadSplat: Radiance Field-Informed Gaussian Splatting for Robust Real-Time Rendering with 900+ FPS (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/20 - Mora: Enabling Generalist Video Generation via A Multi-Agent Framework (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/20 - LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/20 - IDAdapter: Learning Mixed Features for Tuning-Free Personalization of Text-to-Image Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/20 - HyperLLaVA: Dynamic Visual and Language Expert Tuning for Multimodal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/20 - Evaluating Frontier Models for Dangerous Capabilities (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/20 - DepthFM: Fast Monocular Depth Estimation with Flow Matching (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/20 - Compress3D: a Compressed Latent Space for 3D Generation from a Single Image (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/20 - Be-Your-Outpainter: Mastering Video Outpainting through Input-Specific Adaptation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/19 - When Do We Not Need Larger Vision Models? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/19 - Vid2Robot: End-to-end Video-conditioned Policy Learning with Cross-Attention Transformers (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/19 - Towards a general-purpose foundation model for computational pathology (โ)
- 03/19 - TexDreamer: Towards Zero-Shot High-Fidelity 3D Human Texture Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/19 - SceneScript: Reconstructing Scenes With An Autoregressive Structured Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/19 - mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/19 - Magic Fixup: Streamlining Photo Editing by Watching Dynamic Videos (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/19 - LLMLingua-2: Data Distillation for Efficient and Faithful Task-Agnostic Prompt Compression (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/19 - GVGEN: Text-to-3D Generation with Volumetric Representation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/19 - GaussianFlow: Splatting Gaussian Dynamics for 4D Content Creation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/19 - FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/19 - FouriScale: A Frequency Perspective on Training-Free High-Resolution Image Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/19 - Evolutionary Optimization of Model Merging Recipes (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), ([:octocat:](https://github.com/ sakanaai/evolutionary-model-merge)![GitHub Repo stars](https://img.shields.io/github/stars/ sakanaai/evolutionary-model-merge?style=social))
- 03/19 - ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/19 - Chart-based Reasoning: Transferring Capabilities from LLMs to VLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/19 - Apple's MM1: A multimodal large language model capable of interpreting both images and text data (News),
- 03/19 - AnimateDiff-Lightning: Cross-Model Diffusion Distillation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/19 - Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/19 - A visual-language foundation model for computational pathology (โ) , (โณ๏ธ)
- 03/19 - Characteristic AI Agents via Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:![GitHub Repo stars](https://img.shields.io/github/stars/nuaa-nlp/character100 ?style=social))
- 03/18 - How Far Are We on the Decision-Making of LLMs? Evaluating LLMs' Gaming Ability in Multi-Agent Environments (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/18 - VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - VFusion3D: Learning Scalable 3D Generative Models from Video Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - TnT-LLM: Text Mining at Scale with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - SV3D: Novel Multi-view Synthesis and 3D Generation from a Single Image using Latent Video Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - ROUTERBENCH: A Benchmark for Multi-LLM Routing System (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (SS)
- 03/18 - Meta-Prompting for Automating Zero-shot Visual Recognition with LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/18 - LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - LLaVA-UHD: an LMM Perceiving Any Aspect Ratio and High-Resolution Images (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/18 - Larimar: Large Language Models with Episodic Memory Control (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - Infinite-ID: Identity-preserved Personalization via ID-semantics Decoupling Paradigm (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - GPT-4 as Evaluator: Evaluating Large Language Models on Pest Management in Agriculture (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - Generic 3D Diffusion Adapter Using Controlled Multi-View Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/18 - From Pixels to Insights: A Survey on Automatic Chart Understanding in the Era of Large Foundation Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/18 - Fast High-Resolution Image Synthesis with Latent Adversarial Diffusion Distillation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - Decoding Compressed Trust: Scrutinizing the Trustworthiness of Efficient LLMs Under Compression (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/18 - Compiler generated feedback for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/17 - PhD: A Prompted Visual Hallucination Evaluation Dataset (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/17 - MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/16 - VisionCLIP: An Med-AIGC based Ethical Language-Image Foundation Model for Generalizable Retina Image Analysis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/16 - Do Large Language Models understand Medical Codes? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - VideoAgent: Long-form Video Understanding with Large Language Model as Agent (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - Uni-SMART: Universal Science Multimodal Analysis and Research Transformer (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - Trusting the Search: Unraveling Human Trust in Health Information from Google and ChatGPT (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/15 - RAFT: Adapting Language Model to Domain Specific RAG (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/15 - RAFT: Adapting Language Model to Domain Specific RAG (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - PERL: Parameter Efficient Reinforcement Learning from Human Feedback (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - NeuFlow: Real-time, High-accuracy Optical Flow Estimation on Robots Using Edge Devices (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/15 - MusicHiFi: Fast High-Fidelity Stereo Vocoding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/15 - LightIt: Illumination Modeling and Control for Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - Isotropic3D: Image-to-3D Generation Based on a Single CLIP Embedding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/15 - FDGaussian: Fast Gaussian Splatting from Single Image via Geometric-aware Diffusion Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - Enhancing Human-Centered Dynamic Scene Understanding via Multiple LLMs Collaborated Reasoning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - EfficientVMamba: Atrous Selective Scan for Light Weight Visual Mamba (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - DiPaCo: Distributed Path Composition (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/15 - Controllable Text-to-3D Generation via Surface-Aligned Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/14 - WavCraft: Audio Editing and Generation with Natural Language Prompts (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/14 - VisionGPT-3D: A Generalized Multimodal Agent for Enhanced 3D Vision Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/14 - Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/14 - Video Editing via Factorized Diffusion Distillation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/14 - Unlocking the conversion of Web Screenshots into HTML Code with the WebSight Dataset (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/14 - StreamMultiDiffusion: Real-Time Interactive Generation with Region-Based Semantic Control (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/14 - Scaling Instructable Agents Across Many Simulated Worlds (twitter), (Blog),
- 03/14 - Recurrent Drafter for Fast Speculative Decoding in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/14 - Quiet-STaR: Language Models Can Teach Themselves to Think Before Speaking (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/14 - MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/14 - LocalMamba: Visual State Space Model with Windowed Selective Scan (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/14 - Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/14 - Helpful or Harmful? Exploring the Efficacy of Large Language Models for Online Grooming Prevention (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/14 - Griffon v2: Advancing Multimodal Perception with High-Resolution Scaling and Visual-Language Co-Referring (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/14 - GPT on a Quantum Computer (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/14 - Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/14 - GiT: Towards Generalist Vision Transformer through Universal Language Interface (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/14 - Exploring the Capabilities and Limitations of Large Language Models in the Electric Energy Sector (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/14 - BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/14 - 3D-VLA: A 3D Vision-Language-Action Generative World Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/13 - Scaling Instructable Agents Across Many Simulated Worlds (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/13 - VLOGGER: Multimodal Diffusion for Embodied Avatar Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/13 - The Human Factor in Detecting Errors of Large Language Models: A Systematic Literature Review and Future Research Directions (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/13 - SOTOPIA-ฯ: Interactive Learning of Socially Intelligent Language Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/13 - Simple and Scalable Strategies to Continually Pre-train Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/13 - Scaling Up Dynamic Human-Scene Interaction Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/13 - Language-based game theory in the age of artificial intelligence (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/13 - Language models scale reliably with over-training and on downstream tasks (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/13 - Knowledge Conflicts for LLMs: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/13 - Gemma: Open Models Based on Gemini Research and Technology (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/13 - GaussianImage: 1000 FPS Image Representation and Compression by 2D Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/13 - Follow-Your-Click: Open-domain Regional Image Animation via Short Prompts (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/13 - Cultural evolution in populations of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/13 - Bugs in Large Language Models Generated Code: An Empirical Study (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/12 - Synth^2: Boosting Visual-Language Models with Synthetic Captions and Image Embeddings (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/12 - Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/12 - MoAI: Mixture of All Intelligence for Large Language and Vision Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/12 - Learning Generalizable Feature Fields for Mobile Manipulation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/12 - DragAnything: Motion Control for Anything using Entity Representation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/12 - Chronos: Learning the Language of Time Series (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/12 - Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/11 - Transparent AI Disclosure Obligations: Who, What, When, Where, Why, How (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/11 - HILL: A Hallucination Identifier for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/11 - FAX: Scalable and Differentiable Federated Primitives in JAX (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/11 - FashionReGen: LLM-Empowered Fashion Report Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/11 - VideoMamba: State Space Model for Efficient Video Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/11 - V3D: Video Diffusion Models are Effective 3D Generators (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/11 - Stealing Part of a Production Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/11 - Multistep Consistency Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/11 - FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/11 - Chain-of-table: Evolving tables in the reasoning chain for table understanding (Blog),
- 03/11 - An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/11 - Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/10 - VidProM: A Million-scale Real Prompt-Gallery Dataset for Text-to-Video Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/09 - Algorithmic progress in language models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/08 - Sora as an AGI World Model? A Complete Survey on Text-to-Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/08 - On Protecting the Data Privacy of Large Language Models (LLMs): A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/08 - VideoElevator: Elevating Video Generation Quality with Versatile Text-to-Image Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/08 - Personalized Audiobook Recommendations at Spotify Through Graph Neural Networks (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/08 - Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/08 - ELLA: Equip Diffusion Models with LLM for Enhanced Semantic Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/08 - DeepSeek-VL: Towards Real-World Vision-Language Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/08 - CRM: Single Image to 3D Textured Mesh with Convolutional Reconstruction Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/08 - CogView3: Finer and Faster Text-to-Image Generation via Relay Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/08 - Now available on Poe: Claude 3 (Demo),
- 03/08 - Google - Health-specific embedding tools for dermatology and pathology (Blog),
- 03/07 - Yi: Open Foundation Models by 01.AI (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/07 - Teaching Large Language Models to Reason with Reinforcement Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/07 - StableDrag: Stable Dragging for Point-based Image Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/07 - Radiative Gaussian Splatting for Efficient X-ray Novel View Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/07 - PixArt-ฮฃ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/07 - Pix2Gif: Motion-Guided Diffusion for GIF Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/07 - Meet โLiberated Qwenโ, an uncensored LLM that strictly adheres to system prompts (News),
- 03/07 - LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/07 - KAIST develops next-generation ultra-low power LLM accelerator (News),
- 03/07 - Inflection-2.5: meet the world's best personal AI (News),
- 03/07 - How Far Are We from Intelligent Visual Deductive Reasoning? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/07 - GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/07 - Evaluating LLM models at scale (Blog),
- 03/07 - Common 7B Language Models Already Possess Strong Math Capabilities (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/07 - Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/06 - Stop Regressing: Training Value Functions via Classification for Scalable Deep RL (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/06 - ShortGPT: Layers in Large Language Models are More Redundant Than You Expect (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/06 - SaulLM-7B: A pioneering Large Language Model for Law (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/06 - NY hospital exec: Multimodal LLM assistants will create a โparadigm shiftโ in patient care (News),
- 03/06 - Learning to Decode Collaboratively with Multiple Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/06 - Enhancing Vision-Language Pre-training with Rich Supervisions (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/06 - Backtracing: Retrieving the Cause of the Query (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/06 - AI Prompt Engineering Is Dead (News),
- 03/06 - 3D Diffusion Policy (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/05 - OpenAI and Elon Musk (Blog),
- 03/05 - Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - WikiTableEdit: A Benchmark for Table Editing by Natural Language Instruction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - Updating the Minimum Information about CLinical Artificial Intelligence (MI-CLAIM) checklist for generative modeling research (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/05 - Tuning-Free Noise Rectification for High Fidelity Image-to-Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - Scaling Rectified Flow Transformers for High-Resolution Image Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - RT-Sketch: Goal-Conditioned Imitation Learning from Hand-Drawn Sketches (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/05 - Revisiting Meta-evaluation for Grammatical Error Correction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - MathScale: Scaling Instruction Tuning for Mathematical Reasoning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/05 - KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/05 - Interactive Continual Learning: Fast and Slow Thinking (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - In Search of Truth: An Interrogation Approach to Hallucination Detection (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - ImgTrojan: Jailbreaking Vision-Language Models with ONE Image (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - Generative Software Engineering (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/05 - Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - Exploring the Limitations of Large Language Models in Compositional Relation Reasoning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - EasyQuant: An Efficient Data-free Quantization Algorithm for LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - Design2Code: How Far Are We From Automating Front-End Engineering? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - ChatGPT and biometrics: an assessment of face recognition, gender detection, and age estimation capabilities (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - ChatCite: LLM Agent with Human Workflow Guidance for Comparative Literature Summary (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - Benchmarking the Text-to-SQL Capability of Large Language Models: A Comprehensive Evaluation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/05 - An Empirical Study of LLM-as-a-Judge for LLM Evaluation: Fine-tuned Judge Models are Task-specific Classifiers (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 3/5 - OpenAI - ChatGPT can now read responses to you. (twitter,
- 03/04 - The Claude 3 Model Family: Opus, Sonnet, Haiku (โ) (twitter), , (โณ๏ธ)
- 03/04 - Wukong: Towards a Scaling Law for Large-Scale Recommendation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/04 - Large language models surpass human experts in predicting neuroscience results (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/04 - NoteLLM: A Retrievable Large Language Model for Note Recommendation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/04 - MagicClay: Sculpting Meshes With Generative Neural Fields (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 03/04 - Enhancing LLM Safety via Constrained Direct Preference Optimization (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/04 - DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/04 - CODE-ACCORD: A Corpus of Building Regulatory Data for Rule Generation towards Automatic Compliance Checking (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 03/04 - Balancing Enhancement, Harmlessness, and General Capabilities: Enhancing Conversational LLMs with Direct RLHF (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 03/04 - adaptMLLM: Fine-Tuning Multilingual Language Models on Low-Resource Languages with Integrated LLM Playgrounds (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 3/4 - ViewDiff: 3D-Consistent Image Generation with Text-to-Image Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 3/4 - TripoSR: Fast 3D Object Reconstruction from a Single Image (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 3/4 - RT-H: Action Hierarchies Using Language (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 3/4 - ResAdapter: Domain Consistent Resolution Adapter for Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 3/4 - OOTDiffusion: Outfitting Fusion based Latent Diffusion for Controllable Virtual Try-on (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 3/4 - Build AI for a Better Future (twitter), (News),
- 3/4 - AtomoVideo: High Fidelity Image-to-Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 03/03 - Research Papers in February 2024: A LoRA Successor, Small Finetuned LLMs Vs Generalist LLMs, and Transparent LLM Research (Blog),
- 3/3 - Nvidia CEO Jensen Huang says AI could pass most human tests in 5 years (News
- 3/3 - MovieLLM: Enhancing Long Video Understanding with AI-Generated Movies (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 3/3 - InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 3/3 - Could this be bigger than OpenAI? Microsoft invests billions in French startup โ Mistral AI is a multilingual maestro that's almost as good as ChatGPT 4 (News),
- 3/3 - 3DGStream: On-the-Fly Training of 3D Gaussians for Efficient Streaming of Photo-Realistic Free-Viewpoint Videos (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 3/2 - Nvidia CEO says AI could pass human tests in five years (News
- 3/1 - Elon Musk sues OpenAI and CEO Sam Altman over contract breach (News)
- 3.1 - AtP*: An efficient and scalable method for localizing LLM behaviour to components (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS)
- 3.1 - VisionLLaMA: A Unified LLaMA Interface for Vision Tasks (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS)
- 3.1 - Learning and Leveraging World Models in Visual Representation Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS)
- 3.1 - RealCustom: Narrowing Real Text Word for Real-Time Open-Domain Text-to-Image Customization (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS)
- 3.1 - Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS)
- 3.1 - Resonance RoPE: Improving Context Length Generalization of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 02/29 - OHTA: One-shot Hand Avatar via Data-driven Implicit Priors (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 02/29 - Retrieval-Augmented Generation for AI-Generated Content: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.29 - DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.29 - Panda-70M: Captioning 70M Videos with Multiple Cross-Modality Teachers (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.29 - Humanoid Locomotion as Next Token Prediction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.29 - Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.29 - StarCoder 2 and The Stack v2: The Next Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.29 - Trajectory Consistency Distillation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.29 - Beyond Language Models: Byte Models are Digital World Simulators (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.29 - Syntactic Ghost: An Imperceptible General-purpose Backdoor Attacks on Pre-trained Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.29 - ViewFusion: Towards Multi-View Consistency via Interpolated Denoising (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.29 - MOSAIC: A Modular System for Assistive and Interactive Cooking (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS)
- 02/28 - Automatic Creative Selection with Cross-Modal Matching (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 2.28 - Priority Sampling of Large Language Models for Compilers (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.28 - Simple linear attention language models balance the recall-throughput tradeoff (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.28 - Approaching Human-Level Forecasting with Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.28 - Datasets for Large Language Models: A Comprehensive Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.28 - A Survey on Recent Advances in LLM-Based Multi-turn Dialogue Systems (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 02/27 - A High Level Guide to LLM Evaluation Metrics (Blog),
- 2/27 - Users Say Microsoft's AI Has Alternate Personality as Godlike AGI That Demands to Be Worshipped (News)
- 2/27 - Google DeepMind CEO on AGI, OpenAI and Beyond โ MWC 2024 (News)
- 2.27 - Large Language Models(LLMs) on Tabular Data: Prediction, Generation, and Understanding -- A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.27 - Towards Optimal Learning of Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - Evaluating Very Long-Term Conversational Memory of LLM Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - Seeing and Hearing: Open-domain Visual-Audio Generation with Diffusion Latent Aligners (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - OmniACT: A Dataset and Benchmark for Enabling Multimodal Generalist Autonomous Agents for Desktop and Web (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - EMO: Emote Portrait Alive -- Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - Training-Free Long-Context Scaling of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.27 - VastGaussian: Vast 3D Gaussians for Large Scene Reconstruction (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - DiffuseKronA: A Parameter Efficient Fine-tuning Method for Personalized Diffusion Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - Sora Generates Videos with Stunning Geometrical Consistency (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - Playground v2.5: Three Insights towards Enhancing Aesthetic Quality in Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.27 - Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.27 - Video as the New Language for Real-World Decision Making (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 02/27 - On the Societal Impact of Open Foundation Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 02/26 - Set the Clock: Temporal Alignment of Pretrained Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2/26 - DenseMamba: State Space Models with Dense Hidden Connection for Efficient Large Language Models (โ), (๐)(๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 02/26 - Mistral Large is our flagship model, with top-tier reasoning capacities (News)
- 2.26 - Disentangled 3D Scene Generation with Layout Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.26 - Multi-LoRA Composition for Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.26 - MobiLlama: Towards Accurate and Lightweight Fully Transparent GPT (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.26 - Do Large Language Models Latently Perform Multi-Hop Reasoning? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.26 - Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.26 - Nemotron-4 15B Technical Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.26 - StructLM: Towards Building Generalist Models for Structured Knowledge Grounding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.26 - Towards Open-ended Visual Quality Comparison (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.25 - ChatMusician: Understanding and Generating Music Intrinsically with LLM (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.25 - FuseChat: Knowledge Fusion of Chat Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 02/24 - Divide-or-Conquer? Which Part Should You Distill Your LLM? (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 02/24 - Perplexity.ai Revamps Google SEO Model For LLM Era (News)
- 02/24 - Data Interpreter: An LLM Agent For Data Science (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.24 - Empowering Large Language Model Agents through Action Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.23 - MegaScale: Scaling Large Language Model Training to More Than 10,000 GPUs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.23 - Seamless Human Motion Composition with Blended Positional Encodings (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.23 - AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.23 - Gen4Gen: Generative Data Pipeline for Generative Multi-Concept Composition (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.23 - API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.23 - Genie: Generative Interactive Environments (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.23 - GPTVQ: The Blessing of Dimensionality for LLM Quantization (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.23 - ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ)
- 2.22 - CLoVe: Encoding Compositional Language in Contrastive Vision-Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 02/22 - Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 2.22 - Divide-or-Conquer? Which Part Should You Distill Your LLM? (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ)
- 2.22 - MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ)
- 2.22 - Watermarking Makes Language Models Radioactive (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ)
- 2.22 - AutoPrompt - prompt optimization framework (:octocat:)
- 2.22 - Announcing Stable Diffusion 3 (tweet), (blog)
- 2.22 - DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.22 - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real and Simulation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - LLMsย with Industrial Lens: Deciphering the Challenges and Prospects -- A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - Vision-Language Navigation with Embodiedย Intelligence: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - Enhancing Robotic Manipulation with AI Feedback from Multimodal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - Do Machines and Humans Focus on Similar Code? Exploring Explainability of Large Language Models in Code Summarization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - PALO: A Polyglot Large Multimodal Model for 5B People (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.22 - GeneOH Diffusion: Towards Generalizable Hand-Object Interaction Denoising via Denoising Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.22 - Snap Video: Scaled Spatiotemporal Transformers for Text-to-Video Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - CyberDemo: Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - Consolidating Attention Features for Multi-view Image Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - GaussianPro: 3D Gaussian Splatting with Progressive Propagation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - OmniPred: Language Models as Universal Regressors (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - Subobject-level Image Tokenization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.22 - TinyLLaVA: A Framework of Small-scale Large Multimodal Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.22 - Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - MVD^2: Efficient Multiview 3D Reconstruction for Multiview Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.22 - BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ)
- 2.21 - Large Language Models for Data Annotation: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 2.21 - EyeTrans: Merging Human and Machine Attention for Neural Code Summarization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - Corrective Machine Unlearning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - Towards Building Multilingual Language Model for Medicine (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.21 - Generativeย AI for Secure Physical Layer Communications: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - LLMsย Meet Long Video: Advancing Long Video Comprehension with An Interactive Visual Adapter inย LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - Linear Transformers are Versatile In-Context Learners (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - T-Stitch: Accelerating Sampling in Pre-Trained Diffusion Models with Trajectory Stitching (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.21 - LexC-Gen: Generating Data for Extremely Low-Resource Languages with Large Language Models and Bilingual Lexicons (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.21 - Beyond A*: Better Planning with Transformers via Search Dynamics Bootstrapping (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - AgentScope: A Flexible yet Robust Multi-Agent Platform (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.21 - Coercing LLMs to do and reveal (almost) anything (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - D-Flow: Differentiating through Flows for Controlled Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - SDXL-Lightning: Progressive Adversarial Diffusion Distillation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - Music Style Transfer with Time-Varying Inversion of Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - Ouroboros: Speculative Decoding with Large Model Enhanced Drafting (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - User-LLM: Efficient LLM Contextualization with User Embeddings (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - BBA: Bi-Modal Behavioral Alignment for Reasoning with Large Vision-Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.21 - ToDo: Token Downsampling for Efficient Generation of High-Resolution Images (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - Healthcare Copilot: Eliciting the Power of Generalย LLMsย for Medical Consultation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - BiMediX: Bilingual Medical Mixture of Expertsย LLM (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.20 - Me LLaMA: Foundation Large Language Models for Medical Applications (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.20 - From Cloud to Edge: Rethinkingย Generativeย AI for Low-Resource Design Challenges (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - Aria Everyday Activities Dataset (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - Improving Robustness for Joint Optimization of Camera Poses and Decomposed Low-Rank Tensorial Radiance Fields (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - FlashTex: Fast Relightable Mesh Texturing with LightControlNet (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - Video ReCap: Recursive Captioning of Hour-Long Videos (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.20 - A Touch, Vision, and Language Dataset for Multimodal Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - VideoPrism: A Foundational Visual Encoder for Video Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - Neural Network Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.20 - Synthetic Data (Almost) from Scratch: Generalized Instruction Tuning for Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - RealCompo: Dynamic Equilibrium between Realism and Compositionality Improves Text-to-Image Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.20 - Instruction-tuned Language Models are Better Knowledge Learners (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.20 - MVDiffusion++: A Dense High-resolution Multi-view Diffusion Model for Single or Sparse-view 3D Object Reconstruction (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2,20 - The FinBen: An Holistic Financial Benchmark for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 02/19 - Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 02/19 - ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 2.19 - Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 2.19 - Uncertainty quantification in fine-tuned LLMs using LoRA ensembles (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.19 - Automaticย Evaluationย for Mental Health Counseling usingย LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.19 - FeB4RAG:ย Evaluatingย Federated Search in the Context of Retrieval Augmented Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.19 - In deep reinforcement learning, a pruned network is a good network (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.19 - Binary Opacity Grids: Capturing Fine Geometric Detail for Mesh-Based View Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.19 - FiT: Flexible Vision Transformer for Diffusion Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.19 - AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.19 - Pushing Auto-regressive Models for 3D Shape Generation at Capacity and Scalability (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.19 - Reformatted Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.19 - DiLightNet: Fine-grained Lighting Control for Diffusion-based Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.18 - Search Engines Post-ChatGPT: Howย Generativeย Artificialย Intelligenceย Could Make Search Less Reliable (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.18 - Vision-Flan: Scaling Human-Labeled Tasks in Visual Instruction Tuning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.18 - LongAgent: Scaling Language Models to 128k Context through Multi-Agent Collaboration (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.18 - Learning to Learn Faster from Human Feedback with Language Model Predictive Control (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.17 - OneBit: Towards Extremely Low-bit Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.17 - CoLLaVO: Crayon Large Language and Vision mOdel (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.16 - Orca-Math: Unlocking the potential of SLMs in Grade School Math (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ)
- 2.16 - Using Left and Right Brains Together: Towards Vision and Language Planning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.16 - Speculative Streaming: Fast LLM Inference without Auxiliary Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.16 - FinTral: A Family of GPT-4 Level Multimodal Financial Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.16 - PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.16 - RLVF: Learning from Verbal Feedback without Overgeneralization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.16 - In Search of Needles in a 11M Haystack: Recurrent Memory Finds What LLMs Miss (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.16 - Linear Transformers with Learnable Kernel Functions are Better In-Context Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.16 - SPAR: Personalized Content-Based Recommendation via Long Engagement Attention (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.16 - LLM Comparator: Visual Analytics for Side-by-Side Evaluation of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.16 - Make a Cheap Scaling: A Self-Cascade Diffusion Model for Higher-Resolution Adaptation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.16 - Large Language Models as Zero-shot Dialogue State Tracker through Function Calling (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.16 - DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.16 - OpenAI Reaches $80 Billion Valuation In Venture Firm Deal, Report Says (Forbes news)
- 2.16 - Chain-of-Thought Reasoning Without Prompting (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.16 - OpenAIโs Sam Altman Seeks US Blessing to Raise Billions for AI Chips (Bloomberg news)
- 2.15 - BioMistral: A Collection of Open-Source Pretrained Large Language Models for Medical Domains (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - Generative AI and Process Systems Engineering: The Next Frontier (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - Generativeย AI in the Construction Industry: A State-of-the-art Analysis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - OpenAI Sora - Creating video from text (blog)
- 2.15 - OpenAI - Video generation models as world simulators (TR)
- 2.15 - Our next-generation model: Gemini 1.5 (blog)
- 2.15 - V-JEPA: The next step toward Yann LeCunโs vision of advanced machine intelligence (AMI) blog
- 2.15 - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - OpenAI blocks state-sponsored hackers from using ChatGPT (news)
- 2.15 - Chain-of-Thought Reasoning Without Prompting (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - Generative Representational Instruction Tuning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.15 - How to Train Data-Efficient LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - BitDelta: Your Fine-Tune May Only Be Worth One Bit (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.15 - OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.15 - Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - Zero-Shot Unsupervised and Text-Based Audio Editing Using DDPM Inversion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - Data Engineering for Scaling Language Models to 128K Context (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.15 - DreamMatcher: Appearance Matching Self-Attention for Semantically-Consistent Text-to-Image Personalization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - GES: Generalized Exponential Splatting for Efficient Radiance Field Rendering (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - AI Hospital: Interactive Evaluation and Collaboration ofย LLMsย as Intern Doctors for Clinical Diagnosis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - GeoEval:ย Benchmarkย for Evaluatingย LLMsย and Multi-Modal Models on Geometry Problem-Solving (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.15 - Rapid Adoption, Hidden Risks: The Dual Impact of Large Language Model Customization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - Universal Manipulation Interface: In-The-Wild Robot Teaching Without In-The-Wild Robots (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - LAVE: LLM-Powered Agent Assistance and Language Augmentation for Video Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - GaussianObject: Just Taking Four Images to Get A High-Quality 3D Object with Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - Both Matter: Enhancing the Emotional Intelligence of Large Languageย Modelsย without Compromising the General Intelligence (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.15 - Mind the Modality Gap: Towards a Remote Sensing Vision-Languageย Modelย via Cross-modal Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 02/14 - HaLo-NeRF: Learning Geometry-Guided Semantics for Exploring Unconstrained Photo Collections (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 2.14 - Generalization in Healthcare AI:ย Evaluationย of a Clinical Large Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - LlaSMol: Advancing Large Languageย Modelsย for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.14 - LLM Agents can Autonomously Hack Websites (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - OpenAI - Disrupting malicious uses of AI by state-affiliated threat actors (blog)
- 2.14 - Attacks, Defenses and Evaluations for LLM Conversation Safety: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - GPT-4's assessment of its performance in a USMLE-based case study (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - Addressing cognitive bias inย medicalย language models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - CodeMind: A Framework to Challenge Large Language Models for Code Reasoning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - AQA-Bench: An Interactiveย Benchmarkย for Evaluatingย LLMs' Sequential Reasoning Ability (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.14 - DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - OpenToM: A Comprehensiveย Benchmarkย for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.14 - Magic-Me: Identity-Specific Video Customized Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.14 - Premise Order Matters in Reasoning with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - L3GO: Language Agents with Chain-of-3D-Thoughts for Generating Unconventional Objects (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - Computing Power and the Governance of Artificial Intelligence (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - Transformers Can Achieve Length Generalization But Not Robustly (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - MPIrigen: MPI Code Generation through Domain-Specific Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.14 - Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2/13 - Metaโs AI Chief Yann LeCun on AGI, Open-Source, and AI Risk (News)
- 2.13 - A Survey ofย Generativeย AI for De Novo Drug Design: New Frontiers in Molecule and Proteinย Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.13 - A survey of recent methods for addressing AI fairness and bias in biomedicine (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - OpenAi - Memory and new controls for ChatGPT (blog)
- 2.13 - GhostWriter: Augmenting Collaborative Human-AI Writing Experiences Through Personalization and Agency (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - Mixtures of Experts Unlock Parameter Scaling for Deep RL (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - World Model on Million-Length Video And Language With RingAttention (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.13 - Learning Continuous 3D Words for Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - ChatCell: Facilitating Single-Cell Analysis with Natural Language (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.13 - Graph Mamba: Towards Learning on Graphs with State Space Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.13 - IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - Vision-Based Hand Gesture Customization from a Single Demonstration (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - Tandem Transformers for Inference Efficient LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - NeRF Analogies: Example-Based Visual Attribute Transfer for NeRFs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - Aย Surveyย of Table Reasoning with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - Combining Insights From Multiple Large Language Models Improves Diagnostic Accuracy (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.13 - PRompt Optimization in Multi-Step Tasks (PROMST): Integrating Human Feedback and Preference Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.12 - CyberMetric: A Benchmark Dataset for Evaluating Large Language Models Knowledge in Cybersecurity (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - The Sound of Healthcare: Improvingย Medicalย Transcription ASR Accuracy with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Detecting the Clinical Features of Difficult-to-Treat Depression using Synthetic Data from Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Health-LLM: Personalized Retrieval-Augmented Disease Prediction Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.12 - LoTa-Bench:ย Benchmarkingย Language-oriented Task Planners for Embodied Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.12 - LLaGA: Large Language and Graph Assistant (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.12 - Mercury: An Efficiencyย Benchmarkย forย LLMย Code Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - CyberMetric: Aย Benchmarkย Dataset for Evaluating Large Language Models Knowledge in Cybersecurity (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Rolling Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - BASE TTS: Lessons from building a billion-parameter Text-to-Speech model on 100K hours of data (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - OS-Copilot: Towards Generalist Computer Agents with Self-Improvement (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Lumos : Empowering Multimodal LLMs with Scene Text Recognition (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Making Flow-Matching-Based Zero-Shot Text-to-Speech Laugh as You Like (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - AutoMathText: Autonomous Data Selection with Language Models for Mathematical Texts (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.12 - Scaling Laws for Fine-Grained Mixture of Experts (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.12 - Policy Improvement using Language Feedback Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Suppressing Pink Elephants with Direct Principle Feedback (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Step-On-Feet Tuning: Scaling Self-Alignment of LLMs via Bootstrapping (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - WildfireGPT: Tailored Large Language Model for Wildfire Analysis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - AIR-Bench: Benchmarking Large Audio-Languageย Modelsย via Generative Comprehension (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.12 - Imagining a Future of Designing with AI: Dynamic Grounding, Constructive Negotiation, and Sustainable Motivation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.11 - A Benchmark for Multi-modal Foundation Models on Low-level Vision: from Single Images to Pairs (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 2.11 - The Bias of Harmful Label Associations in Vision-Languageย Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.11 - GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.11 - ODIN: Disentangled Reward Mitigates Hacking in RLHF (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.10 - Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.10 - ChemLLM: A Chemical Large Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.10 - A Tale of Tails: Model Collapse as a Change of Scaling Laws (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.10 - LiRank: Industrial Large Scale Ranking Models at LinkedIn (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.10 - Gemini Goes to Med School: Exploring the Capabilities of Multimodal Large Language Models onย Medicalย Challenge Problems & Hallucinations (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.10 - REALM: RAG-Driven Enhancement of Multimodal Electronic Health Records Analysis via Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.10 - OpenFedLLM: Training Large Language Models on Decentralized Private Data via Federated Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.9 - History, Development, and Principles of Large Language Models-An Introductoryย Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.9 - Large Language Models: Aย Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.9 - Factuality of Large Language Models in the Year 2024 (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.9 - RareBench: Canย LLMsย Serve as Rare Diseases Specialists? (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.9 - Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.9 - HeadStudio: Text to Animatable Head Avatars with 3D Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.9 - InternLM-Math: Open Math Large Language Models Toward Verifiable Reasoning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.9 - Keyframer: Empowering Animation Design using Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.9 - ViGoR: Improving Visual Grounding of Large Vision Language Models with Fine-Grained Reward Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.9 - MusicMagus: Zero-Shot Text-to-Music Editing via Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.9 - Model Editing with Canonical Examples (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.8 - UFO: A UI-Focused Agent for Windows OS Interaction (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.8 - SubGen: Token Generation in Sublinear Time and Memory (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Premier-TACO: Pretraining Multitask Representation via Temporal Action-Driven Contrastive Loss (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Animated Stickers: Bringing Stickers to Life with Video Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Real-World Fluid Directed Rigid Body Control via Deep Reinforcement Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - WebLINX: Real-World Website Navigation with Multi-Turn Dialogue (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.8 - An Interactive Agent Foundation Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Multilingual E5 Text Embeddings: A Technical Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.8 - In-Context Principle Learning from Mistakes (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.8 - InstaGen: Enhancing Object Detection by Training on Synthetic Dataset (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Memory Consolidation Enables Long-Context Video Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Implicit Diffusion: Efficient Optimization through Stochastic Sampling (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - SpiRit-LM: Interleaved Spoken and Written Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Offline Actor-Critic Reinforcement Learning Scales to Large Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Question Aware Vision Transformer for Multimodal Reasoning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Driving Everywhere with Large Language Model Policy Adaptation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - LLMsย Among Us: Generative AI Participating in Digital Discourse (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - A Systematicย Surveyย of Prompt Engineering in Large Language Models: Techniques and Applications (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.8 - Self-Alignment of Large Language Models via Monopolylogue-based Social Scene Simulation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Comprehensive Assessment of Jailbreak Attacks Againstย LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.8 - Real-World Robot Applications ofย Foundationย Models: A Review (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 02/08 - Biden-Harris Administration Announces First-Ever Consortium Dedicated to AI Safety (News),
- 02/07 - AISIC Working Groups (Blog),
- 02/07 - AISIC Members (Blog),
- 2.7 - Scaling Up LLM Reviews for Google Ads Content Moderation (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ)
- 2.7 - AlphaFold Meets Flow Matching for Generating Protein Ensembles (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 2.7 - Scaling Up LLM Reviews for Google Ads Content Moderation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - Large Language User Interfaces: Voice Interactive User Interfaces powered by LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - Continual Learning for Large Language Models: Aย Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - Large Language Models Based Fuzzing Techniques: Aย Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - Advancing Explainable AI Toward Human-Like Intelligence: Forging the Path to Artificial Brain (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - SALAD-Bench: A Hierarchical and Comprehensive Safetyย Benchmarkย for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.7 - MEMORYLLM: Towards Self-Updatable Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - Prioritizing Safeguarding Over Autonomy: Risks ofย LLMย Agents for Science (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - ฮป-ECLIPSE: Multi-Concept Personalized Text-to-Image Diffusion Models by Leveraging CLIP Latent Space (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - Grandmaster-Level Chess Without Search (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - Direct Language Model Alignment from Online AI Feedback (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - ScreenAI: A Vision-Language Model for UI and Infographics Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.7 - CodeIt: Self-Improving Language Models with Prioritized Hindsight Replay (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2;7 - Hydragen: High-Throughput LLM Inference with Shared Prefixes (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - Fast Timing-Conditioned Latent Audio Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.15 - TP-Aware Dequantization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.7 - Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.6 - Tag-LLM: Repurposing General-Purpose LLMs for Specialized Domains (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - BiLLM: Pushing the Limit of Post-Training Quantization for LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.6 - Fine-Tuned Language Models Generate Stable Inorganic Materials as Text (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - Self-Discover: Large Language Models Self-Compose Reasoning Structures (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - Can Mamba Learn How to Learn? A Comparative Study on In-Context Learning Tasks (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.6 - EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - MusicRL: Aligning Music Generation to Human Preferences (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.6 - Scaling Laws for Downstream Task Performance of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.6 - Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - Multi-line AI-assisted Code Authoring (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.6 - MobileVLM V2: Faster and Stronger Baseline for Vision Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - EscherNet: A Generative Model for Scalable View Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.6 - CogCoM: Train Large Vision-Language Models Diving into Details through Chain of Manipulations (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - Large Language Models for Time Series: Aย Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.6 - Assuredย LLM-Based Software Engineering (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.6 - Leak, Cheat, Repeat: Data Contamination and Evaluation Malpractices in Closed-Sourceย LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 02/05 - UK AI Safety Institute: third progress report (Blog),
- 2.5 - A Survey on Effective Invocation Methods of Massive LLM Services (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.5 - Empowering Time Series Analysis with Large Language Models: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.5 - When Large Language Models Meet Vector Databases: Aย Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.5 - Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.5 - Large Language Model Distillingย Medicationย Recommendation Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.5 - Professional Agents -- Evolving Large Language Models into Autonomous Experts with Human-Level Competencies (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.5 - Minds versus Machines: Rethinking Entailment Verification with Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.5 - DeAL: Decoding-time Alignment for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.5 - Diffusion World Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.5 - DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.5 - Training-Free Consistent Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.5 - Direct-a-Video: Customized Video Generation with User-Directed Camera Movement and Object Motion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.5 - InteractiveVideo: User-Centric Controllable Video Generation with Synergistic Multimodal Instructions (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.5 - V-IRL: Grounding Virtual Intelligence in Real Life (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.5 - Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.5 - Rethinking Optimization and Architecture for Tiny Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.5 - Shortened LLaMA: A Simple Depth Pruning for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.5 - Progress and Opportunities ofย Foundationย Modelsย in Bioinformatics (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 02/04 - DenseFormer: Enhancing Information Flow in Transformers via Depth Weighted Averaging (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 2.4 - A Survey on Robotics with Foundation Models: toward Embodiedย AI (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.4 - DiffEditor: Boosting Accuracy and Flexibility on Diffusion-based Image Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.4 - Advancing Graph Representation Learning with Large Language Models: A Comprehensiveย Surveyย of Techniques (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.4 - Aย Surveyย on Data Selection forย LLMย Instruction Tuning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.4 - Understanding the planning ofย LLMย agents: Aย survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.4 - Zero-Shot Clinical Trial Patient Matching withย LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.4 - Integration of cognitive tasks intoย artificialย generalย intelligenceย test for large models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.4 - Formal-LLM: Integrating Formal Language and Natural Language for Controllableย LLM-based Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 02/03 - Research Papers in Jan 2024: Model Merging, Mixtures of Experts, and Towards Smaller LLMs (Blog),
- 2.3 - Large Language Model for Table Processing: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.3 - Aย Surveyย of Large Language Models in Finance (FinLLMs) (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.3 - Beyond the Limits: Aย Surveyย of Techniques to Extend the Context Length in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.3 - IMUSIC: IMU-based Facial Expression Capture (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.3 - More Agents Is All You Need (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Parametric Feature Transfer: One-shot Federated Learning with Foundationย Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Code Representation Learning At Scale (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Specialized Language Models with Cheap Inference from Limited Domain Data (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.2 - PokรฉLLMon: A Human-Parity Agent for Pokรฉmon Battles with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.2 - TravelPlanner: A Benchmark for Real-World Planning with Language Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Boximator: Generating Rich and Controllable Motions for Video Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - K-Level Reasoning with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Nomic Embed: Training a Reproducible Long Context Text Embedder (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.2 - LiPO: Listwise Preference Optimization through Learning-to-Rank (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Aย Surveyย on Large Language Model Hallucination via a Creativity Perspective (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Aย Surveyย on Context-Aware Multi-Agent Systems: Techniques, Challenges and Future Directions (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Large language models cannot replace human participants because they cannot portray identity groups (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Faster and Lighterย LLMs: Aย Surveyย on Current Challenges and Way Forward (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.2 - LLM-based NLG Evaluation: Current Status and Challenges (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Exploring patient trust in clinical advice from AI-drivenย LLMsย like ChatGPT for self-diagnosis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - How well doย LLMsย cite relevantย medicalย references? An evaluation framework and analyses (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Peer-review-in-LLMs: Automatic Evaluation Method forย LLMsย in Open-environment (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.2 - Exploring the Limitations of Graph Reasoning in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.2 - Can MLLMs Perform Text-to-Image In-Context Learning? (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.1 - HR-MultiWOZ: A Task Oriented Dialogue (TOD) Dataset for HRย LLMย Agent (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.1 - Whenย Benchmarksย are Targets: Revealing the Sensitivity of Large Language Model Leaderboards (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.1 - Comparative Study of Large Language Model Architectures on Frontier (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.1 - BlackMamba: Mixture of Experts for State-Space Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.1 - Repeat After Me: Transformers are Better than State Space Models at Copying (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.1 - CroissantLLM: A Truly Bilingual French-English Language Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.1 - OLMo: Accelerating the Science of Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.1 - Can Large Language Models Understand Context? (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.1 - Efficient Exploration for LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.1 - SymbolicAI: A framework for logic-based approaches combining generative models and solvers (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.1 - AnimateLCM: Accelerating the Animation of Personalized Diffusion Models and Adapters with Decoupled Consistency Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.1 - Machine Unlearning for Image-to-Image Generative Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.1 - AToM: Amortized Text-to-Mesh using 2D Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.1 - Transforming and Combining Rewards for Aligning Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 2.1 - EE-Tuning: An Economical yet Scalable Solution for Tuning Early-Exit Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 2.1 - Recent Advances in Hate Speech Moderation: Multimodality and the Role of Large Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.31 - Towards Efficient and Reliableย LLMย Serving: A Real-World Workload Study (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) * 1.31 - EVA-GAN: Enhanced Various Audio Generation via Scalable Generative Adversarial Networks (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.31 - Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.31 - Agile But Safe: Learning Collision-Free High-Speed Legged Locomotion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.31 - LongAlign: A Recipe for Long Context Alignment of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.31 - Advances in 3D Generation: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.31 - ReplaceAnything3D:Text-Guided 3D Scene Editing with Compositional Neural Radiance Fields (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.31 - RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.31 - Scavenging Hyena: Distilling Transformers into Long Convolution Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.31 - CARFF: Conditional Auto-encoded Radiance Field for 3D Scene Forecasting (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.31 - Large Language Models for Mathematical Reasoning: Progresses and Challenges (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Large Language Models in Cybersecurity: State-of-the-Art (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Rethinking Interpretability in the Era of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.30 - Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Anything in Any Scene: Photorealistic Video Object Insertion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Efficient Tool Use with Chain-of-Abstraction Reasoning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Weaver: Foundation Models for Creative Writing (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - BlockFusion: Expandable 3D Scene Generation using Latent Tri-plane Extrapolation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - YOLO-World: Real-Time Open-Vocabulary Object Detection (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.30 - StrokeNUWA: Tokenizing Strokes for Vector Graphic Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Proactive Detection of Voice Cloning with Localized Watermarking (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.30 - Transfer Learning for Text Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Repositioning the Subject within Image (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - OWSM v3.1: Better and Faster Open Whisper-Style Speech Models based on E-Branchformer (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Weak-to-Strong Jailbreaking on Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.30 - H2O-Danube-1.8B Technical Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - MouSi: Poly-Visual-Expert Vision-Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.30 - T3: Transparent Tracking & Triggering for Fine-grained Overlap of Compute & Collectives (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Towards Generating Executable Metamorphic Relations Using Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - A Preliminary Study on Using Large Language Models in Software Pentesting (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.30 - Planning, Creation, Usage:ย Benchmarkingย LLMsย for Comprehensive Tool Utilization in Real-World Complex Scenarios (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.30 - Can Large Language Models be Trusted for Evaluation? Scalable Meta-Evaluation ofย LLMsย as Evaluators via Agent Debate (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 01/29 - DressCode: Autoregressively Sewing and Generating Garments from Text Guidance (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS)
- 1.29 - Beyond Direct Diagnosis:ย LLM-based Multi-Specialist Agent Consultation for Automatic Diagnosis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.29 - MT-Eval: A Multi-Turn Capabilities Evaluationย Benchmarkย for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.29 - OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.29 - High-Quality Image Restoration Following Human Instructions (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.29 - ReGAL: Refactoring Programs to Discover Generalizable Abstractions (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.29 - InternLM-XComposer2: Mastering Free-form Text-Image Composition and Comprehension in Vision-Language Large Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.29 - MoE-LLaVA: Mixture of Experts for Large Vision-Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.29 - Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.29 - Motion-I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.29 - StableIdentity: Inserting Anybody into Anywhere at First Sight (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.29 - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.29 - Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.29 - Overcoming the Pitfalls of Vision-Language Model Finetuning for OOD Generalization (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.29 - Security and Privacy Challenges of Large Language Models: Aย Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.29 - Development and Testing of a Novel Large Language Model-Based Clinical Decision Support Systems forย Medicationย Safety in 12 Clinical Specialties (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.29 - Development and Testing of Retrieval Augmented Generation in Large Language Models -- A Case Study Report (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.28 - Evaluatingย LLMย -- Generated Multimodal Diagnosis fromย Medicalย Images and Symptom Analysis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.28 - Media2Face: Co-speech Facial Animation Generation With Multi-Modality Guidance (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.28 - Divide and Conquer: Language Models can Plan and Self-Correct for Compositional Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.28 - Object-Driven One-Shot Fine-tuning of Text-to-Image Diffusion with Prototypical Embedding (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.28 - Comuniqa : Exploring Large Language Models for improving speaking skills (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.28 - Evaluating Gender Bias in Large Language Models via Chain-of-Thought Prompting (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.27 - Aย Surveyย on Data Augmentation in Large Model Era (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.26 - SliceGPT: Compress Large Language Models by Deleting Rows and Columns (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.26 - From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.26 - Learning Universal Predictors (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.26 - EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.26 - Taiyi-Diffusion-XL: Advancing Bilingual Text-to-Image Generation with Large Vision-Language Model Support (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.26 - Generative Expressive Robot Behaviors using Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.26 - TIP-Editor: An Accurate 3D Editor Following Both Text-Prompts And Image-Prompts (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.26 - Scientific Large Language Models: Aย Surveyย on Biological & Chemical Domains (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.26 - F-Eval: Asssessing Fundamental Abilities with Refined Evaluation Methods (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 01/25 - Towards 3D Molecule-Text Interpretation in Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 01/25 - Unifying Large Language Models and Knowledge Graphs: A Roadmap (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 1.25 - Enhancing Diagnostic Accuracy through Multi-Agent Conversations: Using Large Language Models to Mitigate Cognitive Bias (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - K-QA: A Real-Worldย Medicalย Q&A Benchmark (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.25 - How Can Large Language Models Understand Spatial-Temporal Data? (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - Leeroo Orchestrator: Elevatingย LLMsย Performance Through Model Integration (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.25 - LLMย on FHIR -- Demystifying Health Records (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.25 - Rethinking Patch Dependence for Masked Autoencoders (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - Unitxt: Flexible, Shareable and Reusable Data Preparation and Evaluation for Generative AI (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.25 - Deconstructing Denoising Diffusion Models for Self-Supervised Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - FP6-LLM: Efficiently Serving Large Language Models Through FP6-Centric Algorithm-System Co-Design (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.25 - BootPIG: Bootstrapping Zero-shot Personalized Image Generation Capabilities in Pretrained Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - Sketch2NeRF: Multi-view Sketch-guided Text-to-3D Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - Adaptive Mobile Manipulation for Articulated Objects In the Open World (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - pix2gestalt: Amodal Segmentation by Synthesizing Wholes (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.25 - CreativeSynth: Creative Blending and Synthesis of Visual Arts based on Multimodal Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.25 - Genie: Achieving Human Parity in Content-Grounded Datasets Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.24 - Diffuse to Choose: Enriching Image Conditioned Inpainting in Latent Diffusion Models for Virtual Try-All (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.24 - Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.24 - MambaByte: Token-free Selective State Space Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.24 - MM-LLMs: Recent Advances in MultiModal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.24 - MaLA-500: Massive Language Adaptation of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.24 - UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.24 - SpacTor-T5: Pre-training T5 Models with Span Corruption and Replaced Token Detection (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.24 - ConTextual: Evaluating Context-Sensitive Text-Rich Visual Reasoning in Large Multimodal Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.24 - Evaluation of General Large Language Models in Contextually Assessing Semantic Concepts Extracted from Adult Critical Care Electronic Health Record Notes (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 01/23 - Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 1.23 - From Understanding to Utilization: Aย Surveyย on Explainability for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - Unsocial Intelligence: a Pluralistic, Democratic, and Participatory Investigation of AGI Discourse (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - AgentBoard: An Analytical Evaluation Board of Multi-turnย LLMย Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.23 - HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.23 - Benchmarkingย LLMsย via Uncertainty Quantification (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.23 - Evaluation of large language models for assessing code maintainability (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - GALA: Generating Animatable Layered Assets from a Single Scan (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - Lumiere: A Space-Time Diffusion Model for Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - Small Language Model Meets with Reinforced Vision Vocabulary (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - Meta-Prompting: Enhancing Language Models with Task-Agnostic Scaffolding (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.23 - BiTA: Bi-Directional Tuning for Lossless Acceleration in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.23 - Red Teaming Visual Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.23 - How to Fine-Tune LLMs in 2024 with Hugging Face (blog)
- 1.22 - CMMMU: A Chinese Massive Multi-discipline Multimodal Understanding Benchmark (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.22 - Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.22 - EmerDiff: Emerging Pixel-level Semantic Knowledge in Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.22 - SpatialVLM: Endowing Vision-Language Models with Spatial Reasoning Capabilities (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.22 - DITTO: Diffusion Inference-Time T-Optimization for Music Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.22 - Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.22 - WARM: On the Benefits of Weight Averaged Reward Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.22 - OK-Robot: What Really Matters in Integrating Open-Knowledge Models for Robotics (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.22 - Single-View 3D Human Digitalization with Large Reconstruction Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.22 - Scaling Face Interaction Graph Networks to Real World Scenes (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.22 - AI for social science and social science of AI: Aย Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.21 - Revolutionizing Finance withย LLMs: An Overview of Applications and Insights (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.21 - Large Language Model based Multi-Agents: Aย Surveyย of Progress and Challenges (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.20 - Fast Registration of Photorealistic Avatars for VR Facial Animation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.20 - StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.20 - UltrAvatar: A Realistic Animatable 3D Avatar Diffusion Model with Authenticity Guided Textures (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.20 - Make-A-Shape: a Ten-Million-scale 3D Shape Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.20 - Orion-14B: Open-source Multilingual Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.20 - Large-scale Reinforcement Learning for Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.20 - Papers and resources for LLMs evaluation (:octocat:)
- 1.20 - Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.20 - Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.20 - ActAnywhere: Subject-Aware Video Background Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.20 - Rambler: Supporting Writing With Speech via LLM-Assisted Gist Manipulation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.20 - Synthesizing Moving People with 3D Control (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.20 - Understanding Video Transformers via Universal Concept Discovery (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.19 - Inflation with Diffusion: Efficient Temporal Adaptation for Text-to-Video Super-Resolution (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.19 - Meta is developing open source AGI, says Zuckerberg (news) -
- 1.19 - Mark Zuckerbergโs new goal is creating artificial general intelligence (news)
- 1.19 - DiffusionGPT: LLM-Driven Text-to-Image Generation System (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.19 - ChatQA: Building GPT-4 Level Conversational QA Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.19 - VMamba: Visual State Space Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.19 - SHINOBI: Shape and Illumination using Neural Object Decomposition via BRDF Optimization In-the-wild (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 01/18 - Veagle: Advancements in Multimodal Representation Learning (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 01/18 - WHO - Ethics and governance of artificial intelligence for health: Guidance on large multi-modal models (News),
- 01/18 - Understanding Liability Risk from Using Health Care Artificial Intelligence Tools (โ)
- 1.18 - R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 1.18 - R-Judge: Benchmarking Safety Risk Awareness for LLM Agents (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.18 - A Survey on Hardware Accelerators for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.18 - Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (SS)
- 1.18 - RAP-SAM: Towards Real-Time All-Purpose Segment Anything (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.18 - Self-Rewarding Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.18 - WorldDreamer: Towards General World Models for Video Generation via Predicting Masked Tokens (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.18 - Improving fine-grained understanding in image-text pre-training (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.18 - FreGrad: Lightweight and Fast Frequency-aware Diffusion Vocoder (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.18 - CustomVideo: Customizing Text-to-Video Generation with Multiple Subjects (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.18 - Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.18 - SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.18 - GARField: Group Anything with Radiance Fields (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.18 - TextureDreamer: Image-guided Texture Synthesis through Geometry-aware Diffusion (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.7 - Agent AI: Surveying the Horizons of Multimodal Interaction (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.17 - State-of-the-art Code Generation with AlphaCodium โ From Prompt Engineering to Flow Engineering (blog)
- 1.17 - Scalable Pre-training of Large Autoregressive Image Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.17 - Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.17 - Understanding the concerns and choices of public when using large language models for healthcare (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.17 - ClimateGPT: Towards AI Synthesizing Interdisciplinary Research on Climate Change (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.17 - Foundations of Vector Retrieval (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.17 - ReFT: Reasoning with Reinforced Fine-Tuning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.17 - UniVG: Towards UNIfied-modal Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.17 - Asynchronous Local-SGD Training for Language Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.17 - VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.17 - SiT: Exploring Flow and Diffusion-based Generative Models with Scalable Interpolant Transformers (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.17 - Compose and Conquer: Diffusion-Based 3D Depth Aware Composable Image Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.16 - Towards A Better Metric for Text-to-Video Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.16 - Segment Anything Model Can Not Segment Anything: Assessing AI Foundation Model's Generalizability in Permafrost Mapping (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.16 - Understanding User Experience in Large Language Model Interactions (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.16 - Tuning Language Models by Proxy (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.16 - A Survey of Resource-efficient LLM and Multimodal Foundation Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.16 - Meta release MAGNeT for text2music (tweet) -
- 1.16 - Here's how OpenAI plans to address election misinformation on ChatGPT and Dall-E (news)
- 1.16 - ChatGPT will have video functionality and more accuracy in future versions โ Sam Altman says GPT-5 will be a big improvement (news)
- 1.15 - InstantID: Zero-shot Identity-Preserving Generation in Seconds (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.15 - HexaGen3D: StableDiffusion is just one step away from Fast and Diverse Text-to-3D Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.15 - The What, Why, and How of Context Length Extension Techniques in Large Language Models -- A Detailed Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.15 - GPT Store users breaking rules with 'girlfriend' bots (news)
- 1.15 - ChatGPT for Self-Diagnosis: AI Is Changing the Way We Answer Our Own Health Questions (CNET news)
- 1.14 - Anthropic researchers find that AI models can be trained to deceive (TechCrunch news)
- 1.13 - E^2-LLM: Efficient and Extreme Length Extension of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.13 - Quantum Denoising Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.13 - Extending LLMs' Context Window with 100 Samples (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.13 - Leveraging Large Language Models for NLG Evaluation: A Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.12 - A Survey on the Applications of Frontier AI, Foundation Models, and Large Language Models to Intelligent Transportation Systems (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - Business and ethical concerns in domestic Conversational Generative AI-empowered multi-robot systems (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - AI girlfriend bots are already flooding OpenAIโs GPT store (news)
- 1.12 - Intention Analysis Prompting Makes Large Language Models A Good Jailbreak Defender (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - DevEval: Evaluating Code Generation in Practical Software Projects (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - OpenAI Quietly Deletes Ban on Using ChatGPT for โMilitary and Warfareโ (news)
- 1.12 - Google AI has better bedside manner than human doctors โ and makes better diagnoses ([Nature doi: https://doi.org/10.1038/d41586-024-00099-4)
- 1.12 - OpenChat: Advancing Open-source Language Models with Mixed-Quality Data (tweet), (demo), (:octocat:)
- 1.12 - AMIE: A research AI system for diagnostic medical reasoning and conversations (tweet), (blog)
- 1.12 - PALP: Prompt Aligned Personalization of Text-to-Image Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - Transformers are Multi-State RNNs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - TOFU: A Task of Fictitious Unlearning for LLMs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.12 - TRIPS: Trilinear Point Splatting for Real-Time Radiance Field Rendering (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - Distilling Vision-Language Models on Millions of Videos (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.12 - Secrets of RLHF in Large Language Models Part II: Reward Modeling (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.12 - LEGO:Language Enhanced Multi-modal Grounding Model (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.11 - Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS)
- 1.11 - Uncertainty Awareness of Large Language Models Under Code Distribution Shifts: Aย Benchmarkย Study (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - A Universal Knowledge Model and Cognitive Architecture for Prototyping AGI (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - OpenAI debuts ChatGPT subscription aimed at small teams (TechCrunch news)
- 1.11 - OpenAI Signs Up 260 Businesses for Corporate Version of ChatGPT (Bloomberg news), (archive)
- 1.11 - The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Risk Taxonomy, Mitigation, and Assessment Benchmarks of Large Language Model Systems (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Surgical-DINO: Adapter Learning of Foundation Model for Depth Estimation in Endoscopic Surgery (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Seven Failure Points When Engineering a Retrieval Augmented Generation System (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Can large language models identify and correct their mistakes? (blog)
- 1.11 - PIXART-ฮด: Fast and Controllable Image Generation with Latent Consistency Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.11 - InseRF: Text-Driven Generative Object Insertion in Neural 3D Scenes (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - URHand: Universal Relightable Hands (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - ANIM-400K: A Large-Scale Dataset for Automated End-To-End Dubbing of Video (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.11 - Score Distillation Sampling with Learned Manifold Corrective (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Parrot: Pareto-optimal Multi-Reward Reinforcement Learning Framework for Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Object-Centric Diffusion for Efficient Video Editing (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Diffusion Priors for Dynamic View Synthesis from Monocular Videos (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - A Shocking Amount of the Web is Machine Translated: Insights from Multi-Way Parallelism (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Towards Conversational Diagnostic AI (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Tuning LLMs with Contrastive Alignment Instructions for Machine Translation in Unseen, Low-resource Languages (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.11 - Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.10 - Exploring the Reasoning Abilities of Multimodal Large Language Models (MLLMs): A Comprehensive Survey on Emerging Trends in Multimodal Reasoning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.10 - OpenAI's GPT Store Now Offers a Selection of 3 Million Custom AI Bots (news)
- 1.10 - VLP: Vision Language Planning for Autonomous Driving (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.10 - Personal LLM Agents: Insights and Survey about the Capability, Efficiency and Security (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.10 - Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.10 - Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.10 - The Impact of Reasoning Step Length on Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.10 - Bootstrapping LLM-based Task-Oriented Dialogue Agents via Self-Talk (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.10 - Introducing the GPT Store (blog)
- 1.10 - TrustLLM: Trustworthiness in Large Language Models (project), (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 1.10 - Jump Cut Smoothing for Talking Heads (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.9 - DeepSpeed-FastGen: High-throughput Text Generation for LLMs via MII and DeepSpeed-Inference (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.9 - Masked Audio Generation using a Single Non-Autoregressive Transformer (project), (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.9 - Let's Go Shopping (LGS) -- Web-Scale Image-Text Dataset for Visual Concept Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.9 - Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.9 - FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.9 - New York Times-ChatGPT lawsuit poses new legal threats to artificial intelligence (news)
- 1.9 - GPT-Pilot: Dev tool that writes scalable apps from scratch while the developer oversees the implementation (:octocat:)
- 1.9 - MagicVideo-V2: Multi-Stage High-Aesthetic Video Generation (proejct), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.9 - AGG: Amortized Generative 3D Gaussians for Single Image to 3D (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.9 - MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.9 - GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 01/08 - PythonSaga: Redefining the Benchmark to Evaluate Code Generating LLM (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 1.8 - Volkswagen brings ChatGPT into compact cars (news)
- 1.8 - OpenAI - OpenAI and journalism (blog)
- 1.8 - FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGAs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.8 - TeleChat Technical Report (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.8 - From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations (:octocat:) -
- 1.8 - A Complete List of ArXiv Papers on Alignment, Safety, and Security of Large Language Models (LLMs) (list)
- 1.8 - Mixtral of Experts (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.7 - Agent AI: Surveying the Horizons of Multimodal Interaction (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ) , (:octocat:)
- 1.7 - Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.7 - DiarizationLM: Speaker Diarization Post-Processing with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.7 - Soaring from 4K to 400K: Extending LLM's Context with Activation Beacon (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.7 - How To Make Money In 2024 Using ChatGPTโs GPT Store (Forbes news)
- 1.6 - CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.6 - Denoising Vision Transformers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.6 - Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.6 - Microsoft Phi-2 model changes licence to MIT (news)
- 1.6 - Microsoft, OpenAI sued for copyright infringement by nonfiction book authors in class action claim (CNBC news)
- 1.5 - Levels of AGI: Operationalizing Progress on the Path to AGI (v2) (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (SS)
- 1.5 - CoCoT: Contrastive Chain-of-Thought Prompting for Large Multimodal Models with Multiple Image Inputs (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.5 - Latte: Latent Diffusion Transformer for Video Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - From LLM to Conversational Agent: A Memory Enhanced Architecture with Fine-Tuning of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.5 - DeepSeek LLM: Scaling Open-Source Language Models with Longtermism (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - AST-T5: Structure-Aware Pretraining for Code Generation and Understanding (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.5 - Thousands of AI Authors on the Future of AI (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - DocGraphLM: Documental Graph Language Model for Information Extraction (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - Progressive Knowledge Distillation Of Stable Diffusion XL Using Layer Level Loss (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - Pheme: Efficient and Conversational Speech Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - Infinite-LLM: Efficient LLM Service for Long Context with DistAttention and Distributed KVCache (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - The GPT store will launch next week (blog) -
- 1.5 - TinyLlama: An Open-Source Small Language Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.5 - LLaMA Pro: Progressive LLaMA with Block Expansion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.5 - What You See is What You GAN: Rendering Every Pixel for High-Fidelity Geometry in 3D GANs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - Learning the 3D Fauna of the Web (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - ODIN: A Single Model for 2D and 3D Perception (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.5 - LLaVA-ฯ: Efficient Multi-Modal Assistant with Small Language Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - Blending Is All You Need: Cheaper, Better Alternative to Trillion-Parameters LLM (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - Microsoft adding new PC button in its first significant keyboard change in decades (CNBC news)
- 1.4 - LLM Augmented LLMs: Expanding Capabilities through Composition (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - Understanding LLMs: A Comprehensive Overview from Training to Inference (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - Instruct-Imagen: Image Generation with Multi-modal Instruction (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - Improving Diffusion-Based Image Synthesis with Context Prediction (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - Towards Truly Zero-shot Compositional Visual Reasoning with LLMs as Programmers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - FMGS: Foundation Model Embedded 3D Gaussian Splatting for Holistic 3D Scene Understanding (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - aMUSEd: An Open MUSE Reproduction (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.4 - Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.4 - CoMoSVC: Consistency Model-based Singing Voice Conversion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - A Vision Check-up for Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.4 - Multilingual Instruction Tuning With Just a Pinch of Multilinguality (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - Exploring the Frontiers of LLMs in Psychological Applications: A Comprehensive Review (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 1.3 - Has Your Pretrained Model Improved? A Multi-head Posterior Based Approach (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - GPT-4V(ision) is a Generalist Web Agent, if Grounded (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - Image Sculpting: Precise Object Editing with 3D Geometry Control (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - SIGNeRF: Scene Integrated Generation for Neural Radiance Fields (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - Incremental FastPitch: Chunk-based High Quality Text to Speech (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - Efficient Hybrid Zoom using Camera Fusion on Mobile Phones (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - A Comprehensive Study of Knowledge Editing for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.3 - Few-shot Adaptation of Multi-modal Foundation Models: A Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - Can AI Be as Creative as Humans? (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.3 - Enhancing the medical foundation model with multi-scale and cross-modality feature learning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.2 - Diagnostic Accuracy of a Large Language Model in Pediatric Case Studies (JAMA doi:10.1001/jamapediatrics.2023.5750)
- 1.2 - An Overarching Framework for the Ethics of Artificial Intelligence in Pediatrics (JAMA doi:10.1001/jamapediatrics.2023.5761)
- 1.2 - A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.2 - LLaMA Beyond English: An Empirical Study on Language Capability Transfer (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.2 - Boundary Attention: Learning to Find Faint Boundaries at Any Resolution (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.2 - Q-Refine: A Perceptual Quality Refiner for AI-Generated Image (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.2 - En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.2 - Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.2 - COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.1 - A Computational Framework for Behavioral Assessment of LLM Therapists (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.1 - DocLLM: A layout-aware generative language model for multimodal document understanding (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.1 - Taming Mode Collapse in Score Distillation for Text-to-3D Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.1 - The Earth is Flat? Unveiling Factual Errors in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.1 - General-purpose foundation models for increased autonomy in robot-assisted surgery (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.1 - PROMPT-IML: Image Manipulation Localization with Pre-trained Foundation Models Through Prompt Tuning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.1 - Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 1.1 - SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
2023
- 12.31 - Opening A Pandora's Box: Things You Should Know in the Era of Custom GPTs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.31 - State of Open Source AI Book - 2023 Edition (book)
- 12.31 - TrailBlazer: Trajectory Control for Diffusion-Based Video Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.31 - Improving Text Embeddings with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.31 - Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.31 - GraphGPT: Graph Learning with Generative Pre-trained Transformers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.31 - Brain-Conditional Multimodal Synthesis: A Survey and Taxonomy (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.31 - GeoGalactica: A Scientific Large Language Model in Geoscience (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12/30 - Ten Noteworthy AI Research Papers of 2023 (Blog),
- 12.30 - Red Teaming for Large Language Models At Scale: Tackling Hallucinations on Mathematics Tasks (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 12.30 - Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.30 - Boosting Large Language Model for Speech Synthesis: An Empirical Study (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.30 - Unicron: Economizing Self-Healing LLM Training at Scale (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.30 - Autonomous Threat Hunting: A Future Paradigm for AI-Driven Threat Intelligence (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.30 - Promoting Segment Anything Model towards Highly Accurate Dichotomous Image Segmentation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.30 - USFM: A Universal Ultrasound Foundation Model Generalized to Tasks and Organs towards Label Efficient Image Analysis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.30 - Pushing Boundaries: Exploring Zero Shot Object Classification with Large Multimodal Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.30 - FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.29 - DB-GPT: Empowering Database Interactions with Private Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 12.29 - A foundation model for atomistic materials chemistry (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.29 - Self-supervised Pretraining for Decision Foundation Model: Formulation, Pipeline and Challenges (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.29 - EHR Interaction Between Patients and AI: NoteAid EHR Interaction (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.29 - Learning Vision from Models Rivals Learning Vision from Data (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.29 - Generative AI for Math: Part I -- MathPile: A Billion-Token-Scale Pretraining Corpus for Math (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.29 - Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision, Language, Audio, and Action (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.29 - Unsupervised Universal Image Segmentation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.29 - DreamGaussian4D: Generative 4D Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.29 - InsActor: Instruction-driven Physics-based Characters (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.29 - The LLM Surgeon (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.29 - Compact Neural Graphics Primitives with Learned Hash Probing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.29 - Restoration by Generation with Constrained Priors (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.28 - Fast Inference of Mixture-of-Experts Language Models with Offloading (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.28 - MobileVLM : A Fast, Reproducible and Strong Vision Language Assistant for Mobile Devices (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.28 - I2V-Adapter: A General Image-to-Video Adapter for Video Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.28 - Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.28 - DiffusionGAN3D: Boosting Text-guided 3D Generation and Domain Adaption by Combining 3D GANs and Diffusion Priors (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.28 - Prompt Expansion for Adaptive Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.28 - TinyGPT-V: Efficient Multimodal Large Language Model via Small Backbones (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.28 - Segment3D: Learning Fine-Grained Class-Agnostic 3D Segmentation without Manual Labels (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.28 - Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.27 - New York Times sues Microsoft and OpenAI for 'billions' (BBC news)
- 12.27 - PanGu-ฯ: Enhancing Language Model Architectures via Nonlinearity Compensation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.27 - City-on-Web: Real-time Neural Rendering of Large-scale Scenes on the Web (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.27 - PanGu-Draw: Advancing Resource-Efficient Text-to-Image Synthesis with Time-Decoupled Training and Reusable Coop-Diffusion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.27 - Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.27 - LangSplat: 3D Language Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.27 - One-dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.26 - DL3DV-10K: A Large-Scale Scene Dataset for Deep Learning-based 3D Vision (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.26 - SSR-Encoder: Encoding Selective Subject Representation for Subject-Driven Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.26 - Audiobox: Unified Audio Generation with Natural Language Prompts (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.26 - A Recipe for Scaling up Text-to-Video Generation with Text-free Videos (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.26 - HarmonyView: Harmonizing Consistency and Diversity in One-Image-to-3D (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.26 - Supervised Knowledge Makes Large Language Models Better In-context Learners (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.25 - UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.24 - Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.24 - ChatGPT 3.5 fails to write appropriate multiple choice practice exam questions (paper), (PDF)
- 12.24 - LARP: Language-Agent Role Play for Open-World Games (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.24 - Make-A-Character: High Quality Text-to-3D Character Generation within Minutes (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.23 - SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.23 - Gemini vs GPT-4V: A Preliminary Comparison and Combination of Vision-Language Models Through Qualitative Cases (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.23 - Human101: Training 100+FPS Human Gaussians in 100s from 1 View (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.23 - On the Promises and Challenges of Multimodal Foundation Models for Geographical, Environmental, Agricultural, and Urban Planning Applications (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.22 - Awesome LLM Interpretability - A curated list of Large Language Model (LLM) Interpretability resources (:octocat:)
- 12.22 - HeadCraft: Modeling High-Detail Shape Variations for Animated 3DMMs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.22 - Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.22 - Paint3D: Paint Anything 3D with Lighting-Less Texture Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.22 - HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.22 - PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.22 - VideoPoet: A Large Language Model for Zero-Shot Video Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - Exploring the intersection of Generative AI and Software Development (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 12.21 - Generative Multimodal Models are In-Context Learners (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.21 - Zero-Shot Metric Depth with a Field-of-View Conditioned Diffusion Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - Splatter Image: Ultra-Fast Single-View 3D Reconstruction (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.21 - UniSDF: Unifying Neural Representations for High-Fidelity 3D Reconstruction of Complex Scenes with Reflections (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - SpecNeRF: Gaussian Directional Encoding for Specular Reflections (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - Repaint123: Fast and High-quality One Image to 3D Generation with Progressive Controllable 2D Repainting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.21 - Neural feels with neural fields: Visuo-tactile perception for in-hand manipulation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - DreamTuner: Single Image is Enough for Subject-Driven Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - TinySAM: Pushing the Envelope for Efficient Segment Anything Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.21 - Align Your Gaussians: Text-to-4D with Dynamic 3D Gaussians and Composed Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - ShowRoom3D: Text to High-Quality 3D Room Generation Using 3D Priors (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - AppAgent: Multimodal Agents as Smartphone Users (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.21 - DREAM-Talk: Diffusion-based Realistic Emotional Audio-driven Method for Single Image Talking Face Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - Time is Encoded in the Weights of Finetuned Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.21 - The alpha version of Midjourney V6 is open for testing (tweet)
- 12.21 - InternVL: Scaling up Vision Foundation Models and Aligning for Generic Visual-Linguistic Tasks (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 12.20 - RadEdit: stress-testing biomedical vision models via diffusion image editing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - Autonomous chemical research with large language models (Nature https://doi.org/10.1038/s41586-023-06792-0)
- 12.20 - Jack of All Tasks, Master of Many: Designing General-purpose Coarse-to-Fine Vision-Language Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - InstructVideo: Instructing Video Diffusion Models with Human Feedback (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - Cached Transformers: Improving Transformers with Differentiable Memory Cache (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - Adaptive Guidance: Training-free Acceleration of Conditional Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - Mini-GPTs: Efficient Large Language Models through Contextual Pruning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - Model-Based Control with Sparse Neural Dynamics (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - RadEdit: stress-testing biomedical vision models via diffusion image editing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - StreamDiffusion: A Pipeline-level Solution for Real-time Interactive Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.20 - Unlocking Pre-trained Image Backbones for Semantic Image Synthesis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.20 - Stable Video Diffusion Now Available on Stability AI Developer Platform API (blog)
- 12.20 - How to Use ChatGPT to Set Transformative Goals for 2024Use ChatGPT to simplify -- and enhance -- your goal-setting proces (news)
- 12.20 - A Challenger to GPT-4V? Early Explorations of Gemini in Visual Expertise (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.20 - Tracking Any Object Amodally (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.19 - Efficient LLM inference solution on Intel GPU (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 12.19 - 3D-LFM: Lifting Foundation Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.19 - HAAR: Text-Conditioned Generative Model of 3D Strand-based Human Hairstyles (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.19 - MixRT: Mixed Neural Representations For Real-Time NeRF Rendering (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.19 - Text-Conditioned Resampler For Long Form Video Understanding (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.19 - TIP: Text-Driven Image Processing with Semantic and Restoration Instructions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.19 - SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.19 - G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.19 - GAvatar: Animatable 3D Gaussian Avatars with Implicit Mesh Learning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.19 - MAG-Edit: Localized Image Editing in Complex Scenarios via Mask-Based Attention-Adjusted Guidance (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.19 - Cascade Speculative Drafting for Even Faster LLM Inference (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.19 - VolumeDiffusion: Flexible Text-to-3D Generation with Efficient Volumetric Encoder (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.19 - MaskINT: Video Editing via Interpolative Non-autoregressive Masked Transformers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.19 - Parameter-Efficient Fine-Tuning Methods for Pretrained Language Models: A Critical Review and Assessment (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.19 - Designing Guiding Principles for NLP for Healthcare: A Case Study of Maternal Health (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.19 - These scientists arenโt using ChatGPT โ hereโs why (Nature doi: https://doi.org/10.1038/d41586-023-04071-6)
- 12.19 - Gemini: A Family of Highly Capable Multimodal Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.19 - Urban Generative Intelligence (UGI): A Foundational Platform for Agents in Embodied City Environment (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.18 - A Comprehensive Survey of Attack Techniques, Implementation, and Mitigation Strategies in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.18 - Retrieval-Augmented Generation for Large Language Models: A Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 12.18 - 2023, year of open LLMs (blog)
- 12.18 - From Google Gemini to OpenAI Q* (Q-Star): A Survey of Reshaping the Generative Artificial Intelligence (AI) Research Landscape (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.18 - OpenAI Preparedness (blog) - (framework)
- 12.18 - An In-depth Look at Gemini's Language Abilities (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS) -
- 12.18 - M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.18 - MagicScroll: Nontypical Aspect-Ratio Image Generation for Visual Storytelling via Multi-Layered Semantic-Aware Denoising (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.18 - Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.17 - VecFusion: Vector Font Generation with Diffusion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.17 - Silkie: Preference Distillation for Large Visual Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.17 - StarVector: Generating Scalable Vector Graphics Code from Images (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.17 - Paloma: A Benchmark for Evaluating Language Model Fit (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.17 - A Survey of Reasoning with Foundation Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 12.17 - VidToMe: Video Token Merging for Zero-Shot Video Editing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.16 - Rich Human Feedback for Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.16 - ProTIP: Progressive Tool Retrieval Improves Planning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.16 - Amphion: An Open-Source Audio, Music and Speech Generation Toolkit (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.16 - ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.16 - Faithful Persona-based Conversational Dataset Generation with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.16 - SlimmeRF: Slimmable Radiance Fields (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.16 - Customize-It-3D: High-Quality 3D Creation from A Single Image Using Subject-Specific Knowledge Prior (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.16 - Topic-VQ-VAE: Leveraging Latent Codebooks for Flexible Topic-Guided Document Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.16 - FastSR-NeRF: Improving NeRF Efficiency on Consumer Devices with A Simple Super-Resolution Pipeline (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.16 - PowerInfer: Fast Large Language Model Serving with a Consumer-grade GPU (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.16 - Challenges with unsupervised LLM knowledge discovery (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.16 - Catwalk: A Unified Language Model Evaluation Framework for Many Datasets (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 12.16 - Point Transformer V3: Simpler, Faster, Stronger (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.15 - Osprey: Pixel Understanding with Visual Instruction Tuning (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 12.15 - Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - Generative Artificial Intelligence and the Creative Economy Staff Report: Perspectives and Takeaways (report), (PDF)
- 12.15 - Weight subcloning: direct initialization of transformers using larger pretrained ones (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - Faster Diffusion: Rethinking the Role of UNet Encoder in Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.15 - Self-Evaluation Improves Selective Generation in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - Extending Context Window of Large Language Models via Semantic Compression (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - Stable Score Distillation for High-Quality 3D Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - Towards the Unification of Generative and Discriminative Visual Foundation Model: A Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - MobileSAMv2: Faster Segment Anything to Everything (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.15 - DreamTalk: When Expressive Talking Head Generation Meets Diffusion Probabilistic Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - Towards the Unification of Generative and Discriminative Visual Foundation Model: A Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - LoRAMoE: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.15 - TinyGSM: achieving >80% on GSM8k with small language models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - VideoLCM: Video Latent Consistency Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - LIME: Localized Image Editing via Attention Regularization in Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - Mosaic-SDF for 3D Generative Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - FineControlNet: Fine-level Text Control for Image Generation with Spatially Aligned Text Control Injection (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.15 - Pixel Aligned Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.15 - General Object Foundation Model for Images and Videos at Scale (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.15 - Holodeck: Language Guided Generation of 3D Embodied AI Environments (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.14 - OpenAI - Superalignment Fast Grants (blog)
- 12.14 - CERN for AGI: A Theoretical Framework for Autonomous Simulation-Based Artificial Intelligence Testing and Alignment (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.14 - ChatSOS LLM-based knowledge QA system for safety engineering (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.14 - Influence of Prompting Strategies on Segment Anything Model (SAM) for Short-axis Cardiac MRI segmentation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.14 - ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.14 - Vision-Language Models as a Source of Rewards (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.14 - StemGen: A music generation model that listens (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.14 - CogAgent: A Visual Language Model for GUI Agents (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.14 - A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.14 - Zebra: Extending Context Window with Layerwise Grouped Local-Global Attention (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.14 - Distributed Inference and Fine-tuning of Large Language Models Over The Internet (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.14 - FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.14 - TigerBot: An Open Multilingual Multitask LLM (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.14 - OpenAI - Prompt engineering (guide)
- 12.14 - OpenAI - Weak-to-strong generalization (blog)
- 12.14 - China releases first AI large language model for ancient book research (news)
- 12.14 - Imagen 2 on Vertex AI is now generally available (Google blog)
- 12.13 - JAMA NETWORK OPEN PUBLISHES CRITERIA FOR MANUSCRIPTS REPORTING CLINICAL USE OF AI (news)
- 12.13 - SEEAvatar: Photorealistic Text-to-3D Avatar Generation with Constrained Geometry and Appearance (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.13 - LLM in a flash: Efficient Large Language Model Inference with Limited Memory (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.13 - PromptBench: A Unified Library for Evaluation of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.13 - SwitchHead: Accelerating Transformers with Mixture-of-Experts Attention (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.13 - CLIP as RNN: Segment Countless Visual Concepts without Training Endeavor (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.13 - Clockwork Diffusion: Efficient Generation With Model-Step Distillation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.13 - Foundation Models in Robotics: Applications, Challenges, and the Future (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.13 - The Rise of โSmall Language Modelsโ and Reinforcement Learning (news)
- 12.13 - Mistral AI Picks โMixture of Expertsโ Model to Challenge GPT 3.5 (news)
- 12.13 - OpenAIโs chief scientist helped to create ChatGPT โ while worrying about AI safety (Nature news)
- 12.13 - Natureโs Ten people (and one non-human) who helped shape science in 2023 (news)
- 12.13 - OpenAI and Axel Springer strike unprecedented deal to offer news in ChatGPT (news)
- 12.13 - ChatGPT and science: the AI system was a force in 2023 โ for good and bad (Nature doi: https://doi.org/10.1038/d41586-023-03930-6)
- 12.13 - PEEKABOO: Interactive Video Generation via Masked-Diffusion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.13 - How Well Does GPT-4V(ision) Adapt to Distribution Shifts? A Preliminary Investigation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.13 - Interfacing Foundation Models' Embeddings (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.13 - FreeInit: Bridging Initialization Gap in Video Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.13 - FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.13 - DiffMorpher: Unleashing the Capability of Diffusion Models for Image Morphing (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (demo)
- 12.13 - VILA: On Pre-training for Visual Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.13 - A Survey of Text Watermarking in the Era of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.12 - promptbase - All things prompt engineering (:octocat:)
- 12.12 - Steering at the Frontier: Extending the Power of Prompting (Microsoft blog)
- 12.12 - LLMEval: A Preliminary Study on How to Evaluate Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - Domain Prompt Learning with Quaternion Networks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - SM70: A Large Language Model for Medical Devices (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - Efficient Few-Shot Clinical Task Adaptation with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - Mathematical Language Models: A Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - Phi-2: The surprising power of small language models (Microsoft blog)
- 12.12 - Microsoft debuts 2.7B-parameter Phi-2 model that outperforms many larger language models (news)
- 12.12 - Googleโs New AI, Gemini, Beats ChatGPT In 30 Of 32 Test Categories (Forbes news)
- 12.12 - From Text to Motion: Grounding GPT-4 in a Humanoid Robot "Alter3" (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - LLM360: Towards Fully Transparent Open-Source LLMs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.12 - Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.12 - Photorealistic Video Generation with Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - "I Want It That Way": Enabling Interactive Decision Support Using Large Language Models and Constraint Programming (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - Fast Training of Diffusion Transformer with Extreme Masking for 3D Point Clouds Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - Rethinking Compression: Reduced Order Modelling of Latent Features in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.12 - Honeybee: Locality-enhanced Projector for Multimodal LLM (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.12 - COLMAP-Free 3D Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.12 - Alignment for Honesty (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.12 - CCM: Adding Conditional Controls to Text-to-Image Consistency Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.11 - Privacy Issues in Large Language Models: A Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 12.11 - Why We Support and Encourage the Use of Large Language Models in NEJM AI Submissions (paper)
- 12.11 - Vary: Scaling up the Vision Vocabulary for Large Vision-Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.11 - Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.11 - A Survey of Large Language Models in Medicine: Principles, Applications, and Challenges (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.11 - Artificial Intelligence vs Clinician Performance in Estimating Probabilities of Diagnoses Before and After Testing (JAMA doi:10.1001/jamanetworkopen.2023.47075)
- 12.11 - Evaluation of Large Language Models for Decision Making in Autonomous Driving (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.11 - Federated Full-Parameter Tuning of Billion-Sized Language Models with Communication Cost under 18 Kilobytes (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.10 - Context Tuning for Retrieval Augmented Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12/09 - Research Papers in Nov 2023: Tackling Hallucinations, Boosting Reasoning Abilities, and New Insights into the Transformer Architecture (Blog),
- 12.9 - Using Captum to Explain Generative Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.9 - Efficient Quantization Strategies for Latent Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.9 - Steering Llama 2 via Contrastive Activation Addition (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.9 - Artificial intelligence act: Council and Parliament strike a deal on the first rules for AI in the world (press)
- 12.9 - Googleโs best Gemini AI demo video was fabricated (news)
- 12.9 - DreaMoving: A Human Dance Video Generation Framework based on Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.9 - PathFinder: Guided Search over Multi-Step Reasoning Paths (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Perspectives on the State and Future of Deep Learning -- 2023 (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Are We Testing or Being Tested? Exploring the Practical Applications of Large Language Models in Software Testing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.8 - Assessing LLMs for Moral Value Pluralism (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.8 - Large-scale Training of Foundation Models for Wearable Biosignals (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Ophtha-LLaMA2: A Large Language Model for Ophthalmology (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - MVDD: Multi-View Depth Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Purple Llama CyberSecEval: A Secure Coding Benchmark for Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - SparQ Attention: Bandwidth-Efficient LLM Inference (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - GPT4 paper assistant: A daily ArXiv scanner (:octocat:), (demo)
- 12.8 - EE-LLM: Large-Scale Training and Inference of Early-Exit Large Language Models with 3D Parallelism (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Customizing Motion in Text-to-Video Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - HyperDreamer: Hyper-Realistic 3D Content Generation and Editing from a Single Image (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Smooth Diffusion: Crafting Smooth Latent Spaces in Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.8 - GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - NeRFiller: Completing Scenes via Generative 3D Inpainting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Large Language Models for Mathematicians (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Gen2Det: Generate to Detect (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - DreamVideo: Composing Your Dream Videos with Customized Subject and Motion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Scaling Laws of Synthetic Images for Model Training ... for Now (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.8 - Hierarchical Spatio-temporal Decoupling for Text-to-Video Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Generating Illustrated Instructions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Efficient Monotonic Multihead Attention (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.8 - Seamless: Multilingual Expressive and Streaming Speech Translation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.7 - Performance of Large Language Models on a Neurology BoardโStyle Examination (JAMA doi:10.1001/jamanetworkopen.2023.46721)
- 12.7 - LEGO: Learning EGOcentric Action Frame Generation via Visual Instruction Tuning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - Beyond Surface: Probing LLaMA Across Scales and Layers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.7 - Pearl: A Production-ready Reinforcement Learning Agent (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.7 - Controllable Human-Object Interaction Synthesis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - Alpha-CLIP: A CLIP Model Focusing on Wherever You Want (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.7 - Text-to-3D Generation with Bidirectional Diffusion using both 2D and 3D priors (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - Announcing Purple Llama: Towards open trust and safety in the new world of generative AI (blog)
- 12.7 - Llama Guard: LLM-based Input-Output Safeguard for Human-AI Conversations (paper), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - Enhancing Medical Task Performance in GPT-4V: A Comprehensive Study on Prompt Engineering Strategies (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - GPT-4V with Emotion: A Zero-shot Benchmark for Multimodal Emotion Understanding (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - Chain of Code: Reasoning with a Language Model-Augmented Code Emulator (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - OneLLM: One Framework to Align All Modalities with Language (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.7 - Relightable Gaussian Codec Avatars (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - MotionCtrl: A Unified and Flexible Motion Controller for Video Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - Context Diffusion: In-Context Aware Image Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - DreamComposer: Controllable 3D Object Generation via Multi-View Conditions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - Self-conditioned Image Generation via Generating Representations (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.7 - Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.7 - Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.7 - Language-Informed Visual Concept Learning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - Assessing the Usability of GutGPT: A Simulation Study of an AI Clinical Decision Support System for Gastrointestinal Bleeding Risk (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.6 - LLaVA-Grounding: Grounded Visual Chat with Large Multimodal Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.6 - Alchemist: Parametric Control of Material Properties with Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - GPT4Point: A Unified Framework for Point-Language Understanding and Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - WhisBERT: Multimodal Text-Audio Language Modeling on 100M Words (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - Cache Me if You Can: Accelerating Diffusion Models through Block Caching (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - HiFi4G: High-Fidelity Human Performance Rendering via Compact Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - LooseControl: Lifting ControlNet for Generalized Depth Conditioning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - MagicStick: Controllable Video Editing via Control Handle Transformations (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.6 - HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - Kandinsky 3.0 Technical Report (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.6 - Schrodinger Bridges Beat Diffusion Models on Text-to-Speech Synthesis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - AnimateZero: Video Diffusion Models are Zero-Shot Image Animators (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.6 - EU Artificial Intelligence act: potential implications for healthcare AI (blog)
- 12.6 - DiffusionSat: A Generative Foundation Model for Satellite Imagery (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - Open-sourced Data Ecosystem in Autonomous Driving: the Present and Future (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.6 - Gemini: A Family of Highly Capable Multimodal Models (PDF)
- 12.6 - Google - AlphaCode 2 Technical Report (PDF)
- 12.6 - Google -Introducing Gemini: our largest and most capable AI model (blog), (Hands-on with Gemini: Interacting with multimodal AI - youtube) -
- 12.6 - Google - Learn more about Gemini, our most capable AI model (blog), (Welcome to the Gemini era - youtube)
- 12.6 - Pixel 8 Pro โ the first smartphone with AI built in โ is now running Gemini Nano, plus more AI updates coming to the Pixel portfolio (blog)
- 12.6 - Early LLM-based Tools for Enterprise Information Workers Likely Provide Meaningful Boosts to Productivity (paper), (pdf)
- 12.6 - Describing Differences in Image Sets with Natural Language (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.6 - LivePhoto: Real Image Animation with Text-guided Motion Control (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - Fine-grained Controllable Video Generation via Object Appearance and Context (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human Captures (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.6 - ReconFusion: 3D Reconstruction with Diffusion Priors (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.5 - Breast Ultrasound Report Generation using LangChain (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.5 - Generating Fine-Grained Human Motions Using ChatGPT-Refined Descriptions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.5 - Orthogonal Adaptation for Modular Customization of Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.5 - FaceStudio: Put Your Face Everywhere in Seconds (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.5 - Gaussian Head Avatar: Ultra High-fidelity Head Avatar via Dynamic Gaussians (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.5 - BenchLMM: Benchmarking Cross-style Visual Capability of Large Multimodal Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.5 - Foundation Models for Weather and Climate Data Understanding: A Comprehensive Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.5 - Creative Agents: Empowering Agents with Imagination for Creative Tasks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.5 - Large Language Models on Graphs: A Comprehensive Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.5 - Magicoder: Source Code Is All You Need (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.5 - Llamafile - Distribute and run LLMs with a single file (:octocat:)
- 12.5 - LLM Visualization (demo)
- 12.5 - Analyzing and Improving the Training Dynamics of Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.4 - Competition-Level Problems are Effective LLM Evaluators (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.4 - A collection of principles for guiding and evaluating large language models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 12.4 - MedXChat: Bridging CXR Modalities with a Unified Multimodal Large Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.4 - Towards General Purpose Vision Foundation Models for Medical Image Analysis: An Experimental Study of DINOv2 on Radiology Benchmarks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.4 - Aligning and Prompting Everything All at Once for Universal Visual Perception (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.4 - Hulk: A Universal Knowledge Translator for Human-Centric Tasks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.4 - AI Alliance Launches as an International Community of Leading Technology Developers, Researchers, and Adopters Collaborating Together to Advance Open, Safe, Responsible AI (Meta blog) -
- 12.4 - Style Aligned Image Generation via Shared Attention (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.4 - Data Management For Large Language Models: A Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.4 - Merlin:Empowering Multimodal LLMs with Foresight Minds (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.4 - X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.3 - Effectively Fine-tune to Improve Large Multimodal Models for Radiology Report Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.3 - DragVideo: Interactive Drag-style Video Editing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.3 - ImageDream: Image-Prompt Multi-view Diffusion for 3D Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.3 - US artificial intelligence leader OpenAI applies for GPT-6, GPT-7 trademarks in China (news)
- 12.3 - Axiomatic Preference Modeling for Longform Question Answering (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.2 - StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.2 - Medical AI Tools Can Make Dangerous Mistakes. Can the Government Help Prevent Them? (WSJ news) - (archive)
- 12.2 - Segment and Caption Anything (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.2 - SeaLLMs -- Large Language Models for Southeast Asia (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.2 - VideoBooth: Diffusion-based Video Generation with Image Prompts (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.2 - Mamba: Linear-Time Sequence Modeling with Selective State Spaces (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.2 - Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized Model Responses (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.1 - Mamba: Linear-Time Sequence Modeling with Selective State Spaces (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 12.1 - Rethinking FID: Towards a Better Evaluation Metric for Image Generation (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ)
- 12.1 - Grounding Everything: Emerging Localization Properties in Vision-Language Transformers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.1 - The Efficiency Spectrum of Large Language Models: An Algorithmic Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.1 - An open letter to ChatGPT on its first birthday (CNN news)
- 12.1 - Explanatory Argument Extraction of Correct Answers in Resident Medical Exams (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.1 - GraphDreamer: Compositional 3D Scene Synthesis from Scene Graphs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.1 - StyleCrafter: Enhancing Stylized Text-to-Video Generation with Style Adapter (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.1 - Dolphins: Multimodal Language Model for Driving (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.1 - DREAM: Diffusion Rectification and Estimation-Adaptive Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.1 - Instruction-tuning Aligns LLMs to the Human Brain (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.1 - Text-Guided 3D Face Synthesis -- From Generation to Editing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.1 - FSGS: Real-Time Few-shot View Synthesis using Gaussian Splatting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.1 - Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 12.1 - PyNeRF: Pyramidal Neural Radiance Fields (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11/30 - TaskBench: Benchmarking Large Language Models for Task Automation (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 11.30 - Zero Bubble Pipeline Parallelism (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 11.30 - Tech predictions for 2024 and beyond (blog)
- 11.30 - Meta - Audiobox: Generating audio from voice and natural language prompts (blog)
- 11.30 - Will Generative Artificial Intelligence Deliver on Its Promise in Health Care? (JAMA doi:10.1001/jama.2023.25054)
- 11.30 - Generative AI could revolutionize health care โ but not if control is ceded to big tech (Nature doi: https://doi.org/10.1038/d41586-023-03803-y)
- 11.30 - ChatGPT one year on: who is using it, how and why? (Nature https://doi.org/10.1038/d41586-023-03798-6)
- 11.30 - RaDialog: A Large Vision-Language Model for Radiology Report Generation and Conversational Assistance (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.30 - X-Dreamer: Creating High-quality 3D Content by Bridging the Domain Gap Between Text-to-2D and Text-to-3D Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.30 - MoMask: Generative Masked Modeling of 3D Human Motions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.30 - HiFi Tuner: High-Fidelity Subject-Driven Fine-Tuning for Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.30 - Six ways large language models are changing healthcare (Nature medicine https://doi.org/10.1038/s41591-023-02700-1)
- 11.30 - Towards Accurate Differential Diagnosis with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.30 - Generative AI could revolutionize health care โ but not if control is ceded to big tech (Nature doi: https://doi.org/10.1038/d41586-023-03803-y)
- 11.30 - Discover, download, and run local LLMs - (LM Studio)
- 11.30 - A timeline of Sam Altmanโs firing from OpenAI โ and the fallout (news)
- 11.30 - Synthetic data: Anthropicโs CAI, from fine-tuning to pretraining, OpenAIโs Superalignment, tips, types, and open examples (blog)
- 11.29 - Deepfakes, Misinformation, and Disinformation in the Era of Frontier AI, Generative AI, and Large AI Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.29 - Are we going MAD? Benchmarking Multi-Agent Debate between Language Models for Medical Q&A (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.29 - Welcome to a new world of work with Amazon Q - (tweet), (blog)
- 11.29 - Scaling deep learning for materials discovery (Nature https://doi.org/10.1038/s41586-023-06735-9)
- 11.29 - Millions of new materials discovered with deep learning (Google DeepMind blog)
- 11.29 - OpenAI Cookbook
- 11.29 - Announcing ElevenLabs Grants! (tweet), (site)
- 11.29 - SDXL Turbo: A real-time text-to-image generation model (tweet), (news)
- 11.29Training Chain-of-Thought via Latent-Variable Inference (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - The Falcon Series of Open Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - AMA issues new principles for AI development, deployment & use (press), (PDF)
- 11.28 - Animate Anyone: Consistent and Controllable Image-to-Video Synthesis for Character Animation (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.28 - Power Hungry Processing: Watts Driving the Cost of AI Deployment? (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - Graph Prompt Learning: A Comprehensive Survey and Beyond (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.28 - ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up? (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - Surf-D: High-Quality Surface Generation for Arbitrary Topologies using Diffusion Models (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - Introducing Pika 1.0, the idea-to-video platform that brings your creativity to life (tweet), (site)
- 11.28 - Adversarial Diffusion Distillation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.28 - MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (Dataset)
- 11.28 - LEDITS++: Limitless Image Editing using Text-to-Image Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - MEDITRON-70B: Scaling Medical Pretraining for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.28 - Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - Effective prompting for Large Multimodal Models like GPT-4 Vision or LLaVA. ๐ฅ (:octocat:)
- 11.28 - The Power of Prompting (blog)
- 11.27 - Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.27 - WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.27 - MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.27 - RO-LLaMA: Generalist LLM for Radiation Oncology via Noise Augmentation and Consistency Regularization (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.27 - Applications of Large Scale Foundation Models for Autonomous Driving (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.27 - Building the Future of Responsible AI: A Reference Architecture for Designing Large Language Model based Agents (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.27 - ChatGPTโs One-Year Anniversary: Generative AIโs Breakout Year (blog)
- 11.27 - GPT-4โs potential in shaping the future of radiology (tweet), (blog)
- 11.27 - Automatic Hallucination detection with SelfCheckGPT NLI (blog)
- 11.25 - Walking a Tightrope -- Evaluating Large Language Models in High-Risk Domains (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.25 - LLM-Assisted Code Cleaning For Training Accurate Code Generators (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.23 - MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (demo)
- 11.23 - Challenges of Large Language Models for Mental Health Counseling (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.23 - MLLM-Bench, Evaluating Multi-modal LLMs using GPT-4V (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.23 - ZipLoRA: Any Subject in Any Style by Effectively Merging LoRAs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.23 - Visual In-Context Prompting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.22 - Positional Description Matters for Transformers Arithmetic (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.22 - Enhancing Summarization Performance through Transformer-Based Prompt Engineering in Automated Medical Reporting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.22 - OpenAI chaos: A timeline of firings, interim CEOs, re-hirings and other twists (blog)
- 11.22 - Here's a timeline of the OpenAI saga with CEO Sam Altman (mashable news)
- 11.22 - A timeline of Sam Altman's firing and dramatic return to OpenAI (Reuters news)
- 11.22 - Sam Altman to return as CEO of OpenAI (news)
- 11.22 - DiffusionMat: Alpha Matting as Sequential Refinement Learning (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.22 - GAIA: a benchmark for General AI Assistants (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.22 - FusionFrames: Efficient Architectural Aspects for Text-to-Video Generation Pipeline (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.22 - LucidDreamer: Domain-free Generation of 3D Gaussian Splatting Scenes (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.22 - Diffusion Model Alignment Using Direct Preference Optimization (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.22 - Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.22 - PG-Video-LLaVA: Pixel Grounding Large Video-Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.22 - Diffusion360: Seamless 360 Degree Panoramic Image Generation based on Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.22 - SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.22 - ChatGPT generates fake data set to support scientific hypothesis (Nature doi: https://doi.org/10.1038/d41586-023-03635-w), (PDF)
- 11.21 - Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 11.21 - Prompting Frameworks for Large Language Models: A Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 11.21 - Itโs Time For โNutrition Labelsโ In Artificial Intelligence (Forbes news)
- 11.21 - From Classification to Clinical Insights: Towards Analyzing and Reasoning About Mobile and Behavioral Health Data With Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.21 - ALPHA: AnomaLous Physiological Health Assessment Using Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.21 - HierSpeech++: Bridging the Gap between Semantic and Acoustic Representation of Speech by Hierarchical Variational Inference for Zero-shot Speech Synthesis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.21 - PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.21 - NeuroPrompts: An Adaptive Framework to Optimize Prompts for Text-to-Image Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.21 - Concept Sliders: LoRA Adaptors for Precise Control in Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.21 - PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.21 - GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.21 - Accuracy of ChatGPT, Google Bard, and Microsoft Bing for Simplifying Radiology Reports (RSNA https://doi.org/10.1148/radiol.232561), (PDF)
- 11.21 - System 2 Attention (is something you might need too) (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.21 - GPQA: A Graduate-Level Google-Proof Q&A Benchmark (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.21 - GPT-4V(ision) for Robotics: Multimodal Task Planning from Human Demonstration (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.20 - Assessing Prompt Injection Risks in 200+ Custom GPTs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.20 - Advancing Transformer Architecture in Long-Context Large Language Models: A Comprehensive Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.20 - Sam Altman to Join Microsoft Following OpenAI Ouster (WSJ news)
- 11.20 - MultiLoRA: Democratizing LoRA for Better Multi-Task Learning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.19 - Meta Prompting for AGI Systems (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.19 - M^{2}UGen: Multi-modal Music Understanding and Generation with the Power of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.19 - LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.19 - AutoStory: Generating Diverse Storytelling Images with Minimal Human Effort (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.19 - TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.18 - Designing Interpretable ML System to Enhance Trustworthy AI in Healthcare: A Systematic Review of the Last Decade to A Proposed Robust Framework (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.18 - MagicDance: Realistic Human Dance Video Generation with Motions & Facial Expressions Transfer (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.18 - Make Pixels Dance: High-Dynamic Video Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.18 - Orca 2: Teaching Small Language Models How to Reason (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.18 - Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.18 - Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.18 - Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.18 - SelfEval: Leveraging the discriminative nature of generative models for evaluation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.18 - Adapters: A Unified Library for Parameter-Efficient and Modular Transfer Learning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.18 - Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.17 - Leveraging Large Language Models for Decision Support in Personalized Oncology (JAMA doi:10.1001/jamanetworkopen.2023.43689)
- 11.17 - PEFT-MedAware: Large Language Model for Medical Awareness (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.17 - OpenAIโs Sam Altman exits as CEO because โboard no longer has confidenceโ in his ability to lead (CNBC news)
- 11.17 - Testing Language Model Agents Safely in the Wild (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.17 - Text-to-Sticker: Style Tailoring Latent Diffusion Models for Human Expression (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.17 - The Chosen One: Consistent Characters in Text-to-Image Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.17 - Adaptive Shells for Efficient Neural Radiance Field Rendering (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.17 - JaxMARL: Multi-Agent RL Environments in JAX (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:![GitHub Repo stars](https://img.shields.io/github/stars/flairox/jaxmarl ?style=social)) -
- 11.16 - By the Numbers: Tracking The AI Executive Order (HAI news)
- 11.16 - Change to policy on the use of generative AI and large language models (SCience blog)
- 11.16 - ML-Bench: Large Language Models Leverage Open-source Libraries for Machine Learning Tasks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - MedAgents: Large Language Models as Collaborators for Zero-shot Medical Reasoning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.16 - HuatuoGPT-II, One-stage Training for Medical Adaption of LLMs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.16 - Do Physicians Know How to Prompt? The Need for Automatic Prompt Optimization Help in Clinical Note Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - AI: The Coming Revolution (report), (presentation)
- 11.16 - VideoCon: Robust Video-Language Alignment via Contrast Captions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.16 - UnifiedVisionGPT: Streamlining Vision-Oriented AI through Generalized Multimodal Framework (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.16 - Exponentially Faster Language Modelling (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.16 - Memory Augmented Language Models through Mixture of Word Experts (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - ToolTalk: Evaluating Tool-Usage in a Conversational Setting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - Video-LLaVA: Learning United Visual Representation by Alignment Before Projection (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.16 - MetaDreamer: Efficient Text-to-3D Creation With Disentangling Geometry and Texture (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - I&S-ViT: An Inclusive & Stable Method for Pushing the Limit of Post-Training ViTs Quantization (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - Contrastive Chain-of-Thought Prompting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.16 - Tied-Lora: Enhacing parameter efficiency of LoRA with weight tying (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - DMV3D: Denoising Multi-View Diffusion using 3D Large Reconstruction Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - Single-Image 3D Human Digitization with Shape-Guided Diffusion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - GRIM: GRaph-based Interactive narrative visualization for gaMes (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - SiRA: Sparse Mixture of Low Rank Adaptation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.16 - Fusion-Eval: Integrating Evaluators with LLMs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Towards Publicly Accountable Frontier LLMs: Building an External Scrutiny Ecosystem under the ASPIRE Framework (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 11.15 - How Trustworthy are Open-Source LLMs? An Assessment under Malicious Demonstrations Shows their Vulnerabilities (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Towards Publicly Accountable Frontier LLMs: Building an External Scrutiny Ecosystem under the ASPIRE Framework (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Can AI solve medical mysteries? Itโs worth finding out (WP news), (archive)
- 11.15 - UFOGen: You Forward Once Large Scale Text-to-Image Generation via Diffusion GANs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Drivable 3D Gaussian Avatars (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - EDMSound: Spectrogram Based Diffusion Models for Efficient and High-Quality Audio Synthesis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - UNcommonsense Reasoning: Abductive Reasoning about Uncommon Situations (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - UT5: Pretraining Non autoregressive T5 with unrolled denoising (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Llamas Know What GPTs Don't Show: Surrogate Models for Confidence Estimation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Thread of Thought Unraveling Chaotic Contexts (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Instant3D: Instant Text-to-3D Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Fine-tuning Language Models for Factuality (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.15 - Fast Chain-of-Thought: A Glance of Future from Parallel Decoding Leads to Answers Faster (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.14 - Extrinsically-Focused Evaluation of Omissions in Medical Summarization (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.14 - Artificial General Intelligence, Existential Risk, and Human Risk Perception (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.14 - Story-to-Motion: Synthesizing Infinite and Controllable Character Animation from Long Text (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.14 - One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.14 - A Survey on Language Models for Code (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.14 - DiLoCo: Distributed Low-Communication Training of Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.14 - Instruction-Following Evaluation for Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.14 - The ART of LLM Refinement: Ask, Refine, and Trust (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.14 - MART: Improving LLM Safety with Multi-round Automatic Red-Teaming (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.14 - Qwen-Audio: Advancing Universal Audio Understanding via Unified Large-Scale Audio-Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.14 - GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.14 - SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.14 - MEGAVERSE: Benchmarking Large Language Models Across Languages, Modalities, Models and Tasks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ) -
- 11.13 - Applying Large Language Models for Causal Structure Learning in Non Small Cell Lung Cancer (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.13 - The Impact of Large Language Models on Scientific Discovery: a Preliminary Study using GPT-4 (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.13 - To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.13 - Music ControlNet: Multiple Time-varying Controls for Music Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.13 - Prompt Engineering a Prompt Engineer (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.12 - Trusted Source Alignment in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.12 - ChatAnything: Facetime Chat with LLM-Enhanced Personas (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.12 - Q-Instruct: Improving Low-level Visual Abilities for Multi-modality Foundation Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.12 - Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.12 - Cappy: Outperforming and Boosting Large Multi-Task LMs with a Small Scorer (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.11 - LayoutPrompter: Awaken the Design Ability of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.11 - GOAT: GO to Any Thing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.11 - Language Models can be Logical Solvers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.11 - Instant3D: Fast Text-to-3D with Sparse-View Generation and Large Reconstruction Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.11 - Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.11 - A Strengths, Weaknesses, Opportunities, and Threats (SWOT) Analysis of ChatGPT Integration in Nursing Education: A Narrative Review (Cureus DOI: 10.7759/cureus.48643)
- 11.11 - The Impact of Chat Generative Pre-trained Transformer (ChatGPT) on Oncology: Application, Expectations, and Future Prospects (Cureus DOI: 10.7759/cureus.48670)
- 11.10 - Holistic Evaluation of GPT-4V for Biomedical Imaging (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.10 - How to Bridge the Gap between Modalities: A Comprehensive Survey on Multimodal Large Language Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.10 - ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.10 - JARVIS-1: Open-World Multi-task Agents with Memory-Augmented Multimodal Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.10 - China proposes new regulations for generative AI focusing on data security, evaluation (news) - (็ๆๅผไบบๅทฅๆบ่ฝๆๅกๅฎๅ
จๅบๆฌ่ฆๆฑ)
- 11.10 - New international consortium formed to create trustworthy and reliable generative AI models for science (news) - (Trillion Parameter Consortium)
- 11.10 - AI roboticsโ โGPT momentโ is near (TC news)
- 11.10 - โฅ๏ธ ChatGPT in prostate cancer: myth or reality? (Prostate Cancer Prostatic Dis https://doi.org/10.1038/s41391-023-00750-7)
- 11.10 - LCM-LoRA: A Universal Stable-Diffusion Acceleration Module (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.10 - LLaVA-Plus: Learning to Use Tools for Creating Multimodal Agents (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.9 - Frontier Language Models are not Robust to Adversarial Arithmetic, or "What do I need to say so you agree 2+2=5? (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.9 - Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.9 - Lumos: Learning Agents with Unified Data, Modular Design, and Open-Source LLMs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.9 - โฅ๏ธ Accuracy of a Vision-Language Model on Challenging Medical Cases (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.9 - The testing framework for ML models, from tabular to LLMs (:octocat:)
- 11.9 - A Survey on Hallucination in Large Language Models: Principles, Taxonomy, Challenges, and Open Questions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.9 - โฅ๏ธ A Survey of Large Language Models in Medicine: Progress, Application, and Challenge (โ), (๐), (๐), (๐), (๐ ), (SS), (โณ๏ธ), (:octocat:)
- 11.9 - GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.9 - u-LLaVA: Unifying Multi-Modal Tasks via Large Language Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.9 - On the Road with GPT-4V(ision): Early Explorations of Visual-Language Model on Autonomous Driving (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.9 - Humane AI Pin: ChatGPT Wearable to Launch with $699 Price Tag (news)
- 11.9 - Microsoft briefly restricted employee access to OpenAIโs ChatGPT, citing security concerns (news)
- 11.8 - Unveiling Safety Vulnerabilities of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.8 - Video Instance Matting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.8 - I2VGen-XL: High-Quality Image-to-Video Synthesis via Cascaded Diffusion Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.8 - OtterHD: A High-Resolution Multi-modality Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.8 - TEAL: Tokenize and Embed ALL for Multi-modal Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.8 - Holistic Evaluation of Text-To-Image Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.8 - 3DiffTection: 3D Object Detection with Geometry-Aware Diffusion Features (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.8 - NExT-Chat: An LMM for Chat, Detection and Segmentation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.8 - LRM: Large Reconstruction Model for Single Image to 3D (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.8 - Prompt Cache: Modular Attention Reuse for Low-Latency Inference (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.8 - Role play with large language models (Nature https://doi.org/10.1038/s41586-023-06647-8)
- 11.8 - How Accurate was ChatGPT for Common Allergy Myths? Pretty Accurate (news)
- 11.8 - Amazon is reportedly racing to build an AI model called Olympus to take on ChatGPT and Bard (news)
- 11.8 - The AI boom is shaking up the tech industry and moving markets. But is it all a mirage? (news)
- 11.8 - Samsung unveils ChatGPT alternative Samsung Gauss that can generate text, code and images (news)
- 11.7 - Benefits and Harms of Large Language Models in Digital Mental Health (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - Evaluating Large Language Models in Ophthalmology (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - Evaluating multiple large language models in pediatric ophthalmology (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - Leveraging Large Language Models for Automated Proof Synthesis in Rust (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - SoundCam: A Dataset for Finding Humans Using Room Acoustics (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - Neural MMO 2.0: A Massively Multi-task Addition to Massively Multi-agent Learning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - Random Field Augmentations for Self-Supervised Representation Learning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - mPLUG-Owl2: Revolutionizing Multi-modal Large Language Model with Modality Collaboration (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.7 - GPT4All: An Ecosystem of Open Source Compressed Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.7 - GLaMM: Pixel Grounding Large Multimodal Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.7 - S-LoRA: Serving Thousands of Concurrent LoRA Adapters (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - Ziya2: Data-centric Learning is All LLMs Need (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.7 - LDM3D-VR: Latent Diffusion Model for 3D VR (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.6 - A Foundation Model for Music Informatics (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.6 - Can LLMs Follow Simple Rules? (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.6 - โChatGPT detectorโ catches AI-generated papers with unprecedented accuracy (Nature doi: https://doi.org/10.1038/d41586-023-03479-4)
- 11.6 - CogVLM: Visual Expert for Pretrained Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.6 - Introducing GPTs (blog)|
- 11.6 - New models and developer products announced at DevDay (blog)
- 11.6 - OpenAI DevDay, Opening Keynote (Youtube), (tweet)
- 11.6 - All the news from OpenAIโs first developer conference (news)
- 11.6 - OpenAI Wants Everyone to Build Their Own Version of ChatGPT (Wired news) - - 11.6 - Meet Angry Pumpkins: A game made using ChatGPT, DALL-E 3 and MidJourney (news)
- 11.6 - ChatGPT subscribers may get a โGPT builderโ option soon (news)
- 11.5 - Towards Generic Anomaly Detection and Understanding: Large-scale Visual-linguistic Model (GPT-4V) Takes the Lead (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.5 - Levels of AGI: Operationalizing Progress on the Path to AGI (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 11/04 - Research Papers in Oct 2023: A Potential Successor to RLHF for Efficient LLM Alignment and the Resurgence of CNNs (Blog),
- 11.4 - Evaluating the Potential of Leading Large Language Models in Reasoning Biology Questions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.4 - MFTCoder: Boosting Code LLMs with Multitask Fine-Tuning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.4 - Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.4 - Ultra-Long Sequence Distributed Transformer (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.4 - Meet Grok โ Elon Muskโs Answer to ChatGPT (Tweet), (news)
- 11.4 - EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.3 - Don't Make Your LLM an Evaluation Benchmark Cheater (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 11.3 - LLM-driven Multimodal Target Volume Contouring in Radiation Oncology (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.3 - FinGPT: Large Generative Models for a Small Language (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.3 - โฅ๏ธ An Introduction to Natural Language Processing Techniques and Framework for Clinical Implementation in Radiation Oncology (โ), (๐), (๐), (๐), (๐ ), (SS), (โณ๏ธ)
- 11.3 - Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.3 - FLAP: Fast Language-Audio Pre-training (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.3 - PPTC Benchmark: Evaluating Large Language Models for PowerPoint Task Completion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.3 - The worldโs week on AI safety: powerful computing efforts launched to boost research (nature doi: https://doi.org/10.1038/d41586-023-03472-x)
- 11.3 - Forget ChatGPT, why Llama and open source AI win 2023 (news)
- 11.3 - RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.3 - Idempotent Generative Network (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.2 - ProAgent: From Robotic Process Automation to Agentic Process Automation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.2 - A Survey of Large Language Models for Autonomous Driving (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.2 - TopicGPT: A Prompt-based Topic Modeling Framework (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.2 - GOV.UK - Introducing the AI Safety Institute (news)
- 11.2 - US to launch its own AI safety institute (news)
- 11.2 - U.S. ARTIFICIAL INTELLIGENCE SAFETY INSTITUTE (FAQ)
- 11.2 - NIST Seeks Collaborators for Consortium Supporting Artificial Intelligence Safety (news)
- 11.2 - RoboVQA: Multimodal Long-Horizon Reasoning for Robotics (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.2 - E3 TTS: Easy End-to-End Diffusion-based Text to Speech (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.2 - In-Context Prompt Editing For Conditional Audio Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.2 - FlashDecoding++: Faster Large Language Model Inference on GPUs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.2 - The AI Engineer Foundation: Open Source for the Future of AI (news), (:octocat:)
- 11.2 - LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.2 - De-Diffusion Makes Text a Strong Cross-Modal Interface (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.2 - Controllable Music Production with Diffusion Models and Guidance Gradients (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.1 - Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.1 - AMSP: Super-Scaling LLM Training via Advanced Model States Partitioning (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.1 - ChatCoder: Chat-based Refine Requirement Improves LLMs' Code Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.1 - Grounding Visual Illusions in Language: Do Vision-Language Models Perceive Illusions Like Humans? (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.1 - ChipNeMo: Domain-Adapted LLMs for Chip Design (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.1 - Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:) -
- 11.1 - The Generative AI Paradox: "What It Can Create, It May Not Understand" (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ) -
- 11.1 - Learning From Mistakes Makes LLM Better Reasoner (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 11.1 - Exclusive: Stability AI brings advanced 3D and image fine-tuning to Stable Diffusion (VentureBeatnews)
- 11.1 - An Early Look at Stability AI's New Text to 3D Model (news)
- 11.1 - Microsoft 365 Copilot is available for purchase starting today. Here's what to know (ZDnet news) -
- 11.1 - GOV.UK - The Bletchley Declaration by Countries Attending the AI Safety Summit, 1-2 November 2023 (Policy paper)
- 11.1 - Generative AI for Beginners - A Course (:octocat:) -
- 11.1 - GOV.UK - Countries agree to safe and responsible development of frontier AI in landmark Bletchley Declaration (press)
- 11.1 - JADE: A Linguistics-based Safety Evaluation Platform for LLM (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.31 - Taking control: Policies to address extinction risks from advanced AI (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.31 - Does GPT-4 Pass the Turing Test? (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (test)
- 10.31 - โฅ๏ธ A Comprehensive Study of GPT-4V's Multimodal Capabilities in Medical Imaging (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.31 - Artificial intelligence - UK Regulatory Outlook October 2023 (news)
- 10.31 - MM-VID: Advancing Video Understanding with GPT-4V(ision) (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.30 - EHRTutor: Enhancing Patient Understanding of Discharge Instructions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.30 - Transformation vs Tradition: Artificial General Intelligence (AGI) for Arts and Humanities (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.30 - Towards A Holistic Landscape of Situated Theory of Mind in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.30 - RedPajama-Data-v2: an Open Dataset with 30 Trillion Tokens for Training Large Language Models (blog), (:octocat:)
- 10.30 - Awesome LLMs Evaluation Papers (:octocat:)
- 10.30 - Evaluating Large Language Models: A Comprehensive Survey (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS), (:octocat:)
- 10.30 - Phishing emails increase over 1,200 percent since ChatGPT launch (news)
- 10.30 - G7 Leadersโ Statement on the Hiroshima AI Process (statement), (download), (white house)
- 10.30 - Commission welcomes G7 leaders' agreement on Guiding Principles and a Code of Conduct on Artificial Intelligence (news)
- 10.30 - Hiroshima Process International Guiding Principles for Advanced AI system (news), (download)
- 10.30 - Hiroshima Process International Code of Conduct for Advanced AI Systems (news), (download)
- 10.30 - FACT SHEET: President Biden Issues Executive Order on Safe, Secure, and Trustworthy Artificial Intelligence (news)
- 10.30 - Joe Bidenโs Sweeping New Executive Order Aims to Drag the US Government Into the Age of ChatGPT (Wired news)
- 10.30 - โฅ๏ธ Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4V (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.30 - Atom: Low-bit Quantization for Efficient and Accurate LLM Serving (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.30 - Skywork: A More Open Bilingual Foundation Model (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.30 - VideoCrafter1: Open Diffusion Models for High-Quality Video Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.30 - Text-to-3D with classifier score distillation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.29 - TeacherLM: Teaching to Fish Rather Than Giving the Fish, Language Modeling Likewise (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.28 - Foundational Models in Medical Imaging: A Comprehensive Survey and Future Vision (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 10.28 - Overview of Current Applications of Large Language Models in Various Medical Specialities (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.28 - Punica: Multi-Tenant LoRA Serving (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.28 - Personalised Distillation: Empowering Open-Sourced LLMs with Adaptive Learning for Code Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.27 - โฅ๏ธ Qilin-Med-VL: Towards Chinese Large Vision-Language Model for General Healthcare (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.27 - JudgeLM: Fine-tuned Large Language Models are Scalable Judges (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.27 - A Framework for Automated Measurement of Responsible AI Harms in Generative AI Applications (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.27 - ControlLLM: Augment Language Models with Tools by Searching on Graphs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.27 - FP8-LM: Training FP8 Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.27 - United Nations creates advisory body to address AI governance (Reuters news, UN AI Advisory Body)
- 10.27 - Guarding the AI frontier: A proposal for federal regulation (news)
- 10.27 - GOV.UK - Emerging processes for frontier AI safety (white paper - HTML, PDF)
- 10.27 - GOV.UK - Leading frontier AI companies publish safety policies (news)
- 10.26 - Style-Aware Radiology Report Generation with RadGraph and Few-Shot Prompting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.26 - UK Prime Minister announces worldโs first AI Safety Institute (news)
- 10.26 - Using fine-tuned large language models to parse clinical notes in musculoskeletal pain disorders (Lancet https://doi.org/10.1016/S2589-7500(23)00202-9)
- 10.26 - Large Language Models as Generalizable Policies for Embodied Tasks (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.26 - How the Foundation Model Transparency Index Distorts Transparency (blog)
- 10.26 - Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.26 - Controlled Decoding from Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.26 - HyperFields: Towards Zero-Shot Generation of NeRFs from Text (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.26 - CodeFusion: A Pre-trained Diffusion Model for Code Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.26 - BostonDynamics - a robot tour guide using Spot integrated with Chat GPT and other AI models as a proof of concept for the robotics applications of foundational models (Youtube)
- 10.26 - A Picture is Worth a Thousand Words: Principled Recaptioning Improves Image Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.26 - DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.25 - Gov.UK - Frontier AI: capabilities and risks โ discussion paper (paper)
- 10.25 - Qualcomm Raises Bar for On-Device Generative AI at Snapdragon Summit (news) - (Keynote)
- 10.25 - Artificial Intelligence in Health Care: Peter Lee on Empathy, Empowerment, and Equity (blog)
- 10.25 - โฅ๏ธ An Integrative Survey on Mental Health Conversational Agents to Bridge Computer Science and Medical Perspectives (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.25 - OpenAI - Frontier risk and preparedness (Blog)
- 10.25 - Together with Anthropic, Google, and Microsoft, weโre announcing the new Executive Director of the Frontier Model Forum and a new $10 million AI Safety Fund (blog)
- 10.25 - An Early Evaluation of GPT-4V(ision) (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.25 - In-Context Learning Creates Task Vectors (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.25 - Woodpecker: Hallucination Correction for Multimodal Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.25 - Dissecting In-Context Learning of Translations in GPTs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.24 - BLESS: Benchmarking Large Language Models on Sentence Simplification (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.24 - NoteChat: A Dataset of Synthetic Doctor-Patient Conversations Conditioned on Clinical Notes (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.24 - Clinfo.ai: An Open-Source Retrieval-Augmented Large Language Model System for Answering Medical Questions using Scientific Literature (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ) * 10.24 - LoRAShear: Efficient Large Language Model Structured Pruning and Knowledge Recovery (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.24 - SAM-CLIP: Merging Vision Foundation Models towards Semantic and Spatial Understanding (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.24 - Wonder3D: Single Image to 3D using Cross-Domain Diffusion (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.24 - Matryoshka Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.24 - DEsignBench: Exploring and Benchmarking DALL-E 3 for Imagining Visual Design (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.24 - FreeNoise: Tuning-Free Longer Video Diffusion Via Noise Rescheduling (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.24 - Branch-Solve-Merge Improves Large Language Model Evaluation and Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10/23 - AI and Open Source in 2023 (Blog),
- 10.23 - Systematic AI Approach for AGI: Addressing Alignment, Energy, and AGI Grand Challenges (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.23 - Evaluating Large Language Models on Controlled Generation Tasks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.23 - AlpaCare:Instruction-tuned Large Language Models for Medical Application (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.23 - Large Search Model: Redefining Search Stack in the Era of LLMs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.23 - InstructExcel: A Benchmark for Natural Language Instruction in Excel (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.23 - HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.23 - Moral Foundations of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.23 - Exploring the Boundaries of GPT-4 in Radiology (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.22 - An International Consortium for Evaluations of Societal-Scale Risks from Advanced AI (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.22 - Assessing the Utilization of Large Language Models in Medical Education: Insights From Undergraduate Medical Students (Cureus DOI: 10.7759/cureus.47468)
- 10.21 - TexFusion: Synthesizing 3D Textures with Text-Guided Image Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.21 - Ensemble-Instruct: Generating Instruction-Tuning Data with a Heterogeneous Mixture of LMs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.21 - Specific versus General Principles for Constitutional AI (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.21 - Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.21 - Contrastive Preference Learning: Learning from Human Feedback without RL (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - Democratizing Reasoning Ability: Tailored Learning from Large Language Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - DPM-Solver-v3: Improved Diffusion ODE Solver with Empirical Model Statistics (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.20 - Auto-Instruct: Automatic Instruction Generation and Ranking for Black-Box Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - Localizing and Editing Knowledge in Text-to-Image Generative Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - Habitat 3.0: A Co-Habitat for Humans, Avatars and Robots (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - SALMONN: Towards Generic Hearing Abilities for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.20 - Teaching Language Models to Self-Improve through Interactive Demonstrations (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - DreamSpace: Dreaming Your Room Space with Text-Driven Panoramic Texture Propagation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - Creative Robot Tool Use with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - Tuna: Instruction Tuning using Feedback from Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.20 - ToolChain*: Efficient Action Space Navigation in Large Language Models with A* Search (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - SILC: Improving Vision Language Pretraining with Self-Distillation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - Towards Understanding Sycophancy in Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.20 - ScaleLong: Towards More Stable Training of Diffusion Model via Scaling Network Long Skip Connection (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - 3D-GPT: Procedural 3D Modeling with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - Eureka: Human-Level Reward Design via Coding Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.20 - AgentTuning: Enabling Generalized Agent Abilities for LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.20 - Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - AutoMix: Automatically Mixing Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.20 - An Emulator for Fine-Tuning Large Language Models using Small Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.20 - ChatGPT parent OpenAI seeks $86bn valuation (FT (news)
- 10.19 - The Foundation Model Transparency Index (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS), (:octocat:)
- 10.19 - An Image is Worth Multiple Words: Learning Object Level Concepts using Multi-Concept Prompt Learning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.19 - Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.19 - Safe RLHF: Safe Reinforcement Learning from Human Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.19 - Enhancing High-Resolution 3D Generation through Pixel-wise Gradient Clipping (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.18 - DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.18 - Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.18 - MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.18 - Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.18 - BitNet: Scaling 1-bit Transformers for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.18 - 4K4D: Real-Time 4D View Synthesis at 4K Resolution (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.18 - VeRA: Vector-based Random Matrix Adaptation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.18 - Set-of-Mark Prompting Unleashes Extraordinary Visual Grounding in GPT-4V (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.18 - EvalCrafter: Benchmarking and Evaluating Large Video Generation Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.17 - Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.17 - Integrating LLM, EEG, and Eye-Tracking Biomarker Analysis for Word-Level Neural State Classification in Semantic Inference Reading Comprehension (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.17 - Emulating Human Cognitive Processes for Expert-Level Medical Question-Answering with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.17 - TEQ: Trainable Equivalent Transformation for Quantization of LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.17 - LAMP: Learn A Motion Pattern for Few-Shot-Based Video Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.17 - CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.17 - Context-Aware Meta-Learning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.17 - H2O Open Ecosystem for State-of-the-art Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.17 - In-Context Pretraining: Language Modeling Beyond Document Boundaries (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.17 - Interactive Task Planning with Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.17 - Video Language Planning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.16 - Large Models for Time Series and Spatio-Temporal Data: A Survey and Outlook (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 10.16 - OpenAgents: An Open Platform for Language Agents in the Wild (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.16 - How ChatGPT is transforming the postdoc experience (Nature 622, 655-657 (2023) (doi: https://doi.org/10.1038/d41586-023-03235-8)
- 10.16 - Llemma: An Open Language Model For Mathematics (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.15 - AutoAgents: A Framework for Automatic Agent Generation (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.15 - Can GPT-4V(ision) Serve Medical Applications? Case Studies on GPT-4V for Multimodal Medical Diagnosis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.14 - MiniGPT-v2: large language model as a unified interface for vision-language multi-task learning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.14 - Reward-Augmented Decoding: Efficient Controlled Text Generation With a Unidirectional Reward Model (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.14 - Table-GPT: Table-tuned GPT for Diverse Table Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.14 - PaLI-3 Vision Language Models: Smaller, Faster, Stronger (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.13 - Multinational AGI Consortium (MAGIC): A Proposal for International Coordination on AI (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.13 - A Zero-Shot Language Agent for Computer Control with Structured Reflection (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.13 - The Consensus Game: Language Model Generation via Equilibrium Search (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.13 - LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.13 - CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.13 - Toward Joint Language Modeling for Speech Units and Text (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.13 - Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.13 - HyperHuman: Hyper-Realistic Human Generation with Latent Structural Diffusion (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.13 - GaussianDreamer: Fast Generation from Text to 3D Gaussian Splatting with Point Cloud Priors (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.13 - MotionDirector: Motion Customization of Text-to-Video Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.12 - Organizational preparedness for the use of large language models in pathology informatics (Journal of Pathology Informatics, https://doi.org/10.1016/j.jpi.2023.100338)
- 10.12 - โฅ๏ธ FDA creates new advisory committee for digital health and AI (news)
- 10.12 - EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.12 - LangNav: Language as a Perceptual Representation for Navigation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.12 - Octopus: Embodied Vision-Language Programmer from Environmental Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.12 - Prometheus: Inducing Fine-grained Evaluation Capability in Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.12 - Can GPT models be Financial Analysts? An Evaluation of ChatGPT and GPT-4 on mock CFA Exams (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.11 - Exploring the Landscape of Large Language Models In Medical Question Answering: Observations and Open Questions (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.11 - Lemur: Harmonizing Natural Language and Code for Language Agents (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.11 - Apple - Ferret: Refer and Ground Anything Anywhere at Any Granularity (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.10 - Open-Sourcing Highly Capable Foundation Models: An Evaluation of Risks, Benefits, and Alternative Methods for Pursuing Open-Source Objective (paper), (PDF)
- 10.10 - Teaching Language Models to Hallucinate Less with Synthetic Tasks (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.10 - Towards Mitigating Hallucination in Large Language Models via Self-Reflection (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.10 - Multilingual Jailbreak Challenges in Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.10 - Feasibility of Using the Privacy-preserving Large Language Model Vicuna for Labeling Radiology Reports (RSNA Radiology https://doi.org/10.1148/radiol.231147)
- 10.10 - Benchmarking and Explaining Large Language Model-based Code Generation: A Causality-Centric Approach (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.10 - How ChatGPT and other AI tools could disrupt scientific publishing (Nature 622, 234-236 (2023) doi: https://doi.org/10.1038/d41586-023-03144-w)
- 10.9 - GraphLLM: Boosting Graph Reasoning Ability of Large Language Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:![GitHub Repo stars](https://img.shields.io/github/stars/ mistyreed63849/graph-llm?style=social))
- 10.9 - HyperAttention: Long-context Attention in Near-Linear Time (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.9 - โฅ๏ธ A Survey of Large Language Models for Healthcare: from Data, Technology, and Applications to Accountability and Ethics (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.8 - โฅ๏ธ ChatRadio-Valuer: A Chat Large Language Model for Generalizable Radiology Report Generation Based on Multi-institution and Multi-system Data (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.7 - Data-Centric Financial Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.6 - Segmented Harmonic Loss: Handling Class-Imbalanced Multi-Label Clinical Data for Medical Coding with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.6 - Governments race to regulate AI tools (Reuters news)
- 10.6 - MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.6 - Improved Baselines with Visual Instruction Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.6 - Aligning Text-to-Image Diffusion Models with Reward Backpropagation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.6 - DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.6 - Leveraging Unpaired Data for Vision-Language Generative Models via Cycle Consistency (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.6 - A Long Way to Go: Investigating Length Correlations in RLHF (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.6 - Drag View: Generalizable Novel View Synthesis with Unposed Imagery (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.6 - HeaP: Hierarchical Policies for Web Actions using LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.5 - Redefining Digital Health Interfaces with Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.5 - Benchmarking a foundation LLM on its ability to re-label structure names in accordance with the AAPM TG-263 report (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.5 - Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT (Nat Comput Sci 3, 833โ838 (2023). https://doi.org/10.1038/s43588-023-00527-x)
- 10.5 - Large Language Model Cascades with Mixture of Thoughts Representations for Cost-efficient Reasoning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.5 - FreshLLMs: Refreshing Large Language Models with Search Engine Augmentation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.5 - Kandinsky: an Improved Text-to-Image Synthesis with Image Prior and Latent Diffusion (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.4 - Functional trustworthiness of AI systems by statistically valid testing (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.4 - EcoAssistant: Using LLM Assistant More Affordably and Accurately (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.4 - How FaR Are Large Language Models From Agents with Theory-of-Mind? (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10/3 - OceanGPT: A Large Language Model for Ocean Science Tasks (โ), (๐), (๐), (๐), (๐ ), (HTML), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 10.3 - Low-Resource Languages Jailbreak GPT-4 (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.3 - Can large language models provide useful feedback on research papers? A large-scale empirical analysis (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.3 - โฅ๏ธ Conversational Health Agents: A Personalized LLM-Powered Agent Framework (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.3 - Large Language Models Cannot Self-Correct Reasoning Yet (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.3 - ImagenHub: Standardizing the evaluation of conditional image generation models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.3 - Large Language Models as Analogical Reasoners (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.3 - SmartPlay : A Benchmark for LLMs as Intelligent Agents (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.3 - Conditional Diffusion Distillation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.2 - Evaluating the Application of Large Language Models in Clinical Research Contexts (JAMA doi:10.1001/jamanetworkopen.2023.35924)
- 10.2 - Who is ChatGPT? Benchmarking LLMs' Psychological Portrayal Using PsychoBench (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.2 - Investigating the Efficacy of Large Language Models in Reflective Assessment Methods through Chain of Thoughts Prompting (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.2 - Mirror Diffusion Models for Constrained and Watermarked Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10.2 - UniAudio: An Audio Foundation Model Toward Universal Audio Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 10.2 - Enable Language Models to Implicitly Learn Self-Improvement From Data (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 10/01 - RoleLLM: Benchmarking, Eliciting, and Enhancing Role-Playing Abilities of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ), (:octocat:)
- 10.1 - PixArt-ฮฑ: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.30 - Coordinated pausing: An evaluation-based coordination scheme for frontier AI developers (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 9.30 - Deployment Corrections: An incident response framework for frontier AI models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 9.30 - Open-Sourcing Highly Capable Foundation Models: An evaluation of risks, benefits, and alternative methods for pursuing open-source objectives (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.29 - An evaluation of GPT models for phenotype concept recognition (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.29 - Vision Transformers Need Registers (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.29 - The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision) (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.29 - DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.29 - Text-to-3D using Gaussian Splatting (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.29 - Qwen Technical Report (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:![GitHub Repo stars](https://img.shields.io/github/stars/qwenlm/qwen ?style=social))
- 9.29 - Deep Geometrized Cartoon Line Inbetweening (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.29 - Demystifying CLIP Data (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.29 - MotionLM: Multi-Agent Motion Forecasting as Language Modeling (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.29 - GPT-Fathom: Benchmarking Large Language Models to Decipher the Evolutionary Path towards GPT-4 and Beyond (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.29 - RealFill: Reference-Driven Generation for Authentic Image Completion (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.29 - CCEdit: Creative and Controllable Video Editing via Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.29 - ConceptGraphs: Open-Vocabulary 3D Scene Graphs for Perception and Planning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.28 - Emu: Enhancing Image Generation Models Using Photogenic Needles in a Haystack (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.28 - Language models in molecular discovery (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.28 - Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.28 - AutoCLIP: Auto-tuning Zero-Shot Classifiers for Vision-Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.28 - AnyMAL: An Efficient and Scalable Any-Modality Augmented Language Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.28 - Effective Long-Context Scaling of Foundation Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.28 - Show-1: Marrying Pixel and Latent Diffusion Models for Text-to-Video Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.27 - A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.27 - NeuRBF: A Neural Fields Representation with Adaptive Radial Basis Functions (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.27 - Low-rank Adaptation of Large Language Model Rescoring for Parameter-Efficient Speech Recognition (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.27 - Jointly Training Large Autoregressive Multimodal Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.27 - DECO: Dense Estimation of 3D Human-Scene Contact In The Wild (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.27 - Finite Scalar Quantization: VQ-VAE Made Simple (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.27 - VPA: Fully Test-Time Visual Prompt Adaptation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.27 - LAVIE: High-Quality Video Generation with Cascaded Latent Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.27 - VideoDirectorGPT: Consistent Multi-scene Video Generation via LLM-Guided Planning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.27 - Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.26 - โฅ๏ธ Creating Trustworthy LLMs: Dealing with Hallucinations in Healthcare AI (โ), (๐), (๐), (๐), (SS), (๐ ), (โณ๏ธ)
- 9.26 - QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.26 - Aligning Large Multimodal Models with Factually Augmented RLHF (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.26 - DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.26 - Efficient Post-training Quantization with FP8 Formats (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.26 - DeepSpeed-VisualChat: Multi-Round Multi-Image Interleave Chat via Multi-Modal Causal Attention (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.26 - Small-scale proxies for large-scale Transformer training instabilities (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.25 - VidChapters-7M: Video Chapters at Scale (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.25 - Evaluating Cognitive Maps and Planning in Large Language Models with CogEval (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 09/23 - Research Papers Aug-Sep 2023: From Self-Alignment to LongLoRA (Blog),
- 9.23 - MosaicFusion: Diffusion Models as Data Augmenters for Large Vocabulary Instance Segmentation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.23 - Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.23 - Robotic Offline RL from Internet Videos via Value-Function Pre-Training (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.23 - Exploring Large Language Models' Cognitive Moral Development through Defining Issues Test (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.23 - Calibrating LLM-Based Evaluator (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.22 - Affect Recognition in Conversations Using Large Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.22 - DRG-LLaMA : Tuning LLaMA Model to Predict Diagnosis-related Group for Hospitalized Patients (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.22 - CodePlan: Repository-level Coding using LLMs and Planning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.22 - DualToken-ViT: Position-aware Efficient Vision Transformer with Dual Token Fusion (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.22 - LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.22 - LLM-Grounder: Open-Vocabulary 3D Visual Grounding with Large Language Model as an Agent (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.22 - MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.22 - Game of Thrones author sues ChatGPT owner OpenAI (BBC news)
- 9.22 - Prompt Pre-Training with Twenty-Thousand Classes for Open-Vocabulary Visual Recognition (NeurIPS 2023 abstract), (PDF)
- 9.21 - Foundation Metrics: Quantifying Effectiveness of Healthcare Conversations powered by Generative AI (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.21 - How Robust is Google's Bard to Adversarial Image Attacks? (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.21 - SCREWS: A Modular Framework for Reasoning with Revisions (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.21 - OpenAI release preview of Dall-E 3 (tweet), (DALLยทE 3)
- 9.21 - LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.21 - BTLM-3B-8K: 7B Parameter Performance in a 3B Parameter Model (โ), (๐), (๐), (๐ ), (โณ๏ธ) -
- 9.21 - A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.21 - DreamLLM: Synergistic Multimodal Comprehension and Creation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.21 - FreeU: Free Lunch in Diffusion U-Net (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.21 - Kosmos-2.5: A Multimodal Literate Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.21 - Chain-of-Verification Reduces Hallucination in Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.20 - OpenChat: Advancing Open-source Language Models with Mixed-Quality Data (โ), (๐),, (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.20 - A Large-scale Dataset for Audio-Language Representation Learning (โ), (๐), (๐), (๐ ), (โณ๏ธ) -
- 9.20 - LMDX: Language Model-based Document Information Extraction and Localization (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.20 - The Languini Kitchen: Enabling Language Modelling Research at Different Scales of Compute (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.20 - OpenAIโs Dall-E 3 Is an Art Generator Powered by ChatGPT (Wired news)
- 9.20 - OpenBA: An Open-sourced 15B Bilingual Asymmetric seq2seq Model Pre-trained from Scratch (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.19 - MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 9.19 - Enhancing Health Data Interoperability with Large Language Models: A FHIR Study (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.19 - OpenCog Hyperon: A Framework for AGI at the Human Level and Beyond (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.19 - SlimPajama-DC: Understanding Data Combinations for LLM Training (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.19 - Baichuan 2: Open Large-scale Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.19 - Stabilizing RLHF through Advantage Model and Selective Rehearsal (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.19 - 360^circ Reconstruction From a Single Image Using Space Carved Outpainting (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.19 - Language Modeling Is Compression (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.19 - Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.18. - Data Formulator: AI-powered Concept-driven Visualization Authoring (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.18 - MindAgent: Emergent Gaming Interaction (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.18 - An Empirical Study of Scaling Instruct-Tuned Large Multimodal Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.18 - LayoutNUWA: Revealing the Hidden Layout Expertise of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.18 - Multimodal Foundation Models: From Specialists to General-Purpose Assistants (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.18 - Adapting Large Language Models via Reading Comprehension (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.17 - OWL: A Large Language Model for IT Operations (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.17 - CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.17 - Contrastive Decoding Improves Reasoning in Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.16 - PDFTriage: Question Answering over Long, Structured Documents (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.16 - Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT) (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.16 - Struc-Bench: Are Large Language Models Really Good at Generating Complex Structured Data? (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.15 - Compositional Foundation Models for Hierarchical Planning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.15 - Scaling Laws for Sparsely-Connected Foundation Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.15 - Investigating Answerability of LLMs for Long-Form Question Answering (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.15 - Connecting Large Language Models with Evolutionary Algorithms Yields Powerful Prompt Optimizers (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.15 - TextBind: Multi-turn Interleaved Multimodal Instruction-following (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.15 - LASER: LLM Agent with State-Space Exploration for Web Navigation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.14 - Towards Artificial General Intelligence (AGI) in the Internet of Things (IoT): Opportunities and Challenges (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.14 - The Rise and Potential of Large Language Model Based Agents: A Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ) ,(:octocat:) -
- 9.14 - Agents: An Open-source Framework for Autonomous Language Agents (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.14 - Generative Image Dynamics (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.14 - Clinical Text Summarization: Adapting Large Language Models Can Outperform Human Experts (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.14 - AudioSR: Versatile Audio Super-resolution at Scale (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.14 - Are Large Language Model-based Evaluators the Solution to Scaling Up Multilingual Evaluation? (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.14 - Ambiguity-Aware In-Context Learning with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.13 - RAIN: Your Language Models Can Align Themselves without Finetuning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.13 - Text-Guided Generation and Editing of Compositional 3D Avatars (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.13 - MagiCapture: High-Resolution Multi-Concept Portrait Customization (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.13 - DreamStyler: Paint by Style Inversion with Text-to-Image Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.12 - Re-Reading Improves Reasoning in Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.12 - A Survey of Hallucination in Large Foundation Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.12 - Learning Disentangled Avatars with Hybrid 3D Representations (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.12 - InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.12 - Efficient Memory Management for Large Language Model Serving with PagedAttention (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.12 - Large Language Model for Science: A Study on P vs. NP (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.12 - AstroLLaMA: Towards Specialized Foundation Models in Astronomy (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.12 - Uncovering mesa-optimization algorithms in Transformers (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.11 - MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.11 - PhotoVerse: Tuning-Free Image Customization with Text-to-Image Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.11 - Large Language Models for Compiler Optimization (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.11 - NExT-GPT: Any-to-Any Multimodal LLM (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.11 - Textbooks Are All You Need II: phi-1.5 technical report (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.11 - Optimize Weight Rounding via Signed Gradient Descent for the Quantization of LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.10 - Transformers in Small Object Detection: A Benchmark and Survey of State-of-the-Art (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.10 - Neurons in Large Language Models: Dead, N-gram, Positional (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.9 - MADLAD-400: A Multilingual And Document-Level Large Audited Dataset (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.9 - When Less is More: Investigating Data Pruning for Pretraining LLMs at Scale (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.9 - FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.8 - From Sparse to Dense: GPT-4 Summarization with Chain of Density Prompting (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.8 - Mobile V-MoEs: Scaling Down Vision Transformers via Sparse Mixture-of-Experts (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.7.FIND: A Function Description Benchmark for Evaluating Interpretability Methods (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.7 - GOV.UK - Frontier AI Taskforce: first progress report (report)
- 9.7 - InstructDiffusion: A Generalist Modeling Interface for Vision Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.7 - ImageBind-LLM: Multi-modality Instruction Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.7 - ProPainter: Improving Propagation and Transformer for Video Inpainting (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.7 - Tracking Anything with Decoupled Video Segmentation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.7 - FLM-101B: An Open LLM and How to Train It with $100K Budget (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.7 - Large-Scale Automatic Audiobook Creation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.7 - Large Language Models as Optimizers (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.7 - SyncDreamer: Generating Multiview-consistent Images from a Single-view Image (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.7 - Text2Control3D: Controllable 3D Avatar Generation in Neural Radiance Fields using Geometry-Guided Text-to-Image Diffusion Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.7 - DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.7 - XGen-7B Technical Report (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.7 - Reuse and Diffuse: Iterative Denoising for Text-to-Video Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.7 - SLiMe: Segment Like Me (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.6 - GPT Can Solve Mathematical Problems Without a Calculator (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.6 - Physically Grounded Vision-Language Models for Robotic Manipulation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.6 - Doppelgangers: Learning to Disambiguate Images of Similar Structures (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.5 - Artificial General Intelligence for Radiation Oncology (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.5 - Cognitive Architectures for Language Agents (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.5 - Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.5 - One Wide Feedforward is All You Need (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.5 - AniPortraitGAN: Animatable 3D Portrait Generation from 2D Image Collections (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.5 - PromptTTS 2: Describing and Generating Voices with Text Prompt (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.5 - Hierarchical Masked 3D Diffusion Model for Video Outpainting (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.4 - Concepts is All You Need: A More Direct Path to AGI (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.4 - StyleAdapter: A Single-Pass LoRA-Free Model for Stylized Image Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.4 - ControlMat: A Controlled Generative Approach to Material Capture (โ), (๐), (๐), (๐ ), (โณ๏ธ) -
- 9.3 - ModelScope-Agent: Building Your Customizable Agent System with Open-source Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.2 - Bias and Fairness in Large Language Models: A Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.2 - Contrastive Feature Masking Open-Vocabulary Vision Transformer (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.2 - Efficient RLHF: Reducing the Memory Usage of PPO (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.2 - MagicProp: Diffusion-based Video Editing via Motion-aware Appearance Propagation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.2 - Google's search for an AI future as it turns 25 (BBC news)
- 9.2 - ChatGPT Glossary: 41 AI Terms that Everyone Should Know (blog)
- 9.2 - CityDreamer: Compositional Generative Model of Unbounded 3D Cities (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.2 - Point-Bind & Point-LLM: Aligning Point Cloud with Multi-modality for 3D Understanding, Generation, and Instruction Following (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.1 - RLAIF: Scaling Reinforcement Learning from Human Feedback with AI Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.1 - YaRN: Efficient Context Window Extension of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 9.1 - VideoGen: A Reference-Guided Latent Diffusion Approach for High Definition Text-to-Video Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.1 - Large Content And Behavior Models To Understand, Simulate, And Optimize Content And Behavior (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 9.1 - FACET: Fairness in Computer Vision Evaluation Benchmark (โ), (๐), (๐), (๐ ), (โณ๏ธ) -
- 9.1 - UT Researchers Use AI to Translate Thoughts Into Text (blog)
- 9.1 - Baidu launches Ernie chatbot after Chinese government approval (news)
- 9.1 - The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.31 - PointLLM: Empowering Large Language Models to Understand Point Clouds (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.31 - AI Agents โ Build and Host LLM Apps At Scale (blog)
- 8.31 - UAE launches Arabic large language model in Gulf push into generative AI (blog)
- 8.31 - UK MPs Propose Allies Form AI Union to Guard Against Adversaries (news)
- 8.31 - OpenAI released a new Teaching with AI (blog)
- 8.31 - BioCoder: A Benchmark for Bioinformatics Code Generation with Contextual Pragmatic Knowledge (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.31 - Any-Size-Diffusion: Toward Efficient Text-Driven Synthesis for Any-Size HD Images (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.31 - LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.31 - MVDream: Multi-view Diffusion for 3D Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.31 - Emergence of Segmentation with Minimalistic White-Box Transformers (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.31 - Learning Vision-based Pursuit-Evasion Robot Policies (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.30 - SAM-Med2D (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.30 - Large language models arenโt people. Letโs stop testing them as if they were (MIT TR blog)
- 8.30 - Sobering Reports on AI for CPR, Cancer Treatment Advice (blog)
- 8.30 - Why Generative AI Needs Another Breakthrough Moment (blog)
- 8.30 - Chinese ChatGPT alternatives just got approved for the general public (MIT TR news)
- 8.30 - OpenAI Nears $1 Billion of Annual Sales as ChatGPT Takes Off (news), (archive.today)
- 8.30 - International Governance of Civilian AI: A Jurisdictional Certification Approach (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.30 - AnomalyGPT: Detecting Industrial Anomalies using Large Vision-Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.30 - LLaSM: Large Language and Speech Model (proejct), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.30 - RoboTAP: Tracking Arbitrary Points for Few-Shot Visual Imitation (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.29 - Vector Search with OpenAI Embeddings: Lucene Is All You Need (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.29 - Radiology-Llama2: Best-in-Class Large Language Model for Radiology (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.29 - Inside Google's Plans To Fix Healthcare With Generative AI (Forbes news)
- 8.29 - Googleโs new Vertex AI features to unlock advanced LLM capabilities (blog)
- 8.29 - The company landscape for artificial intelligence in large-molecule drug discovery (nature reviews drug discovery doi: https://doi.org/10.1038/d41573-023-00139-0)
- 8.29 - Full Code Medical Launches Full Code AI, the First Integration of ChatGPT in Software-Based Medical Simulation (blog)
- 8.29 - ChatGPT in Medical Education and Research: A Boon or a Bane? (DOI: 10.7759/cureus.44316)
- 8.29 - OpenAI Unveils ChatGPT for Businesses, Stepping Up Revenue Push (news), (archive.today)
- 8.28 - Graph Meets LLMs: Towards Large Graph Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.28 - AI Deception: A Survey of Examples, Risks, and Potential Solutions (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.28 - Is the AI boom already over? (blog)
- 8.28 - Most Americans havenโt used ChatGPT; few think it will have a major impact on their job ([Pew Research Center news)
- 8.28 - OpenAI - Introducing ChatGPT Enterprise (blog)
- 8.28 - PointHPS: Cascaded 3D Human Pose and Shape Estimation from Point Clouds (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.27 - MedAlign: A Clinician-Generated Dataset for Instruction Following with Electronic Medical Records (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.26 - ORES: Open-vocabulary Responsible Visual Synthesis (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.25 - The One Generative AI Risk That No One Is Talking About (blog)
- 8.25 - Koreaโs Naver joins generative AI race with HyperCLOVA X large language model (blog)
- 8.25 - Can ChatGPT Transform Healthcare? (blog)
- 8.25 - OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.25 - Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.25 - Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.25 - SoTaNa: The Open-Source Software Development Assistant (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.24 - Code Llama: Open Foundation Models for Code (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS), (:octocat:)
- 8.24 - Evaluating large language models on medical evidence summarization (npj Digital Medicine volume 6, https://doi.org/10.1038/s41746-023-00896-7)
- 8.24 - Harnessing AI for Psychiatric Use Requires More Nuanced Discussion (blog)
- 8.24 - Use of Artificial Intelligence Chatbots for Cancer Treatment Information (JAMA Oncol. Published online August 24, 2023. doi:10.1001/jamaoncol.2023.2954)
- 8.24 - Prompt2Model: Generating Deployable Models from Natural Language Instructions (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.24 - American Stories: A Large-Scale Structured Text Dataset of Historical U.S. Newspapers (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.24 - Use of LLMs for Illicit Purposes: Threats, Prevention Measures, and Vulnerabilities (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.24 - Problems in using LLMs in commercial products (blog)
- 8.24 - Code LLaMA is now on Perplexityโs LLaMa Chat! (tweet), (labs)
- 8.24 - Meta AI released Code Llama, a large language model built on top of Llama 2, fine-tuned for coding & state-of-the-art (tweet), (blog), (paper), (:octocat:), (Model)
- 8.23 - Efficient Benchmarking (of Language Models) (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.23 - New Study Gives ChatGPT High Marks as a CDS Tool (news)
- 8.23 - OpenAI launched fine-tuning for GPT-3.5 Turbo! Fine-tuning (tweet), (blog)
- 8.23 - Seamless4MT: Massive Multilingual Multimodal Machine Translation (paper), (code), (blog), (demo), (tweet)
- 8.22 - A Survey on Large Language Model based Autonomous Agents (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.22 - Comparison of Ophthalmologist and Large Language Model Chatbot Responses to Online Patient Eye Care Questions (JAMA Netw Open. 2023;6(8):e2330320. doi:10.1001/jamanetworkopen.2023.30320)
- 8.22 - Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow: Development and Usability Study (J Med Internet Res 2023;25:e48659 doi: 10.2196/48659)
- 8.22 - Giraffe - 32K Long Context Open-Source LLMs (tweet), (blog), (Model)
- 8.22 - Language to rewards for robotic skill synthesis (Google blog), (tweet)
- 8.22 - Stabilizing Unsupervised Environment Design with a Learned Adversary (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.21 - Large Language Models for Software Engineering: A Systematic Literature Review (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 8.21 - AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.21 - Giraffe: Adventures in Expanding Context Lengths in LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.21 - TADA! Text to Animatable Digital Avatars (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (tweet)
- 8.21 - Large Language Models on Wikipedia-Style Survey Generation: an Evaluation in NLP Concepts (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.21 - Instruction Tuning for Large Language Models: A Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.21 - Large Language Models in Hematology Case Solving: A Comparative Study of ChatGPT-3.5, Google Bard, and Microsoft Bing (DOI: 10.7759/cureus.43861 )
- 8.20 - LegalBench: A Collaboratively Built Benchmark for Measuring Legal Reasoning in Large Language Models (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.19 - HumanLiff: Layer-wise 3D Human Generation with Diffusion Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.19 - Meet FraudGPT: The Dark Side Twin of ChatGPT (news)
- 8.19 - AI2 Dolma: 3 Trillion Token Open Corpus for Language Model Pretraining (blog)
- 8.19 - AI2 drops biggest open dataset yet for training language models (TechCrunch news)
- 8.18 - Using Large Language Models to Simulate Multiple Humans and Replicate Human Subject Studies (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.18 - Graph of Thoughts: Solving Elaborate Problems with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.18 - NYU Langone Health Holds First Generative AI "Prompt-a-Thon" (tweet), (news), (Nature paper)
- 8.18 - Autonomous visual information seeking with large language models (Google blog)
- 8.18 - Mind + Machine: ChatGPT as a Basic Clinical Decisions Support Tool (DOI: 10.7759/cureus.43690)
- 8.17 - Reinforced Self-Training for Language Modeling (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.17 - Consciousness in Artificial Intelligence: Insights from the Science of Consciousness (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.17 - OpenAI acquires start-up Global Illumination to work on core products, ChatGPT (Reuters news)
- 8.16 - RAVEN: In-Context Learning with Retrieval Augmented Encoder-Decoder Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.16 - Atom-by-atom protein generation and beyond with language models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.16 - TeCH: Text-guided Reconstruction of Lifelike Clothed Humans (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.16 - Open challenges in LLM research (blog)
- 8.16 - Microsoft Introduces Azure ChatGPT: A Private Version of ChatGPT Tailored for the Enterprise (news)
- 8.15 - GOV.UK - Artificial Intelligence for Decarbonisation innovation programme: Stream 3 (announcement)
- 8.15 - Introducing DeciCoder: The New Gold Standard in Efficient and Accurate Code Generation (blog), (project)
- 8.15 - CALYPSO: LLMs as Dungeon Masters' Assistants (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.15 - CoDeF: Content Deformation Fields for Temporally Consistent Video Processing (project), (Hires Demo), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.15 - Link-Context Learning for Multimodal LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.15 - Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.15 - Teach LLMs to Personalize -- An Approach inspired by Writing Education (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.14 - LLM Self Defense: By Self Examination, LLMs Know They Are Being Tricked (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.14 - GIT-Mol: A Multi-modal Large Language Model for Molecular Science with Graph, Image, and Text (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.14 - Chatbots in Drug Discovery: A Case Study on Anti-Cocaine Addiction Drug Development with ChatGPT (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.14 - Large Language Models for Information Retrieval: A Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.14 - Bayesian Flow Networks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.14 - Mind your Language (Model): Fact-Checking LLMs and their Role in NLP Research and Practice (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.14 - The Devil is in the Errors: Leveraging Large Language Models for Fine-grained Machine Translation Evaluation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.14 - CausalLM is not optimal for in-context learning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.14 - OctoPack: Instruction Tuning Code Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.14 - SpeechX: Neural Codec Language Model as a Versatile Speech Transformer (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 08/13 - Research Highlights Jul-Aug 2023: Llama 2, Flash-Attention 2, and More (Blog),
- 8.13 - What if Generative AI turned out to be a Dud? (blog)
- 8.13 - The most powerful open source instructions dataset: Flan (378 Million samples) (tweet), (HF)
- 8.13 - VisIT-Bench: A Benchmark for Vision-Language Instruction Following Inspired by Real-World Use (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.13 - IP-Adapter: Text Compatible Image Prompt Adapter for Text-to-Image Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.12 - A new solution and concrete implementation steps for Artificial General Intelligence (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.12 - AI Town - a virtual town where AI characters live, chat and socialize ๐ ๐ป๐ (:octocat:), (Live Demo)
- 8.12 - GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.12 - Release the Platypus family of finetuned LLMs (tweet), (project), (paper), (:octocat:)
- 8.12 - Self-Alignment with Instruction Backtranslation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.11 - Detecting and Preventing Hallucinations in Large Vision Language Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.11 - BOLAA: Benchmarking and Orchestrating LLM-augmented Autonomous Agents (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.11 - AudioLDM 2: Learning Holistic Audio Generation with Self-supervised Pretraining (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.11 - Follow Anything: Open-set detection, tracking, and following in real-time (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.11 - ChatGPT expands its โcustom instructionsโ feature to free users (TechCrunch news)
- 8.10 - The Multi-modality Cell Segmentation Challenge: Towards Universal Solutions (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.10 - DOD Announces Establishment of Generative AI Task Force (U.S. Department of Defense, Release)
- 8.10 - Metacognitive Prompting Improves Understanding in Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.10 - Testing GPT-4 with Wolfram Alpha and Code Interpreter plug-ins on math and science problems (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.10 - Trustworthy LLMs: a Survey and Guideline for Evaluating Large Language Models' Alignment (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.10 - OpenProteinSet: Training data for structural biology at scale (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.9 - Inst-Inpaint: Instructing to Remove Objects with Diffusion Models (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (demo)
- 8.9 - A Comparative Study of Open-Source Large Language Models, GPT-4 and Claude 2: Multiple-Choice Test Taking in Nephrology (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.9 - LLMeBench: A Flexible Framework for Accelerating LLMs Benchmarking (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.9 - Extrapolating Large Language Models to Non-English by Aligning Languages (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.9 - ChatGPT answers more than half of software engineering questions incorrectly (ZDnet (news)
- 8.9 - Releasing Claude Instant 1.2 (Blog)
- 8.9 - Shepherd: A Critic for Language Model Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.9 - JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.9 - Accelerating LLM Inference with Staged Speculative Decoding (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.9 - Could a Large Language Model Be Conscious? (news)
- 8.9 - ๐Exciting news! Stability AI has launched StableCode, the revolutionary generative AI LLM for coding! (tweet), (blog)
- 8.9 - New research visualizes the political bias of all major AI language models (tweet)
- 8.8 - Continual Pre-Training of Large Language Models: How to (re)warm your model? (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.8 - Gentopia: A Collaborative Platform for Tool-Augmented LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.8 - AgentSims: An Open-Source Sandbox for Large Language Model Evaluation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.8 - Empowering Vision-Language Models to Follow Interleaved Vision-Language Instructions (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.8 - MedMine: Examining Pre-trained Language Models on Medication Mining (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.8 - Separate Anything You Describe (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.8 - AI regulation is taking shape, but startups are being left out (Verge news)
- 8.8 - Accelerating LLM Inference with Staged Speculative Decoding (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.8 - 3D Gaussian Splatting for Real-Time Radiance Field Rendering (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.8 - OpenAI launches webcrawler GPTBot, and instructions on how to block it (mashable news)
- 8.8 - FLIRT: Feedback Loop In-context Red Teaming (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.8 - SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.8 - Study Tests Large Language Modelsโ Ability to Answer Clinical Questions (JAMA. 2023;330(6):496. doi:10.1001/jama.2023.12553)
- 8.8 - Why Are So Many Organizations Banning ChatGPT? (BlackBerry Blog)
- 8.7 - Coupling Symbolic Reasoning with Language Modeling for Efficient Longitudinal Understanding of Unstructured Electronic Medical Records (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.7 - Zhongjing: Enhancing the Chinese Medical Capabilities of Large Language Model through Expert Feedback and Real-world Multi-turn Dialogue (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.7 - Extracting detailed oncologic history and treatment plan from medical oncology notes with large language models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.7 - UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.7 - Studying Large Language Model Generalization with Influence Functions (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.7 - "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.7 - RecycleGPT: An Autoregressive Language Model with Recyclable Module (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.7 - AgentBench: Evaluating LLMs as Agents (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.7 - Simple synthetic data reduces sycophancy in large language models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.7 - Creation and Adoption of Large Language Models in Medicine (Jama doi:10.1001/jama.2023.14217)
- 8.7 - Doctors Vs. ChatGPT: Which Is More Empathetic? (Forbes news)
- 8.7 - Criminals Have Created Their Own ChatGPT Clones (Wired news)
- 8.7 - Large Language Models Answer Medical Questions Accurately, but Canโt Match Cliniciansโ Knowledge (Jama doi:10.1001/jama.2023.14311)
- 8.6 - Scaling Clinical Trial Matching Using Large Language Models: A Case Study in Oncology (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.6 - Pre-Trained Large Language Models for Industrial Control (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.6 - Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.6 - A Simple AI Governance Framework In The Age Of ChatGPT (Forbes news)
- 8.5 - ReCLIP: Refine Contrastive Language Image Pre-Training with Source Free Domain Adaptation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.4 - Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.4 - Who Answers It Better? An In-Depth Analysis of ChatGPT and Stack Overflow Answers to Software Engineering Questions (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.4 - Towards Generalist Foundation Model for Radiology (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.3 - Emergent Analogical Reasoning in Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.3 - The Capability of Large Language Models to Measure Psychiatric Functioning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.3 - Local Large Language Models for Complex Structured Medical Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.3 - Huge set of ChatGPT updates (tweet)
- 8.3 - Accuracy of Vitreoretinal Disease Information From an Artificial Intelligence Chatbot (JAMA Ophthalmology doi: 10.1001/jamaophthalmol.2023.3314)
- 8.2 - XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.2 - 4 Charts That Show Why AI Progress Is Unlikely to Slow Down (Time news)
- 8.2 - Do Multilingual Language Models Think Better in English? (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.2 - Exploring the psychology of GPT-4's Moral and Legal Reasoning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.2 - DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.2 - Flows: Building Blocks of Reasoning and Collaborating AI (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.1 - MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.1 - Retrieval Augmented Generation and Representative Vector Summarization for large unstructured textual data in Medical Education (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.1 - MetaGPT: Meta Programming for Multi-Agent Collaborative Framework (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.1 - Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 8.1 - Upstage LLM #1 in Open LLM Leaderboard (Leaderboard)
- 8.1 - ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 8.1 - ChatGPT app for Android is now available in all countries and regions (tweet), (blog)
- 7.31 - LLMs4OL: Large Language Models for Ontology Learning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.31 - Plotting Progress in AI (blog)
- 7.31 - Getting from Generative AI to Trustworthy AI: What LLMs might learn from Cyc (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.31 - Learning to Model the World with Language (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.30 - Do LLMs Possess a Personality? Making the MBTI Test an Amazing Evaluation for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.30 - Unified Model for Image, Video, Audio and Language Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.29 - The shaky foundations of large language models and foundation models for electronic health records (npj digital medicine, https://doi.org/10.1038/s41746-023-00879-8), (PDF)
- 7.29 - Uncertainty in Natural Language Generation: From Theory to Applications (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.28 - Exploring Format Consistency for Instruction Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.28 - โญ Med-HALT: Medical Domain Hallucination Test for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.28 - Med-Flamingo: a Multimodal Medical Few-shot Learner (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.28 - Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.28 - How Good is Google Bard's Visual Understanding? An Empirical Study on Open Challenges (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.27 - Generative AI for Medical Imaging: extending the MONAI Framework (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.27 - Guidance for Authors, Peer Reviewers, and Editors on Use of AI, Language Models, and Chatbots (Jama doi:10.1001/jama.2023.12500)
- 7.27 - Chatbots, Artificial Intelligence, and the Future of Scientific Reporting (JAMA Ophthalmology doi: 10.1001/jamaophthalmol.2023.3344)
- 7.27 - Matching Patients to Clinical Trials with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.27 - Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.27 - Scaling Up and Distilling Down: Language-Guided Robot Skill Acquisition (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.27 - NeurIPS 2023 Large Language Model Efficiency Challenge: 1 LLM + 1GPU + 1Day (site)
- 7.27 - Google DeepMind RT-2: Vision-Language-Action Models (tweet), (blog), (project), (PDF)
- 7.27 - Multilingual Code Co-Evolution Using Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.27 - The Guardian's updated editorial code guidance now includes a section on generative AI (PDF)
- 7.27 - Training Data Extraction From Pre-trained Language Models: A Survey (report), (PDF)
- 7.27 - โญ Universal and Transferable Adversarial Attacks on Aligned Language Models (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 7.27 - NeRF-Det: Learning Geometry-Aware Volumetric Representation for Multi-View 3D Object Detection (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.27 - PanGu-Coder2: Boosting Large Language Models for Code with Ranking Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.27 - WavJourney: Compositional Audio Creation with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.26 - Supporting Open Source and Open Science in the EU AI Act (Blog), (PDF)
- 7.26 - Points-to-3D: Bridging the Gap between Sparse Points and Shape-Controllable Text-to-3D Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.26 - Tracking Anything in High Quality (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.26 - Stability AI Announces Stable Diffusion XL 1.0, Featured on Amazon Bedrock (blog), (SD-XL 1.0-base Model Card), (SD-XL 1.0-refiner Model Card), (:octocat:)
- 7.26 - Towards Generalist Biomedical AI (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.26 - โญ Microsoft, Anthropic, Google, and OpenAI launch Frontier Model Forum (Microsoft), Google, OpenAI, anthropic) -
- 7.26 - Evaluating the Moral Beliefs Encoded in LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.26 - WebArena: A Realistic Web Environment for Building Autonomous Agents (project), (๐), (:octocat:
- 7.26 - ARB: Advanced Reasoning Benchmark for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.26 - OpenAI scuttles AI-written text detector over โlow rate of accuracyโ (news)
- 7.25 - Foundational Models Defining a New Era in Vision: A Survey and Outlook (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.25 - Evaluating Large Language Models for Radiology Natural Language Processing (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.25 - LLM-Rec: Personalized Recommendation via Prompting Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.25 - How Can Large Language Models Help Humans in Design and Manufacturing? (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.25 - UK House of Lords Announces Inquiry into Large Language Models (news)
- 7.25 - FacTool: Factuality Detection in Generative AI -- A Tool Augmented Framework for Multi-Task and Multi-Domain Scenarios (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.25 - LoraHub: Efficient Cross-Task Generalization via Dynamic LoRA Composition (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.25 - ChatGPT is a black box: how AI research can break it open (Nature doi: https://doi.org/10.1038/d41586-023-02366-2)
- 7.25 - ChatGPT broke the Turing test โ the race is on for new ways to assess AI (Nature doi: https://doi.org/10.1038/d41586-023-02361-7), (PDF)
- 7.25 - Evaluating the Ripple Effects of Knowledge Editing in Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.25 - 3D-LLM: Injecting the 3D World into Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.25 - RLCD: Reinforcement Learning from Contrast Distillation for Language Model Alignment (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.24 - A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.24 - โญ Aligning Large Language Models with Human: A Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.24 - A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.24 - LLMs get a medical education (Nature DOI: 10.1038/d41591-023-00064-0)
- 7.24 - MC-JEPA: A Joint-Embedding Predictive Architecture for Self-Supervised Learning of Motion and Content Features (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.24 - Interpolating between Images with Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.24 - PUMA: Secure Inference of LLaMA-7B in Five Minutes (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.24 - A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.23 - GitHub repo for Generative Agents: Interactive Simulacra of Human Behavior (:octocat:)
- 7.23 - Optimized Network Architectures for Large Language Model Training with Billions of Parameters (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.22 - A Zero-shot and Few-shot Study of Instruction-Finetuned Large Language Models Applied to Clinical and Biomedical Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.22 - Introducing FreeWilly1 and FreeWilly2 - The latest groundbreaking LLMs from Stability AI's and @carperai lab! โญ (tweet)
- 7.22 - llama2-webui: Run Llama 2 locally with gradio UI on GPU or CPU from anywhere (Linux/Windows/Mac) (:octocat:)
- 7.22 - ChatGPT for Android launches next week (news)
- 7.22 - Expedia launches ChatGPT travel planning tool (news)
- 7.21 - CohortGPT: An Enhanced GPT for Participant Recruitment in Clinical Study (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.21 - FACT SHEET: Biden-โ Harris Administration Secures Voluntary Commitments from Leading Artificial Intelligence Companies to Manage the Risks Posed by AI (White House news)
- 7.21 - Prompting Large Language Models with Speech Recognition Abilities (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.21 - L-Eval: Instituting Standardized Evaluation for Long Context Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.21 - CopyRNeRF: Protecting the CopyRight of Neural Radiance Fields (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.21 - FaceCLIPNeRF: Text-driven 3D Face Manipulation using Deformable Neural Radiance Fields (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.21 - Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.21 - Meet FreeWilly, Our Large And Mighty Instruction Fine-Tuned Models (stability.ai announcement)
- 7.21 - WormGPT: ChatGPT For Cybercriminals (news)
- 7.21 - Brain2Music: Reconstructing Music from Human Brain Activity (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.21 - OpenAI launches customized instructions for ChatGPT (news)
- 7.20 - L-Eval: Instituting Standardized Evaluation for Long Context Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.20 - DNA-Rendering: A Diverse Neural Actor Repository for High-Fidelity Human-centric Rendering (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.20 - LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.20 - DialogStudio: Towards Richest and Most Diverse Unified Dataset Collection for Conversational AI (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.20 - FABRIC: Personalizing Diffusion Models with Iterative Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.20 - โญ FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.20 - Instruction-following Evaluation through Verbalizer Manipulation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.20 - PASTA: Pretrained Action-State Transformer Agents (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.20 - TokenFlow: Consistent Diffusion Features for Consistent Video Editing (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.20 - โญ SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.20 - Artificial intelligence is making the union movementโs caseโand even ChatGPT knows it (news)
- 7.20 - Apple is testing a ChatGPT-like AI chatbot (news)
- 7.20 - Someone Used ChatGPT to Finish the Game of Thrones Book Series (news)
- 7.20 - โญ Meta-Transformer: A Unified Framework for Multimodal Learning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.19 - PharmacyGPT: The AI Pharmacist (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.19 - IvyGPT: InteractiVe Chinese pathwaY language model in medical domain (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.19 - Study Tests Large Language Modelsโ Ability to Answer Clinical Questions (Jama doi: 10.1001/jama.2023.12553)
- 7.19 - (Ab)using Images and Sounds for Indirect Instruction Injection in Multi-Modal LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.19 - Text2Layer: Layered Image Generation using Latent Diffusion Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.19 - Towards A Unified Agent with Foundation Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.19 - โญ Challenges and Applications of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.19 - On the Origin of LLMs: An Evolutionary Tree and Graph for 15,821 Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (Constellation)
- 7.18 - AnyDoor: Zero-shot Object-level Image Customization (project), (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (demo)
- 7.18 - Augmenting CLIP with Improved Visio-Linguistic Reasoning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.18 - How generative AI will reshape the enterprise (report)
- 7.18 - How is ChatGPT's behavior changing over time? (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.18 - NU-MCC: Multiview Compressive Coding with Neighborhood Decoder and Repulsive UDF (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.18 - Measuring Faithfulness in Chain-of-Thought Reasoning (PDF)
- 7.18 - ๐ฆ Llama 2 and Claude 2 are now live on Chatbot Arena! (arena)
- 7.18 - Statement of Support for Metaโs Open Approach to Todayโs AI (blog)
- 7.18 - Llama 2: Open Foundation and Fine-Tuned Chat Models (paper), (PDF), (:octocat:)
- 7.18 - Meta and Microsoft Introduce the Next Generation of Llama (tweet), (news), (Llama2), (download)
- 7.18 - Retentive Network: A Successor to Transformer for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.18 - Diffusion Models Beat GANs on Image Classification (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.18 - BuboGPT: Enabling Visual Grounding in Multi-Modal LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.18 - TableGPT: Towards Unifying Tables, Nature Language and Commands into One GPT (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.17 - Abductive Reasoning with the GPT-4 Language Model: Case studies from criminal investigation, medical practice, scientific research (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.17 - Performance of a Large Language Model on Practice Questions for the Neonatal Board Examination (Jama doi: 10.1001/jamapediatrics.2023.2373)
- 7.17 - Large language models in medicine (nature medicine https://doi.org/10.1038/s41591-023-02448-8)
- 7.17 - Chatbot vs Medical Student Performance on Free-Response Clinical Reasoning Examinations (JAMA, doi:10.1001/jamainternmed.2023.2909)
- 7.17 - AlpaGasus: Training A Better Alpaca with Fewer Data (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.16 - Communicative Agents for Software Development (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.16 - Planting a SEED of Vision in Large Language Model (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 07/15 - AI Research Highlights June-July 2023: Long Contexts and Scaling Transformers to 1,000,000,000 Tokens (Blog),
- 7.15 - DreamTeacher: Pretraining Image Backbones with Deep Generative Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.15 - INVE: Interactive Neural Video Editing (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.14 - Software Testing with Large Language Model: Survey, Landscape, and Vision (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 7.14 - Are Large Language Models a Threat to Digital Public Goods? Evidence from Activity on Stack Overflow (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.14 - China takes major step in regulating generative AI services like ChatGPT (news), (็ๆๅผไบบๅทฅๆบ่ฝๆๅก็ฎก็ๆ่กๅๆณ)
- 7.14 - What Happens When You Ask a Chinese Chatbot About Taiwan? (news)
- 7.14 - In-context Autoencoder for Context Compression in a Large Language Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.14 - Mega-TTS 2: Zero-Shot Text-to-Speech with Arbitrary Length Speech Prompts (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.14 - Learning to Retrieve In-Context Examples for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.14 - Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.14 - CoTracker: It is Better to Track Together (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.14 - The Practical Guides for Large Language Models (:octocat:)
- 7.14 - Scaling Autoregressive Multi-Modal Models: Pretraining and Instruction Tuning (paper), (PDF)
- 7.14 - Introducing CM3leon, a more efficient, state-of-the-art generative model for text and images (blog)
- 7.14 - Animate-A-Story: Storytelling with Retrieval-Augmented Video Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.14 - Domain-Agnostic Tuning-Encoder for Fast Personalization of Text-To-Image Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.14 - Generating Benchmarks for Factuality Evaluation of Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.13 - F.T.C. Opens Investigation Into ChatGPT Maker Over Technologyโs Potential Harms (news)
- 7.13 - Instruction Mining: High-Quality Instruction Data Selection for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.13 - Patch n' Pack: NaViT, a Vision Transformer for any Aspect Ratio and Resolution (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.13 - T2I-CompBench: A Comprehensive Benchmark for Open-world Compositional Text-to-image Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.13 - Distilling Large Language Models for Biomedical Knowledge Extraction: A Case Study on Adverse Drug Events (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.13 - DIALGEN: Collaborative Human-LM Generated Dialogues for Improved Understanding of Human-Human Conversations (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.13 - Copy Is All You Need (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.13 - AniFaceDrawing: Anime Portrait Exploration during Your Sketching (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.13 - HyperDreamBooth: HyperNetworks for Fast Personalization of Text-to-Image Models (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.13 - Stability AI releases Stable Doodle, a sketch-to-image tool (news), (announcement)
- 7.12 - Efficient 3D Articulated Human Generation with Layered Surface Volumes (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.12 - SayPlan: Grounding Large Language Models using 3D Scene Graphs for Scalable Task Planning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.12 - Stack More Layers Differently: High-Rank Training Through Low-Rank Updates (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.12 - PolyLM: An Open Source Polyglot Large Language Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.12 - Today we announce the formation of xAI (announcement)
- 7.12 - Large language models encode clinical knowledge (Nature, https://doi.org/10.1038/s41586-023-06291-2), (PDF)
- 7.12 - Google's NotebookLM (waitlist)
- 7.12 - 27% of jobs at high risk from AI revolution, says OECD (news)
- 7.12 - Objaverse-XL: A Universe of 10M+ 3D Objects (PDF)
- 7.12 - EgoVLPv2: Egocentric Video-Language Pre-training with Fusion in the Backbone (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.12 - Differentiable Blocks World: Qualitative 3D Decomposition by Rendering Primitives (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.11 - AmadeusGPT: a natural language interface for interactive animal behavioral analysis (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.11 - 3 principles for regulatory-grade large language model application (CIO news)
- 7.11 - Generative Pretraining in Multimodality (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.11 - DNAGPT: A Generalized Pretrained Tool for Multiple DNA Sequence Analysis Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.11 - VampNet: Music Generation via Masked Acoustic Token Modeling (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.11 - Shelving, Stacking, Hanging: Relational Pose Diffusion for Multi-modal Rearrangement (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.11 - International Institutions for Advanced AI (โ), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 7.11 - Semantic-SAM: Segment and Recognize Anything at Any Granularity (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.11 - AI tools are designing entirely new proteins that could transform medicine (Nature, doi: https://doi.org/10.1038/d41586-023-02227-y), (PDF)
- 7.11 - Shutterstock expands deal with OpenAI to build generative AI tools (news)
- 7.11 - Generative Pretraining in Multimodality (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.11 - Unleashing Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.11 - Secrets of RLHF in Large Language Models Part I: PPO (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.11 - Anthropic's Claude-2 was just released (blog), (claude)
- 7.11 - Large Language Models as General Pattern Machines (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.10 - Self-Diagnosis and Large Language Models: A New Front for Medical Misinformation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.10 - Google is testing its medical AI chatbot at the Mayo Clinic (news)
- 7.10 - RLTF: Reinforcement Learning from Unit Test Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.10 - AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.10 - GPT Researcher - GPT based autonomous agent that does online comprehensive research on any given topic (:octocat:)
- 7.9 - Chapyter: ChatGPT Code Interpreter in Jupyter Notebooks (:octocat:)
- 7.9 - DragGAN - Drag Your GAN - Face Inversion: Interactive Point-based Manipulation on the Generative Image Manifold (tweet), (HF demo)
- 7.8 - Opening up ChatGPT: Tracking openness, transparency, and accountability in instruction-tuned text generators (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.8 - Googleโs medical AI chatbot is already being tested in hospitals (news)
- 7.8 - Large Language Models for Supply Chain Optimization (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.8 - Sketch-A-Shape: Zero-Shot Sketch-to-3D Shape Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.8 - AutoDecoding Latent 3D Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.8 - Awesome Generative AI Techniques: a curated list of Generative AI Techniques (:octocat:)
- 7.8 - Robots say they won't steal jobs, rebel against humans (news)
- 7.7 - CheXmask: a large-scale dataset of anatomical segmentation masks for multi-center chest x-ray images (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.7 - Teaching Arithmetic to Small Transformers (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.7 - GPT4RoI: Instruction Tuning Large Language Model on Region-of-Interest (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.7 - Lost in the Middle: How Language Models Use Long Contexts (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 07/06 - State of Computer Vision 2023: From Vision Transformers to Neural Radiance Fields (Blog),
- 7.6 - What Should Data Science Education Do with Large Language Models? (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.6 - A.I. Will Change Medicine but Not What It Means to Be a Doctor (NYT, news)
- 7.6 - Frontier AI Regulation: Managing Emerging Risks to Public Safety (โ), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 7.6 - The imperative for regulatory oversight of large language models (or generative AI) in healthcare (npj Digital Medicine, https://doi.org/10.1038/s41746-023-00873-0), (PDF)
- 7.6 - OpenAI launches ChatGTP code interpreter for better coding using only natural language (tweet), (blog), (news)
- 7.6 - Jailbroken: How Does LLM Safety Training Fail? (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.6 - Building Cooperative Embodied Agents Modularly with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.6 - What Matters in Training a GPT4-Style Language Model with Multimodal Inputs? (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.6 - DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.6 - Elastic Decision Transformer (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.6 - Releasing ๐ CodeGen2.5 ๐, a small but mighty LLM for code (tweet), (blog), (:octocat:)
- 7.6 - Training Models to Generate, Recognize, and Reframe Unhelpful Thoughts (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.6 - Lost in the Middle: How Language Models Use Long Contexts (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.6 - Artificial Intelligence in Clinical Diagnosis Opportunities, Challenges, and Hype (JAMA, doi:10.1001/jama.2023.11440)
- 7.6 - AI Chatbots, Health Privacy, and Challenges to HIPAA Compliance (JAMA, doi:10.1001/jama.2023.9458)
- 7.6 - Health Care Privacy Risks of AI Chatbots (JAMA, doi:10.1001/jama.2023.9618)
- 7.6 - Generative AI in Health Care and Liability Risks for Physicians and Safety Concerns for Patients (JAMA, doi:10.1001/jama.2023.9630)
- 7.6 - The Challenges for Regulating Medical Use of ChatGPT and Other Large Language Models (JAMA, doi:10.1001/jama.2023.9651)
- 7.6 - A Survey on Evaluation of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:) , (SS)
- 7.5 - OpenAI - Introducing Superalignment (blog)
- 7.5 - Collaborative Score Distillation for Consistent Visual Synthesis (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.5 - Becoming self-instruct: introducing early stopping criteria for minimal instruct tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.5 - OpenAI - Introducing Superalignment (blog)
- 7.5 - Open-Source Large Language Models Outperform Crowd Workers and Approach ChatGPT in Text-Annotation Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.5 - Embodied Task Planning with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.5 - Flacuna: Unleashing the Problem Solving Power of Vicuna using FLAN Fine-Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.5 - Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.5 - Physics-based Motion Retargeting from Sparse Inputs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.5 - MSViT: Dynamic Mixed-Scale Tokenization for Vision Transformers (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.5 - Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.5 - All about the generative tasks in the Generative Medical AI (blog)
- 7.5 - LongNet: Scaling Transformers to 1,000,000,000 Tokens (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.4 - PULSAR at MEDIQA-Sum 2023: Large Language Models Augmented by Synthetic Dialogue Convert Patient Dialogues to Medical Records (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.4 - A ChatGPT Aided Explainable Framework for Zero-Shot Medical Image Diagnosis (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.4 - Segment Anything Meets Point Tracking (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.4 - Career Essentials in Generative AI by Microsoft and LinkedIn (learning)
- 7.4 - SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis (PDF), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.4 - Real-time Monocular Full-body Capture in World Space via Sequential Proxy-to-Motion Learning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.3 - Motion-X: A Large-scale 3D Expressive Whole-body Human Motion Dataset (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.3 - EmoGen: Eliminating Subjective Bias in Emotional Music Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.3 - SketchMetaFace: A Learning-based Sketching Interface for High-fidelity 3D Character Face Modeling (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.2 - LEDITS: Real Image Editing with DDPM Inversion and Semantic Guidance (โ), (๐), (๐), (๐ ), (โณ๏ธ), (demo)
- 7.1 - Global Mental Health Services and the Impact of Artificial IntelligenceโPowered Large Language Models (Jama doi:10.1001/jamapediatrics.2023.2373)
- 7.1 - Personality Traits in Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.1 - DisCo: Disentangled Control for Referring Human Dance Generation in Real World (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 7.1 - BatGPT: A Bidirectional Autoregessive Talker from Generative Pre-trained Transformer (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 7.1 - Improve ChatGPT with Knowledge Graphs (blog)
- 7.1 - The Rise of the AI Engineer (Blog) -
- 6.30 - DrugGPT: A GPT-based Strategy for Designing Potential Ligands Targeting Specific Proteins (โ), (๐)
- 6.30 - Doctor Chatbot: The EUสผs Regulatory Prescription for Generative Medical AI (Oslo Law Review, https://doi.org/10.18261/olr.10.1.1), (PDF)
- 6.30 - Preference Ranking Optimization for Human Alignment (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.30 - Reliability of Medical Information Provided by ChatGPT: Assessment Against Clinical Guidelines and Patient Information Quality Instrument (JMIR, doi: 10.2196/47479), (PDF)
- 6.30 - Large language model AI chatbots require approval as medical devices (Nature Medicine, https://doi.org/10.1038/s41591-023-02412-6) -
- 6.30 - LLaVAR: Enhanced Visual Instruction Tuning for Text-Rich Image Understanding (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.30 - Generative AI for Programming Education: Benchmarking ChatGPT, GPT-4, and Human Tutors (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.30 - Michelangelo: Conditional 3D Shape Generation based on Shape-Image-Text Aligned Latent Representation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.30 - Generate Anything Anywhere in Any Scene (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.30 - Benchmarking Large Language Model Capabilities for Conditional Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.29 - End-to-end Autonomous Driving: Challenges and Frontiers (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.29 - UMASS_BioNLP at MEDIQA-Chat 2023: Can LLMs generate high-quality synthetic note-oriented doctor-patient conversations? (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.29 - โญ A Survey of Large Language Models - version 11 (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 6.29 - June 2023, A Stage Review of Instruction Tuning (notion)
- 6.29 - Towards Language Models That Can See: Computer Vision Through the LENS of Natural Language (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.29 - Towards Measuring the Representation of Subjective Global Opinions in Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.29 - REFLECT: Summarizing Robot Experiences for Failure Explanation and Correction (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.29 - One-2-3-45: Any Single Image to 3D Mesh in 45 Seconds without Per-Shape Optimization (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.29 - DreamDiffusion: Generating High-Quality Images from Brain EEG Signals (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.28 - Regulations to govern use of AI in health records could come later this year (news)
- 6.28 - On the Exploitability of Instruction Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.28 - ChatLaw: Open-Source Legal Large Language Model with Integrated External Knowledge Bases (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.28 - RSPrompter: Learning to Prompt for Remote Sensing Instance Segmentation based on Visual Foundation Model (โ), (๐), (๐), (๐ ), (โณ๏ธ), (demo)
- 6.28 - Extending Context Window of Large Language Models via Positional Interpolation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.28 - CLIPA-v2: Scaling CLIP Training with 81.1% Zero-shot ImageNet Accuracy within a \10,000 Budget; An Extra 4,000 Unlocks 81.8% Accuracy (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.28 - Automatic Calibration and Error Correction for Large Language Models via Pareto Optimal Self-Supervision (โ), (๐), (๐), (๐ ), (โณ๏ธ
- 6.28 - PoseDiffusion: Solving Pose Estimation via Diffusion-aided Bundle Adjustment (project), (๐), (:octocat:)
- 6.28 - BrainGPT - A Large Language Model tool to assist neuroscientific research (home)
- 6.28 - Toward Actionable Generative AI - LAMs: From Large Language Models to Large Action Models (blog)
- 6.28 - The official #DragGAN app and code (tweet), (application), (:octocat:)
- 6.27 - Introducing ERNIE 3.5: Baiduโs Knowledge-Enhanced Foundation Model Takes a Giant Leap Forward (blog)
- 6.27 - Beyond the Hype: Assessing the Performance, Trustworthiness, and Clinical Suitability of GPT3.5 (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.27 - Vision Augmented Language Models: Computer vision through the LENS of natural language (blog), (demo), (:octocat:)
- 6.27 - Restart Sampling for Improving Generative Processes (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.27 - 3D-Speaker: A Large-Scale Multi-Device, Multi-Distance, and Multi-Dialect Corpus for Speech Representation Disentanglement (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.27 - MIMIC: Masked Image Modeling with Image Correspondences (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.27 - LeanDojo: Theorem Proving with Retrieval-Augmented Language Models (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.27 - Any Image to 3D (blog)
- 6.27 - โญ๏ธLangChain Integrationsโญ๏ธ Hub (link)
- 6.27 - MVDiffusion: Enabling Holistic Multi-view Image Generation with Correspondence-Aware Diffusion (project), (demo)
- 6.27 - Extending Context Window of Large Language Models via Positional Interpolation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.27 - Salesforce open-source LLMs with 8k sequence length - Xgen 7B (tweet), (blog), (:octocat:)
- 6.27 - Embracing change and resetting expectations (blog) -
- 6.27 - Baby steps in evaluating the capacities of large language models (Nature Reviews Psychology, https://doi.org/10.1038/s44159-023-00211-x), (preview)
- 6.26 - MedLSAM: Localize and Segment Anything Model for 3D Medical Images (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.26 - MotionGPT: Human Motion as a Foreign Language (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.26 - Faster Segment Anything: Towards Lightweight SAM for Mobile Applications (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.26 - Aligning Large Multi-Modal Model with Robust Instruction Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.26 - InterCode: Standardizing and Benchmarking Interactive Coding with Execution Feedback (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.26 - LongCoder: A Long-Range Pre-trained Language Model for Code Completion (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.26 - Kosmos-2: Grounding Multimodal Large Language Models to the World (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.26 - ViNT: A Foundation Model for Visual Navigation (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (video)
- 6.26 - DragDiffusion: Harnessing Diffusion Models for Interactive Point-based Image Editing (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.25 - Generative AI โ LLMOps Architecture Patterns (blog)
- 6.25 - DomainStudio: Fine-Tuning Diffusion Models for Domain-Driven Image Generation using Limited Data (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.25 - H_2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.25 - Thinking Like an Annotator: Generation of Dataset Labeling Instructions (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.25 - Language models are weak learners (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.25 - Let's Do a Thought Experiment: Using Counterfactuals to Improve Moral Reasoning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.25 - Chat with Hacker News in real-time using natural language (demo)
- 6.24 - Zero-shot spatial layout conditioning for text-to-image diffusion models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.24 - Beyond Scale: the Diversity Coefficient as a Data Quality Metric Demonstrates LLMs are Pre-trained on Formally Diverse Data (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.24 - On the paper โExploring the MIT Mathematics and EECS Curriculum Using Large Language Modelsโ (MIT)
- 6.24 - A critical analysis of โExploring the MIT Mathematics and EECS Curriculum Using Large Language Modelsโ (blog)
- 6.24 - System-Level Natural Language Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.24 - OpenMask3D: Open-Vocabulary 3D Instance Segmentation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.24 - Scaling MLPs: A Tale of Inductive Bias (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.23 - MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.23 - What's going on with the Open LLM Leaderboard? (blog)
- 6.23 - A Survey on Multimodal Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.23 - LLM Powered Autonomous Agents (blog)
- 6.23 - DreamEditor: Text-Driven 3D Scene Editing with Neural Fields (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.23 - Long-range Language Modeling with Self-retrieval (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.23 - Bring Your Own Data! Self-Supervised Evaluation for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.22 - Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 6.22 - reliableGPT: Stop OpenAI Errors in Production (:octocat:)
- 6.22 - Lit-GPT : Implementation of Falcon, StableLM, Pythia, INCITE language models based on nanoGPT (:octocat:)
- 6.22 - Perspective Fields for Single Image Camera Calibration (project page), (video), (demo), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (CVPR 2023)
- 6.22 - Event Stream GPT (ESGPT), for "event stream" datasets, particularly Electronic Health Record (EHR) datasets (tweet), (:octocat:)
- 6.22 - MPT-30B is here (tweet), (blog), (HF), (MosaicML MPT-30B-Chat)
- 6.22 - How continuous batching enables 23x throughput in LLM inference while reducing p50 latency (blog)
- 6.22 - DreamTime: An Improved Optimization Strategy for Text-to-3D Content Creation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.22 - Stability AI launches SDXL 0.9: A Leap Forward in AI Image Generation (news)
- 6.21 - ChatGPT Poses New Regulatory Questions for FDA, Medical Industry (Bloomber news), Youtube)
- 6.21 - Understanding Social Reasoning in Language Models with Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.21 - Opportunities and Risks of LLMs for Scalable Deliberation with Polis (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.21 - Training Transformers with 4-bit Integers (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.21 - Fast Segment Anything (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.21 - DecodingTrust: A Comprehensive Assessment of Trustworthiness in GPT Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.21 - LMFlow: An Extensible Toolkit for Finetuning and Inference of Large Foundation Models (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.20 - Visual Foundation Models for Medical Image Analysis (blog)
- 6.20 - Learning to Generate Better Than Your LLM (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.20 - Sound reconstruction from human brain activity via a generative model with brain-like auditory features (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.20 - A Simple and Effective Pruning Approach for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ, (:octocat:)
- 6.20 - Radiology Report Expert Evaluation (ReXVal) Dataset (PhysioNet https://doi.org/10.13026/2fp8-qr71)
- 6.20 - RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.20 - Segment Anything Model (SAM) for Radiation Oncology (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.20 - RepoFusion: Training Code Models to Understand Your Repository (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.20 - Textbooks Are All You Need (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.19 - Path to Medical AGI: Unify Domain-specific Medical LLMs with the Lowest Cost (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.19 - CounselGPT - Korean psychological counseling dataset (:octocat:)
- 6.19 - MotionGPT: Finetuned LLMs are General-Purpose Motion Generators (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.18 - Point-Cloud Completion with Pretrained Text-to-image Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.18 - Mercedes-Benz Installs ChatGPT Artificial Intelligence in 900,000 Cars (Newsweek), (Mercedes Benz)
- 6.18 - OpenLLaMA-13B released (tweet), (:octocat:)
- 6.17 - Generation of Radiology Findings in Chest X-Ray by Leveraging Collaborative Knowledge (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.17 - Beware of Unreliable Data in Model Evaluation: A LLM Prompt Selection case study with Flan-T5 (blog) -
- 6.17 - GPT Engineer - specify what you want it to build, the AI asks for clarification, and then builds it (:octocat:)
- 6.17 - Demystifying GPT Self-Repair for Code Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.17 - Introducing GAIA-1: A Cutting-Edge Generative AI Model for Autonomy (blog)
- 6.17 - Understanding Encoder And Decoder LLMs (blog) -
- 6.16 - Evaluating Superhuman Models with Consistency Checks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.16 - AvatarBooth: High-Quality and Customizable 3D Human Avatar Generation (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.16 - Gradient is All You Need? (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.16 - LabelBench: A Comprehensive Framework for Benchmarking Label-Efficient Learning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.16 - AD-AutoGPT: An Autonomous GPT for Alzheimer's Disease Infodemiology (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.16 - Meta - Introducing Voicebox: The Most Versatile AI for Speech Generation (news)
- 6.16 - Explore, Establish, Exploit: Red Teaming Language Models from Scratch (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.16 - Full Parameter Fine-tuning for Large Language Models with Limited Resources (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.16 - ClinicalGPT: Large Language Models Finetuned with Diverse Medical Data and Comprehensive Evaluation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.16 - CLIPSonic: Text-to-Audio Synthesis with Unlabeled Videos and Pretrained Language-Vision Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.16 - Language-Guided Music Recommendation for Video via Prompt Analogies (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.16 - QR Code AI Art Generator (tweet), (Hugging face), (SD art)
- 6.16 - Standford CRFM - Transparency Index for Foundation Model Provider's Compliance measurement with the Draft EU AI Act (tweet), (:octocat:)
- 6.16 - The economic potential of generative AI: The next productivity frontier (McKinsey & Company. report)
- 6.15 - Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis (โ), (๐), (๐), (๐), (๐ ), (HTML), (AS), (GS), (โณ๏ธ), (:octocat:)
- 6.15 - Med-MMHL: A Multi-Modal Dataset for Detecting Human- and LLM-Generated Misinformation in the Medical Domain (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.15 - Opportunities and Challenges for ChatGPT and Large Language Models in Biomedicine and Health (โ), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 6.15 - Introducing the ElevenLabs AI Speech Classifier: Elevating Safety Standards for AI-generated Audio Content (news)
- 6.15 - ChatGPT AI Shines in Challenging Medical Cases (news)
- 6.15 - Accuracy of a Generative Artificial Intelligence Model in a Complex Diagnostic Challenge (JAMA doi:10.1001/jama.2023.8288)
- 6.15 - LOVM: Language-Only Vision Model Selection (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.15 - WizardCoder: Empowering Code Large Language Models with Evol-Instruct (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.15 - Segment Any Point Cloud Sequences by Distilling Vision Foundation Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.15 - Seeing the World through Your Eyes (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.15 - Exploring the MIT Mathematics and EECS Curriculum Using Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.15 - Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Theory of Mind (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.15 - Segment Any Point Cloud Sequences by Distilling Vision Foundation Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.15 - ChessGPT: Bridging Policy Learning and Language Modeling (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.15 - Top Use Cases and Cutting-Edge Solutions with Generative AI in Healthcare (blog) -
- 6.15 - [SCIENCE] Art and the science of generative AI, Vol 380, Issue 6650, (DOI: 10.1126/science.adh4451)
- 6.14 - Radiology-GPT: A Large Language Model for Radiology (demo), (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.14 - Unifying Large Language Models and Knowledge Graphs: A Roadmap (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.14 - Knowledge Distillation of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.14 - TAPIR: Tracking Any Point with per-frame Initialization and temporal Refinement (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.14 - EU MEPs ready to negotiate first-ever rules for safe and transparent AI (news)
- 6.14 - TryOnDiffusion: A Tale of Two UNets (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.14 - AssistGPT: A General Multi-modal Assistant that can Plan, Execute, Inspect, and Learn (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.14 - Stable Diffusion with Core ML on Apple Silicon (tweet), (:octocat:)
- 6.13 - How AI Responds to Common Lung Cancer Questions: ChatGPT vs Google Bard (RSNA Radiology, https://doi.org/10.1148/radiol.230922)
- 6.13 - Artificial Artificial Artificial Intelligence: Crowd Workers Widely Use Large Language Models for Text Production Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.13 - Scalable 3D Captioning with Pretrained Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.13 - Rerender A Video: Zero-Shot Text-Guided Video-to-Video Translation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.13 - arXiVeri: Automatic table verification with GPT (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.13 - AVIS: Autonomous Visual Information Seeking with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.13 - AniFaceDrawing: Anime Portrait Exploration during Your Sketching (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.13 - h2oGPT: Democratizing Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), :octocat:)
- 6.13 - 3D molecule generation by denoising voxel grids โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.13 - GeneCIS: A Benchmark for General Conditional Image Similarity (project page), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (CVPR 2023)
- 6.13 - Macaw-LLM: Multi-Modal Language Modeling with Image, Audio, Video, and Text Integration (:octocat:)
- 6.13 - One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.13 - GitHub survey result - 92% of U.S.-based developers are already using AI coding tools both in and outside of work (blog)
- 6.13 - ChatGPT Workspaces - Upcoming ChatGPT features: file uploading, profiles, organizations and workspaces (reddit)
- 6.12 - Data-Copilot: Bridging Billions of Data and Humans with Autonomous Workflow (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.12 - Transformers learn through gradual rank increase (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.12 - Large Language Models as Tax Attorneys: A Case Study in Legal Capabilities Emergence (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.12 - Augmenting Language Models with Long-Term Memory (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.12 - Yann LeCun and Geoffrrey Hinton's Consensus on a number of questions about AI and catastrophic risks (tweet)
- 6.12 - Conversation of Andrew Ng and Geoffrey Hinton about AI and catastrophic risks (tweet)
- 6.12 - Benchmarking Neural Network Training Algorithms (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.12 - Lit-llama - Implementation of the LLaMA language model based on nanoGPT (:octocat:)
- 6.12 - OpenAI, DeepMind will open up models to UK government (news)
- 6.12 - WizardLM: An Instruction-following LLM Using Evol-Instruct (:octocat:)
- 6.11 - The Impact of ChatGPT and LLMs on Medical Imaging Stakeholders: Perspectives and Use Cases (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.11 - Face0: Instantaneously Conditioning a Text-to-Image Model on a Face (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.11 - A Comprehensive Survey on Applications of Transformers for Deep Learning Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 06/10 - AI Research Highlights May-June 2023: Direct-Preference Optimization for Human Feedback and More (Blog),
- 6.10 - Medical Data Augmentation via ChatGPT: A Case Study on Medication Identification and Medication Event Classification (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.10 - Large Language Model Evaluation in 2023: 5 Methods (blog) -
- 6.9 - How Can Recommender Systems Benefit from Large Language Models: A Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:) -
- 6.9 - On the Challenges and Perspectives of Foundation Models for Medical Image Analysis (โ), (๐), (๐), (๐ ), (โณ๏ธ), (๐)
- 6.9 - Chat Generative Pretrained Transformer Fails the Multiple-Choice American College of Gastroenterology Self-Assessment Test (The American Journal of Gastroenterology, DOI: 10.14309/ajg.0000000000002320)
- 6.9 - Aladdin: Zero-Shot Hallucination of Stylized 3D Assets from Abstract Scene Descriptions (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.9 - Judging LLM-as-a-judge with MT-Bench and Chatbot Arena (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.9 - Evaluating the Social Impact of Generative AI Systems in Systems and Society (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.9 - Can Large Language Models Infer Causation from Correlation? (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.9 - FinGPT: Open-Source Financial Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.8 - Interpretable Medical Diagnostics with Structured Data Extraction by Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.8 - Customizing General-Purpose Foundation Models for Medical Report Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.8 - Regulators Face Novel Challenges as Artificial Intelligence Tools Enter Medical Practice (JAMA Health Forum doi: 10.1001/jamahealthforum.2023.2300)
- 6.8 - Artificial General Intelligence for Medical Imaging (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.8 - On the Reliability of Watermarks for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.8 - PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.8 - How Far Can Camels Go? Exploring the State of Instruction Tuning on Open Resources (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.8 - StableDiffusion - Clipdrop Launches Uncrop: The Ultimate Aspect Ratio Editor (blog)
- 6.8 - Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.8 - Simple and Controllable Music Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.8 - Tracking Everything Everywhere All at Once (project page), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:
- 6.8 - Understanding GPT tokenizers (blog)
- 6.7 - The Two Word Test: A Semantic Benchmark for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.7 - Generative Text-Guided 3D Vision-Language Pretraining for Unified Medical Image Segmentation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.7 - Health system-scale language models are all-purpose prediction engines (Nature https://doi.org/10.1038/s41586-023-06160-y), (๐), (:octocat:
- 6.7 - Learning to Ground Instructional Articles in Videos through Narrations (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.7 - Emergent Correspondence from Image Diffusion (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.7 - Certified Reasoning with Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.7 - Increasing Diversity While Maintaining Accuracy: Text Data Generation with Large Language Models and Human Interventions (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.7 - Youku-mPLUG: A 10 Million Large-scale Chinese Video-Language Dataset for Pre-training and Benchmarks (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:
- 6.7 - M$^3$IT: A Large-Scale Dataset towards Multi-Modal Multilingual Instruction Tuning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.7 - PromptBench: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:
- 6.7 - INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (SS), (:octocat:)
- 6.7 - ChatGPT is fun, but it is not funny! Humor is still challenging Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.7 - Deductive Verification of Chain-of-Thought Reasoning (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.6 - ChatGPT might replace your doctor โ and it will actually do a better job of caring for you (news)
- 6.6 - ChatDB: Augmenting LLMs with Databases as Their Symbolic Memory (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - InstructZero: Efficient Instruction Optimization for Black-Box Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.6 - HeadSculpt: Crafting 3D Head Avatars with Text (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - MotionDiffuser: Controllable Multi-Agent Motion Prediction using Diffusion (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - Neuralangelo: High-Fidelity Neural Surface Reconstruction (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - PokemonChat: Auditing ChatGPT for Pokรฉmon Universe Knowledge (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - A Static Evaluation of Code Completion by Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - Large Language Models of Code Fail at Completing Code with Potential Bugs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - Ada-TTA: Towards Adaptive High-Quality Text-to-Talking Avatar Synthesis (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - Recognize Anything: A Strong Image Tagging Model (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.6 - ATT3D: Amortized Text-to-3D Object Synthesis (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.6 - Falcon-40B-Instruct is a 40B parameters causal decoder-only model built by TII based on Falcon-40B and finetuned on a mixture of Baize (HF)
- 6.5 - PULSAR: Pre-training with Extracted Healthcare Terms for Summarising Patients' Problems and Data Augmentation with Black-box Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.5 - Benchmarking Large Language Models on CMExam -- A Comprehensive Chinese Medical Exam Dataset (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.5 - shs-nlp at RadSum23: Domain-Adaptive Pre-training of Instruction-tuned LLMs for Radiology Report Impression Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.5 - A survey of Generative AI Applications (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.5 - New Artificial Intelligence ChatGPT Performs Poorly on the 2022 Self-assessment Study Program for Urology (AUA Urology practice https://doi.org/10.1097/UPJ.0000000000000406)
- 6.5 - LLM-Blender: Ensembling Large Language Models with Pairwise Ranking and Generative Fusion (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.5 - Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.5 - PLANNER: Generating Diversified Paragraph via Latent Language Diffusion Mode (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.5 - Orca: Progressive Learning from Complex Explanation Traces of GPT-4 (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.4 - Fine-Tuning Language Models with Advantage-Induced Policy Alignment (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.4 - SAM3D: Zero-Shot 3D Object Detection via Segment Anything Model (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.4 - A Technical Report for Polyglot-Ko: Open-Source Large-Scale Korean Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.3 - The Role of ChatGPT, Generative Language Models, and Artificial Intelligence in Medical Education: A Conversation With ChatGPT and a Call for Papers (JMIR, doi: 10.2196/46885), (PDF)
- 6.3 - VisualGPTScore: Visio-Linguistic Reasoning with Multimodal Generative Pre-Training Scores (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.2 - Segment Anything in High Quality (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.2 - The RefinedWeb Dataset for Falcon LLM: Outperforming Curated Corpora with Web Data, and Web Data Only (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.2 - StyleDrop: Text-To-Image Generation in Any Style (project page), (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - How Chatbots and Large Language Model Artificial Intelligence Systems Will Reshape Modern Medicine: Fountain of Creativity or Pandoraโs Box? (Jama doi: 10.1001/jamainternmed.2023.1835)
- 6.1 - StableRep: Synthetic Images from Text-to-Image Models Make Strong Visual Representation Learners (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - The TIME - "The End of Humanity" cover (tweet), ("AI Is Not an Arms Race") -
- 6.1 - AutoGPTQ - An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm (:octocat:)
- 6.1 - Wuerstchen: Efficient Pretraining of Text-to-Image Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - StyleGAN knows Normal, Depth, Albedo, and More (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - Diffusion Self-Guidance for Controllable Image Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - Thought Cloning: Learning to Think while Acting by Imitating Human Thinking (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.1 - Hiera: A Hierarchical Vision Transformer without the Bells-and-Whistles (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - The Hidden Language of Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - Inserting Anybody in Diffusion Models via Celeb Basis (โ), (๐), (๐), (๐ ), (โณ๏ธ), (project page), (:octocat:)
- 6.1 - LLaVA-Med: Training a Large Language-and-Vision Assistant for Biomedicine in One Day (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 6.1 - Birth of a Transformer: A Memory Viewpoint (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - Make-Your-Video: Customized Video Generation Using Textual and Structural Guidance (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 6.1 - ReviewerGPT? An Exploratory Study on Using Large Language Models for Paper Reviewing (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.31 - The Impact of Positional Encoding on Length Generalization in Transformers (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.31 - Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandoraโs Box Has Been Opened (J Med Internet Res 2023;25:e46924 doi: 10.2196/46924)
- 5.31 - Tree-Ring Watermarks: Fingerprints for Diffusion Images that are Invisible and Robust (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.31 - Discovering New Interpretable Conservation Laws as Sparse Invariants (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.31 - Control4D: Dynamic Portrait Editing by Learning 4D GAN from 2D Diffusion-based Editor (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.31 - Understanding and Mitigating Copying in Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.31 - PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.31 - Human or Not? A Gamified Approach to the Turing Test (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.31 - OpenAI - Letโs Verify Step by Step (paper), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (blog), (GitHub dataset) -
- 5.31 - Humans in 4D: Reconstructing and Tracking Humans with Transformers (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.31 - Improving CLIP Training with Language Rewrites (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.31 - MuseCoco: Generating Symbolic Music from Text (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.31 - MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.31 - CodeTF: One-stop Transformer Library for State-of-the-art Code LLM (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.30 - HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face (NeurIPS2003 โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.30 - Prompt Engineering for Effective Use of Large Language Models in Radiology (RSNA)
- 5.30 - Re-evaluating Word Mover's Distance (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.30 - Bigger, Better, Faster: Human-level Atari with human-level efficiency (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.30 - Japan Goes All In: Copyright Doesnโt Apply To AI Training (news)
- 5.30 - A.I. Poses โRisk of Extinction,โ Industry Leaders Warn - (NYT news)
- 5.30 - Statement on AI Risk - AI experts and public figures express their concern about AI risk (statement) -
- 5.30 - GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction (โ), (๐), (๐), (๐ ), (โณ๏ธ), (SS), (:octocat:)
- 5.30 - Nested Diffusion Processes for Anytime Image Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.30 - StyleAvatar3D: Leveraging Image-Text Diffusion Models for High-Fidelity 3D Avatar Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (GitHub dataset)
- 5.30 - HiFA: High-fidelity Text-to-3D with Advanced Diffusion Guidance (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.30 - Grammar Prompting for Domain-Specific Language Generation with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.30 - AlteredAvatar: Stylizing Dynamic 3D Avatars with Fast Style Adaptation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.30 - Ambient Diffusion: Learning Clean Distributions from Corrupted Data (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.30 - ChatGPT and large language models in gastroenterology, (Nature Reviews Gastroenterology & Hepatology)
- 5.30 - Blockwise Parallel Transformer for Long Context Large Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.29 - A lawyer used ChatGPT to prepare a court filing. It went horribly awry. (CBS news)
- 5.29 - Reconstructing the Mind's Eye: fMRI-to-Image with Contrastive Learning and Diffusion Priors (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.29 - RAPHAEL: Text-to-Image Generation via Large Mixture of Diffusion Paths (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.29 - Photoswap: Personalized Subject Swapping in Images (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.29 - TaleCrafter: Interactive Story Visualization with Multiple Characters (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.29 - GlyphControl: Glyph Conditional Control for Visual Text Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.29 - Make-An-Audio 2: Temporal-Enhanced Text-to-Audio Generation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.29 - Faith and Fate: Limits of Transformers on Compositionality (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.29 - PaLI-X: On Scaling up a Multilingual Vision and Language Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.29 - Controllable Text-to-Image Generation with GPT-4 (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.29 - Brainformers: Trading Simplicity for Efficiency (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.28 - Large Language Models, scientific knowledge and factuality: A systematic analysis in antibiotic discovery (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.28 - Geometric Algebra Transformers (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.28 - Tab-CoT: Zero-shot Tabular Chain of Thought (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.28 - Tab-CoT: Zero-shot Tabular Chain of Thought (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.28 - FuseCap: Leveraging Large Language Models to Fuse Visual Data into Enriched Image Captions (โ), (๐), (๐), (๐ ), (โณ๏ธ), (demo)
- 5.28 - Introducing NVIDIA ACE For Games - Spark Life Into Virtual Characters With Generative AI (blog)
- 5.27 - SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex Interactive Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.27 - The Curse of Recursion: Training on Generated Data Makes Models Forget (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.27 - DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.27 - What indeed can GPT models do in chemistry? A comprehensive benchmark on eight tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.27 - WingmanAI - real-time transcription of audio, integrated with ChatGPT for interactive use (:octocat:)
- 5.27 - ToolBench - Large-scale instruction tuning SFT data to equip LLMs with general tool-use capability (:octocat:)
- 5.27 - G7 officials to hold first meeting on AI regulation next week (news)
- 5.26 - Large language models improve Alzheimer's disease diagnosis using multi-modality data (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.26 - ChatGPT Fails American College of Gastroenterology Assessment Tests (news)
- 5.26 - Chain-of-Thought Hub: A Continuous Effort to Measure Large Language Models' Reasoning Performance (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.26 - A new antibiotic, discovered with artificial intelligence, may defeat a dangerous superbug (CNN news)
- 5.26 - Generating Images with Multimodal Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.26 - Backpack Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.26 - Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.26 - Playing repeated games with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.26 - Training Socially Aligned Language Models in Simulated Human Society (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.26 - BiomedGPT: A Unified and Generalist Biomedical Generative Pre-trained Transformer for Vision, Language, and Multimodal Tasks (โ), (๐), (๐), (๐ )
- 5.26 - Large Language Models as Tool Makers (โ), (๐), (๐), (๐ )
- 5.26 - ProlificDreamer: High-Fidelity and Diverse Text-to-3D Generation with Variational Score Distillation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (project page)
- 5.25 - ChatCAD+: Towards a Universal and Reliable Interactive CAD using LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.25 - The Current and Future State of AI Interpretation of Medical Images (NEJM, DOI: 10.1056/NEJMra2301725)
- 5.25 - Deep learning-guided discovery of an antibiotic targeting Acinetobacter baumannii, (nature chemical biology https://doi.org/10.1038/s41589-023-01349-8), (:octocat:), (Cloned snapshot)
- 5.25 - Ghost in the Minecraft: Generally Capable Agents for Open-World Environments via Large Language Models with Text-based Knowledge and Memory (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.25 - Role-Play with Large Language Models (โ), (๐), (๐), (๐ )
- 5.25 - Break-A-Scene: Extracting Multiple Concepts from a Single Image (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.25 - Voyager: An Open-Ended Embodied Agent with Large Language Models (Project page), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (MindDojo)
- 5.25 - Efficient Neural Music Generation (โ), (๐), (๐), (๐ )
- 5.25 - Custom-Edit: Text-Guided Image Editing with Customized Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.25 - On Architectural Compression of Text-to-Image Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.25 - Prompt-Free Diffusion: Taking "Text" out of Text-to-Image Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.25 - The False Promise of Imitating Proprietary LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.25 - the new Stable Diffusion โReimagine XLโ model on @ClipdropApp x @StabilityAI (tweet), (Clipdrop)
- 5.25 - Gorilla: Large Language Model Connected with Massive APIs (tweet), (project page), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (demo video), (discord)
- 5.25 - OpenAI - Democratic Inputs to AI (Tweet), (Blog)
- 5.24 - Reasoning with Language Model is Planning with World Model (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.24 - Large Language Models are Few-Shot Health Learners (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.24 - HuatuoGPT, towards Taming Language Model to Be a Doctor (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.24 - Instructions as Backdoors: Backdoor Vulnerabilities of Instruction Tuning for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.24 - Have LLMs Advanced Enough? A Challenging Problem Solving Benchmark For Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.24 - Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.24 - A majority of Americans have heard of ChatGPT, but few have tried it themselves (Pew Research Center news)
- 5.24 - Towards Revealing the Mystery behind Chain of Thought: a Theoretical Perspective (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.24 - Think Before You Act: Decision Transformers with Internal Working Memory (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.24 - PandaGPT: One Model to Instruction-Follow Them All (project page), (๐), (demo), (video), (dataset), (model), (:octocat:), (tweet)
- 5.24 - SPRING: GPT-4 Out-performs RL Algorithms by Studying Papers and Reasoning (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.24 - Manifold Diffusion Fields (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.24 - A Neural Space-Time Representation for Text-to-Image Personalization (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.24 - Can Transformers Learn to Solve Problems Recursively? (โ), (๐), (๐), (๐ )
- 5.24 - This Land is {Your, My} Land: Evaluating Geopolitical Biases in Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.24 - Model evaluation for extreme risks (โ), (๐), (๐), (๐ )
- 5.24 - State of GPT and RLHF LLMs - Andrej Karpathy, OpenAI (session), (video)
- 5.24 - LMs with a Voice: Spoken Language Modeling beyond Speech Tokens (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.24 - BLIP-Diffusion: Pre-trained Subject Representation for Controllable Text-to-Image Generation and Editing (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (Project page)
- 5.23 - Threats by artificial intelligence to human health and human existence (BMJ, http://dx.doi.org/10.1136/bmjgh-2022-010435), (PDF)
- 5.23 - Transformer-based Vulnerability Detection in Code at EditTime: Zero-shot, Few-shot, or Fine-tuning? โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - Unityโs Project Barracuda Injects Generative AI Into Games To Kickstart Exponential Growth (Forbes news)
- 5.23 - VisorGPT: Learning Visual Prior via Generative Pre-Training (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.23 - ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - OlaGPT: Empowering LLMs With Human-like Problem-Solving Abilities (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - Goat: Fine-tuned LLaMA Outperforms GPT-4 on Arithmetic Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - Control-A-Video: Controllable Text-to-Video Generation with Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)|
- 5.23 - Aligning Large Language Models through Synthetic Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - LLMs as Factual Reasoners: Insights from Existing Benchmarks and Beyond (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.23 - Lost in Translation: Large Language Models in Non-English Content Analysis (news)
- 5.23 - Anchor Prediction: Automatic Refinement of Internet Links (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - Coarse-to-Fine Contrastive Learning in Image-Text-Graph Space for Improved Vision-Language Compositionality (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.23 - Bing at Microsoft Build 2023: Continuing the Transformation of Search (blog)
- 5.23 - Bringing the power of AI to Windows 11 โ unlocking a new era of productivity for customers and developers with Windows Copilot and Dev Home (blog)
- 5.23 - Adobe Unveils Future of Creative Cloud With Generative AI as a Creative Co-Pilot in Photoshop (news), (blog)
- 5.23 - QLoRA: Efficient Finetuning of Quantized LLMs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.22 - A Study of Generative Large Language Model for Medical Research and Healthcare (โ), (๐), (๐), , (๐ ),(โณ๏ธ)
- 5.22 - Large-language-model-based 10-year risk prediction of cardiovascular disease: insight from the UK biobank data (medRxiv), (SS)
- 5.22 - SEAHORSE: A Multilingual, Multifaceted Dataset for Summarization Evaluation (โ), (๐), (๐), (โณ๏ธ)
- 5.22 - Meta-in-context learning in large language models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.22 - AudioToken: Adaptation of Text-Conditioned Diffusion Models for Audio-to-Image Generation (โ), (๐), (๐), (โณ๏ธ), (:octocat:)
- 5.22 - Iterative Forward Tuning Boosts In-context Learning in Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.22 - How Language Model Hallucinations Can Snowball (โ), (๐), (๐), (๐ ), (โณ๏ธ), (demo)
- 5.22 - Intel Announces Aurora genAI, Generative AI Model With 1 Trillion Parameters (news), (Intel newsroom)
- 5.22 - Introducing Mind-Video (Tweet), (demo), (data)
- 5.22 - Reflective Linguistic Programming (RLP): A Stepping Stone in Socially-Aware AGI (SocialAGI) (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.22 - GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.22 - LM vs LM: Detecting Factual Errors via Cross Examination (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.22 - XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages (๐), (:octocat:)
- 5.22 - VideoLLM: Modeling Video Sequence with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.22 - RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.22 - RWKV: Reinventing RNNs for the Transformer Era (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.22 - Introducing speech-to-text, text-to-speech, and more for 1,100+ languages (Blog), (๐), (:octocat:) -
- 5.21 - Embracing Large Language Models for Medical Applications: Opportunities and Challenges (abstract), (SS)
- 5.21 - Augmenting Autotelic Agents with Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.21 - XrayGPT: Chest Radiographs Summarization using Medical Vision-Language Models (:octocat:), (Video) -
- 5.20 - G7 Hiroshima Leadersโ Communiquรฉ (statement), (html)
- 5.20 - G7 calls for developing global technical standards for AI (news)
- 5.20 - Labour should pledge ยฃ11bn to build โBritGPTโ AI, thinktank says (news)
- 5.20 - CodeCompose: A Large-Scale Industrial Deployment of AI-assisted Code Authoring (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - Appraising the Potential Uses and Harms of LLMs for Medical Systematic Reviews (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation (โ), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 5.19 - HaluEval: A Large-Scale Hallucination Evaluation Benchmark for Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.19 - OPT-R: Exploring the Role of Explanations in Finetuning and Prompting for Reasoning Skills of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - Neural Foundations of Mental Simulation: Future Prediction of Latent Representations on Dynamic Scenes (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - Clinical Camel: An Open-Source Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding (โ), (๐), (๐), (๐ ), (โณ๏ธ), (huggingface), (:octocat:)
- 5.19 - New York City public schools remove ChatGPT ban (news)
- 5.19 - Graphologue: Exploring Large Language Model Responses with Interactive Diagrams (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - The Inside Story: Towards Better Understanding of Machine Translation Neural Evaluation Metrics (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.19 - HalOmi: A Manually Annotated Benchmark for Multilingual Hallucination and Omission Detection in Machine Translation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - Characterizing tradeoffs between teaching via language and demonstrations in multi-agent systems (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - TELeR: A General Taxonomy of LLM Prompts for Benchmarking Complex Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - Text2NeRF: Text-Driven 3D Scene Generation with Neural Radiance Fields (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - Chupa: Carving 3D Clothed Humans from Skinned Shape Priors using 2D Diffusion Probabilistic Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - Comparing Software Developers with ChatGPT: An Empirical Investigation (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.19 - Multimodal Web Navigation with Instruction-Finetuned Foundation Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - Cinematic Mindscapes: High-quality Video Reconstruction from Brain Activity (โ), (๐), (๐), (๐ ), (โณ๏ธ) -
- 5.19 - Scaling laws for language encoding models in fMRI (โ), (๐), (๐), (๐ ), (โณ๏ธ) -
- 5.19 - Any-to-Any Generation via Composable Diffusion (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.19 - ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.19 - Apple Bans Employees From Using ChatGPT Amid Its Own AI Efforts (news)
- 5.18 - A Framework for Critically Assessing ChatGPT and Other Large Language Artificial Intelligence Model Applications in Health Care (https://doi.org/10.1016/j.mcpdig.2023.03.006)
- 5.18 - Brain-inspired learning in artificial neural networks: a review (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.18 - ONE-PEACE: Exploring One General Representation Model Toward Unlimited Modalities (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.18 - RoomDreamer: Text-Driven 3D Indoor Scene Synthesis with Coherent Geometry and Texture (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.18 - LIMA: Less Is More for Alignment (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.18 - GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework (โ), (๐), (๐), (๐ ), (project page), (โณ๏ธ), (:octocat:), (Star history)
- 5.18 - SpeechGPT: Empowering Large Language Models with Intrinsic Cross-Modal Conversational Abilities (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.18 - mLongT5: A Multilingual and Efficient Text-To-Text Transformer for Longer Sequences (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.18 - Language Models Meet World Models: Embodied Experiences Enhance Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.18 - Roundhill Investments Launches Generative AI & Technology ETF (NYSE Arca: CHAT) (news), (CHAT ETF)
- 5.18 - VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.18 - Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold (โ), (๐), (๐), (๐ ), (โณ๏ธ), (Huggingfae), (Unofficial), (colab), (Official)
- 5.18 - PyLLMs - a minimal Python library to connect to LLMs (OpenAI, Anthropic, Google, AI21, Cohere, Aleph Alpha, HuggingfaceHub) (:octocat:)
- 5.18 - Evidence of Meaning in Language Models Trained on Programs (โ), (๐), (๐), (โณ๏ธ)
- 5.18 - Introducing the ChatGPT app for iOS (blog), (Download on the App Stor)
- 5.18 - MTIA v1: Metaโs first-generation AI inference accelerator (blog)
- 5.18 - Pursuing groundbreaking scale and accelerating research using Metaโs Research SuperCluster (blog)
- 5.18 - Reimagining Metaโs infrastructure for the AI age (blog)
- 5.17 - Evaluating Object Hallucination in Large Vision-Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.17 - Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.17 - DoReMi: Optimizing Data Mixtures Speeds Up Language Model Pretraining (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.17 - Explaining black box text modules in natural language with language models (โ), (๐), (๐), (๐ ), (project page), (โณ๏ธ
- 5.17 - Tree of Thoughts: Deliberate Problem Solving with Large Language Models (โ), (๐), (๐), (โณ๏ธ)
- 5.17 - Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback (โ), (๐), (๐), (โณ๏ธ)
- 5.17 - PMC-VQA: Visual Instruction Tuning for Medical Visual Question Answering (โ), (๐), (๐), (๐papers with code)
- 5.17 - What You See is What You Read? Improving Text-Image Alignment Evaluation (โ), (๐), (๐), (โณ๏ธ)
- 5.17 - PaLM 2 Technical Report (โ), (๐), (๐), (๐papers with code)
- 5.17 - Improving Language Model Negotiation with Self-Play and In-Context Learning from AI Feedback (โ), (๐), (๐), (โณ๏ธ)
- 5.17 - SoundStorm: Efficient Parallel Audio Generation (โ), (๐), (๐), (Project page), (โณ๏ธ) -
- 5.16 - GPT-4 in Radiology: Improvements in Advanced Reasoning (RSNA Radiology, https://doi.org/10.1148/radiol.230987)
- 5.16 - Performance of ChatGPT on a Radiology Board-style Examination: Insights into Current Strengths and Limitations (RSNA Radiology, https://doi.org/10.1148/radiol.230582)
- 5.16 - AR-Diffusion: Auto-Regressive Diffusion Model for Text Generation (โ), (๐), (๐)
- 5.16 - Make-An-Animation: Large-Scale Text-conditional 3D Human Motion Generation (โ), (๐), (๐)
- 5.16 - ChatGPT versus human in generating medical graduate exam questions โ An international prospective study (medRxiv), (๐)
- 5.16 - Understanding 3D Object Interaction from a Single Image (โ), (๐), (๐), (project page), (demo), (video), (:octocat:)
- 5.16 - StructGPT: A General Framework for Large Language Model to Reason over Structured Data (โ), (๐), (๐), (โณ๏ธ)
- 5.16 - FitMe: Deep Photorealistic 3D Morphable Model Avatars (โ), (๐), (๐), (project page)
- 5.16 - Pre-Training to Learn in Context (โ), (๐), (๐), (โณ๏ธ)
- 5.16 - Towards Expert-Level Medical Question Answering with Large Language Models (โ), (๐), (๐), (๐papers with code)
- 5.16 - GPTeam: Collaborative AI Agents (:octocat:)
- 5.16 - WATCH LIVE: OpenAI CEO Sam Altman testifies on artificial intelligence before Senate committee (Youtube)
- 5.16 - NYT - Microsoft Says New A.I. Shows Signs of Human Reasoning -
- 5.15 - Common Diffusion Noise Schedules and Sample Steps are Flawed (โ), (๐), (๐)
- 5.15 - Symbol tuning improves in-context learning in language models (โ), (๐), (๐)
- 5.15 - Interpretability at Scale: Identifying Causal Mechanisms in Alpaca (โ), (๐), (๐)
- 5.15 - DarkBERT: A Language Model for the Dark Side of the Internet (โ), (๐), (๐) -
- 5.15 - AutoRecon: Automated 3D Object Discovery and Reconstruction (โ), (๐), (๐), (Project page) -
- 5.15 - RL4F: Generating Natural Language Feedback with Reinforcement Learning for Repairing Model Outputs (โ), (๐), (๐), (โณ๏ธ) -
- 5.15 - Small Models are Valuable Plug-ins for Large Language Models (โ), (๐), (๐) -
- 5.15 - "ChatGPT can pick stocks better then top fund managers" - The ChatGPT Fund - (tweet), (website)
- 5.15 - officially launching the Poe API - (Tweet, (:octocat:): (poe-protocol), (api-bot-tutorial) -
- 5.15 - Guidance - A guidance language for controlling large language models (:octocat:)
- 5.15 - BriefGPT - Locally hosted tool that connects documents to LLMs for summarization and querying, with a simple GUI (:octocat:)
- 5.15 - Iโm an ER doctor. Hereโs how Iโm already using ChatGPT to help treat patients (blog)
- 5.14 - A Comprehensive Survey on Segment Anything Model for Vision and Beyond (โ), (๐), (๐), (โณ๏ธ), (:octocat:)
- 5.14 - How to run Llama 13B with a 6GB graphics card (Gist)
- 05/13 - AI Research Highlights April-May 2023: Transformers for Long Inputs and Less Training Data (Blog),
- 5.13 - Leaked Copilot Chat's confidential rules (tweet)
- 5.13 - GPT-Sentinel: Distinguishing Human and ChatGPT Generated Content (arXiv](https://arxiv.org/abs/2305.07969)), (๐), (๐)
- 5.13 - Everything-LLMs-And-Robotics - The world's largest GitHub Repository for LLMs + Robotics (:octocat:) -
- 5.13 - CodeT5+: Open Code Large Language Models for Code Understanding and Generation (โ), (๐), (๐), (:octocat:), (๐papers with code)
- 5.13 - EU AI Act To Target US Open Source Software (Blog)
- 5.13 - PCAST Working Group on Generative AI Invites Public Input (Blog) -
- 5.12 - Text2Cohort: Democratizing the NCI Imaging Data Commons with Natural Language Cohort Discovery (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.12 - MedGPTEval: A Dataset and Benchmark to Evaluate Responses of Large Language Models in Medicine (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.12 - A Survey on Segment Anything Model (SAM): Vision Foundation Model Meets Prompt Engineering (โ), (๐), (๐), (โณ๏ธ)
- 5.12 - spacy-llm, an extension for integrating LLMs into structured NLP pipelines! (:octocat:), (tweet)
- 5.12 - TinyStories: How Small Can Language Models Be and Still Speak Coherent English? (โ), (๐), (๐) -
- 5.12 - Dr. LLaMA: Improving Small Language Models in Domain-Specific QA via Generative Data Augmentation (โ), (๐), (๐), (model), (:octocat:)
- 5.12 - ArtGPT-4: Artistic Vision-Language Understanding with Adapter-enhanced MiniGPT-4 (โ), (๐), (๐), (model)
- 5.12 - MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers (โ), (๐), (๐)
- 5.12 - AI FILM -The Carnival of the Ages - Runway gen2 (Youtube), (Reddit) -
- 5.11 - Towards best practices in AGI safety and governance: A survey of expert opinion (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 5.11 - The ConceptARC Benchmark: Evaluating Understanding and Generalization in the ARC Domain (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 5.11 - Large Language Models Can Be Used To Effectively Scale Spear Phishing Campaigns (โ), (๐), (๐)
- 5.11 - Towards best practices in AGI safety and governance: A survey of expert opinion (โ), (๐), (๐)
- 5.11 - Optimizing Memory Mapping Using Deep Reinforcement Learning (โ), (๐), (๐)
- 5.11 - Universal Source Separation with Weakly Labelled Data (โ), (๐), (๐), (:octocat:)
- 5.11 - Active Retrieval Augmented Generation (โ), (๐), (๐), (:octocat:)
- 5.11 - Anthropic - Introducing 100K Context Windows (Blog)
- 5.11 - CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model (โ), (๐), (๐)
- 5.11 - Exploiting Diffusion Prior for Real-World Image Super-Resolution (โ), (๐), (๐), (Project page)
- 5.11 - Domain Incremental Lifelong Learning in an Open World (โ), (๐), (๐)
- 5.11 - Not All Languages Are Created Equal in LLMs: Improving Multilingual Capability by Cross-Lingual-Thought Prompting (โ), (๐), (๐)
- 5.11 - Region-Aware Pretraining for Open-Vocabulary Object Detection with Vision Transformers (โ), (๐), (๐)
- 5.11 - EfficientViT: Memory Efficient Vision Transformer with Cascaded Group Attention (โ), (๐), (๐), (:octocat:)
- 5.11 - InstructBLIP: Towards General-purpose Vision-Language Models with Instruction Tuning (โ), (๐), (๐), (:octocat:)
- 5.11 - Huggingface Transformers Agent (API)
- 5.11 - Google PaLM 2 Technical Report (๐), (Blog)
- 5.11 - Google MusicLM (Demo), (news)
- 5.10 - LMFlow Benchmark: An Automatic Evaluation Framework for Open-Source LLMs (blog) -
- 5.10 - Bridging the Literacy Gap for Surgical Consents: An AI-Human Expert Collaborative Approach (medxRiv paper) -
- 5.10 - Are ChatGPT and GPT-4 General-Purpose Solvers for Financial Text Analytics? An Examination on Several Typical Tasks (โ), (๐), (๐), (โณ๏ธ)
- 5.10 - HumanRF: High-Fidelity Neural Radiance Fields for Humans in Motion (โ), (๐), (๐)
- 5.10 - VideoChat: Chat-Centric Video Understanding (โ), (๐), (๐)
- 5.10 - Bot or Human? Detecting ChatGPT Imposters with A Single Question (โ), (๐), (๐)
- 5.10 - Do LLMs Understand User Preferences? Evaluating LLMs On User Rating Prediction (โ), (๐), (๐)
- 5.10 - Relightify: Relightable 3D Faces from a Single Image via Diffusion Models (โ), (๐), (๐)
- 5.10 - Similarity of Neural Network Models: A Survey of Functional and Representational Measures (โ), (๐), (๐)
- 5.10 - Generative AI meets 3D: A Survey on Text-to-3D in AIGC Era (โ), (๐), (๐)
- 5.10 - MPT-7B StoryWriter- new open-source language model that can handle really long inputs (Replicate)
- 5.10 - Humata.ai - Ask AI anything about your files (Tweet)
- 5.10 - IMAGEBIND: One Embedding Space To Bind Them All (๐), (Blog), (:octocat:), (๐papers with code), (star history)
- 5.9 - Large Language Models Need Holistically Thought in Medical Conversational QA (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.9 - StarCoder: may the source be with you! (โ), (๐), (๐), (๐ )
- 5.9 - Towards Building the Federated GPT: Federated Instruction Tuning (โ), (๐), (๐), (:octocat:)
- 5.9 - Large Language Model Programs (โ), (๐), (๐)
- 5.9 - FrugalGPT: How to Use Large Language Models While Reducing Cost and Improving Performance (โ), (๐), (๐)
- 5.9 - OpenAI - Language models can explain neurons in language models (Blog), (Paper), (:octocat:), (Tweet)
- 5.9 - AvatarReX: Real-time Expressive Full-body Avatars (โ), (๐), (๐) -
- 5.8 - Augmented Large Language Models with Parametric Knowledge Guiding (โ), (๐), (๐) -
- 5.8 - We had ChatGPT take the CPA exam โ and it failed (news)
- 5.8 - Comparison of GPT-3.5, GPT-4, and human user performance on a practice ophthalmology written examination (Nature)
- 5.8 - MultiModal-GPT: A Vision and Language Model for Dialogue with Humans (โ), (๐), (๐), (:octocat:), (๐ ), (Star history)
- 5.7 - SuperAgent - Deploy LLM Agents to production (:octocat:)
- 5.7 - Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models (โ), (๐), (๐), (:octocat:)
- 5.7 - X-LLM: Bootstrapping Advanced Large Language Models by Treating Multi-Modalities as Foreign Languages (โ), (๐), (๐)
- 5.7 - Multi-Space Neural Radiance Fields (โ), (๐), (๐), (Project page), (Dataset)
- 5.7 - Language Models Don't Always Say What They Think: Unfaithful Explanations in Chain-of-Thought Prompting (โ), (๐), (๐)
- 5.7 - Yoshua Bengio - AI Scientists: Safe and Useful AI? (Blog)
- 05/06 - Insights from Large-Scale LLM Training Runs (Blog),
- 5.5 - privateGPT - Interact privately with your documents using the power of GPT, 100% privately, no data leaks (:octocat:), (star history)
- 5.5 - Open LLMs : A list of open LLMs available for commercial use - (:octocat:)
- 5.5 - A Suite of Generative Tasks for Multi-Level Multimodal Webpage Understanding (โ), (๐), (๐), (๐ )
- 5.5 - Otter: A Multi-Modal Model with In-Context Instruction Tuning (โ), (๐), (๐), (:octocat:), (๐ )
- 5.5 - Composite Motion Learning with Task Control (โ), (๐), (๐), (:octocat:), (Papper page)
- 5.5 - StarCoderBase: trained on 1T tokens in 80+ programming languages (Huggingface)
- 5.5 - Dolphin: General video interaction platform based on LLMs (Demo), (:octocat:), (Tweet) -
- 5.5 - MPT-7B: A New Standard for Open-Source, Commercially Usable LLMs (Blog), Commercially usable: (MPT-7B) (MPT-7B-Instruct), (MPT-7B-StoryWriter), For non-commerical use: (MPT-7B-Chat)
- 5.5 - StarCoder: A State-of-the-Art LLM for Code (Blog), (:octocat:), (HuggingFace), (Tweet)
- 5.5 - OpenAlpaca, an instruction-following model based on OpenLLaMA (:octocat:), (Huggingface), (Tweet)
- 5.4 - Caption Anything: Interactive Image Description with Diverse Multimodal Controls (โ), (๐), (๐), (๐), (โณ๏ธ), (:octocat:)
- 5.4 - Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability (โ), (๐), (๐), (:octocat:(https://img.shields.io/github/stars/KindXiaoming/BIMT?style=social)), (demo), (Papper page)
- 5.4 - Evaluating the Performance of ChatGPT in Ophthalmology: An Analysis of its Successes and Shortcomings (Ophthalmology Science) -
- 5.4 - Cognitive Reframing of Negative Thoughts through Human-Language Model Interaction (โ), (๐), (๐)
- 5.4 - Governance of the AI, by the AI, and for the AI (โ), (๐), (๐), (Papper page)
- 5.4 - Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs (โ), (๐), (๐)
- 5.4 - Diffusion Explainer: Visual Explanation for Text-to-image Stable Diffusion (โ), (๐), (๐), (Papper page)
- 5.4 - AttentionViz: A Global View of Transformer Attention (โ), (๐), (๐), (Papper page)
- 5.4 - Reddit - OpenAI lost $540M in 2022, will need $100B more to develop AGI, says Altman. My breakdown on why this matters and what it means for other AI startups
- 5.4 - FACT SHEET: Biden-โ Harris Administration Announces New Actions to Promote Responsible AI Innovation that Protects Americansโ Rights and Safety - (White house)
- 5.4 - Google "We Have No Moat, And Neither Does OpenAI" - (Blog)
- 5.4 - CNBC - Britain launches probe into ChatGPT-style A.I. as regulators grow concerned by risks
- 5.4 - Personalize Segment Anything Model with One Shot (โ), (๐), (๐), (:octocat:), (๐ )
- 5.4 - AutoML-GPT: Automatic Machine Learning with GPT (โ), (๐), (๐), (๐ )
- 5.4 - NeRSemble: Multi-view Radiance Field Reconstruction of Human Heads (โ), (๐), (๐, (Project page), (๐ )
- 5.4 - An automatically discovered chain-of-thought prompt generalizes to novel models and datasets (โ), (๐), (๐)
- 5.4 - NYT - White House Pushes Tech C.E.O.s to Limit Risks of A.I.
- 5.4 - Microsoft Bing AI chatbot and Edge browser get massive AI upgrades. See the list. (Blog)
- 5.3 - Can Large Language Models Be an Alternative to Human Evaluations? (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.4 - Introducing Slack GPT (Blog)
- 5.3 - Distinguishing GPT-4-generated Radiology Abstracts from Original Abstracts: Performance of Blinded Human Observers and AI Content Detector (medRxiv), (๐)
- 5.3 - Chatbot Arena: Benchmarking LLMs in the Wild with Elo Ratings - (Blog)
- 5.3 - CodeGen2: Lessons for Training LLMs on Programming and Natural Languages (โ), (๐), (๐), (:octocat:)
- 5.3 - Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes (โ), (๐), (๐)
- 5.3 - Visual Chain of Thought: Bridging Logical Gaps with Multimodal Infillings (โ), (๐), (๐)
- 5.3 - AG3D: Learning to Generate 3D Avatars from 2D Image Collections (โ), (๐), (๐), (Project page)
- 5.3 - Shap-E: Generating Conditional 3D Implicit Functions (โ), (๐), (๐), (:octocat:), (๐ )
- 5.3 - 100 Practical Applications and Use Cases of Generative AI - (๐), (News) -
- 5.3 - Comprehensive LLM model zoo - Ecosystem Graphs to track the foundation model ecosystem assets (datasets, models, and applications) and their relationship (Table), (Graph), (:octocat:)
- 5.3 - GPTutor: a ChatGPT-powered programming tool for code explanation (โ), (๐), (๐)
- 5.3 - Midjourney 5.1 Arrives - And Itโs Another Leap Forward For AI Art - (Forbes)
- 5.3 - Mojo ๐ฅ โ a new programming language for all AI developers (Web), (tweet), (:octocat:)
- 5.3 - #NeurIPS2023 Creative AI Track (Blog), (Call for proposal)
- 5.3 - HeyPi - Personal AI
- 5.2 - RadAdapt: Radiology Report Summarization via Lightweight Domain Adaptation of Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 5.2 - Interpretable Machine Learning for Science with PySR and SymbolicRegression.jl (โ), (๐), (๐)
- 5.2 - Andrew Ng - ChatGPT Prompt Engineering for Developers - (online course), (Tweet)
- 5.2 - DreamPaint: Few-Shot Inpainting of E-Commerce Items for Virtual Try-On without 3D Modeling (โ), (๐), (๐)
- 5.2 - Generalizing Dataset Distillation via Deep Generative Prior (โ), (๐), (๐)
- 5.2 - Multimodal Procedural Planning via Dual Text-Image Prompting (โ), (๐), (๐), (:octocat:)
- 5.2 - WSJ - Google DeepMind CEO Says Some Form of AGI Possible in a Few Years
- 5.2 - Latest NVIDIA Graphics Research Advances Generative AIโs Next Frontier (Blog)
- 5.2 - Pick-a-Pic: An Open Dataset of User Preferences for Text-to-Image Generation (โ), (๐), (๐), (:octocat:)
- 5.2 - TMR: Text-to-Motion Retrieval Using Contrastive 3D Human Motion Synthesis (โ), (๐), (๐), (Project page), (Demo)
- 5.2 - Is Your Code Generated by ChatGPT Really Correct? Rigorous Evaluation of Large Language Models for Code Generation (โ), (๐), (๐), (:octocat:)
- 5.2 - Unlimiformer: Long-Range Transformers with Unlimited Length Input (โ), (๐), (๐)
- 5.2 - Bark - Text-Prompted Generative Audio Model (:octocat:)
- 5.2 - Jsonformer: A Bulletproof Way to Generate Structured JSON from Language Models (:octocat:)
- 5.1 - scGPT: Towards Building a Foundation Model for Single-Cell Multi-omics Using Generative AI (bioXiv), (๐)
- 5.1 - The Guardian - AI makes non-invasive mind-reading possible by turning thoughts into text
- 5.1 - Learning to Reason and Memorize with Self-Notes (โ), (๐), (๐)
- 5.1 - Poisoning Language Models During Instruction Tuning (โ), (๐), (๐)
- 5.1 - What Do Self-Supervised Vision Transformers Learn? (โ), (๐), (๐)
- 5.1 - NYT - โThe Godfather of A.I.โ Leaves Google and Warns of Danger Ahead (Archive)
- 4.30 - ChatGPT: Is this version good for healthcare and research? - (ScienceDirect)
- 4.30 - Understanding Parameter-Efficient LLM Finetuning: Prompt Tuning And Prefix Tuning (Blog)
- 4.30 - A brief history of LLaMA models (Blog)
- 4.30 - BabyBeeAGI: Task Management and Functionality Expansion on top of BabyAGI (blog), (Replit), (:octocat:), (OG BaybyAGI)
- 4.30 - Results of G7 Digital and Tech Ministersโ Meeting in Takasaki, Gunma - (Summary), (Declaration), (Annex1), (Annex2), (Annex3), (Annex4), (Annex5)
- 4.30 - PandaLM: Reproducible and Automated Language Model Assessment (:octocat:)
- 4.29 - Can ChatGPT Pass An Introductory Level Functional Language Programming Course? (โ), (๐), (๐)
- 4.29 - A Review of ChatGPT Applications in Education, Marketing, Software Engineering, and Healthcare: Benefits, Drawbacks, and Research Directions (โ), (๐), (๐)
- 4.29 - ChatGPT-2D, which can generate mind maps with AI - (Tweet), (ChatGPT-2D)
- 4.29 - MLC LLM - an open framework that brings language models (LLMs) directly into a broad class of platforms (CUDA, Vulkan, Metal) with GPU acceleration (Tweet), (Demo), (:octocat:)
- 4.29 - GenOs Index - The April (aka the Frenetic Pace) Edition - (blog)
- 4.29 - StableVicuna, the AI Worldโs First Open Source RLHF LLM Chatbot! - (Blog), (Tweet)
- 4.29 - DeepFloyd - a state-of-the-art text-to-image model (Web), (:octocat:), (HuggingFace demo), (Tweet)
- 4.29 - When Patient Questions Are Answered With Higher Quality and Empathy by ChatGPT than Physicians - (Blog)
- 4.29 - BMTools - Tool Learning for Big Models, Open-Source Solutions of ChatGPT-Plugins (:octocat:)
- 4.29 - FastChat-T5 (:octocat:), (Tweet)
- 4.29 - Lamini, the LLM Engine for Rapidly Customizing Models - (Blog)
- 4.28 - SAM on Medical Images: A Comprehensive Study on Three Prompt Modes (โ), (๐), (๐), (โณ๏ธ)
- 4.28 - EU proposes new copyright rules for generative AI - (Reuter), (Economic times) -
- 4.28 - PROMPTENGINEERING FORCHATGPTA QUICKGUIDE TOTECHNIQUES, TIPS,ANDBESTPRACTICES - (๐)
- 4.28 - ResiDual: Transformer with Dual Residual Connections (โ), (๐), (๐), (:octocat:)
- 4.28 - Causal Reasoning and Large Language Models: Opening a New Frontier for Causality (โ), (๐), (๐)
- 4.28 - We Interviewed the Engineer Google Fired for Saying Its AI Had Come to Life (Futurism)
- 4.28 - LLaMA-Adapter V2: Parameter-Efficient Visual Instruction Model (โ), (๐), (๐), (:octocat:)
- 4.28 - MLCopilot: Unleashing the Power of Large Language Models in Solving Machine Learning Tasks (โ), (๐), (๐)
- 4.28 - Are Emergent Abilities of Large Language Models a Mirage? (โ), (๐), (๐)
- 4.28 - The Ultimate Battle of Language Models: Lit-LLaMA vs GPT3.5 vs Bloom vs โฆ. (Blog)
- 4.28 - Otter, a multi-modal in-context learning model with instruction tuning - (:octocat:), (Demo), (Youtube)
- 4.28 - Economist - Yuval Noah Harari argues that AI has hacked the operating system of human civilisation (Archive)
- 4.28 - Assessing the Potential of USMLE-Like Exam Questions Generated by GPT-4 (medRxiv), (๐)
- 4.28 - JAMA - Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum - (paper)
- 4.27 - Ethics of large language models in medicine and medical research (The Lancet, https://doi.org/10.1016/S2589-7500(23)00083-3), (PDF)
- 4.27 - ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger (โ), (๐), (๐)
- 4.27 - PMC-LLaMA: Further Finetuning LLaMA on Medical Papers (โ), (๐), (๐), (:octocat:)
- 4.27 - "Can ChatGPT Diagnose Me?" How Large Language Models will Transform Clinical Care - (Youtube)
- 4.27 - Large Language Models Are State-of-the-Art Evaluators of Code Generation (โ), (๐), (๐)
- 4.27 - Controlled Text Generation with Natural Language Instructions (โ), (๐), (๐)
- 4.27 - โญ A Survey of Large Language Models - version 8 (โ), (๐), (๐)
- 4.27 - LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions (โ), (๐), (๐), ([:octocat:](https://github.com/m)
- 4.27 - DataComp: In search of the next generation of multimodal datasets (โ), (๐), (๐), (:octocat:), (Project page)
- 4.27 - We're Afraid Language Models Aren't Modeling Ambiguity (โ), (๐), (๐)
- 4.27 - Boston Dynamics robot dog can answer your questions now, thanks to ChatGPT - (ZDNet), (YouTube)
- 4.27 - LlamaIndex & Deep Lake for Financial Statement Analysis (Blog)
- 4.26 - Towards Medical Artificial General Intelligence via Knowledge-Enhanced Multimodal Pretraining (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 4.26 - Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning (โ), (๐), (๐)
- 4.26 - Multidimensional Evaluation for Text Style Transfer Using ChatGPT (โ), (๐), (๐)
- 4.26 - NPJ - Comparing scientific abstracts generated by ChatGPT to real abstracts with detectors and blinded human reviewers (Paper), (๐)
- 4.26 - TopGPT โ the worldโs first Andrew Tate large language model
- 4.26 - Multi-Party Chat: Conversational Agents in Group Settings with Humans and Models (โ), (๐), (๐)
- 4.26 - MOSS, a 16B tool-augmented conversational language model (Tweet), (:octocat:)
- 4.26 - Exploring the Curious Case of Code Prompts (โ), (๐), (๐)
- 4.26 - Controllable Image Generation via Collage Representations (โ), (๐), (๐)
- 4.26 - Unleashing Infinite-Length Input Capacity for Large-scale Language Models with Self-Controlled Memory System (โ), (๐), (๐)
- 4.26 - TextDeformer: Geometry Manipulation using Text Guidance (โ), (๐), (๐)
- 4.26 - Evaluation of GPT-3.5 and GPT-4 for supporting real-world information needs in healthcare delivery (โ), (๐), (๐)
- 4.26 - Ray Conditioning: Trading Photo-consistency for Photo-realism in Multi-view Image Generation (โ), (๐), (๐), (Project page)
- 4.26 - Harnessing the Power of LLMs in Practice: A Survey on ChatGPT and Beyond (โ), (๐), (๐), (:octocat:), (SS)
- 4.26 - HuggingChat - the first open source alternative to ChatGPT
- 4.25 - Time - The 'Don't Look Up' Thinking That Could Doom Us With AI (Archive)
- 4.25 - AI-assisted coding: Experiments with GPT-4 (โ), (๐), (๐)
- 4.25 - NVIDIA NeMo Guardrails helps enterprises keep applications built on large language models aligned with their safety and security requirements (Blog), (:octocat:)
- 4.25 - Stable and low-precision training for large-scale vision-language models (โ), (๐), (๐)
- 4.25 - AudioGPT: Understanding and Generating Speech, Music, Sound, and Talking Head (โ), (๐), (๐)
- 4.25 - Answering Questions by Meta-Reasoning over Multiple Chains of Thought (โ), (๐), (๐)
- 4.25 - Patch-based 3D Natural Scene Generation from a Single Example (โ), (๐), (๐), (Project page)
- 4.25 - Generative AI at Work - (NBER), (๐) -
- 4.25 - Chatbot Arena
- 4.24 - Text-to-Audio Generation using Instruction-Tuned LLM and Latent Diffusion Model (โ), (๐), (๐),(Project page), (:octocat:)
- 4.24 - AI, write an essay for me: A large-scale comparison of human-written versus ChatGPT-generated essays (โ), (๐), (๐)
- 4.24 - Pointersect: Neural Rendering with Cloud-Ray Intersection (โ), (๐), (๐), (web)
- 4.24 - A Cookbook of Self-Supervised Learning (โ), (๐), (๐)
- 4.24 - On the Challenges of Using Black-Box APIs for Toxicity Evaluation in Research (โ), (๐), (๐), (:octocat:)
- 4.24 - Towards Realistic Generative 3D Face Models (โ), (๐), (๐)
- 4.24 - TextMesh: Generation of Realistic 3D Meshes From Text Prompts (โ), (๐), (๐)
- 4.24 - Benchmarking ChatGPT-4 on ACR Radiation Oncology In-Training Exam (TXIT): Potentials and Challenges for AI-Assisted Medical Education and Decision Making in Radiation Oncology (โ), (๐), (๐), (:octocat:)
- 4.24 - Social AGI - SAMANTHA (Self-Reflective Artificial Mind Attuned to Naturalistic Thought and Human Adaptability) (:octocat:)
- 4.24 - Segment Anything in Medical Images (โ), (๐), (๐), (:octocat:)
- 4.24 - Segment Anything in 3D with NeRFs (โ), (๐), (๐), (project page)
- 4.24 - WizardLM: Empowering Large Language Models to Follow Complex Instructions (โ), (๐), (๐)
- 4.24 - Track Anything: Segment Anything Meets Videos (โ), (๐), (๐)
- 4.24 - OpenAI Brand guidelines - (blog)
- 4.24 - GPT4Tools: Teaching LLM to Use Tools via Self-instruction - (Project page), (:octocat:), (Video),
- 4.24 - RAM: Relate-Anything-Model (:octocat:), (Demo)
- 4.24 - Chart-GPT 1.0
- 4.23 - Enhancing Chain-of-Thoughts Prompting with Iterative Bootstrapping in Large Language Models (โ), (๐), (๐), (:octocat:)
- 4.23 - Evaluating ChatGPT's Information Extraction Capabilities: An Assessment of Performance, Explainability, Calibration, and Faithfulness (โ), (๐), (๐)
- 4.22 - Boosting Theory-of-Mind Performance in Large Language Models via Prompting (โ), (๐), (๐)
- 4.22 - LaMP: When Large Language Models Meet Personalization (โ), (๐), (๐), (Project page), (Download), (Leaderboard), (:octocat:)
- 4.22 - Finetuning Large Language Models (Blog)
- 4.21 - Can GPT-4 Perform Neural Architecture Search? (โ), (๐), (๐)
- 4.21 - Evaluating Transformer Language Models on Arithmetic Operations Using Number Decomposition (โ), (๐), (๐)
- 4.21 - Emergent and Predictable Memorization in Large Language Models (โ), (๐), (๐)
- 4.21 - CLaMP: Contrastive Language-Music Pre-training for Cross-Modal Symbolic Music Information Retrieval (โ), (๐), (๐)
- 4.21 - Bard now helps you code with support for 20+ langs (Python, C++, JS, Go, etc.). (Blog)
- 4.21 - Inducing anxiety in large language models increases exploration and bias (โ), (๐), (๐)
- 4.20 - Is ChatGPT a Good Recommender? A Preliminary Study (โ), (๐), (๐), (โณ๏ธ)
- 4.20 - Why Does ChatGPT Fall Short in Answering Questions Faithfully? (โ), (๐), (๐)
- 4.20 - FinChat.io - The Chat GPT for Finance
- 4.20 - LlamaAcademy: Teaching Llamas How to Code (:octocat:)
- 4.20 - Announcing Google DeepMind: DeepMind + Brain = Google DeepMind (Blog)
- 4.20 - "Can ChatGPT Diagnose Me?" How Large Language Models will Transform Clinical Care. Thursday, April 27th, 2023 (RSVP)
- 4.20 - StableLM: Stability AI Language Models (:octocat:), (Blog)
- 4.19 - GeneGPT: Teaching Large Language Models to Use NCBI Web APIs (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 4.19 - Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions (JMIR doi:10.2196/48291), (PDF)
- 4.19 - Fundamental Limitations of Alignment in Large Language Models (โ), (๐), (๐)
- 4.19 - Scaling Transformer to 1M tokens and beyond with RMT (โ), (๐), (๐), (:octocat:)
- 4.19 - Occupational Heterogeneity in Exposure to Generative AI - (paper), (๐)
- 4.19 - The Unintended Consequences of Censoring Digital Technology -- Evidence from Italy's ChatGPT Ban (โ), (๐), (๐)
- 4.19 - CompressGPT: Decrease Token Usage by ~70% (blog)
- 4.19 - Language Models Enable Simple Systems for Generating Structured Views of Heterogeneous Data Lakes (โ), (๐), (๐), (:octocat:(https://img.shields.io/github/stars/HazyResearch/evaporate?style=social))
- 4.19 - LLM as A Robotic Brain: Unifying Egocentric Memory and Control (โ), (๐), (๐)
- 4.19 - Is ChatGPT Good at Search? Investigating Large Language Models as Re-Ranking Agent (โ), (๐), (๐)
- 4.19 - Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models (โ), (๐), (๐), (project page), (:octocat:)
- 4.19 - h2oai's LLM repositories - (h2ogpt), (h2o-llmstudio), (Huggingface) -
- 4.19 - Evaluating Verifiability in Generative Search Engines (โ), (๐), (๐)
- 4.19 - How to train your own Large Language Models (Blog)
- 4.19 - AI Playground from Vercel Labs (tweet)
- 4.19 - StanfordBDHG HealthGPT (tweet), (:octocat:)
- 4.19 - GPT4All-J : the first Apache-2 Licensed Chatbot that runs locally on your machine (:octocat:), (๐) -
- 4.19 - PersonalPrivate.AI - system to advise on new patent ideas (tweet)
- 4.18 - Computer-Vision Benchmark Segment-Anything Model (SAM) in Medical Images: Accuracy in 12 Datasets (โ), (๐), (๐), (โณ๏ธ)
- 4.18 - Exploring the Trade-Offs: Unified Large Language Models vs Local Fine-Tuned Models for Highly-Specific Radiology NLI Task (โ), (๐), (๐), (โณ๏ธ)
- 4.18 - Economist - The world needs an international agency for artificial intelligence, say two AI experts (Archive)
- 4.18 - CancerGPT: Few-shot Drug Pair Synergy Prediction using Large Pre-trained Language Models (โ), (๐), (๐)
- 4.18 - Think Before You Act: Unified Policy for Interleaving Language Reasoning with Actions (โ), (๐), (๐)
- 4.18 - Nature - Why open-source generative AI models are an ethical way forward for science
- 4.18 - Autonomous Agents(BabyAGI, AutoGPT) & Agent Simulations(CAMEL, Generative Agents) (Blog)
- 4.18 - AutoTaskFormer: Searching Vision Transformers for Multi-task Learning (โ), (๐), (๐)
- 4.18 - SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in Underperformed Scenes: Camouflage, Shadow, and More (โ), (๐), (๐), (Project page)
- 4.18 - Align your Latents: High-Resolution Video Synthesis with Latent Diffusion Models (โ), (๐), (๐), (Project page) -
- 4.18 - Google - Differentially private heatmaps (Blog)
- 4.18 - The Complete Beginners Guide To Autonomous Agents
- 4.18 - Llama Lab - A repo dedicated to building cutting-edge AGI projects: llama_agi (inspired by babyagi) and auto_llama (inspired by autogpt) (:octocat:), (Llama Hub)
- 4.18 - Elon Musk to start ChatGPT rival called โTruthGPTโ (tweet)
- 4.17 - InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions (CVPR2023 โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 4.17 - MasaCtrl: Tuning-Free Mutual Self-Attention Control for Consistent Image Synthesis and Editing (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 4.17 - Notice of the Cyberspace Administration of China on Public Comments on the "Administrative Measures for Generative Artificial Intelligence Services (Draft for Comment)" (Announcement)
- 4.17 - Pretrained Language Models as Visual Planners for Human Assistance (โ), (๐), (๐)
- 4.17 - An Evaluation on Large Language Model Outputs: Discourse and Memorization (โ), (๐), (๐)
- 4.17 - Epic, Microsoft bring generative AI to EHRs - ([Microsoft announcement](Microsoft and Epic expand strategic collaboration with integration of Azure OpenAI Service))
- 4.17 - BenchMD: A Benchmark for Modality-Agnostic Learning on Medical Images and Sensors (โ), (๐), (๐) -
- 4.17 - Towards Robust Prompts on Vision-Language Models (โ), (๐), (๐) -
- 4.17 - Tool Learning with Foundation Models (โ), (๐), (๐), (:octocat:) -
- 4.17 - Low-code LLM: Visual Programming over LLMs (โ), (๐), (๐), (โณ๏ธ), (:octocat:)
- 4.17 - Wired - OpenAIโs CEO Says the Age of Giant AI Models Is Already Over
- 4.17 - Synthetic Data from Diffusion Models Improves ImageNet Classification (โ), (๐), (๐)
- 4.17 - RedPajama-Data: An Open Source Recipe to Reproduce LLaMA training dataset (GitHib)
- 4.17 - Visual Instruction Tuning (โ), (๐), (๐), (:octocat:), (Dataset), (Model), (Project page), (Demo)
- 4.17 - Learning to Compress Prompts with Gist Tokens (โ), (๐), (๐)
- 4.17 - ImpressionGPT: An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT (โ), (๐), (๐)
- 4.17 - Meta - DINOv2: State-of-the-art computer vision models with self-supervised learning (blog), (:octocat:), (Demo), (โ), (๐), (๐)
- 4.17 - TypingMind - A better UI for ChatGPT (tweet)
- 4.16 - Understanding Large Language Models (Blog)
- 4.16 - INSIGHT - an autonomous AI that can do medical research (:octocat:)
- 4.16 - GPT4free - use ChatGPT, for free!! - (:octocat:) -
- 4.16 - Solving Math Word Problems by Combining Language Models With Symbolic Solvers (โ), (๐), (๐)
- 4.16 - ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human (โ), (๐), (๐)
- 4.16 - Driving and suppressing the human language network using large language models (bioRxiv), (๐)
- 4.16 - MultiGPT (:octocat:). (tweet)
- 4.16 - OpenAssistant Conversations - Democratizing Large Language Model Alignment (๐), (YouTube), (hacker news)
- 4.16 - Auto-evaluator - lightweight evaluation tool for question-answering using Langchain (:octocat:) -
- 4.16 - NYT - Google Devising Radical Search Changes to Beat Back A.I. Rivals (Archive)
- 4.15 - Brex's Prompt Engineering Guide (:octocat:)
- 4.15 - Graphologue and Sensecape by UCSD Creativity Lab
- 4.15 - Tractable Control for Autoregressive Language Generation (โ), (๐), (๐)
- 4.15 - Web LLM - language model chats directly onto web browsers (Site), (:octocat:)
- 4.15 - MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models (Project page). (Paper), (:octocat:), (YouTube)
- 4.15 - OpenAssistant - The world's largest open-source replication of ChatGPT (site), (:octocat:), (Dataset - OASST1), (Paper), (YouTube), (Reddit)
- 4.14 - MedAlpaca -- An Open-Source Collection of Medical Conversational AI Models and Training Data (โ), (๐), (๐), (๐papers with code)
- 4.14 - HuaTuo: Tuning LLaMA Model with Chinese Medical Knowledge (โ), (๐), (๐), (๐papers with code)
- 4.14 - ChatGPT: Applications, Opportunities, and Threats (โ), (๐), (๐)
- 4.14 - Swin3D: A Pretrained Transformer Backbone for 3D Indoor Scene Understanding (โ), (๐), (๐)
- 4.14 - OpenBB Terminal V3.0.0rc2 - (:octocat:)
- 4.14 - Delta Denoising Score (โ), (๐), (๐), (Project page)
- 4.14 - DINOv2: Learning Robust Visual Features without Supervision (โ), (๐), (๐)
- 4.14 - Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text (โ), (๐), (๐), (:octocat:)
- 4.14 - WSJ - Elon Musk Creates New Artificial Intelligence Company X.AI (archive), (FT)
- 4.14 - Google Med-PaLM 2 - A responsible path to generative AI in healthcare
- 4.14 - Meta's open source Animated Drawings - (Blog)
- 4.14 - ControlNet v1.1 nightly - (:octocat:)
- 4.13 - Teenage-AGI (:octocat:)
- 4.13 - Boosted Prompt Ensembles for Large Language Models (โ), (๐), (๐)
- 4.13 - ChatGPT-4 Outperforms Experts and Crowd Workers in Annotating Political Twitter Messages with Zero-Shot Learning (โ), (๐), (๐)
- 4.13 - Soundini: Sound-Guided Diffusion for Natural Video Editing (โ), (๐), (๐), (Project page)
- 4.13 - Shall We Pretrain Autoregressive Language Models with Retrieval? A Comprehensive Study (โ), (๐), (๐), (:octocat:)
- 4.13 - Inpaint Anything: Segment Anything Meets Image Inpainting (โ), (๐), (๐), (:octocat:)
- 4.13 - GoalGPT by Nando.ai
- 4.13 - Power-seeking can be probable and predictive for trained agents (โ), (๐), (๐)
- 4.13 - GoalGPT by Nando.ai
- 4.13 - Stable Diffusion XL Beta Available for API Customers and DreamStudio Users
- 4.13 - NAB 2023: Introducing Text-Based Editing in Premiere Pro, Properties panel in After Effects, and much more
- 4.13 - Announcing New Tools for Building with Generative AI on AWS - Amazon LLM (Titan), AWS fine-tuning model (Bedrock), Amazon copilot competitor (Code whisperer)
- 4.13 - FT - We must slow down the race to God-like AI (archive)
- 4.13 - Segment Everything Everywhere All at Once (โ), (๐), (๐)
- 4.13 - Expressive Text-to-Image Generation with Rich Text (โ), (๐), (๐), (Project page)
- 4.13 - AGIEval: A Human-Centric Benchmark for Evaluating Foundation Models (โ), (๐), (๐), (โณ๏ธ), (SS), (:octocat:)
- 4.12 - Can Large Language Models Transform Computational Social Science? (โ), (๐), (๐) -
- 4.12 - Galactic ChitChat: Using Large Language Models to Converse with Astronomy Literature (โ), (๐), (๐) -
- 4.12 - Performance of ChatGPT, GPT-4, and Google Bard on a Neurosurgery Oral Boards Preparation Question Bank (medRxiv), (๐)
- 4.12 - ChatGPT Beyond English: Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning (โ), (๐), (๐)
- 4.12 - Foundation models for generalist medical artificial intelligence (Nature https://doi.org/10.1038/s41586-023-05881-4), (๐), (SS)
- 4.12 - Dolly v2 - 12B parameter language model (Model weight), (:octocat:), (Blog) -
- 4.11 - Re-imagine the Negative Prompt Algorithm: Transform 2D Diffusion into 3D, alleviate Janus problem and Beyond (โ), (๐), (๐), (Project page), (:octocat:), (Colab), (Hugging face) -
- 4.11 - Toxicity in ChatGPT: Analyzing Persona-assigned Language Models (โ), (๐), (๐) -
- 4.11 - Multi-step Jailbreaking Privacy Attacks on ChatGPT (โ), (๐), (๐) -
- 4.11 - Building LLM applications for production
- 4.11 - Emergent autonomous scientific research capabilities of large language models (โ), (๐), (๐)
- 4.11 - OpenAIโs Bug Bounty Program
- 4.11 - NTIAโs โAI Accountability Policy Request for Commentโ
- 4.11 - WSJ - Biden Administration Weighs Possible Rules for AI Tools Like ChatGPT, (archive)
- 4.11 - ChemCrow: Augmenting large-language models with chemistry tools (โ), (๐), (๐)
- 4.11 - LangChainJS Support for Multiple JS Environments (tweet)
- 4.11 - Teaching Large Language Models to Self-Debug (โ), (๐), (๐)
- 4.10 - Can ChatGPT Forecast Stock Price Movements? Return Predictability and Large Language Models (Paper), (๐)
- 4.10 - On the Possibilities of AI-Generated Text Detection (โ), (๐), (๐)
- 4.10 - OpenAGI: When LLM Meets Domain Experts (NeurIPS2003 โ), (๐), (๐), (:octocat:)
- 4.9 - ChatAll - oncurrently chat with ChatGPT, Bing Chat, bard, Alpaca, Vincuna, Claude, ChatGLM, MOSS, iFlytek Spark, ERNIE and more, discover the best answers (:octocat:)
- 4.9 - BabyAGI JS - (:octocat:)
- 4.9 - AgentGPT - Auto-GPT directly in the browser (tweet), (:octocat:), (demo)
- 4.8 - A Recipe for Training Large Models
- 4.7 - SuperPrompt Engineer Encourages ChatGPT Hallucinations
- 4.7 - Cerebras-GPT: Open Compute-Optimal Language Models Trained on the Cerebras Wafer-Scale Cluster (โ), (๐), (๐)
- 4.7 - Why think step-by-step? Reasoning emerges from the locality of experience (โ), (๐), (๐)
- 4.7 - Generative Agents: Interactive Simulacra of Human Behavior (โ), (๐), (๐), (Project), (:octocat:) -
- 4.7 - Vicuna-7B: small, efficient, yet capable (:octocat:), (Weight)
- 4.7 - StackLlama (Blog), (Demo), (:octocat:)
- 4.7 - SegGPT: Segmenting Everything In Context (โ), (๐), (๐), (:octocat:), (Demo)
- 4.6 - Synthetic Data in Healthcare (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 4.6 - Chrome ships WebGPU (Blog)
- 4.6 - GPT detectors are biased against non-native English writers (โ), (๐), (๐)
- 4.6 - ChaosGPT: Empowering GPT with Internet and Memory to Destroy Humanity (YouTube)
- 4.6 - InstantBooth: Personalized Text-to-Image Generation without Test-Time Finetuning (โ), (๐), (๐), (Project) -
- 4.6 - Wired - AI Desperately Needs Global Oversight
- 4.6 - Instruction Tuning with GPT-4 (โ), (๐), (๐), (:octocat:)
- 4.6 - GeNVS: Generative Novel View Synthesis with 3D-Aware Diffusion Models (โ), (๐), (๐), (:octocat:)
- 4.6 - Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the MACHIAVELLI Benchmark (โ), (๐), (๐)
- 4.5 - Yoshua Bengio - Slowing down development of AI systems passing the Turing test -
- 4.5 - Language models are on Replicate - FLAN-T5, GPT-J, and LLaMA (Blog)
- 4.5 - Meta's Segment Anything Model (SAM) (Paper), (๐), (:octocat:), (Demo), (โ), (๐), (๐) -
- 04/04 - Summary of ChatGPT-Related Research and Perspective Towards the Future of Large Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (SL), (SP), (GS), (SS), (โณ๏ธ)
- 4.4 - Leveraging GPT-4 for Post Hoc Transformation of Free-text Radiology Reports into Structured Reporting: A Multilingual Feasibility Study (RSNA Radiology, https://doi.org/10.1148/radiol.230725)
- 4.4 - Calibrated Chaos: Variance Between Runs of Neural Network Training is Harmless and Inevitable (โ), (๐), (๐)
- 4.4 - One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era (โ), (๐), (๐)
- 4.4 - LangCahin raised $10 million in seed funding
- 4.4 - Kandinsky 2.1 (:octocat:), (HuggingFace)
- 4.4 - The weights of Vicuna-13B released (WebUI demo) (:octocat:)
- 4.4 - LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models (โ), (๐), (๐), (:octocat:)
- 4.4 - Summary of ChatGPT/GPT-4 Research and Perspective Towards the Future of Large Language Models (โ), (๐), (๐)
- 4.3 - Pythia: A Suite for Analyzing Large Language Models Across Training and Scaling (โ), (๐), (๐)
- 4.3 - Vicuna-13B: An Open-Source ChatGPT Alternative That Impresses GPT-4 (Blog), (:octocat:)
- 4.3 - Baby AGI (:octocat:)
- 4.3 - Berkley just released Koala-13B! (Demo)
- 4.3 - 2023 Artificial Intelligence (AI) Index Report Published by Stanford Institute for Human-Centered Artificial Intelligence (HAI)
- 4.3 - The LLM playground - open source (:octocat:)
- 4.3 - Baize: An Open-Source Chat Model with Parameter-Efficient Tuning on Self-Chat Data (โ), (๐), (๐), (:octocat:)
- 4.2 - GPTCache : A Library for Creating Semantic Cache for LLM Queries - (:octocat:)
- 4.2 - Better Language Models of Code through Self-Improvement (โ), (๐), (๐)
- 4.2 - Eight Things to Know about Large Language Models (โ), (๐), (๐)
- 4.2 - LLMMaps -- A Visual Metaphor for Stratified Evaluation of Large Language Models (โ), (๐), (๐)
- 4.1 - Towards General Purpose Vision Systems (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 4.1 - Evaluating Large Language Models on a Highly-specialized Topic, Radiation Oncology Physics (โ), (๐), (๐), (โณ๏ธ) |
- 4.1 - Italy curbs ChatGPT, starts probe over privacy concerns
- 3.31 - Evaluating GPT-4 and ChatGPT on Japanese Medical Licensing Examinations (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 3.31 - Choose Your Weapon: Survival Strategies for Depressed AI Academics (โ), (๐), (๐)
- 3.31 - CAMEL: Communicative Agents for "Mind" Exploration of Large Scale Language Model Society (โ), (๐), (๐), (:octocat:)
- 3.31 - โญ A Survey of Large Language Models - Version 1 (โ), (๐), (๐)
- 3.31 - (SCIENTIFIC AMERICAN) AI Chatbots Can Diagnose Medical Conditions at Home. How Good Are They?
- 3.30 - ChatGPT in Healthcare: A Taxonomy and Systematic Review (medRxiv), (๐)
- 3.30 - Launching the Generative AI Open Source (GenOS) Index - (Index), (Tweet)
- 3.30 - Whose Opinions Do Language Models Reflect? (โ), (๐), (๐), (:octocat:)
- 3.30 - Language Models can Solve Computer Tasks (โ), (๐), (๐)
- 3.30 - Self-Refine: Iterative Refinement with Self-Feedback (โ), (๐), (๐)
- 3.30 - Humans in Humans Out: On GPT Converging Toward Common Sense in both Success and Failure (โ), (๐), (๐)
- 3.30 - List of Open Sourced Fine-Tuned Large Language Models (LLM)
- 3.30 - NEJM - Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine
- 3.30 - BloombergGPT: A Large Language Model for Finance (โ), (๐), (๐)
- 3.30 - Got It AIโs ELMAR challenges GPT-4 and LLaMa, scores well on hallucination benchmarks
- 3.30 - HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in HuggingFace (โ), (๐), (๐)
- 3.30 - CAIDP claims "The FTC should investigate OpenAI and block GPT over โdeceptiveโ behavior"
- 3.30 - Epic to use Microsoft's GPT-4 in EHRs
- 3.30 - Auto-GPT: An Autonomous GPT-4 Experiment (:octocat:)
- 3.29 - HyperDiffusion: Generating Implicit Neural Fields with Weight-Space Diffusion (project), (โ), (๐), (๐), (๐ ), (โณ๏ธ
- 3.29 - AnnoLLM: Making Large Language Models to Be Better Crowdsourced Annotators (โ), (๐), (๐)
- 3.29 - nucleotide transformers - genomics LLM, ranging from 500M to 2.5B parameters - (:octocat:)
- 3.29 - GeoV-9b - 9 billion parameter causal language model (code, weights, colab)
| 3.29 | GPT4All - 7B param language model finetuned from a curated set of 400k GPT-Turbo-3.5 |
| 3.29 | LLaMA-Adapter!: Efficient Fine-tuning of Language Models with Zero-init Attention |
| 3.29 | MacGPT 3.2 |
- 3.29 - G-Eval: NLG Evaluation using GPT-4 with Better Human Alignment (โ), (๐), (๐), (โณ๏ธ), (:octocat:)
- 3.29 - TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs (โ), (๐), (๐)
- 3.28 - Training Language Models with Language Feedback at Scale (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 3.28 - Natural Selection Favors AIs over Humans โ), (๐), (๐)
- 3.28 - ChatGPT Outperforms Crowd-Workers for Text-Annotation Tasks (โ), (๐), (๐)
| 3.28 | LLaMA voice chat + Siri TTS |
| 3.28 | Cerebras-GPT - 111M to 13B parameters trained using the Chinchilla formula |
| 3.28 | Microsoft Security Copilot: Empowering defenders at the speed of AI |
- 3.28 - Google pix2struct launched today, a multimodal model specializing in screenshot data
| 3.28 | OpenFlamingo - a framework that enables training and evaluation of large multimodal models (LMMs) |
- 3.27 - Microsoft JARVIS (:octocat:)
- 3.27 - ChatGPT Survey: Performance on NLP datasets
- 3.27 - GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models (โ), (๐), (๐)
- 3.26 - AI-Generated Content (AIGC): A Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ
- 3.26 - Nature Language Reasoning, A Survey (โ), (๐), (๐)
| 3.26 | Sam Altman: OpenAI CEO on GPT-4, ChatGPT, and the Future of AI - Lex Fridman Podcast #367 |
| 3.26 | LLaMA voice chat |
| 3.26 | Japanese Alpaca LoRA |
- 3.24 - LLM for Patient-Trial Matching: Privacy-Aware Data Augmentation Towards Better Performance and Generalizability (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 3.24 - Progressively Optimized Local Radiance Fields for Robust View Synthesis (โ), (๐), (๐), (๐ ), (โณ๏ธ), (CVPR 2023) |
- 3.24 - Efficient Methods for Natural Language Processing: A Survey (โ), (๐), (๐)
| 3.24 | NYT OPINION - You Can Have the Blue Pill or the Red Pill, and Weโre Out of Blue Pills (archive)
| 3.24 | Dolly - open source LLM |
| 3.24 | Text2Video-Zero: Text-to-Image Diffusion Models are Zero-Shot Video Generators |
| 3.24 | ChatDoctor: A Medical Chat Model Fine-tuned on LLaMA Model using Medical Domain Knowledge (โ), (๐), (๐), (:octocat:) |
- 3.24 - Do large language models need sensory grounding for meaning and understanding? @YannLeCun
| 3.23 | OpenAI: ChatGPT Plugins |
| 3.23 | Opera brings AI ChatGPT bot sidebar to browsers |
- 3.22 - The Shaky Foundations of Clinical Foundation Models: A Survey of Large Language Models and Foundation Models for EMRs (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS
- 3.22 - Artificial muses: Generative Artificial Intelligence Chatbots Have Risen to Human-Level Creativity (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 3.22 - GitHub: Copilot X
- 3.22 - Sparks of Artificial General Intelligence: Early experiments with GPT-4 (โ), (๐), (๐), (YouTube)
- 3.22 - Pause Giant AI Experiments: An Open Letter
- 3.21 - WSJ - Generative AI Makes Headway in Healthcare
- 3.21 - NVIDIA Brings Generative AI to Worldโs Enterprises
- 3.21 | Adobe launches Firefly
- 3.21 | Google launches Bard in the US and UK
- 3.21 | Microsoft: Bing Image Creator
- 3.21 | Stability AI Launches Stable Diffusion Reimagine
- 3.20 - Capabilities of GPT-4 on Medical Challenge Problems (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 3.20 - Reflexion: an autonomous agent with dynamic memory and self-reflection (โ), (๐), (๐), (:octocat:)
- 3.20 | March 20 ChatGPT outage: Hereโs what happened
- 3.20 | Runway Gen-2
- 3.20 | Capabilities of GPT-4 on Medical Challenge Problems (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 3.20 - Making Music with GPT 4 by (Wavtool)
- 3.19 - Simple LLM Finetuner (:octocat:)
- 3.18 - Data-centric Artificial Intelligence: A Survey (โ), (๐), (๐), (:octocat:)
- 3.17 - Can AI-Generated Text be Reliably Detected? (โ), (๐), (๐)
- 3.17 - GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models (โ), (๐), (๐), (โณ๏ธ), (SS)
- 3.16 - WebSHAP: Towards Explaining Any Machine Learning Models Anywhere (โ), (๐), (๐), (:octocat:)
- 3.16 - LERF: Language Embedded Radiance Fields (โ), (๐), (๐), (:octocat:)
- 3.16 - Microsoft: Microsoft 365 Copilot
- 3.16 - Alpaca LoRA: instruct tune LLAMA on consumer hardware
- 3.16 - OpenAI CEO Sam Altman says AI will reshape society, acknowledges risks: 'A little bit scared of this'
- 3.15 - A new era for AI and Google Workspace
- 3.15 - PyTorch 2.0: Our next generation release
- 3.15 - Baidu: ERNIE Bot
- 3.15 - Midjourney: Midjourney V5
- 3.15 - arXiv - GPT-4 Technical report
- 3.14 - Text-to-image Diffusion Models in Generative AI: A Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 3.14 - The Lancet - Attention is not all you need: the complicated case of ethically using large language models in healthcare and medicine
- 3.14 - THUDM releases ChatGLM-6B
- 3.14 - Langflow - a UI for LangChain (:octocat:)
- 3.14 - Anthropic: Claude
- 3.14 - Google: PaLM API & Workspace
- 3.14 - OpenAI: GPT-4
- 3.13 - Stanford Alpaca 7B
- 3.13 - Microsoft lays off team that taught employees how to make AI tools responsibly
- 3.13 - MiniLLM: Large Language Models on Consumer GPUs
- 3.13 - Chatbot UI (:octocat:(https://img.shields.io/github/stars/mckaywrigley/chatbot-ui?style=social))
- 3.12 - Towards General Purpose Medical AI: Continual Learning Medical Foundation Model (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 3.12 - GM explores using ChatGPT in vehicles
- 3.10 - Google: PaLM-E
- 3.9 - multi-model playground - https://nat.dev
- 3.9 - GPT-4 is coming next week
- 3.8 - Visual ChatGPT: Talking, Drawing and Editing with Visual Foundation Models (โ), (๐), (๐), (โณ๏ธ), (SS), (:octocat:)
- 3.8 - NYT, Opinion - Noam Chomsky: The False Promise of ChatGPT (archive)
- 3.7 - A Comprehensive Survey of AI-Generated Content (AIGC): A History of Generative AI from GAN to ChatGPT (โ), (๐), (๐)
- 3.7 - Radiology - The Role and Limitations of Large Language Models Such as ChatGPT in Clinical Settings and Medical Journalism
- 3.7 - Stability AI Acquires Image Editing App Clipdrop
- 3.6 - Google: Universal Speech Model
- 3.5 - Generative AI: Perspectives from Stanford HAI
- 3.5 - UpStage, ChatGPT bot (Askup) on Line
- 3.5 - UpStage, ChatGPT bot (Askup) on KakaoTalk
- 3.2 - Consistency Models (โ), (๐), (๐), (:octocat:)
- 3.1 - Almanac: Retrieval-Augmented Language Models for Clinical Medicine (โ), (๐), (๐), (โณ๏ธ)
- 3.1 - OpenAI: ChatGPT and Whisper API
- 2.28 - Large Language Models Are State-of-the-Art Evaluators of Translation Quality (โ), (๐), (๐)
- 2.27 - Best Practices for Using AI When Writing Scientific Manuscripts (ACS Nano 2023, 17, 5, 4091โ4093)
- 2.27 - Fighting โWoke AI,โ Musk Recruits Team to Develop OpenAI Rival
- 2.25 - The Lancet - The promise of large language models in health care
- 2.25 - AugGPT: Leveraging ChatGPT for Text Data Augmentation (โ), (๐), (๐)
- 2.24 - Sam Altman, Planning for AGI and beyond
- 2.24 - Meta: LLaMA
- 2.23 - Language Is Not All You Need: Aligning Perception with Language Models (โ), (๐), (๐), (๐), (๐ ), (HTML), (โณ๏ธ), (:octocat:)
- 2.23 - Radiology - ChatGPT and the Future of Medical Writing
- 2.23 - Instagram co-founders launch AI-powered news app Artifact on Android, iOS
- 2.23 - Notion.AI launch
- 2.22 - The alignment problem from a deep learning perspective (โ), (๐), (๐)
- 2.22 - Microsoft: Bing announcement on mobile and Skype
- 2.22 - Science - As scientists explore AI-written text, journals hammer out policies
- 2.21 - BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT (โ), (๐), (๐)
- 2.21 - Hyena Hierarchy: Towards Larger Convolutional Language Models (โ), (๐), (๐)
- 2.21 - The PNAS Journals Outline Their Policies for ChatGPT and Generative AI
- 2.21 - ChatGPT: Jack of all trades, master of none (โ), (๐), (๐)
- 2.20 - ChatGPT for Robotics: Design Principles and Model Abilities (โ), (๐), (๐), (โณ๏ธ), (:octocat:)
- 2.18 - A Comprehensive Survey on Pretrained Foundation Models: A History from BERT to ChatGPT (โ), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 2.17 - Complex QA and language models hybrid architectures, Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 2.17 - Time, ChatGPT cover
- 2.17 - OpenAI, Foundry Product Brief
- 2.17 - Generative AI on Roblox: Our Vision for the Future of Creation
- 2.16 - Auditing large language models: a three-layered approach (โ), (๐), (๐), (๐ ), (โณ๏ธ), (SS)
- 2.16 - Do We Still Need Clinical Language Models? (โ), (๐), (๐)
- 2.16 - Startup Replit launches a ChatGPT-like bot for coders
- 2.15 - A&O announces exclusive launch partnership with Harvey
- 2.14 - ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 2.14 - What Is ChatGPT Doing โฆ and Why Does It Work? (Stephen Wolfram Writings)
- 2.14 - 1M ChatGPT plus user
- 2.14 - The Gen AI Conference Hosted by Jasper
- 2.13 - Google: Vision Transformer 22B
- 2.12 - Transformer models: an introduction and catalog (โ), (๐), (๐), (Blog)
- 2.10 - arXivGPT launches
- 2.10 - OpenAI, ChatGPT plus announce (20$)
- 2.9 - Disastrous Chatbot Demo Costs Google $140 Billion
- 2.9 - Meta: Toolformer
- 2.8 - A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity (โ), (๐), (๐)
- 2.8 - Runway launches ground-breaking Gen-1 video generation AI system
- 2.7 - Microsoft: Bing ChatGPT
- 2.7 - Getty Images sues AI art generator Stable Diffusion in the US for copyright infringement
- 2.6 - A Categorical Archive of ChatGPT Failures (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 2.6 - The Lancet - ChatGPT: friend or foe?
- 2.6 - Google: Bard announcement
- 2.4 - Theory of Mind May Have Spontaneously Emerged in Large Language Models (โ), (๐), (๐)
- 2.4 - POE.com open
- 2.3 - Google invests in Anthropic, maker of ChatGPT rival
- 2.3 - Naver, SearchGPT announcement
- 2.2 - Creating a Large Language Model of a Philosopher (โ), (๐), (๐)
- 2.2 - ChatGPT reaches 100 million users two months after launch
- 2.1 - The Diagnostic and Triage Accuracy of the GPT-3 Artificial Intelligence Model (medrXiv
- 2.1 - OpenAI, released a software tool to help identify text generated by AI
- 1.31 - The Flan Collection: Designing Data and Methods for Effective Instruction Tuning (blog), (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 1.31 - JAMA Network - Nonhuman โAuthorsโ and Implications for the Integrity of Scientific Publication and Medical Knowledge
- 1.30 - SingSong: Generating musical accompaniments from singing (โ), (๐), (๐), (:octocat:)
- 1.30 -China's biggest search engine is to set launch a ChatGPT rival in March
- 1.26 - Science Journal - ChatGPT is fun, but not an author
- 1.26 - DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature (โ), (๐), (๐)
- 1.26 - ChatGPT Is Coming for Classrooms. Don't Panic
- 1.26 - ChatGPT passes exams from law and business schools
- 1.26 - Googleโs new AI turns text into music - MusicLM
- 1.24 - Putting ChatGPT's Medical Advice to the (Turing) Test (โ), (๐), (๐)
- 1.24 - Nature policy - Tools such as ChatGPT threaten transparent science; here are our ground rules for their use
- 1.20 - WAME policy - Chatbots, ChatGPT, and Scholarly Manuscripts
- 1.17 - Meet Claude: Anthropicโs Rival to ChatGPT
- 1.14 - Microsoft in talks to acquire a 49% stake in ChatGPT owner OpenAI |
- 1.12 - Multimodal Deep Learning (โ), (๐), (๐)
- 1.11 - This Voice Doesn't Exist - Generative Voice AI
- 1.9 - Microsoft is looking at OpenAIโs GPT for Word, Outlook, and PowerPoint
- 1.5 - Apple launches AI-powered book narrations
- 1.5 - Microsoft, VALL-E
- 1.4 - ICML conference responds to LLM ethics rule
- 1.3 - Enter GPTZeo
2022
- 12.29 - GPT Takes the Bar Exam (โ), (๐), (๐), (SS)
- 12.27 - bioarXiv - Comparing scientific abstracts generated by ChatGPT to original abstracts using an artificial intelligence output detector, plagiarism detector, and blinded human reviewers
- 12.26 - A large language model for electronic health records (Nature https://doi.org/10.1038/s41746-022-00742-2), (PDF)
- 12.26 - How Well Does ChatGPT Do When Taking the Medical Licensing Exams? The Implications of Large Language Models for Medical Education and Knowledge Assessment (medRxiv), (PDF)
- 12.20 - Towards Reasoning in Large Language Models: A Survey (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 12.15 - Constitutional AI: Harmlessness from AI Feedback (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:)
- 12.3 - The Role of Generative Adversarial Network in Medical Image Analysis: An In-depth Survey (ACM, https://doi.org/10.1145/3527849), (PDF)
- 11.30 - OpenAI, ChatGPT service
- 11.29 - MegaBlocks: Efficient Sparse Training with Mixture-of-Experts (โ), (๐), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - Fine-tuning language models to find agreement among humans with diverse preferences (โ), (๐), (๐), (๐ ), (โณ๏ธ)
- 11.28 - NeurIPS 2022 conference
- 11.21 - VectorFusion: Text-to-SVG by Abstracting Pixel-Based Diffusion Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (CVPR 2023)
- 11.17 - InstructPix2Pix: Learning to Follow Image Editing Instructions
- 11.16 - Holistic Evaluation of Language Models (โ), (๐), (๐), (๐ ), (โณ๏ธ), (:octocat:), (SS)
- 11.14 - Diffusion Models for Medical Image Analysis: A Comprehensive Survey (โ), (๐), (๐), (โณ๏ธ), (:octocat:)
- 11.3 - Large Language Models Are Human-Level Prompt Engineers (โ), (๐), (๐), (โณ๏ธ), (:octocat:)
- 11.1 - MedSegDiff: Medical Image Segmentation with Diffusion Probabilistic Model (โ), (๐), (๐), (โณ๏ธ), (:octocat:)
- 10.30 - LlamaIndex (GPT Index) GitHub project
- 10.23 - LangChain GitHub project
- 9.27 - What Does DALL-E 2 Know About Radiology? (JMIR), (โ), (๐), (๐), (โณ๏ธ)
- 9.19 - SEQUOIA - Generative AI: A Creative New World
- 9.15 - Brain Imaging Generation with Latent Diffusion Models โ), (๐), (๐), (โณ๏ธ)
- 9.6 - A Survey on Generative Diffusion Model (โ), (๐), (๐), (โณ๏ธ), (:octocat:)
- 8.25 - Understanding Diffusion Models: A Unified Perspective (โ), (๐), (๐), (Blog)
- 7.4 - Shifting machine learning for healthcare from development to deployment and from models to data (nature biomedical engineering, https://doi.org/10.1038/s41551-022-00898-y), (PDF)
- 3.29 - Training Compute-Optimal Large Language Models (โ), (๐), (๐), (๐ )
- 3.15 - OpenAI, GPT 3.5 announce
- 2.11 - Compute Trends Across Three Eras of Machine Learning (โ), (๐), (๐)
- 2.8 - โญ Survey of Hallucination in Natural Language Generation (โ), (๐), (๐), (โณ๏ธ), (SS)
- 1.28 - Chain-of-Thought Prompting Elicits Reasoning in Large Language Models (โ), (๐), (๐), (โณ๏ธ), (SS)
2021
- 12.8 - Ethical and social risks of harm from Language Models (โ), (๐), (๐), (โณ๏ธ), (SS)
- 10.19 - Future directions for chatbot research: an interdisciplinary research agenda (paper)
- 8.16 - โญ On the Opportunities and Risks of Foundation Models (โ), (๐), (๐), (โณ๏ธ)
- 6.15 - Synthetic data in machine learning for medicine and healthcare (nature biomedical engineering, https://doi.org/10.1038/s41551-021-00751-8), (PDF)
- 4.18 - The Power of Scale for Parameter-Efficient Prompt Tuning (โ), (๐), (๐)
Additional Links