ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

MIT License

Downloads
219
Stars
92.5K
Committers
310

Bot releases are hidden (Show)

ollama - v0.1.29

Published by jmorganca 7 months ago

AMD Preview

Ollama now supports AMD graphics cards in preview on Windows and Linux. All the features are now accelerated by AMD graphics cards, and support is included by default in Ollama for Linux, Windows and Docker.

Supported cards and accelerators

Family Supported cards and accelerators
AMD Radeon RX 7900 XTX 7900 XT 7900 GRE 7800 XT 7700 XT 7600 XT 7600 6950 XT 6900 XTX 6900XT 6800 XT 6800Vega 64 Vega 56
AMD Radeon PRO W7900 W7800 W7700 W7600 W7500 W6900X W6800X Duo W6800X W6800V620 V420 V340 V320Vega II Duo Vega II VII SSG
AMD Instinct MI300X MI300A MI300MI250X MI250 MI210 MI200MI100 MI60 MI50

What's Changed

  • ollama <command> -h will now show documentation for supported environment variables
  • Fixed issue where generating embeddings with nomic-embed-text, all-minilm or other embedding models would hang on Linux
  • Experimental support for importing Safetensors models using the FROM <directory with safetensors model> command in the Modelfile
  • Fixed issues where Ollama would hang when using JSON mode.
  • Fixed issue where ollama run would error when piping output to tee and other tools
  • Fixed an issue where memory would not be released when running vision models
  • Ollama will no longer show an error message when piping to stdin on Windows

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.28...v0.1.29

ollama - v0.1.28

Published by jmorganca 8 months ago

What's Changed

  • Vision models such as llava should now respond better to text prompts
  • Improved support for llava 1.6 models
  • Fixed issue where switching between models repeatedly would cause Ollama to hang
  • Installing Ollama on Windows no longer requires a minimum of 4GB disk space (but remember: models are big)!
  • Ollama on macOS will now more reliably determine available VRAM
  • Fixed issue where running Ollama in podman would not detect Nvidia GPUs
  • Ollama will correctly return an empty embedding when calling /api/embeddings with an empty prompt instead of hanging

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.27...v0.1.28

ollama - v0.1.27

Published by jmorganca 8 months ago

306900613-01333db3-c27b-4044-88b3-9b2ffbe06415

Gemma

Gemma is a new, top-performing family of lightweight open models built by Google DeepMind. Available in 2b and 7b parameter sizes:

  • ollama run gemma:2b
  • ollama run gemma:7b (default)

What's Changed

  • Fixed performance issues when running the Gemma model
  • Fixed performance issues on Windows CPU. Systems with AVX and AVX2 should be 2-4 times faster.
  • Reduced likelihood of false positive Windows Defender alerts on Windows.

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.26...v0.1.27

ollama - v0.1.26

Published by jmorganca 8 months ago

What's Changed

  • Support for bert and nomic-bert embedding models
  • Fixed issue where system prompt and prompt template would not be updated when loading a new model
  • Quotes will now be trimmed around the value of the OLLAMA_HOST on Windows
  • Fixed duplicate button issue on the Windows taskbar menu.
  • Fixed issue where system prompt would be be overridden when using the /api/chat endpoint
  • Hardened AMD driver lookup logic
  • Fixed issue where two versions of Ollama on Windows would run at the same time
  • Fixed issue where memory would not be released after a model is unloaded with modern CUDA-enabled GPUs
  • Fixed issue where AVX2 was required for GPU on Windows machines with GPUs
  • Fixed issue where /bye or /exit would not work with trailing spaces or characters after them

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.25...v0.1.26

ollama - v0.1.25

Published by jmorganca 8 months ago

ollama_windows

Windows Preview

Ollama is now available on Windows in preview. Download it here. Ollama on Windows makes it possible to pull, run and create large language models in a new native Windows experience. It includes built-in GPU acceleration, access to the full model library, and the Ollama API including OpenAI compatibility.

What's Changed

  • Ollama on Windows is now available in preview.
  • Fixed an issue where requests would hang after being repeated several times
  • Ollama will now correctly error when provided an unsupported image format
  • Fixed issue where ollama serve wouldn't immediately quit when receiving a termination signal
  • Fixed issues with prompt templating for the /api/chat endpoint, such as where Ollama would omit the second system prompt in a series of messages
  • Fixed issue where providing an empty list of messages would return a non-empty response instead of loading the model
  • Setting a negative keep_alive value (e.g. -1) will now correctly keep the model loaded indefinitely

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.24...v0.1.25

ollama - v0.1.24

Published by jmorganca 8 months ago

OpenAI Compatibility

openai

This release adds initial compatibility support for the OpenAI Chat Completions API.

Usage with cURL

curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "llama2",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Hello!"
            }
        ]
    }'

New Models

  • Qwen 1.5: Qwen 1.5 is a new family of large language models by Alibaba Cloud spanning from 0.5B to 72B.

What's Changed

  • Fixed issue where requests to /api/chat would hang when providing empty user messages repeatedly
  • Fixed issue on macOS where Ollama would return a missing library error after being open for a long period of time

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.23...v0.1.24

ollama - v0.1.23

Published by jmorganca 9 months ago

vision

New vision models

The LLaVA model family on Ollama has been updated to version 1.6, and now includes a new 34b version:

  • ollama run llava A new 7B LLaVA model based on mistral.
  • ollama run llava:13b 13B LLaVA model
  • ollama run llava:34b 34B LLaVA model – one of the most powerful open-source vision models available

These new models share new improvements:

  • More permissive licenses: LLaVA 1.6 models are distributed via the Apache 2.0 license or the LLaMA 2 Community License.
  • Higher image resolution: support for up to 4x more pixels, allowing the model to grasp more details.
  • Improved text recognition and reasoning capabilities: these models are trained on additional document, chart and diagram data sets.

keep_alive parameter: control how long models stay loaded

When making API requests, the new keep_alive parameter can be used to control how long a model stays loaded in memory:

curl http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt": "Why is the sky blue?",
  "keep_alive": "30s"
}'
  • If set to a positive duration (e.g. 20m, 1hr or 30), the model will stay loaded for the provided duration
  • If set to a negative duration (e.g. -1), the model will stay loaded indefinitely
  • If set to 0, the model will be unloaded immediately once finished
  • If not set, the model will stay loaded for 5 minutes by default

Support for more Nvidia GPUs

  • GeForce GTX TITAN X 980 Ti 980 970 960 950 750 Ti 750
  • GeForce GTX 980M 970M 965M 960M 950M 860M 850M
  • GeForce 940M 930M 910M 840M 830M
  • Quadro M6000 M5500M M5000 M2200 M1200 M620 M520
  • Tesla M60 M40
  • NVS 810

What's Changed

  • New keep_alive API parameter to control how long models stay loaded
  • Image paths can now be provided to ollama run when running multimodal models
  • Fixed issue where downloading models via ollama pull would slow down to 99%
  • Fixed error when running Ollama with Nvidia GPUs and CPUs without AVX instructions
  • Support for additional Nvidia GPUs (compute capability 5)
  • Fixed issue where system prompt would be repeated in subsequent messages
  • ollama serve will now print prompt when OLLAMA_DEBUG=1 is set
  • Fixed issue where exceeding context size would cause erroneous responses in ollama run and the /api/chat API
  • ollama run will now allow sending messages without images to multimodal models

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.22...v0.1.23

ollama - v0.1.22

Published by jmorganca 9 months ago

New models

  • Stable LM 2: A state-of-the-art 1.6B small language model.

What's Changed

  • Fixed issue with Nvidia GPU detection that would cause Ollama to error instead of falling back to CPU
  • Fixed issue where AMD integrated GPUs caused an error

Full Changelog: https://github.com/ollama/ollama/compare/v0.1.21...v0.1.22

ollama - v0.1.21

Published by jmorganca 9 months ago

New models

  • Qwen: Qwen is a series of large language models by Alibaba Cloud spanning from 1.8B to 72B parameters.
  • Stable Code: A new code completion model on par with Code Llama 7B and similar models.
  • Nous Hermes 2 Mixtral: The Nous Hermes 2 model from Nous Research, now trained over Mixtral.

CPU improvements

Ollama now supports CPUs without AVX. This means Ollama will now run on older CPUs and in environments (such as virtual machines, Rosetta, GitHub actions) that don't provide support for AVX instructions.

For newer CPUs that support AVX2, Ollama will receive a small performance boost, running models about 10% faster.

What's Changed

  • Support for a much broader set of CPUs, including CPUS without AVX instruction set support
  • If a GPU error is hit when attempting to run a model, Ollama will fallback to CPU
  • Ollama will now use AVX2 for faster performance if available
  • Improved detection of Nvidia GPUs, especially in WSL
  • Fixed issue where models with LoRA layers may not load
  • Fixed incorrect error that would occur when retrying network connections in ollama pull and ollama push
  • Fixed issue where /show parameter would round decimal numbers
  • Fixed issue where upon hitting the context window limit,
  • Fixed issue where generating responses would hang after around 20 requests

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.20...v0.1.21

ollama - v0.1.20

Published by jmorganca 9 months ago

New models

  • MegaDolphin: A new 120B version of the Dolphin model.
  • OpenChat: Updated to the latest version 3.5-0106.
  • Dolphin Mistral: Updated to the latest DPO Laser version, which achieves higher scores with more robust outputs.

What's Changed

  • Fixed additional cases where Ollama would fail with out of memory CUDA errors
  • Multi-GPU machines will now correctly allocate memory across all GPUs
    * Fixed issue where Nvidia GPUs would not be detected by Ollama

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.19...v0.1.20

ollama - v0.1.19

Published by jmorganca 9 months ago

This release focuses on performance and fixing a number issues and crashes relating to memory allocation.

New Models

  • LLaMa-Pro: An expansion of LLaMa by Tencent to an 8B that specializes in language, programming and mathematics.

What's Changed

  • Fixed "out of memory" errors when running models such as llama2, mixtral or llama2:13b with limited GPU memory
  • Fixed CUDA errors when running on older GPUs that aren't yet supported
  • Increasing context size with num_ctx will now work (up to a model's supported context window).

To use a 32K context window with Mistral:

# ollama run
/set parameter num_ctx 32678

# api
curl http://localhost:11434/api/generate -d '{
  "model": "mistral",
  "prompt": "Why is the sky blue?",
  "options": {"num_ctx": 32678}
}'
  • Larger models such as mixtral can now be run on Macs with less memory
  • Fixed an issue where pressing up or down arrow keys would cause the wrong prompt to show in ollama run
  • Fixed performance issues on Intel Macs
  • Fixed an error that would occur with old Nvidia GPUs
  • OLLAMA_ORIGINS now supports browser extension URLs
  • Ollama will now offload more processing to the GPU where possible

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.18...v0.1.19

ollama - v0.1.18

Published by jmorganca 10 months ago

New models

  • TinyLlama: a compact 1.1B Llama model on 3 trillion tokens
  • OpenHermes 2: A 7B model, fine-tuned on Mistral with strong multi-turn chat skills and system prompt capabilities.
  • WizardCoder 33B: a new 33B state of the art code generation model: ollama run wizardcoder:33b
  • Dolphin Phi: a 2.7B uncensored model, based on the Phi language model by Microsoft Research

What's Changed

  • Added /? shortcuts help command to ollama run to list keyboard shortcuts
  • Improved performance when sending follow up messages in ollama run or via the API.
  • Fixed issues where certain 7B models would error on GPUs with 4GB of memory or less
  • Fixed issue where Llava model prompts couldn't start with a file path
  • Fixed issue where model would not be correctly reloaded if options or parameters changed between requests
  • Ollama will now automatically pull new models when running older ggml format models. If using custom ggml format models in a Modelfile, please import GGUF models instead.

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.17...v0.1.18

ollama - v0.1.17

Published by jmorganca 10 months ago

Phi-2

This release adds support for the Phi-2 model by Microsoft.

ollama run phi

Phi-2 is a new, powerful 2.7B model with strong reasoning and language understanding capabilities comparable to larger, 13B models. Given its small size, it will run effectively on a wide set of hardware configurations.

Example prompt

By default, phi includes a prompt template designed for multi-turn conversations:

% ollama run phi
>>> Hello, can you help me find my way to Toronto?
 Certainly! What is the exact location in Toronto that you are looking for?

>>> Yonge & Bloor
 Sure, Yonge and Bloor is a busy intersection in downtown Toronto. Would you like to take public transportation or drive there?

>>> Public transportation
 Great! The easiest way to get there is by taking the TTC subway. You can take Line 1, which runs along Yonge Street and passes through downtown Toronto.

Using Ollama's API:

curl http://localhost:11434/api/chat -d '{
  "model": "phi",
  "messages": [
    { "role": "user", "content": "why is the sky blue?" }
  ]
}'

Example prompts (raw mode)

Phi also responds well to a wide variety of prompt formats when using raw mode in Ollama's API, which bypasses all default prompt templating:

Instruct

curl http://localhost:11434/api/generate -d '{
  "model": "phi",
  "prompt": "Instruct: Write a detailed analogy between mathematics and a lighthouse.\nOutput:",
  "options": {
    "stop": ["Instruct:", "Output:"]
  },
  "raw": true,
  "stream": false
}'

Code Completion

curl http://localhost:11434/api/generate -d '{
  "model": "phi",
  "prompt": "def print_prime(n):\n  ",
  "raw": true,
  "stream": false
}'

Text completion

curl http://localhost:11434/api/generate -d '{
  "model": "phi",
  "prompt": "There once was a mouse named",
  "raw": true,
  "stream": false
}'

New Models

  • Phi-2: A versatile 2.7B model by Microsoft with outstanding reasoning and language understanding capabilities.
  • Solar: A compact, yet powerful 10.7B large language model designed for single-turn conversation.
  • OpenChat: Updated to OpenChat-3.5-1210, this new version of the 7B model model excels at coding tasks and scores very high on many open-source LLM benchmarks.
  • Wizard Math: Updated to WizardMath v1.1, this 7B model excels at Math logic and reasoning and is now based on Mistral

What's Changed

  • Fixed issues where message objects in /api/chat would return "images": null in the response
  • /api/chat now always returns a message object, even if content is an empty string

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.16...v0.1.17

ollama - v0.1.16

Published by jmorganca 10 months ago

This release adds support for Mixtral and other models with the mixture of experts architecture:

ollama run jmorgan/mixtral

New models

  • Mixtral: A high-quality mixture of experts model with open weights.
  • Dolphin Mixtral: An uncensored, fine-tuned model based on the Mixtral mixture of experts model that excels at coding tasks.

What's Changed

  • Add support for mixture of experts (MoE) and Mixtral

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.15...v0.1.16

ollama - v0.1.15

Published by jmorganca 10 months ago

Multimodal model support

Ollama now supports multimodal models that can describe what they see. To use a multimodal model with ollama run, include the full path of a png or jpeg image in the prompt:

% ollama run llava
>>> What does the text in this image say? /Users/mchiang/Downloads/image.png 
Added image '/Users/mchiang/Downloads/image.png'

The text in this image says "The Ollamas."

API usage

A new images parameter has been added to the Generate API, which takes a list of base64-encoded png or jpeg images. Images up to 100MB in size are supported.

curl http://localhost:11434/api/generate -d '{
  "model": "llava",
  "prompt":"What is in this picture?",
  "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
}'

With the new Chat API introduced in version 0.1.14, images can also be added to messages from the user role:

curl http://localhost:11434/api/chat -d '{
  "model": "llava",
  "messages": [
    {
      "role": "user",
      "content": "What is in this picture?",
      "images": ["iVBORw0KGgoAAAANSUhEUgAAAG0AAABmCAYAAADBPx+VAAAACXBIWXMAAAsTAAALEwEAmpwYAAAAAXNSR0IArs4c6QAAAARnQU1BAACxjwv8YQUAAA3VSURBVHgB7Z27r0zdG8fX743i1bi1ikMoFMQloXRpKFFIqI7LH4BEQ+NWIkjQuSWCRIEoULk0gsK1kCBI0IhrQVT7tz/7zZo888yz1r7MnDl7z5xvsjkzs2fP3uu71nNfa7lkAsm7d++Sffv2JbNmzUqcc8m0adOSzZs3Z+/XES4ZckAWJEGWPiCxjsQNLWmQsWjRIpMseaxcuTKpG/7HP27I8P79e7dq1ars/yL4/v27S0ejqwv+cUOGEGGpKHR37tzJCEpHV9tnT58+dXXCJDdECBE2Ojrqjh071hpNECjx4cMHVycM1Uhbv359B2F79+51586daxN/+pyRkRFXKyRDAqxEp4yMlDDzXG1NPnnyJKkThoK0VFd1ELZu3TrzXKxKfW7dMBQ6bcuWLW2v0VlHjx41z717927ba22U9APcw7Nnz1oGEPeL3m3p2mTAYYnFmMOMXybPPXv2bNIPpFZr1NHn4HMw0KRBjg9NuRw95s8PEcz/6DZELQd/09C9QGq5RsmSRybqkwHGjh07OsJSsYYm3ijPpyHzoiacg35MLdDSIS/O1yM778jOTwYUkKNHWUzUWaOsylE00MyI0fcnOwIdjvtNdW/HZwNLGg+sR1kMepSNJXmIwxBZiG8tDTpEZzKg0GItNsosY8USkxDhD0Rinuiko2gfL/RbiD2LZAjU9zKQJj8RDR0vJBR1/Phx9+PHj9Z7REF4nTZkxzX4LCXHrV271qXkBAPGfP/atWvu/PnzHe4C97F48eIsRLZ9+3a3f/9+87dwP1JxaF7/3r17ba+5l4EcaVo0lj3SBq5kGTJSQmLWMjgYNei2GPT1MuMqGTDEFHzeQSP2wi/jGnkmPJ/nhccs44jvDAxpVcxnq0F6eT8h4ni/iIWpR5lPyA6ETkNXoSukvpJAD3AsXLiwpZs49+fPn5ke4j10TqYvegSfn0OnafC+Tv9ooA/JPkgQysqQNBzagXY55nO/oa1F7qvIPWkRL12WRpMWUvpVDYmxAPehxWSe8ZEXL20sadYIozfmNch4QJPAfeJgW3rNsnzphBKNJM2KKODo1rVOMRYik5ETy3ix4qWNI81qAAirizgMIc+yhTytx0JWZuNI03qsrgWlGtwjoS9XwgUhWGyhUaRZZQNNIEwCiXD16tXcAHUs79co0vSD8rrJCIW98pzvxpAWyyo3HYwqS0+H0BjStClcZJT5coMm6D2LOF8TolGJtK9fvyZpyiC5ePFi9nc/oJU4eiEP0jVoAnHa9wyJycITMP78+eMeP37sXrx44d6+fdt6f82aNdkx1pg9e3Zb5W+RSRE+n+VjksQWifvVaTKFhn5O8my63K8Qabdv33b379/PiAP//vuvW7BggZszZ072/+TJk91YgkafPn166zXB1rQHFvouAWHq9z3SEevSUerqCn2/dDCeta2jxYbr69evk4MHDyY7d+7MjhMnTiTPnz9Pfv/+nfQT2ggpO2dMF8cghuoM7Ygj5iWCqRlGFml0QC/ftGmTmzt3rmsaKDsgBSPh0/8yPeLLBihLkOKJc0jp8H8vUzcxIA1k6QJ/c78tWEyj5P3o4u9+jywNPdJi5rAH9x0KHcl4Hg570eQp3+vHXGyrmEeigzQsQsjavXt38ujRo44LQuDDhw+TW7duRS1HGgMxhNXHgflaNTOsHyKvHK5Ijo2jbFjJBQK9YwFd6RVMzfgRBmEfP37suBBm/p49e1qjEP2mwTViNRo0VJWH1deMXcNK08uUjVUu7s/zRaL+oLNxz1bpANco4npUgX4G2eFbpDFyQoQxojBCpEGSytmOH8qrH5Q9vuzD6ofQylkCUmh8DBAr+q8JCyVNtWQIidKQE9wNtLSQnS4jDSsxNHogzFuQBw4cyM61UKVsjfr3ooBkPSqqQHesUPWVtzi9/vQi1T+rJj7WiTz4Pt/l3LxUkr5P2VYZaZ4URpsE+st/dujQoaBBYokbrz/8TJNQYLSonrPS9kUaSkPeZyj1AWSj+d+VBoy1pIWVNed8P0Ll/ee5HdGRhrHhR5GGN0r4LGZBaj8oFDJitBTJzIZgFcmU0Y8ytWMZMzJOaXUSrUs5RxKnrxmbb5YXO9VGUhtpXldhEUogFr3IzIsvlpmdosVcGVGXFWp2oU9kLFL3dEkSz6NHEY1sjSRdIuDFWEhd8KxFqsRi1uM/nz9/zpxnwlESONdg6dKlbsaMGS4EHFHtjFIDHwKOo46l4TxSuxgDzi+rE2jg+BaFruOX4HXa0Nnf1lwAPufZeF8/r6zD97WK2qFnGjBxTw5qNGPxT+5T/r7/7RawFC3j4vTp09koCxkeHjqbHJqArmH5UrFKKksnxrK7FuRIs8STfBZv+luugXZ2pR/pP9Ois4z+TiMzUUkUjD0iEi1fzX8GmXyuxUBRcaUfykV0YZnlJGKQpOiGB76x5GeWkWWJc3mOrK6S7xdND+W5N6XyaRgtWJFe13GkaZnKOsYqGdOVVVbGupsyA/l7emTLHi7vwTdirNEt0qxnzAvBFcnQF16xh/TMpUuXHDowhlA9vQVraQhkudRdzOnK+04ZSP3DUhVSP61YsaLtd/ks7ZgtPcXqPqEafHkdqa84X6aCeL7YWlv6edGFHb+ZFICPlljHhg0bKuk0CSvVznWsotRu433alNdFrqG45ejoaPCaUkWERpLXjzFL2Rpllp7PJU2a/v7Ab8N05/9t27Z16KUqoFGsxnI9EosS2niSYg9SpU6B4JgTrvVW1flt1sT+0ADIJU2maXzcUTraGCRaL1Wp9rUMk16PMom8QhruxzvZIegJjFU7LLCePfS8uaQdPny4jTTL0dbee5mYokQsXTIWNY46kuMbnt8Kmec+LGWtOVIl9cT1rCB0V8WqkjAsRwta93TbwNYoGKsUSChN44lgBNCoHLHzquYKrU6qZ8lolCIN0Rh6cP0Q3U6I6IXILYOQI513hJaSKAorFpuHXJNfVlpRtmYBk1Su1obZr5dnKAO+L10Hrj3WZW+E3qh6IszE37F6EB+68mGpvKm4eb9bFrlzrok7fvr0Kfv727dvWRmdVTJHw0qiiCUSZ6wCK+7XL/AcsgNyL74DQQ730sv78Su7+t/A36MdY0sW5o40ahslXr58aZ5HtZB8GH64m9EmMZ7FpYw4T6QnrZfgenrhFxaSiSGXtPnz57e9TkNZLvTjeqhr734CNtrK41L40sUQckmj1lGKQ0rC37x544r8eNXRpnVE3ZZY7zXo8NomiO0ZUCj2uHz58rbXoZ6gc0uA+F6ZeKS/jhRDUq8MKrTho9fEkihMmhxtBI1DxKFY9XLpVcSkfoi8JGnToZO5sU5aiDQIW716ddt7ZLYtMQlhECdBGXZZMWldY5BHm5xgAroWj4C0hbYkSc/jBmggIrXJWlZM6pSETsEPGqZOndr2uuuR5rF169a2HoHPdurUKZM4CO1WTPqaDaAd+GFGKdIQkxAn9RuEWcTRyN2KSUgiSgF5aWzPTeA/lN5rZubMmR2bE4SIC4nJoltgAV/dVefZm72AtctUCJU2CMJ327hxY9t7EHbkyJFseq+EJSY16RPo3Dkq1kkr7+q0bNmyDuLQcZBEPYmHVdOBiJyIlrRDq41YPWfXOxUysi5fvtyaj+2BpcnsUV/oSoEMOk2CQGlr4ckhBwaetBhjCwH0ZHtJROPJkyc7UjcYLDjmrH7ADTEBXFfOYmB0k9oYBOjJ8b4aOYSe7QkKcYhFlq3QYLQhSidNmtS2RATwy8YOM3EQJsUjKiaWZ+vZToUQgzhkHXudb/PW5YMHD9yZM2faPsMwoc7RciYJXbGuBqJ1UIGKKLv915jsvgtJxCZDubdXr165mzdvtr1Hz5LONA8jrUwKPqsmVesKa49S3Q4WxmRPUEYdTjgiUcfUwLx589ySJUva3oMkP6IYddq6HMS4o55xBJBUeRjzfa4Zdeg56QZ43LhxoyPo7Lf1kNt7oO8wWAbNwaYjIv5lhyS7kRf96dvm5Jah8vfvX3flyhX35cuX6HfzFHOToS1H4BenCaHvO8pr8iDuwoUL7tevX+b5ZdbBair0xkFIlFDlW4ZknEClsp/TzXyAKVOmmHWFVSbDNw1l1+4f90U6IY/q4V27dpnE9bJ+v87QEydjqx/UamVVPRG+mwkNTYN+9tjkwzEx+atCm/X9WvWtDtAb68Wy9LXa1UmvCDDIpPkyOQ5ZwSzJ4jMrvFcr0rSjOUh+GcT4LSg5ugkW1Io0/SCDQBojh0hPlaJdah+tkVYrnTZowP8iq1F1TgMBBauufyB33x1v+NWFYmT5KmppgHC+NkAgbmRkpD3yn9QIseXymoTQFGQmIOKTxiZIWpvAatenVqRVXf2nTrAWMsPnKrMZHz6bJq5jvce6QK8J1cQNgKxlJapMPdZSR64/UivS9NztpkVEdKcrs5alhhWP9NeqlfWopzhZScI6QxseegZRGeg5a8C3Re1Mfl1ScP36ddcUaMuv24iOJtz7sbUjTS4qBvKmstYJoUauiuD3k5qhyr7QdUHMeCgLa1Ear9NquemdXgmum4fvJ6w1lqsuDhNrg1qSpleJK7K3TF0Q2jSd94uSZ60kK1e3qyVpQK6PVWXp2/FC3mp6jBhKKOiY2h3gtUV64TWM6wDETRPLDfSakXmH3w8g9Jlug8ZtTt4kVF0kLUYYmCCtD/DrQ5YhMGbA9L3ucdjh0y8kOHW5gU/VEEmJTcL4Pz/f7mgoAbYkAAAAAElFTkSuQmCC"]
    }
  ]
}'

New Models

  • LLaVA: A novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding. Available in 7b and 13b parameter sizes.
  • BakLLaVA: BakLLaVA is a multimodal model consisting of the Mistral 7B base model augmented with the LLaVA architecture.

What's Changed

  • Support for multi-modal models: LLaVA, BakLLaVA and more.
  • Fixed an issue where /set template and /set system wouldn't work correctly in ollama run
  • The show endpoint will now return model details such as parameter size and quantization level:
curl http://localhost:11434/api/show -d '{ "name": "llava" }'
{
  ...
  "details": {
    "format": "gguf",
    "families": [
      "llama",
      "clip"
    ],
    "parameter_size": "7B",
    "quantization_level": "Q4_0"
  }
}
  • Fixed issue where ctrl-z would not work on Windows

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.14...v0.1.15

ollama - v0.1.14

Published by jmorganca 11 months ago

New Models

  • StableLM Zephyr: A lightweight chat model allowing accurate, and responsive output without requiring high-end hardware.
  • Magicoder: a family of 7B parameter models trained on 75K synthetic instruction data using OSS-Instruct, a novel approach to enlightening LLMs with open-source code snippets.

What's Changed

  • New Chat API for sending a history of messages
    curl http://localhost:11434/api/chat -d '{
      "model": "mistral",
      "messages": [
        { "role": "system", "content": "You are a helpful assistant that answers concisely." },
        { "role": "user", "content": "why is the sky blue?" }
      ]
    }'
    
  • Linewrap now works when resizing the terminal with ollama run
  • Fixed an issue where ctrl+z would not suspend ollama run as expected
  • Fixed an issue where requests to /api/generate would not work when waiting for another request to finish
  • Fixed an issue where extra whitespace after a FROM command would cause an error
  • Ollama will now warn you if there's a version mismatch when connecting remotely with OLLAMA_HOST
  • New /api/version API for checking Ollama's version

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.13...v0.1.14

ollama - v0.1.13

Published by jmorganca 11 months ago

New models

  • Starling: a large language model trained by reinforcement learning from AI feedback focused on improving chatbot helpfulness.
  • Meditron: Open-source medical large language model adapted from Llama 2 to the medical domain.
  • DeepSeek LLM An advanced language model crafted with 2 trillion bilingual tokens.

What's Changed

  • Improved progress bar when running ollama pull with a simpler design that displays a more consistent download speed and remaining time
  • The system prompt can now be set in ollama run using /set system <system prompt>.
  • Parameters can now be set in ollama run using /set <parameter> <value>. Examples:
    • Set the context size to 16K: /set parameter num_ctx 16384
    • Set the temperature to 1: /set parameter temperature 1
    • Set the seed: /set parameter seed 1048
  • Fixed issue where Linux installer script would encounter an error when installing on Red Hat Enterprise Linux with an Nvidia GPU

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.12...v0.1.13

ollama - v0.1.12

Published by jmorganca 11 months ago

New Models

  • Yi Chatollama run yi:34b – the chat variant of the popular Yi 34b model .

What's Changed

  • Improved multi-line prompts (starting & ending with """) and pasting functionality inollama run
  • Option (or alt) + backspace will now delete words in ollama run
  • Fixed issue where older Intel Macs would receive an error when trying to run a model
  • Fixed issues with YaRN models output and performance

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.11...v0.1.12

ollama - v0.1.11

Published by jmorganca 11 months ago

New Models

  • Orca 2: A fine-tuned version of Meta's Llama 2 model, designed to excel particularly in reasoning.
  • DeepSeek Coder: A capable coding model trained from scratch. Available in 1.3B, 6.7B and 33B parameter counts.
  • Alfred: A robust conversational model designed to be used for both chat and instruct use cases.

What's Changed

  • Improved progress bar design
  • Fixed issue where ollama create would error with invalid cross-device link
  • Fixed issue where ollama run Ollama would exit with an error on macOS Big Sur and Monterey
  • q5_0 and q5_1 models will now use GPU
  • Fixed several max retries exceeded errors when running ollama pull or ollama push
  • Fixed issue where ollama create would result in a "file not found" error FROM referred to local file
  • Fixed issue where resizing the terminal while running ollama pull would cause repeated progress bar messages
  • Minor performance improvements on Intel Macs
  • Improved error messages on Linux when using Nvidia GPUs

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.10...v0.1.11

ollama - v0.1.10

Published by jmorganca 11 months ago

New models

  • OpenChatollama run openchat – An open-source chat model trained on a wide variety of data, surpassing ChatGPT on various benchmarks.
  • Neural-chatollama run neural-chat – New chat model by Intel
  • Goliathollama run goliath – A large chat model created by combining two fine-tuned versions of Llama 2 70B

What's Changed

  • JSON mode can now be used with ollama run:
    • Pass --format json flag or
    • Use /set format json to change the current chat session to use JSON mode
  • Prompts can now be passed in via standard input to ollama run. For example: head -30 README.md | ollama run codellama "how do I install Ollama on Linux?"
  • ollama create now works with OLLAMA_HOST to build models using Ollama running on a remote machine
  • Fixed crashes on Intel Macs
  • Fixed issue where ollama pull progress would reverse when re-trying a failed connection
  • Fixed issue where ollama show --modelfile would show an incorrect FROM command
  • Fixed issue where word wrap wouldn't work when piping in data to ollama run via standard input
  • Fix permission denied issues when running ollama create on Linux
  • Added FAQ entry for proxy support on Linux
  • Fixed installer error on Debian 12
  • Fixed issue where ollama push would result in a 405 error
  • ollama push will now return a better error when trying to push to a namespace the current user does not have access to

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.9...v0.1.10

Package Rankings
Top 9.59% on Proxy.golang.org
Top 34.91% on Pypi.org
Badges
Extracted from project README
Discord