ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

MIT License

Downloads
219
Stars
92.5K
Committers
310

Bot releases are visible (Hide)

ollama - v0.3.11

Published by github-actions[bot] about 1 month ago

New models

  • Solar-Pro-Preview: an advanced large language model (LLM) with 22 billion parameters designed to fit into a single GPU.
  • Qwen 2.5: new multilingual Qwen models pretrained on Alibaba's latest large-scale dataset, encompassing up to 18 trillion tokens with support for a context window of up to 128K tokens.
  • Bespoke-Minicheck: a state-of-the-art fact-checking model developed by Bespoke Labs.
  • Mistral-Small: a lightweight 22B model designed for cost-effective use in tasks like translation and summarization.
  • Reader-LM: A series of models that convert HTML content to Markdown content, which is useful for content conversion tasks.

What's Changed

  • New ollama stop command to unload a running model
  • Ollama will now show an error when importing a model with an invalid number of tokens in the vocabulary
  • The ollama/ollama container image will now start running almost immediately, leading to 5s faster start times
  • Fixed issue where ollama show would show excessive whitespace in the output

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.10...v0.3.11

ollama - v0.3.10

Published by github-actions[bot] about 1 month ago

New models

  • Yi-Coder: a series of open-source code language models that delivers state-of-the-art coding performance with fewer than 10 billion parameters.
  • DeepSeek-V2.5: An upgraded version of DeekSeek-V2 that integrates the general and coding abilities of both DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct.
  • Reflection: a high-performing model trained with a new technique called Reflection-tuning that teaches a LLM to detect mistakes in its reasoning and correct course.

What's Changed

  • Fixed rare error that would occur for certain models when running ollama show
  • CUDA 11 will now be used for older NVIDIA drivers that are not compatible with CUDA 12
  • Fixed error when running ollama create to import Gemma 2 models from safetensors
  • The OpenAI-compatible chat and completions APIs will no longer scale temperature and frequency_penalty

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.9...v0.3.10

ollama - v0.3.9

Published by github-actions[bot] about 2 months ago

What's Changed

  • Fixed error that would occur when running Ollama on Linux machines with the ARM architecture
  • Ollama will now show an improved error message when attempting to run unsupported models
  • Fixed issue where Ollama would not auto-detect the chat template for Llama 3.1 models
  • OLLAMA_HOST will now work with with URLs that contain paths

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.8...v0.3.9

ollama - v0.3.8 Latest Release

Published by github-actions[bot] about 2 months ago

What's Changed

  • Fixed error where the ollama CLI couldn't be found on the path when upgrading Ollama on Windows

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.7...v0.3.8

ollama - v0.3.7

Published by github-actions[bot] about 2 months ago

What's Changed

  • Cuda 12 support, improving performance by up to 10% on newer NVIDIA GPUs
  • Improved performance of ollama pull and ollama push on slower connections
  • Fixed issue where setting OLLAMA_NUM_PARALLEL would cause models to be reloaded on lower VRAM systems
  • Ollama on Linux is now distributed as a tar.gz file, which contains the ollama binary along with required libraries.

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.6...v0.3.7-rc5

ollama - v0.3.6

Published by github-actions[bot] 2 months ago

What's Changed

  • Fixed issue where /api/embed would return an error instead of loading the model when the input field was not provided.
  • ollama create can now import Phi-3 models from Safetensors
  • Added progress information to ollama create when importing GGUF files
  • Ollama will now import GGUF files faster by minimizing file copies

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.5...v0.3.6

ollama - v0.3.5

Published by github-actions[bot] 2 months ago

What's Changed

  • Fixed Incorrect function error when downloading models on Windows
  • Fixed issue where temporary files would not be cleaned up
  • Fix rare error when Ollama would start up due to invalid model data
  • Ollama will now provide an error instead of crashing on Windows when running models that are too large to fit into total memory

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.4...v0.3.5

ollama - v0.3.4

Published by github-actions[bot] 2 months ago

What's Changed

  • NUMA support will now be autodetected by Ollama to improve performance
  • Fixed issue where the /api/embed would sometimes return embedding results out of order

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.3...v0.3.4

ollama - v0.3.3

Published by github-actions[bot] 3 months ago

What's Changed

  • The /api/embed endpoint now returns statistics: total_duration, load_duration, and prompt_eval_count
  • Added usage metrics to the /v1/embeddings OpenAI compatibility API
  • Fixed issue where /api/generate would respond with an empty string if provided a context
  • Fixed issue where /api/generate would return an incorrect value for context
  • /show modefile will now render MESSAGE commands correctly

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.2...v0.3.3

ollama - v0.3.2

Published by github-actions[bot] 3 months ago

What's Changed

  • Fixed issue where ollama pull would not resume download progress
  • Fixed issue where phi3 would report an error on older versions

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.1...v0.3.2

ollama - v0.3.1

Published by github-actions[bot] 3 months ago

What's Changed

  • Added support for min_p sampling option
  • Lowered number of requests required when downloading models with ollama pull
  • ollama create will now autodetect required stop parameters when importing certain models
  • Ollama on Windows will now show better error messages if required files are missing
  • Fixed issue where /save would cause parameters to be saved incorrectly.
  • OpenAI-compatible API will now return a finish_reason of tool_calls if a tool call occured.
  • Ollama's Linux install script will now return a better error on unsupported CUDA versions

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.3.0...v0.3.1

ollama - v0.3.0

Published by github-actions[bot] 3 months ago

an image of ollama selecting the right tool for the job, holding up a hammer to nail wooden boards - please support ollama! Let open source win

Tool support

Ollama now supports tool calling with popular models such as Llama 3.1. This enables a model to answer a given prompt using tool(s) it knows about, making it possible for models to perform more complex tasks or interact with the outside world.

Example tools include:

  • Functions and APIs
  • Web browsing
  • Code interpreter
  • much more!

https://github.com/user-attachments/assets/957ef0d6-e7b7-4168-8033-13b0fe5f7029

To use tools, provide the tools field when using Ollama's Chat API:

import ollama

response = ollama.chat(
    model='llama3.1',
    messages=[{'role': 'user', 'content': 'What is the weather in Toronto?'}],

    # provide a weather checking tool to the model
    tools=[{
      'type': 'function',
      'function': {
        'name': 'get_current_weather',
        'description': 'Get the current weather for a city',
        'parameters': {
          'type': 'object',
          'properties': {
            'city': {
              'type': 'string',
              'description': 'The name of the city',
            },
          },
          'required': ['city'],
        },
      },
    },
  ],
)

print(response['message']['tool_calls'])

More information:

New models

  • Llama 3.1: a new state-of-the-art model from Meta available in 8B, 70B and 405B parameter sizes with support for tool calling.
  • Mistral Large 2: Mistral's new 123B flagship model that is significantly more capable in code generation, tool calling, mathematics, and reasoning with 128k context window and support for dozens of languages.
  • Firefunction v2: An open weights function calling model based on Llama 3, competitive with GPT-4o function calling capabilities.
  • Llama-3-Groq-Tool-Use: A series of models from Groq that represent a significant advancement in open-source AI capabilities for tool use/function calling.

What's Changed

  • Fixed duplicate error message when running ollama create

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.2.8...v0.3.0

ollama - v0.2.8

Published by github-actions[bot] 3 months ago

New models

  • Mistral Nemo: A state-of-the-art 12B model with 128k context length, built by Mistral AI in collaboration with NVIDIA.
  • NuExtract: A 3.8B model fine-tuned on a private high-quality synthetic dataset for information extraction, based on Phi-3.

What's Changed

  • Fixed issue where a final assistant message would not be considered for continuing a response
  • Improved OpenAI-compatible chat completions endpoint image handling
  • ollama create will now validate templates
  • Fix error when old versions of ROCm 5 would be detected

Full Changelog: https://github.com/ollama/ollama/compare/v0.2.7...v0.2.8

ollama - v0.2.7

Published by github-actions[bot] 3 months ago

What's Changed

  • Fixed issue where last message when streaming would omit the content response field

Full Changelog: https://github.com/ollama/ollama/compare/v0.2.6...v0.2.7

ollama - v0.2.6

Published by github-actions[bot] 3 months ago

New models

  • Mathstral: MathΣtral is a 7B model designed for math reasoning and scientific discovery by Mistral AI.

What's Changed

  • Fixed issue where uppercase roles such as USER would no longer work in the chat endpoints
  • Fixed issue where empty system message would be included in the prompt

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.2.5...v0.2.6

ollama - v0.2.5

Published by github-actions[bot] 3 months ago

What's changed

  • Fixed issue where a model's SYSTEM message not be applied

Full Changelog: https://github.com/ollama/ollama/compare/v0.2.4...v0.2.5

ollama - v0.2.4

Published by github-actions[bot] 3 months ago

What's Changed

  • Fixed issue where context, load_duration and total_duration fields would not be set in the /api/generate endpoint.
  • Ollama will no longer error if loading models larger than system memory if disk space is available

New Contributors

Full Changelog: https://github.com/ollama/ollama/compare/v0.2.3...v0.2.4

ollama - v0.2.3

Published by github-actions[bot] 3 months ago

What's Changed

  • Fix issue where system prompt would not be applied

Full Changelog: https://github.com/ollama/ollama/compare/v0.2.2...v0.2.3

ollama - v0.2.2

Published by github-actions[bot] 3 months ago

What's Changed

  • Fixed issues with Nvidia V100 GPUs
  • glm4 models will no longer report of memory issues
  • Fixed error that would occur when running deepseek-v2 models
  • Fixed a series of out of memory issues when using Nvidia GPUs
  • Fixed a series of errors that would occur when using multiple Radeon GPUs
  • Fixed rare missing DLL issue on Windows
  • Fixed rare crash if ROCm 6.1.2 is installed on Windows

Full Changelog: https://github.com/ollama/ollama/compare/v0.2.1...v0.2.2-rc2

ollama - v0.2.1

Published by github-actions[bot] 3 months ago

What's Changed

  • Fixed issue where setting OLLAMA_NUM_PARALLEL would cause models to be reloaded after each request

Full Changelog: https://github.com/ollama/ollama/compare/v0.2.0...v0.2.1

Package Rankings
Top 9.59% on Proxy.golang.org
Top 34.91% on Pypi.org
Badges
Extracted from project README
Discord