ollama

Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.

MIT License

Downloads
219
Stars
92.5K
Committers
310

Bot releases are hidden (Show)

ollama - v0.1.9

Published by jmorganca 11 months ago

New models

  • Yi: a high-performing, bilingual model supporting both English and Chinese.

What's Changed

  • JSON mode: instruct models to always return valid JSON when calling /api/generate by setting the format parameter to json
  • Raw mode: bypass any templating done by Ollama by passing {"raw": true} to /api/generate
  • Better error descriptions when downloading and uploading models with ollama pull and ollama push
  • Fixed issue where Linux installer would encounter an error when running as the root user
  • Improved progress bar design when running ollama pull and ollama push
  • Fixed issue where running on a machine with less than 2GB of VRAM would be slow

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.8...v0.1.9

ollama - v0.1.8

Published by jmorganca 12 months ago

New Models

  • CodeBooga: A high-performing code instruct model created by merging two existing code models.
  • Dolphin 2.2 Mistral: An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.
  • MistralLite: MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.
  • Yarn Mistral an extension of Mistral to support a context window of up to 128 tokens
  • Yarn Llama 2 an extension of Llama 2 to support a context window of up to 128 tokens

What's Changed

  • Ollama will now honour large context sizes on models such as codellama and mistrallite
  • Fixed issue where repeated characters would be output on long contexts
  • ollama push is now much faster. 7B models will push up to ~100MB/s and large models (70B+) up to 1GB/s if network speeds permit

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.7...v0.1.8

ollama - v0.1.7

Published by jmorganca 12 months ago

What's Changed

  • Fixed an issue when running ollama run where certain key combinations such as Ctrl+Space would lead to an unresponsive prompt
  • Fixed issue in ollama run where retrieving the previous prompt from history would require two up arrow key presses instead of one
  • Exiting ollama run with Ctrl+D will now put cursor on the next line

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.6...v0.1.7

ollama - v0.1.6

Published by jmorganca 12 months ago

New models

  • Dolphin 2.1 Mistral: an instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.
  • Zephyr has been updated to Zephyr Beta. Zephyr Alpha is still available as zephyr:7b-alpha

What's Changed

  • Pasting multi-line strings in ollama run is now possible
  • Fixed various issues when writing prompts in ollama run
  • The library models have been refreshed and revamped.
    • All chat or instruct models now support setting the system parameter, or SYSTEM command in the Modelfile
    • Parameters (num_ctx, etc) have been updated for library models
    • Slight performance improvements for all models
  • Model storage can now be configured with OLLAMA_MODELS. See the FAQ for more info on how to configure this.
  • OLLAMA_HOST will now default to port 443 when https:// is specified, and port 80 when http:// is specified
  • Fixed trailing slash causing an error when using OLLAMA_HOST
  • Fixed issue where ollama pull would retry multiple times when out of space
  • Fixed various out of memory issues when using Nvidia GPUs
  • Fixed performance issue previously introduced on AMD CPUs

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.5...v0.1.6

ollama - v0.1.5

Published by jmorganca 12 months ago

What's Changed

  • Fix an issue where an error would occur when running falcon or starcoder models

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.4...v0.1.5

ollama - v0.1.4

Published by jmorganca 12 months ago

New models

  • OpenHermes 2 Mistral: a new fine-tuned model based on Mistral, trained on open datasets totalling over 900,000 instructions. This model has strong multi-turn chat skills, surpassing previous Hermes 13B models and even matching 70B models on some benchmarks.

What's Changed

  • Faster model switching: models will now stay loaded between requests when using different parameters (e.g. temperature) or system prompts
  • Improved Linux installer
    • The linux install script curl https://ollama.ai/install.sh | sh to be more consistent with Ollama on macOS
    • The installer will now install ollama as the current user
    • Models can be found under ~/.ollama/models (existing models in /usr/share/ollama will be moved to this directory)
    • Logs can be found under ~/.ollama/logs/server.log
  • starcoder, sqlcoder and falcon models now have unicode support. Note: they will need to be re-pulled (e.g. ollama pull starcoder)
  • New doc on importing existing models (GGUF, PyTorch, etc)
  • ollama serve will now print the current version of Ollama on start
  • ollama run will now show more descriptive errors when encountering runtime issues (such as insufficient memory)
  • Fixed a series of permissions issues with ollama create on Linux
  • Fixed an issue where Ollama on Linux would use CPU instead of using both the CPU and GPU for GPUs with less memory
  • Fixed architecture check in Linux install script
  • Fixed issue where leading whitespaces would be returned in responses
  • Fixed issue where ollama show would show an empty SYSTEM prompt (instead of omitting it)
  • Fixed issue with the /api/tags endpoint would return null instead of [] if no models were found
  • Fixed an issue where ollama show wouldn't work when connecting remotely by using OLLAMA_HOST
  • Fixed issue where GPU/Metal would be used on macOS even with num_gpu set to 0
  • Fixed issue where certain characters would be escaped in responses
  • Fixed ollama serve logs to report the proper amount of GPU memory (VRAM) being used

Note: the EMBED keyword in Modelfile is being revisited until a future version of Ollama. Join the discussion on how we can make it better.

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.3...v0.1.4

ollama - v0.1.3

Published by jmorganca about 1 year ago

What's Changed

  • Improved various API error messages to be easier to read
  • Improved GPU allocation for older GPUs to fix "out of memory" errors
  • Fixed issue where setting num_gpu to 0 would result in an error
  • Ollama for macOS will now always update to the latest version, even if earlier updates had also been downloaded beforehand

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.2...v0.1.3

ollama - v0.1.2

Published by jmorganca about 1 year ago

New Models

  • Zephyr A fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets and performs as well as Llama 2 70B in many benchmarks
  • Mistral OpenOrca a 7 billion parameter model fine-tuned on top of the Mistral 7B model using the OpenOrca dataset

Examples

Ollama's examples have been updated with some new examples:

What's Changed

  • Download speeds for ollama pull have been significantly improved, from 60MB/s to over 1.5GB/s (25x faster) on fast network connections
  • The API now supports non-streaming responses. Set the stream parameter to false and endpoints will return data in one single response:
    curl -X POST http://localhost:11434/api/generate -d '{
      "model": "llama2",
      "prompt": "Why is the sky blue?",
      "stream": false
    }'
    
  • Ollama can now be used with http proxies (using HTTP_PROXY=http://<proxy>) and https proxies (using HTTPS_PROXY=https://<proxy>)
  • Fixed token too long error when generating a response
  • q8_0, q5_0, q5_1, and f32 models will now use GPU on Linux
  • Revise help text in ollama run to be easier to read
  • Rename runner subprocess to ollama-runner
  • ollama create will now show feedback when reading model metadata
  • Fix not found error showing when running ollama pull
  • Improved video memory allocation on Linux to fix errors when using Nvidia GPUs

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.1...v0.1.2

ollama - v0.1.1

Published by jmorganca about 1 year ago

What's Changed

  • Cancellable responses: Ctrl+C will now cancel responses when running ollama run
  • Improved error messages for unknown /slash commands when using ollama run
  • Various improvements to the Linux install script for distro compatibility and to fix bugs
  • Fixed install issues on Fedora
  • Fixed issue where specifying the library/ prefix in ollama run would cause an error
  • Fixed highlight color for placeholder text in ollama run
  • Fixed issue where auto updater would not restart when clicking "Restart to Update"
  • Ollama will now clean up subdirectories in ~/.ollama/models
  • Show a default message when license/parameters/etc

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.0...v0.1.1

ollama - v0.1.0

Published by jmorganca about 1 year ago

Coming soon

ollama - v0.0.21

Published by jmorganca about 1 year ago

  • Fixed an issue where empty responses would be returned if template was provided in the api, but not prompt
  • Fixed an issue where the "Send a message" placeholder would show when writing multi line prompts with ollama run
  • Fixed an issue where multi-line prompts in ollama run wouldn't be submitted when pressing Return

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.20...v0.0.21

ollama - v0.0.20

Published by jmorganca about 1 year ago

What's Changed

  • ollama run has a new & improved experience:
    • Models will now be loaded immediately making even the first prompt much faster
    • Added hint text
    • Ollama will now fit words in the available width of the terminal for better readability
  • OLLAMA_HOST now supports ipv6 hostnames
  • ollama run will now automatically pull models if they don't exist when using a remote instance of Ollama
  • Sending an empty prompt field to /api/generate will now load the model so the next request is fast
  • Fixed an issue where ollama create would not correctly detect falcon model sizes
  • Add a simple python client to access Ollama in api/client.py by @pdevine
  • Improvements to showing progress on ollama pull and ollama push
  • Fixed an issue for adding empty layers with ollama create
  • Fixed an issue for running Ollama on Windows (compiled from source)
  • Fixed an error when running ollama push
  • Readable community projects by @jamesbraza

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.19...v0.0.20

ollama - v0.0.19

Published by mchiang0610 about 1 year ago

What's Changed

  • Updated Docker image for Ollama docker pull ollama/ollama
  • Ability to import and use GGUF file type models
  • Fixed issue where ollama push would error on long-running uploads
  • Ollama will now automatically clean up unused data locally
  • Improve build instructions by @apepper

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.18...v0.0.19

ollama - v0.0.18

Published by mchiang0610 about 1 year ago

What's Changed

  • New ollama show command for viewing details about a model:
    • See a system prompt for a model: ollama show --system orca-mini
    • View a model's parameters: ollama show --parameters codellama
    • View a model's default prompt template: ollama show --template llama2
    • View a Modelfile for a model: ollama show --modelfile llama2
  • Minor improvements to model loading and generation time
  • Fixed an issue where characters would be escaped in prompts causing escaped characters like &amp; in the output
  • Fixed several issues with building from source on Windows and Linux
  • Minor performance improvements to model loading and generation
  • New sentiments example by @technovangelist
  • Fixed num_keep parameter not working properly
  • Fixed issue where Modelfile parameters would not be honored at runtime
  • Added missing options params to the embeddings docs by @herrjemand
  • Fixed issue where ollama list would error when there were no models to show

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.17...v0.0.18

ollama - v0.0.17

Published by mchiang0610 about 1 year ago

What's Changed

  • Multiple models can be removed together: ollama rm mario:latest orca-mini:3b
  • ollama list will now show a unique ID for each model based on its contents
  • Fixed bug where a prompt wasn't set by default causing an error when running a model created with ollama create
  • Fixed crash when running 34B parameter models on hardware with not enough memory to run it.
  • Fixed issue where non-quantized f16 models would not run
  • Improved network performance of ollama push
  • Fixed issue where stop sequences (such as \n) wouldn't be honored

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.16...v0.0.17

ollama - v0.0.16

Published by mchiang0610 about 1 year ago

What's Changed

  • Ollama version can be checked by running ollama -v or ollama --version
  • Support for 34B models such as codellama
  • Model names or paths withhttps:// in front of them will now work when running ollama run

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.15...v0.0.16

ollama - v0.0.15

Published by mchiang0610 about 1 year ago

📍 Ollama model list is now available

Ollama now supports a list of models published on ollama.ai/library. We are working on ways to allow anyone to push models to Ollama. Expect more news on this in the future.

Please join the community on Discord if you have any questions/concerns/ want to hang out.

What's Changed

  • Target remote Ollama hosts with OLLAMA_HOST=<host> ollama run llama2
  • Fixed issue where PARAMETER values weren't correctly in Modelfiles
  • Fixed issue where a warning would show when parsing a Modelfile comment
  • Ollama will now parse data from ggml format models and use them to make sure your system has enough memory to run a model with GPU support
  • Experimental support for creating fine-tuned models via ollama create with Modelfiles: use the ADAPTER Modelfile instruction
  • Added documentation for the num_gqa parameter
  • Added tutorials and examples for using LangChain with Ollama
  • Ollama will now log embedding eval timing
  • Update llama.cpp to the latest version
  • Add context to api documentation for /api/generate
  • Fixed issue with resuming downloads via ollama pull
  • Using EMBED in Modelfiles will now skip regenerating embeddings if the input files have not changed
  • Ollama will now use an already loaded model for /api/embeddings if it is available
  • New example: dockerit – a tool to help you build and run your application in a Docker container
  • Retry download on network errors

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.14...v0.0.15

ollama - v0.0.14

Published by mchiang0610 about 1 year ago

ollama 0 0 14

What's Changed

  • Ollama 🦜️🔗 LangChain Integration! https://python.langchain.com/docs/integrations/llms/ollama
  • API docs for Ollama: https://github.com/jmorganca/ollama/blob/main/docs/api.md
  • Llama 2 70B model with Metal support (Recommend at least 64GB of memory) ollama run llama2:70b
  • Uncensored Llama 2 70B model with Metal support ollama run llama2-uncensored:70b
  • New models available! For a list of models you can directly pull from Ollama, please see https://gist.github.com/mchiang0610/b959e3c189ec1e948e4f6a1f737a1fc5
  • Embeddings can now be generated for a model with /api/embeddings
  • Experimental EMBED instruction in the Modelfile
  • Configurable rope frequency parameters
  • OLLAMA_HOST can now specify the entire address to serve on with ollama serve
  • Fixed issue where context was truncated incorrectly leading to poor output
  • ollama pull can now be run in different terminal windows for the same model concurrently
  • Add an example on multiline input
  • Fixed error not being checked on ollama pull

New Contributors

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.13...0.0.14

ollama - v0.0.13

Published by jmorganca about 1 year ago

New improvements

  • Using Ollama CLI without Ollama running will now start Ollama
  • Changed the buffer limit so that conversations would continue until it is complete
  • Models now stay loaded in memory automatically between messages, so series of prompts are extra fast!
  • The white fluffy Ollama icon is back when using dark mode
  • Ollama will now run on Intel Macs. Compatibility & performance improvements to come
  • When running ollama run, the /show command can be used to inspect the current model
  • ollama run can now take in multi-line strings:
    % ollama run llama2
    >>> """       
      Is this a
      multi-line
      string?
    """
    Thank you for asking! Yes, the input you provided is a multi-line string. It contains multiple lines of text separated by line breaks.
    
  • More seamless updates: Ollama will now show a subtle hint that an update is ready in the tray menu, instead of a dialog window
  • ollama run --verbose will now show load duration times

Bug fixes

  • Fixed crashes on Macs with 8GB of shared memory
  • Fixed issues in scanning multi-line strings in a Modelfile
ollama - v0.0.12

Published by jmorganca about 1 year ago

New improvements

  • You can now rename models you've pulled or created with ollama cp
  • Added support for running k-quant models
  • Performance improvements from enabling Accelerate
  • Ollama's API can now be accessed by websites hosted on localhost
  • ollama create will now automatically pull models in the FROM instruction you don't have locally

Bug fixes

  • ollama pull will now show a better error when pulling a model that doesn't exist
  • Fixed an issue where cancelling and resuming downloads with ollama pull would cause an error
  • Fixed formatting of different errors so they are readable when running ollama commands
  • Fixed an issue where prompt templates defined with the TEMPLATE instruction wouldn't be parsed correctly
  • Fixed error when a model isn't found
Package Rankings
Top 9.59% on Proxy.golang.org
Top 34.91% on Pypi.org
Badges
Extracted from project README
Discord