ollama | Go Ecosystem Directory

ollama - v0.1.9

Published by jmorganca 11 months ago

New models

Yi: a high-performing, bilingual model supporting both English and Chinese.

What's Changed

JSON mode: instruct models to always return valid JSON when calling /api/generate by setting the format parameter to json
Raw mode: bypass any templating done by Ollama by passing {"raw": true} to /api/generate
Better error descriptions when downloading and uploading models with ollama pull and ollama push
Fixed issue where Linux installer would encounter an error when running as the root user
Improved progress bar design when running ollama pull and ollama push
Fixed issue where running on a machine with less than 2GB of VRAM would be slow

New Contributors

@pepperoni21 made their first contribution in https://github.com/jmorganca/ollama/pull/995
@lgrammel made their first contribution in https://github.com/jmorganca/ollama/pull/1020
@ej52 made their first contribution in https://github.com/jmorganca/ollama/pull/999
@David-Kunz made their first contribution in https://github.com/jmorganca/ollama/pull/996
@tjbck made their first contribution in https://github.com/jmorganca/ollama/pull/943
@omagdy7 made their first contribution in https://github.com/jmorganca/ollama/pull/1029
@upchui made their first contribution in https://github.com/jmorganca/ollama/pull/1034
@kevinhermawan made their first contribution in https://github.com/jmorganca/ollama/pull/1043
@amithkoujalgi made their first contribution in https://github.com/jmorganca/ollama/pull/1044
@mpldr made their first contribution in https://github.com/jmorganca/ollama/pull/1042
@aashish2057 made their first contribution in https://github.com/jmorganca/ollama/pull/992
@nickanderson made their first contribution in https://github.com/jmorganca/ollama/pull/1062

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.8...v0.1.9

ollama - v0.1.8

Published by jmorganca 12 months ago

New Models

CodeBooga: A high-performing code instruct model created by merging two existing code models.
Dolphin 2.2 Mistral: An instruct-tuned model based on Mistral. Version 2.2 is fine-tuned for improved conversation and empathy.
MistralLite: MistralLite is a fine-tuned model based on Mistral with enhanced capabilities of processing long contexts.
Yarn Mistral an extension of Mistral to support a context window of up to 128 tokens
Yarn Llama 2 an extension of Llama 2 to support a context window of up to 128 tokens

What's Changed

Ollama will now honour large context sizes on models such as codellama and mistrallite
Fixed issue where repeated characters would be output on long contexts
ollama push is now much faster. 7B models will push up to ~100MB/s and large models (70B+) up to 1GB/s if network speeds permit

New Contributors

@dloss made their first contribution in https://github.com/jmorganca/ollama/pull/948
@noahgitsham made their first contribution in https://github.com/jmorganca/ollama/pull/983

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.7...v0.1.8

ollama - v0.1.7

Published by jmorganca 12 months ago

What's Changed

Fixed an issue when running ollama run where certain key combinations such as Ctrl+Space would lead to an unresponsive prompt
Fixed issue in ollama run where retrieving the previous prompt from history would require two up arrow key presses instead of one
Exiting ollama run with Ctrl+D will now put cursor on the next line

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.6...v0.1.7

ollama - v0.1.6

Published by jmorganca 12 months ago

New models

Dolphin 2.1 Mistral: an instruct-tuned model based on Mistral and trained on a dataset filtered to remove alignment and bias.
Zephyr has been updated to Zephyr Beta. Zephyr Alpha is still available as zephyr:7b-alpha

What's Changed

Pasting multi-line strings in ollama run is now possible
Fixed various issues when writing prompts in ollama run
The library models have been refreshed and revamped.
- All chat or instruct models now support setting the system parameter, or SYSTEM command in the Modelfile
- Parameters (num_ctx, etc) have been updated for library models
- Slight performance improvements for all models
Model storage can now be configured with OLLAMA_MODELS. See the FAQ for more info on how to configure this.
OLLAMA_HOST will now default to port 443 when https:// is specified, and port 80 when http:// is specified
Fixed trailing slash causing an error when using OLLAMA_HOST
Fixed issue where ollama pull would retry multiple times when out of space
Fixed various out of memory issues when using Nvidia GPUs
Fixed performance issue previously introduced on AMD CPUs

New Contributors

@ajayk made their first contribution in https://github.com/jmorganca/ollama/pull/855

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.5...v0.1.6

ollama - v0.1.5

Published by jmorganca 12 months ago

What's Changed

Fix an issue where an error would occur when running falcon or starcoder models

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.4...v0.1.5

ollama - v0.1.4

Published by jmorganca almost 1 year ago

New models

OpenHermes 2 Mistral: a new fine-tuned model based on Mistral, trained on open datasets totalling over 900,000 instructions. This model has strong multi-turn chat skills, surpassing previous Hermes 13B models and even matching 70B models on some benchmarks.

What's Changed

Faster model switching: models will now stay loaded between requests when using different parameters (e.g. temperature) or system prompts
Improved Linux installer
- The linux install script curl https://ollama.ai/install.sh | sh to be more consistent with Ollama on macOS
- The installer will now install ollama as the current user
- Models can be found under ~/.ollama/models (existing models in /usr/share/ollama will be moved to this directory)
- Logs can be found under ~/.ollama/logs/server.log
starcoder, sqlcoder and falcon models now have unicode support. Note: they will need to be re-pulled (e.g. ollama pull starcoder)
New doc on importing existing models (GGUF, PyTorch, etc)
ollama serve will now print the current version of Ollama on start
ollama run will now show more descriptive errors when encountering runtime issues (such as insufficient memory)
Fixed a series of permissions issues with ollama create on Linux
Fixed an issue where Ollama on Linux would use CPU instead of using both the CPU and GPU for GPUs with less memory
Fixed architecture check in Linux install script
Fixed issue where leading whitespaces would be returned in responses
Fixed issue where ollama show would show an empty SYSTEM prompt (instead of omitting it)
Fixed issue with the /api/tags endpoint would return null instead of [] if no models were found
Fixed an issue where ollama show wouldn't work when connecting remotely by using OLLAMA_HOST
Fixed issue where GPU/Metal would be used on macOS even with num_gpu set to 0
Fixed issue where certain characters would be escaped in responses
Fixed ollama serve logs to report the proper amount of GPU memory (VRAM) being used

Note: the EMBED keyword in Modelfile is being revisited until a future version of Ollama. Join the discussion on how we can make it better.

New Contributors

@vieux made their first contribution in https://github.com/jmorganca/ollama/pull/810
@s-kostyaev made their first contribution in https://github.com/jmorganca/ollama/pull/801
@ggozad made their first contribution in https://github.com/jmorganca/ollama/pull/794
@awaescher made their first contribution in https://github.com/jmorganca/ollama/pull/811
@deichbewohner made their first contribution in https://github.com/jmorganca/ollama/pull/799

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.3...v0.1.4

ollama - v0.1.3

Published by jmorganca about 1 year ago

What's Changed

Improved various API error messages to be easier to read
Improved GPU allocation for older GPUs to fix "out of memory" errors
Fixed issue where setting num_gpu to 0 would result in an error
Ollama for macOS will now always update to the latest version, even if earlier updates had also been downloaded beforehand

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.2...v0.1.3

ollama - v0.1.2

Published by jmorganca about 1 year ago

New Models

Zephyr A fine-tuned 7B version of mistral that was trained on on a mix of publicly available, synthetic datasets and performs as well as Llama 2 70B in many benchmarks
Mistral OpenOrca a 7 billion parameter model fine-tuned on top of the Mistral 7B model using the OpenOrca dataset

Examples

Ollama's examples have been updated with some new examples:

Ask the mentors: a TypesScript, multi-user conversation app
TypeScript LangChain: a simple example of using Ollama with LangChainJS and TypeScript.

What's Changed

Download speeds for ollama pull have been significantly improved, from 60MB/s to over 1.5GB/s (25x faster) on fast network connections

The API now supports non-streaming responses. Set the stream parameter to false and endpoints will return data in one single response:

curl -X POST http://localhost:11434/api/generate -d '{
  "model": "llama2",
  "prompt": "Why is the sky blue?",
  "stream": false
}'

Ollama can now be used with http proxies (using HTTP_PROXY=http://<proxy>) and https proxies (using HTTPS_PROXY=https://<proxy>)
Fixed token too long error when generating a response
q8_0, q5_0, q5_1, and f32 models will now use GPU on Linux
Revise help text in ollama run to be easier to read
Rename runner subprocess to ollama-runner
ollama create will now show feedback when reading model metadata
Fix not found error showing when running ollama pull
Improved video memory allocation on Linux to fix errors when using Nvidia GPUs

New Contributors

@xyproto made their first contribution in https://github.com/jmorganca/ollama/pull/705
@konsalex made their first contribution in https://github.com/jmorganca/ollama/pull/741

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.1...v0.1.2

ollama - v0.1.1

Published by jmorganca about 1 year ago

What's Changed

Cancellable responses: Ctrl+C will now cancel responses when running ollama run
Improved error messages for unknown /slash commands when using ollama run
Various improvements to the Linux install script for distro compatibility and to fix bugs
Fixed install issues on Fedora
Fixed issue where specifying the library/ prefix in ollama run would cause an error
Fixed highlight color for placeholder text in ollama run
Fixed issue where auto updater would not restart when clicking "Restart to Update"
Ollama will now clean up subdirectories in ~/.ollama/models
Show a default message when license/parameters/etc

New Contributors

@aaroncoffey made their first contribution in https://github.com/jmorganca/ollama/pull/629
@lstep made their first contribution in https://github.com/jmorganca/ollama/pull/621
@JayNakrani made their first contribution in https://github.com/jmorganca/ollama/pull/632
@Jimexist made their first contribution in https://github.com/jmorganca/ollama/pull/664
@hallh made their first contribution in https://github.com/jmorganca/ollama/pull/663

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.0...v0.1.1

ollama - v0.1.0

Published by jmorganca about 1 year ago

Coming soon

ollama - v0.0.21

Published by jmorganca about 1 year ago

Fixed an issue where empty responses would be returned if template was provided in the api, but not prompt
Fixed an issue where the "Send a message" placeholder would show when writing multi line prompts with ollama run
Fixed an issue where multi-line prompts in ollama run wouldn't be submitted when pressing Return

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.20...v0.0.21

ollama - v0.0.20

Published by jmorganca about 1 year ago

What's Changed

ollama run has a new & improved experience:
- Models will now be loaded immediately making even the first prompt much faster
- Added hint text
- Ollama will now fit words in the available width of the terminal for better readability
OLLAMA_HOST now supports ipv6 hostnames
ollama run will now automatically pull models if they don't exist when using a remote instance of Ollama
Sending an empty prompt field to /api/generate will now load the model so the next request is fast
Fixed an issue where ollama create would not correctly detect falcon model sizes
Add a simple python client to access Ollama in api/client.py by @pdevine
Improvements to showing progress on ollama pull and ollama push
Fixed an issue for adding empty layers with ollama create
Fixed an issue for running Ollama on Windows (compiled from source)
Fixed an error when running ollama push
Readable community projects by @jamesbraza

New Contributors

@jamesbraza made their first contribution in https://github.com/jmorganca/ollama/pull/550

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.19...v0.0.20

ollama - v0.0.19

Published by mchiang0610 about 1 year ago

What's Changed

Updated Docker image for Ollama docker pull ollama/ollama
Ability to import and use GGUF file type models
Fixed issue where ollama push would error on long-running uploads
Ollama will now automatically clean up unused data locally
Improve build instructions by @apepper

New Contributors

@apepper made their first contribution in https://github.com/jmorganca/ollama/pull/482

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.18...v0.0.19

ollama - v0.0.18

Published by mchiang0610 about 1 year ago

What's Changed

New ollama show command for viewing details about a model:
- See a system prompt for a model: ollama show --system orca-mini
- View a model's parameters: ollama show --parameters codellama
- View a model's default prompt template: ollama show --template llama2
- View a Modelfile for a model: ollama show --modelfile llama2
Minor improvements to model loading and generation time
Fixed an issue where characters would be escaped in prompts causing escaped characters like & in the output
Fixed several issues with building from source on Windows and Linux
Minor performance improvements to model loading and generation
New sentiments example by @technovangelist
Fixed num_keep parameter not working properly
Fixed issue where Modelfile parameters would not be honored at runtime
Added missing options params to the embeddings docs by @herrjemand
Fixed issue where ollama list would error when there were no models to show

New Contributors

@callmephilip made their first contribution in https://github.com/jmorganca/ollama/pull/448
@herrjemand made their first contribution in https://github.com/jmorganca/ollama/pull/472

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.17...v0.0.18

ollama - v0.0.17

Published by mchiang0610 about 1 year ago

What's Changed

Multiple models can be removed together: ollama rm mario:latest orca-mini:3b
ollama list will now show a unique ID for each model based on its contents
Fixed bug where a prompt wasn't set by default causing an error when running a model created with ollama create
Fixed crash when running 34B parameter models on hardware with not enough memory to run it.
Fixed issue where non-quantized f16 models would not run
Improved network performance of ollama push
Fixed issue where stop sequences (such as \n) wouldn't be honored

New Contributors

@sqs made their first contribution in https://github.com/jmorganca/ollama/pull/415

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.16...v0.0.17

ollama - v0.0.16

Published by mchiang0610 about 1 year ago

What's Changed

Ollama version can be checked by running ollama -v or ollama --version
Support for 34B models such as codellama
Model names or paths withhttps:// in front of them will now work when running ollama run

New Contributors

@rlbaker made their first contribution in https://github.com/jmorganca/ollama/pull/377. Also thanks to @jesjess243 for opening a PR to fix this

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.15...v0.0.16

ollama - v0.0.15

Published by mchiang0610 about 1 year ago

📍 Ollama model list is now available

Ollama now supports a list of models published on ollama.ai/library. We are working on ways to allow anyone to push models to Ollama. Expect more news on this in the future.

Please join the community on Discord if you have any questions/concerns/ want to hang out.

What's Changed

Target remote Ollama hosts with OLLAMA_HOST=<host> ollama run llama2
Fixed issue where PARAMETER values weren't correctly in Modelfiles
Fixed issue where a warning would show when parsing a Modelfile comment
Ollama will now parse data from ggml format models and use them to make sure your system has enough memory to run a model with GPU support
Experimental support for creating fine-tuned models via ollama create with Modelfiles: use the ADAPTER Modelfile instruction
Added documentation for the num_gqa parameter
Added tutorials and examples for using LangChain with Ollama
Ollama will now log embedding eval timing
Update llama.cpp to the latest version
Add context to api documentation for /api/generate
Fixed issue with resuming downloads via ollama pull
Using EMBED in Modelfiles will now skip regenerating embeddings if the input files have not changed
Ollama will now use an already loaded model for /api/embeddings if it is available
New example: dockerit – a tool to help you build and run your application in a Docker container
Retry download on network errors

New Contributors

@asarturas made their first contribution in https://github.com/jmorganca/ollama/pull/326
@gusanmaz made their first contribution in https://github.com/jmorganca/ollama/pull/340
@bmizerany made their first contribution in https://github.com/jmorganca/ollama/pull/262

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.14...v0.0.15

ollama - v0.0.14

Published by mchiang0610 about 1 year ago

ollama 0 0 14

What's Changed

Ollama 🦜️🔗 LangChain Integration! https://python.langchain.com/docs/integrations/llms/ollama
API docs for Ollama: https://github.com/jmorganca/ollama/blob/main/docs/api.md
Llama 2 70B model with Metal support (Recommend at least 64GB of memory) ollama run llama2:70b
Uncensored Llama 2 70B model with Metal support ollama run llama2-uncensored:70b
New models available! For a list of models you can directly pull from Ollama, please see https://gist.github.com/mchiang0610/b959e3c189ec1e948e4f6a1f737a1fc5
Embeddings can now be generated for a model with /api/embeddings
Experimental EMBED instruction in the Modelfile
Configurable rope frequency parameters
OLLAMA_HOST can now specify the entire address to serve on with ollama serve
Fixed issue where context was truncated incorrectly leading to poor output
ollama pull can now be run in different terminal windows for the same model concurrently
Add an example on multiline input
Fixed error not being checked on ollama pull

New Contributors

@cmiller01 made their first contribution in https://github.com/jmorganca/ollama/pull/301
@soroushj made their first contribution in https://github.com/jmorganca/ollama/pull/316
@findmyway made their first contribution in https://github.com/jmorganca/ollama/pull/311

Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.13...0.0.14

ollama - v0.0.13

Published by jmorganca about 1 year ago

New improvements

Using Ollama CLI without Ollama running will now start Ollama
Changed the buffer limit so that conversations would continue until it is complete
Models now stay loaded in memory automatically between messages, so series of prompts are extra fast!
The white fluffy Ollama icon is back when using dark mode
Ollama will now run on Intel Macs. Compatibility & performance improvements to come
When running ollama run, the /show command can be used to inspect the current model

ollama run can now take in multi-line strings:

% ollama run llama2
>>> """       
  Is this a
  multi-line
  string?
"""
Thank you for asking! Yes, the input you provided is a multi-line string. It contains multiple lines of text separated by line breaks.

More seamless updates: Ollama will now show a subtle hint that an update is ready in the tray menu, instead of a dialog window
ollama run --verbose will now show load duration times

Bug fixes

Fixed crashes on Macs with 8GB of shared memory
Fixed issues in scanning multi-line strings in a Modelfile

ollama - v0.0.12

Published by jmorganca about 1 year ago

New improvements

You can now rename models you've pulled or created with ollama cp
Added support for running k-quant models
Performance improvements from enabling Accelerate
Ollama's API can now be accessed by websites hosted on localhost
ollama create will now automatically pull models in the FROM instruction you don't have locally

Bug fixes

ollama pull will now show a better error when pulling a model that doesn't exist
Fixed an issue where cancelling and resuming downloads with ollama pull would cause an error
Fixed formatting of different errors so they are readable when running ollama commands
Fixed an issue where prompt templates defined with the TEMPLATE instruction wouldn't be parsed correctly
Fixed error when a model isn't found