Get up and running with Llama 3.2, Mistral, Gemma 2, and other large language models.
MIT License
Bot releases are hidden (Show)
Published by jmorganca 11 months ago
/api/generate
by setting the format
parameter to json
{"raw": true}
to /api/generate
ollama pull
and ollama push
root
userollama pull
and ollama push
Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.8...v0.1.9
Published by jmorganca 12 months ago
codellama
and mistrallite
ollama push
is now much faster. 7B models will push up to ~100MB/s and large models (70B+) up to 1GB/s if network speeds permitFull Changelog: https://github.com/jmorganca/ollama/compare/v0.1.7...v0.1.8
Published by jmorganca 12 months ago
ollama run
where certain key combinations such as Ctrl+Space would lead to an unresponsive promptollama run
where retrieving the previous prompt from history would require two up arrow key presses instead of oneollama run
with Ctrl+D will now put cursor on the next lineFull Changelog: https://github.com/jmorganca/ollama/compare/v0.1.6...v0.1.7
Published by jmorganca 12 months ago
zephyr:7b-alpha
ollama run
is now possibleollama run
chat
or instruct
models now support setting the system
parameter, or SYSTEM
command in the Modelfile
num_ctx
, etc) have been updated for library modelsOLLAMA_MODELS
. See the FAQ for more info on how to configure this.OLLAMA_HOST
will now default to port 443
when https://
is specified, and port 80
when http://
is specifiedOLLAMA_HOST
ollama pull
would retry multiple times when out of spaceout of memory
issues when using Nvidia GPUsFull Changelog: https://github.com/jmorganca/ollama/compare/v0.1.5...v0.1.6
Published by jmorganca 12 months ago
falcon
or starcoder
modelsFull Changelog: https://github.com/jmorganca/ollama/compare/v0.1.4...v0.1.5
Published by jmorganca almost 1 year ago
temperature
) or system promptscurl https://ollama.ai/install.sh | sh
to be more consistent with Ollama on macOSollama
as the current user~/.ollama/models
(existing models in /usr/share/ollama
will be moved to this directory)~/.ollama/logs/server.log
starcoder
, sqlcoder
and falcon
models now have unicode support. Note: they will need to be re-pulled (e.g. ollama pull starcoder
)ollama serve
will now print the current version of Ollama on startollama run
will now show more descriptive errors when encountering runtime issues (such as insufficient memory)ollama create
on Linuxollama show
would show an empty SYSTEM
prompt (instead of omitting it)/api/tags
endpoint would return null
instead of []
if no models were foundollama show
wouldn't work when connecting remotely by using OLLAMA_HOST
num_gpu
set to 0
ollama serve
logs to report the proper amount of GPU memory (VRAM) being usedNote: the EMBED
keyword in Modelfile
is being revisited until a future version of Ollama. Join the discussion on how we can make it better.
Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.3...v0.1.4
Published by jmorganca about 1 year ago
num_gpu
to 0
would result in an errorFull Changelog: https://github.com/jmorganca/ollama/compare/v0.1.2...v0.1.3
Published by jmorganca about 1 year ago
Ollama's examples have been updated with some new examples:
ollama pull
have been significantly improved, from 60MB/s to over 1.5GB/s (25x faster) on fast network connectionsstream
parameter to false
and endpoints will return data in one single response:
curl -X POST http://localhost:11434/api/generate -d '{
"model": "llama2",
"prompt": "Why is the sky blue?",
"stream": false
}'
HTTP_PROXY=http://<proxy>
) and https proxies (using HTTPS_PROXY=https://<proxy>
)token too long
error when generating a responseq8_0
, q5_0
, q5_1
, and f32
models will now use GPU on Linuxollama run
to be easier to readollama-runner
ollama create
will now show feedback when reading model metadatanot found error
showing when running ollama pull
Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.1...v0.1.2
Published by jmorganca about 1 year ago
Ctrl+C
will now cancel responses when running ollama run
/slash
commands when using ollama run
library/
prefix in ollama run
would cause an errorollama run
~/.ollama/models
Full Changelog: https://github.com/jmorganca/ollama/compare/v0.1.0...v0.1.1
Published by jmorganca about 1 year ago
Coming soon
Published by jmorganca about 1 year ago
template
was provided in the api, but not prompt
ollama run
ollama run
wouldn't be submitted when pressing ReturnFull Changelog: https://github.com/jmorganca/ollama/compare/v0.0.20...v0.0.21
Published by jmorganca about 1 year ago
ollama run
has a new & improved experience:
OLLAMA_HOST
now supports ipv6 hostnamesollama run
will now automatically pull models if they don't exist when using a remote instance of Ollamaprompt
field to /api/generate
will now load the model so the next request is fastollama create
would not correctly detect falcon model sizesapi/client.py
by @pdevineollama pull
and ollama push
ollama create
ollama push
Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.19...v0.0.20
Published by mchiang0610 about 1 year ago
docker pull ollama/ollama
ollama push
would error on long-running uploadsFull Changelog: https://github.com/jmorganca/ollama/compare/v0.0.18...v0.0.19
Published by mchiang0610 about 1 year ago
ollama show
command for viewing details about a model:
ollama show --system orca-mini
ollama show --parameters codellama
ollama show --template llama2
ollama show --modelfile llama2
&
in the outputnum_keep
parameter not working properlyModelfile
parameters would not be honored at runtimeollama list
would error when there were no models to showFull Changelog: https://github.com/jmorganca/ollama/compare/v0.0.17...v0.0.18
Published by mchiang0610 about 1 year ago
ollama rm mario:latest orca-mini:3b
ollama list
will now show a unique ID for each model based on its contentsollama create
ollama push
\n
) wouldn't be honoredFull Changelog: https://github.com/jmorganca/ollama/compare/v0.0.16...v0.0.17
Published by mchiang0610 about 1 year ago
ollama -v
or ollama --version
codellama
https://
in front of them will now work when running ollama run
Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.15...v0.0.16
Published by mchiang0610 about 1 year ago
Ollama now supports a list of models published on ollama.ai/library. We are working on ways to allow anyone to push models to Ollama. Expect more news on this in the future.
Please join the community on Discord if you have any questions/concerns/ want to hang out.
OLLAMA_HOST=<host> ollama run llama2
PARAMETER
values weren't correctly in Modelfilesollama create
with Modelfiles: use the ADAPTER
Modelfile instructionnum_gqa
parametercontext
to api documentation for /api/generate
ollama pull
EMBED
in Modelfiles will now skip regenerating embeddings if the input files have not changed/api/embeddings
if it is availabledockerit
– a tool to help you build and run your application in a Docker containerFull Changelog: https://github.com/jmorganca/ollama/compare/v0.0.14...v0.0.15
Published by mchiang0610 about 1 year ago
ollama run llama2:70b
ollama run llama2-uncensored:70b
/api/embeddings
EMBED
instruction in the ModelfileOLLAMA_HOST
can now specify the entire address to serve on with ollama serve
ollama pull
can now be run in different terminal windows for the same model concurrentlyollama pull
Full Changelog: https://github.com/jmorganca/ollama/compare/v0.0.13...0.0.14
Published by jmorganca about 1 year ago
ollama run
, the /show
command can be used to inspect the current modelollama run
can now take in multi-line strings:
% ollama run llama2
>>> """
Is this a
multi-line
string?
"""
Thank you for asking! Yes, the input you provided is a multi-line string. It contains multiple lines of text separated by line breaks.
ollama run --verbose
will now show load duration timesModelfile
Published by jmorganca about 1 year ago
ollama cp
localhost
ollama create
will now automatically pull models in the FROM
instruction you don't have locallyollama pull
will now show a better error when pulling a model that doesn't existollama pull
would cause an errorollama
commandsTEMPLATE
instruction wouldn't be parsed correctly