python-arpa

Python library for n-gram models in ARPA format

MIT License

Downloads

5.4K

Stars

38

Committers

View Code on GitHub

Ecosystems: C, Python, R

Python ARPA Package

Python library for reading ARPA n-gram models.

Documentation is available.
Changes between releases are documented.
Bugs can be reported on the issue tracker.
Questions can be asked via e-mail.
Source code is tracked on GitHub.

Setup

Python 3.4+

In order to install the Python 3 version:

$ pip install --user -U arpa

Python 2.7

In order to install the Python 2.7 version:

$ pip install --user -U arpa-backport

Usage

The package may be imported directly:

import arpa  # Python 3.4+
# OR
import arpa_backport as arpa  # Python 2.7

models = arpa.loadf("foo.arpa")
lm = models[0]  # ARPA files may contain several models.

# probability p(end|in, the)
lm.p("in the end")
lm.log_p("in the end")

# sentence score w/ sentence markers
lm.s("This is the end .")
lm.log_s("This is the end .")

# sentence score w/o sentence markers
lm.s("This is the end .", sos=False, eos=False)
lm.log_s("This is the end .", sos=False, eos=False)

Development

Contributions are welcome! Write a bug report or send a pull request. Other contributors have done so before.

License

Copyright (c) 2015-2018 Stefan Fischer The source code is available under the MIT License. See LICENSE for further details.

Package Rankings

Top 8.13% on Pypi.org

Badges

Extracted from project README

PyPI Python Versions

PyPI Version

PyPI Python Versions

PyPI Version

Travis

Documentation Status

Coverage Status

Related Projects

DocsGPT

GPT-powered chat for documentation, chat with your documents

02 Feb 2023 14,124

pyVersioning

Gather version information and export as any programming language source file for inclusion into ...

arviz

Exploratory analysis of Bayesian models with Python

29 Jul 2015 1,536

ASR_benchmark

Program to benchmark various speech recognition APIs

awesome-python-scientific-audio

Curated list of python software and packages related to scientific research in audio

19 Nov 2016 1,511

ipasymbols

Properties of IPA symbols

Seq2Seq-Vis

Visualization for Sequential Neural Networks with Attention

16 May 2017 455

pytaxize

python port of taxize (taxonomy toolbelt) for R

pyonlinesvr

Python-Wrapper for Francesco Parrella's OnlineSVR C++ implementation with scikit-learn-compatible...

scholarec

Recommendation engine for scholarly articles

fmri-physio-log

Parse Siemens PMU files

argumentation-management

Annotator combining different NLP pipelines.

lark

Lark is a parsing toolkit for Python, built with a focus on ergonomics, performance and modularity.

04 Feb 2017 4,800

scirate

Python wrapper for extracting information from Scirate

pyfn

A python module to process data for Frame Semantic Parsing