corpusmaker

clojure utilities to build training corpora for machine learning / NLP out of public wikimedia dumps: status - partially stalled - will probably be reworked as cascalog scripts -- this project is in stalled mode right now: the pignlproc project is likely to replace it due to licensing constraints for future integration in Apache projects

EPL-1.0 License

Stars
14

Statistics for this project are still being loaded, please check back later.