| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Contains fix for CVE-2026-14009.
Changelog:
* Fix CVE-2025-14009: secure ZIP extraction in nltk.downloader
* Block path traversal/arbitrary reads in nltk.data for protocol-less refs
* Block path traversal/abs paths in corpus readers and FS pointers
* Validate external StanfordSegmenter JARs using SHA256
* Add optional sandbox enforcement for filestring()
* Maintenance: downloader/zipped models, CI/tooling updates
Signed-off-by: Gyorgy Sarvari <skandigraun@gmail.com>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
(cherry picked from commit 14d464c15094d1758dc14706646a8aa645a3bf34)
Signed-off-by: Gyorgy Sarvari <skandigraun@gmail.com>
Signed-off-by: Anuj Mittal <anuj.mittal@oss.qualcomm.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changelog:
=============
* Update download checksums to use SHA256 in built index
* Fix percentage escape in new-style string formatting
* replace shortened URLs using goo.gl
* Make Wordnet interoperable with various taggers and tagged corpora
* Fix saving PerceptronTagger
* Document how to reproduce old Wordnet studies
* properly initialize Portuguese corpus reader
* support for mixed rules conversion into Chomsky Normal Form
* only import tkinter if a GUI is needed
* issue #2112 with Corenlp
* new environment variable NLTK_DOWNLOADER_FORCE_INTERACTIVE_SHELL
* Lesk defaults to most frequent sense in case of ties
Signed-off-by: Wang Mingyu <wangmy@fujitsu.com>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
|
|
|
The Natural Language Toolkit (NLTK) is a Python package for
natural language processing.
Signed-off-by: Thomas Perrot <thomas.perrot@bootlin.com>
Signed-off-by: Khem Raj <raj.khem@gmail.com>
|