Skip to content

WordTokenizer conflict with nltk >= 3.8.2 #1017

@ansesame

Description

@ansesame

First, thanks for the library. Second, I found a problem with the requirements after the release of nltk == 3.8.2 for WordTokenizer

The aforementioned nltk == 3.8.2 solves a remote code execution vulnerability, disabled to obtain "punkt" and replaced it with "punkt_tab".
Issues:

This creates a problem with newspaper3k when downloading "punkt" in REQUIERED_CORPORA. Then I suggest modify REQUIRED_CORPORA or updating the requierements.py to avoid using nltk >= 3.8.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions