Skip to content

Feature: More config arguments for CLI version #185

@pfahlstrom

Description

@pfahlstrom

Could more switch arguments be added to the CLI version? For example, I would like to set uncurl_quotes to False and unescape_html to True. (Even though my file has "<" characters in it.)

For what I'm doing I don't have a need to build a python script, so being able to set more config options via CLI switches would be very helpful.

(Also, on readthedocs at the top of the Configuring ftfy page it references fix_entities once, but I see elsewhere that's deprecated and replaced with unescape_html.)

EDIT: For now, I tried to set these options permanently by going to __init__.py and changing them there, under TextFixerConfig.

unescape_html: Union[str, bool] = True # was = "auto"
uncurl_quotes: bool = False # was = True

Unfortunately, while this worked for the quote marks, for some reason it did NOT work for the html entities. So, in addition, I went through and commented out these lines:

# if config.unescape_html == "auto" and "<" in segment:
# config = config._replace(unescape_html=False)

# if config.unescape_html == "auto" and "<" in text:
# config = config._replace(unescape_html=False)

# if config.unescape_html == "auto" and "<" in line:
# config = config._replace(unescape_html=False)

Once I did that, the CLI started unescaping the entities like I wanted.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions