diff --git a/CHANGELOG.md b/CHANGELOG.md index f523940..24509db 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -9,7 +9,14 @@ You should also add project tags for each release in Github, see [Managing relea ## [Unreleased] ### Changed - GitHub workflow for linting and formatting uses ruff as a separate job + +### Fixed +- Fixed bug where only every other file was read instead of all files - Fixed Logging bug in `add_tokenize_docs` in `word_count.py` + +### Added +- Added note for Z shell users to use quotes when running `pip install -e .'[test,dev]'` + ### Removed - GitHub action to run flake8 for linting in build - Removed wildcard from corpus-counter script dependency diff --git a/README.md b/README.md index b131646..cacd064 100644 --- a/README.md +++ b/README.md @@ -15,6 +15,7 @@ Use these steps for setting up a development environment to install and work wit 3) Install the package. - If you want to just use the scripts and package features, install the project by running `pip install .` from the root directory. - If you will be changing the code and running tests, you can install it by running `pip install -e .[test,dev]`. The `-e/--editable` flag means local changes to the project code will always be available with the package is imported. You wouldn't use this in production, but it's useful for development. + - Note for zsh users: use `pip install -e .'[test,dev]'` For example, if you use Conda, you would run the following to create an environment named `template` with python version 3.10, then activate it and install the package in developer mode: diff --git a/src/cdstemplate/corpus_counter_script.py b/src/cdstemplate/corpus_counter_script.py index 63f45e1..e6d7e1b 100644 --- a/src/cdstemplate/corpus_counter_script.py +++ b/src/cdstemplate/corpus_counter_script.py @@ -56,7 +56,7 @@ def main(csv_out, document_dir, case_insensitive=False): for i, doc in enumerate(documents): if i % 2 == 0: logger.info("Tokenizing document number %s: %s", i, doc) - cc.add_doc(Path(doc).read_text()) + cc.add_doc(Path(doc).read_text()) cc.save_token_counts(csv_out)