Can't process docx files even though docx2txt is installed

Hello,

I am trying to use textract to do the obvious with docx files in a AWS Lambda using python. Textract library is included in the package, as is the dependency - docx2txt. I try getting the text out of the file, but still getting the ExtensionNotSupported stating that docx is not supported. I tried putting the doc2txt library in the parsers folder too -  didn't help.


![image](https://github.com/user-attachments/assets/a67319bc-374c-4323-847a-c7babad41d29)

Using:
 - Textract version 1.6.3
 - Python version 3.11
 - AWS Lambda function


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Can't process docx files even though docx2txt is installed #521

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Can't process docx files even though docx2txt is installed #521

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions