README.md
March 23, 2026 ยท View on GitHub
How to Install
pip install pyzipper textract python-magic PyPDF2 python-docx chardet python-pytesseract
git clone https://github.com/Michael-Sebero/Document-Tools
python3 /home/$USER/Document-Tools/document-tools.py
Compare Documents
This compares two documents and lists the similarities and differences to an output file.
Duplicate Line Remover
This detects duplicate lines in a file, removes them and then saves the changes to an output file.
Extract Text
This extracts text from an image or a directory full of images.
Find Word
This looks in a given directory recursively for keywords in documents and tells you where you can find them.
Find Word Archive
This looks in a given directory's .zip or .tar file for keywords in documents and tells you where you can find them in the archive.
Keyword Line Extractor
This looks for keywords in a file and extracts lines where they're found to an output file.
Replace Keyword
This replaces keywords in a file.