Document-Scanner is open-source python package to scan, segment and tranform images of documents as if the documents is scanned by a scanner. It includes predefined pipelines on preprocessing, frame detection, transformation and post processing to add styles.
-
Convert to HSV color space
The following pipelines is applied first on intensity slice , or the Value phase, of the original image. If failed to find frame in the intensity image, apply exactly the same processes to saturation image.
-
Preprocessing
-
Blur with Median filter
-
Histogram equalization
-
Morphological operation (Opening)
-
(Optional) Threshold based segmentation.
Here we assume that the document of interest is mainly white while background is darker. Then we can extract document from background with a proper threshold. After histogram, maybe we can just assume the document lays in the half brighter part on histogram.
-
Canny edge detector
-
Contour detection
-
Morphological Erosion
-
Morphological Dilation
This step is to dilate the contour to reduce the impact of non-linear edge when calculating connectivity.
-
-
Hough Transform
-
Intersection 1. Find the cartesian coordination of intersection points 1. Calculate connectivity on every intersections on four direction: up, right, bottom, left. 1, Corner Compute the possiblity on every intersection points to decide the orientation of corner.
-
Frame detection
- Find possible frames
- Select the most possible frame
-
Warp
-
(TODO) Post process
Add directly from GitHub with uv as a dependency:
uv add "doc-scanner @ git+https://github.com/fMeow/document-scanner"Or using pip:
pip install git+https://github.com/fMeow/document-scannerThe minimum required dependencies to run document-scanner are:
- Python>=3.9
- OpenCV4
- scikit-image
- pandas
- numpy>=2.0
import cv2
from doc_scanner import scanner
# Load an image
image = cv2.imread("document.jpg")
# Convert to HSV color space
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)
# Create scanner instance with intensity channel (Value)
intensity_scanner = scanner(hsv[:, :, 2])
# Run the scanning pipeline
intensity_scanner.scan()
# Check if corners were detected
if intensity_scanner.corners is not None:
# Warp the image to extract the document
warped = intensity_scanner.warp(image)
cv2.imwrite("scanned_document.jpg", warped)
else:
# Try with saturation channel as fallback
saturation_scanner = scanner(hsv[:, :, 1])
saturation_scanner.scan()
if saturation_scanner.corners is not None:
warped = saturation_scanner.warp(image)
cv2.imwrite("scanned_document.jpg", warped)For batch processing, use the provided script :
uv run scripts/scan.py --from_dir ./data/images/segment --to_dir ./outputYou can copy it to your project and run it as a standalone script:
uv run --script scripts/scan.py --from_dir ./data/images/segment --to_dir ./outputContributions are welcome! This project uses uv for dependency management, ruff for linting and formatting. Configuration is in pyproject.toml.
-
Fork the repository and then clone it to your local machine.
-
Install uv (if not already installed):
curl -LsSf https://astral.sh/uv/install.sh | sh -
Create a feature branch (
git checkout -b feature/amazing-feature) -
Install the project in editable mode with dev dependencies:
uv pip install -e ".[dev]" -
Make your changes and ensure code is formatted and tests pass:
Run linting:
uv run ruff check . --fixFormat code:
uv run ruff format .Run tests:
uv run pytest
-
Commit your changes (
git commit -m 'Add some amazing feature') -
Push to the branch (
git push origin feature/amazing-feature) -
Open a Pull Request, requirements to be merged:
- All code should pass ruff linting and be formatted with ruff
- All tests should pass on all supported Python versions
- Follow existing code style and conventions
