Skip to content

fMeow/document-scanner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Document-Scanner

Document-Scanner is open-source python package to scan, segment and tranform images of documents as if the documents is scanned by a scanner. It includes predefined pipelines on preprocessing, frame detection, transformation and post processing to add styles.

Pipeline

  1. Convert to HSV color space

    The following pipelines is applied first on intensity slice , or the Value phase, of the original image. If failed to find frame in the intensity image, apply exactly the same processes to saturation image.

  2. Preprocessing

    1. Blur with Median filter

    2. Histogram equalization

    3. Morphological operation (Opening)

    4. (Optional) Threshold based segmentation.

      Here we assume that the document of interest is mainly white while background is darker. Then we can extract document from background with a proper threshold. After histogram, maybe we can just assume the document lays in the half brighter part on histogram.

    5. Canny edge detector

    6. Contour detection

    7. Morphological Erosion

    8. Morphological Dilation

      This step is to dilate the contour to reduce the impact of non-linear edge when calculating connectivity.

  3. Hough Transform

  4. Intersection 1. Find the cartesian coordination of intersection points 1. Calculate connectivity on every intersections on four direction: up, right, bottom, left. 1, Corner Compute the possiblity on every intersection points to decide the orientation of corner.

  5. Frame detection

    1. Find possible frames
    2. Select the most possible frame
  6. Warp

  7. (TODO) Post process

Installation

Add directly from GitHub with uv as a dependency:

uv add "doc-scanner @ git+https://github.com/fMeow/document-scanner"

Or using pip:

pip install git+https://github.com/fMeow/document-scanner

The minimum required dependencies to run document-scanner are:

  • Python>=3.9
  • OpenCV4
  • scikit-image
  • pandas
  • numpy>=2.0

Usage

Basic Example

import cv2
from doc_scanner import scanner

# Load an image
image = cv2.imread("document.jpg")

# Convert to HSV color space
hsv = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)

# Create scanner instance with intensity channel (Value)
intensity_scanner = scanner(hsv[:, :, 2])

# Run the scanning pipeline
intensity_scanner.scan()

# Check if corners were detected
if intensity_scanner.corners is not None:
    # Warp the image to extract the document
    warped = intensity_scanner.warp(image)
    cv2.imwrite("scanned_document.jpg", warped)
else:
    # Try with saturation channel as fallback
    saturation_scanner = scanner(hsv[:, :, 1])
    saturation_scanner.scan()
    if saturation_scanner.corners is not None:
        warped = saturation_scanner.warp(image)
        cv2.imwrite("scanned_document.jpg", warped)

Command-Line Script

For batch processing, use the provided script :

uv run scripts/scan.py --from_dir ./data/images/segment --to_dir ./output

You can copy it to your project and run it as a standalone script:

uv run --script scripts/scan.py --from_dir ./data/images/segment --to_dir ./output

Contributing

Contributions are welcome! This project uses uv for dependency management, ruff for linting and formatting. Configuration is in pyproject.toml.

Development Setup

  1. Fork the repository and then clone it to your local machine.

  2. Install uv (if not already installed):

    curl -LsSf https://astral.sh/uv/install.sh | sh
  3. Create a feature branch (git checkout -b feature/amazing-feature)

  4. Install the project in editable mode with dev dependencies:

    uv pip install -e ".[dev]"
  5. Make your changes and ensure code is formatted and tests pass:

    Run linting:

    uv run ruff check . --fix

    Format code:

    uv run ruff format .

    Run tests:

    uv run pytest
  6. Commit your changes (git commit -m 'Add some amazing feature')

  7. Push to the branch (git push origin feature/amazing-feature)

  8. Open a Pull Request, requirements to be merged:

    • All code should pass ruff linting and be formatted with ruff
    • All tests should pass on all supported Python versions
    • Follow existing code style and conventions

About

A opencv4 and scikit-image based python digital image document scanner.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •  

Languages