Key Features:
Multi-threaded Processing: Speeds up chunking operations by processing multiple documents simultaneously. Supports Multiple File Types: Handles PDF, DOCX, and PPTX formats. Flexible Chunking Strategies: Offers fixed-size and page-based chunking methods. Zero Dependencies: Lightweight and easy to integrate into your projects. Installation:
pip install unsiloed-chunker Usage Example:
from unsiloed_chunker import Chunker
chunker = Chunker(file_path="your_document.pdf") chunks = chunker.chunk(strategy="fixed_size", chunk_size=500) for chunk in chunks: print(chunk) For more details, check out the documentation.
I'd love to hear your feedback and suggestions!