Quick Start

← Back to Home

Get up and running in under two minutes

1 — Install

pip
pip install textspitter # With optional loguru logging pip install "textspitter[logging]"
uv (library dependency)
uv add textspitter uv add "textspitter[logging]"
uv tool (standalone CLI)

Installs textspitter as an isolated tool — no virtual environment management required.

uv tool install textspitter

2 — Extract your first file

from TextSpitter import TextSpitter text = TextSpitter(filename="path/to/document.pdf") print(text[:500])

TextSpitter() auto-detects the format from the file extension, picks the right reader, and returns a plain str.

3 — Use the CLI

# Single file to stdout textspitter report.pdf # Multiple files saved to a combined output textspitter chapter1.pdf chapter2.pdf -o book.txt

4 — Work with streams

from io import BytesIO from TextSpitter import TextSpitter # From BytesIO (e.g. web upload, boto3, httpx) text = TextSpitter(file_obj=BytesIO(pdf_bytes), filename="report.pdf") # From raw bytes text = TextSpitter(file_obj=some_bytes, filename="data.csv")

filename is required for streams so TextSpitter knows which reader to use.

5 — Next steps