Quick Start - TextSpitter

← Back to Home

Get up and running in under two minutes

1 — Install

pip

pip install textspitter

# With optional loguru logging
pip install "textspitter[logging]"

uv (library dependency)

uv add textspitter
uv add "textspitter[logging]"

uv tool (standalone CLI)

Installs textspitter as an isolated tool — no virtual environment management required.

uv tool install textspitter

2 — Extract your first file

from TextSpitter import TextSpitter

text = TextSpitter(filename="path/to/document.pdf")
print(text[:500])

TextSpitter() auto-detects the format from the file extension, picks the right reader, and returns a plain str.

3 — Use the CLI

# Single file to stdout
textspitter report.pdf

# Multiple files saved to a combined output
textspitter chapter1.pdf chapter2.pdf -o book.txt

4 — Work with streams

from io import BytesIO
from TextSpitter import TextSpitter

# From BytesIO (e.g. web upload, boto3, httpx)
text = TextSpitter(file_obj=BytesIO(pdf_bytes), filename="report.pdf")

# From raw bytes
text = TextSpitter(file_obj=some_bytes, filename="data.csv")

filename is required for streams so TextSpitter knows which reader to use.

5 — Next steps

Tutorial — format-by-format walkthrough
Use Cases — FastAPI, S3, LangChain patterns
Recipes — copy-paste snippets
API Reference — full API documentation