← Back to Home
Get up and running in under two minutes
1 — Install
pip
pip install textspitter
# With optional loguru logging
pip install "textspitter[logging]"uv (library dependency)
uv add textspitter
uv add "textspitter[logging]"uv tool (standalone CLI)
Installs textspitter as an isolated tool — no virtual environment management required.
uv tool install textspitter2 — Extract your first file
from TextSpitter import TextSpitter
text = TextSpitter(filename="path/to/document.pdf")
print(text[:500])TextSpitter() auto-detects the format from the file extension, picks the right reader, and returns a plain str.
3 — Use the CLI
# Single file to stdout
textspitter report.pdf
# Multiple files saved to a combined output
textspitter chapter1.pdf chapter2.pdf -o book.txt4 — Work with streams
from io import BytesIO
from TextSpitter import TextSpitter
# From BytesIO (e.g. web upload, boto3, httpx)
text = TextSpitter(file_obj=BytesIO(pdf_bytes), filename="report.pdf")
# From raw bytes
text = TextSpitter(file_obj=some_bytes, filename="data.csv")filename is required for streams so TextSpitter knows which reader to use.
5 — Next steps
- Tutorial — format-by-format walkthrough
- Use Cases — FastAPI, S3, LangChain patterns
- Recipes — copy-paste snippets
- API Reference — full API documentation