For complete documentation, see:
Semantically chunk text based on sentence similarity.
curl -X POST http://localhost:3001/api/chunkit \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN_HERE" \
-d '{
"documents": [
{
"document_name": "test",
"document_text": "Your text here..."
}
],
"options": {
"maxTokenSize": 500,
"similarityThreshold": 0.5,
"onnxEmbeddingModel": "Xenova/all-MiniLM-L6-v2",
"dtype": "q8"
}
}'
Pack sentences into dense chunks up to max token size.
curl -X POST http://localhost:3001/api/cramit \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN_HERE" \
-d '{
"documents": [{"document_text": "Your text here..."}],
"options": {"maxTokenSize": 500}
}'
Split text into individual sentences.
curl -X POST http://localhost:3001/api/sentenceit \
-H "Content-Type: application/json" \
-H "Authorization: Bearer YOUR_TOKEN_HERE" \
-d '{
"documents": [{"document_text": "Your text here..."}],
"options": {}
}'
Health check endpoint.
curl http://localhost:3001/api/health
Get API version information.
curl http://localhost:3001/api/version
⚠️ Authentication is ENABLED
All API endpoints require a Bearer token in the Authorization header:
Authorization: Bearer YOUR_TOKEN_HERE
# Using docker-compose (recommended)
docker-compose up -d semantic-chunking-api
# Or with docker run
docker run -p 3001:3001 \
-v ./models:/app/models \
-e API_AUTH_TOKEN=your-token \
semantic-chunking