API Documentation
DocParse
- Partition
Aryn Platform
- Document
- DocSet
- Query
- Transform
Partition Document
POST
/
v1
/
document
/
partition
curl --request POST \
--url https://api.aryn.cloud/v1/document/partition \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: multipart/form-data' \
--form 'options={
"selected_pages": [
123
],
"extract_images": false,
"extract_image_format": "ppm",
"extract_table_structure": false,
"table_extraction_options": {
"include_additional_text": false,
"model_selection": "pixels > 500 -> deformable_detr; table_transformer"
},
"summarize_images": false,
"text_mode": "standard",
"use_ocr": true,
"text_extraction_options": {
"ocr_text_mode": "vision",
"remove_line_breaks": true
},
"ocr_language": "english",
"threshold": "auto",
"chunking_options": {
"strategy": "context_rich",
"tokenizer": "openai_tokenizer",
"tokenizer_options": {
"model_name": "text-embedding-3-small"
},
"max_tokens": 123,
"merge_across_pages": true
},
"output_format": "json",
"output_label_options": {
"title_candidate_elements": [
"<string>"
],
"promote_title": false,
"orientation_correction": false
},
"markdown_options": {
"include_pagenum": false,
"include_headers": false,
"include_footers": false
}
}'
{
"status": [
"<string>"
],
"status_code": 123,
"error": "<string>",
"elements": [
{
"type": "<string>",
"bbox": [
123
],
"properties": {},
"text_representation": "<string>"
}
],
"markdown": "<string>"
}
This is the Aryn DocParse API for partitioning (and optionally chunking) a document synchronously.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Headers
Body
multipart/form-data
Response
200
application/json
Successful Response
The response is of type object
.
Was this page helpful?
curl --request POST \
--url https://api.aryn.cloud/v1/document/partition \
--header 'Authorization: Bearer <token>' \
--header 'Content-Type: multipart/form-data' \
--form 'options={
"selected_pages": [
123
],
"extract_images": false,
"extract_image_format": "ppm",
"extract_table_structure": false,
"table_extraction_options": {
"include_additional_text": false,
"model_selection": "pixels > 500 -> deformable_detr; table_transformer"
},
"summarize_images": false,
"text_mode": "standard",
"use_ocr": true,
"text_extraction_options": {
"ocr_text_mode": "vision",
"remove_line_breaks": true
},
"ocr_language": "english",
"threshold": "auto",
"chunking_options": {
"strategy": "context_rich",
"tokenizer": "openai_tokenizer",
"tokenizer_options": {
"model_name": "text-embedding-3-small"
},
"max_tokens": 123,
"merge_across_pages": true
},
"output_format": "json",
"output_label_options": {
"title_candidate_elements": [
"<string>"
],
"promote_title": false,
"orientation_correction": false
},
"markdown_options": {
"include_pagenum": false,
"include_headers": false,
"include_footers": false
}
}'
{
"status": [
"<string>"
],
"status_code": 123,
"error": "<string>",
"elements": [
{
"type": "<string>",
"bbox": [
123
],
"properties": {},
"text_representation": "<string>"
}
],
"markdown": "<string>"
}
Assistant
Responses are generated using AI and may contain mistakes.