DocSet Functions
Functions for managing document sets (DocSets) which are collections of documents.Create DocSet
Create a new DocSet to store documents.Parameters
Parameters
Example
Example
Return Value
Return Value
A DocSetMetadata object containing:
Unique identifier for the DocSet
Name of the DocSet
Creation timestamp
Boolean indicating if DocSet is read-only
Dictionary of custom properties
Size of DocSet in bytes
Schema object defining document properties
Dictionary of prompts for the DocSet
Get DocSet
Retrieve metadata for a DocSet.Parameters
Parameters
The unique identifier of the DocSet to retrieve
Example
Example
Return Value
Return Value
A DocSetMetadata object containing:
Unique identifier for the DocSet
Name of the DocSet
Creation timestamp
Boolean indicating if DocSet is read-only
Dictionary of custom properties
Size of DocSet in bytes
Schema object defining document properties
Dictionary of prompts for the DocSet
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “DocSet not found”HTTPError 5xx: Internal Server Error
List DocSets
List all DocSets in the account.Parameters
Parameters
Example
Example
Return Value
Return Value
A paginated list of DocSetMetadata objects, each containing:
Unique identifier for the DocSet
Name of the DocSet
Creation timestamp
Boolean indicating if DocSet is read-only
Dictionary of custom properties
Size of DocSet in bytes
Schema object defining document properties
Dictionary of prompts for the DocSet
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 5xx: Internal Server Error
Delete DocSet
Delete a DocSet and all its documents.Parameters
Parameters
The unique identifier of the DocSet to delete
Example
Example
Return Value
Return Value
The metadata of the deleted DocSet
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “DocSet not found”HTTPError 5xx: Internal Server Error
Document Functions
Functions for managing individual documents within DocSets.Add Document
Add a single document to the Aryn platform. This API calls DocParse to partition the document, and automatically extracts any properties registered as part of the DocSet schema.Parameters
Parameters
A
file opened in binary mode or a path specified as either a str or PathLike instance, or an HTTP URL indicating the document to add. The path can either be a local path or an Amazon S3 url starting with s3://. The URL can start with either https:// or http://. In the latter case, you must have boto3 installed and AWS credentials set up in your environment.The id of the DocSet into which to add the document.
Example
Example
Return Value
Return Value
A
DocumentMetadata object containingAccount identifier
Document identifier
Document set identifier
Document name
Creation timestamp
Document size in bytes
MIME type of document
Custom document properties
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “DocSet not found”HTTPError 5xx: Internal Server Error
List Documents
List all documents in a DocSet.Parameters
Parameters
Example
Example
Return Value
Return Value
A paginated list of DocumentMetadata objects, each containing:
Account identifier
Document identifier
Document set identifier
Document name
Creation timestamp
Document size in bytes
MIME type of document
Custom document properties
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “DocSet not found”HTTPError 400: “Invalid filter parameters”HTTPError 5xx: Internal Server Error
Get Document
Get a document by ID.Parameters
Parameters
Example
Example
Return Value
Return Value
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “Document not found”HTTPError 5xx: Internal Server Error
Delete Document
Delete a document by ID.Parameters
Parameters
Example
Example
Return Value
Return Value
The metadata of the deleted document
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “Document not found”HTTPError 5xx: Internal Server Error
Get Document Binary
Get the binary content of a document.Parameters
Parameters
Example
Example
Return Value
Return Value
The binary content of the document
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “Document not found”HTTPError 5xx: Internal Server Error
Asynchronous Document Functions
add_doc_async
Submit a document for asynchronous add_doc and get itstask_id. The results of the task will remain available in the system for 48 hours. Meant to be used with get_async_result.
Note: sending multiple asynchronous add_doc tasks at the same time does not guarantee that they will run simultaneously.
Parameters
Parameters
A
file opened in binary mode or a path specified as either a str or PathLike instance, or an HTTP URL indicating the document to add. The path can either be a local path or an Amazon S3 url starting with s3://. The URL can start with either https:// or http://. In the latter case, you must have boto3 installed and AWS credentials set up in your environment.The id of the DocSet into which to add the document.
Example
Example
Return Value
Return Value
A dict containing the
task_id of the submitted request.Exceptions
Exceptions
User errors:
HTTPError: Error:status_code 403. Reason:"This async action requires you to upgrade your account plan"- Fix: Please upgrade your account here
HTTPError: Error:status_code 403. Reason:"No Aryn API Key provided"- Fix: Please provide an API key either as a parameter or specify it in the environment variable
ARYN_API_KEY. HTTPError: Error:status_code 403. Reason:"Invalid Aryn API key"- Fix: Please provide a valid API key either as a parameter or specify it in the environment variable
ARYN_API_KEY. HTTPError: Error:status_code 403. Reason:"Expired Aryn API key"- Fix: Please get a new API key here.
HTTPError: Error:status_code 429. Reason:"Too many requests"- Fix: Please try again after some time. Each account is allowed 1000 tasks to run at a time.
HTTPError: Error:status_code 5xx. Reason:Internal Server Error
get_async_result
Gets the results of an asynchronous add_doc task bytask_id. Meant to be used with add_doc_async.
Parameters
Parameters
task_id: Required. A string of the task id to poll and attempt to get the result for.aryn_api_key: An Aryn API key, provided as a string. You can get one for free at aryn.ai/get-started. Default isNone(If not provided, the sdk will check for it in the environment variableARYN_API_KEYor will look in aryn_config as specified above).region: A string that specifies the region to use for the DocParse server. Valid values areUSandNone. Default isNone, which uses the US region. Via the API, you can specify the region by modifying the base URL of the DocParse server.aryn_config: An ArynConfig object (defined in aryn_sdk/config.py), used for finding an api key. Ifaryn_api_keyis set it will override this. The default ArynConfig looks in the env varARYN_API_KEYand then in the file~/.aryn/config.yaml. Default is None (aryn-sdk will look in the aryn_api_key parameter, in your environment variables, and then in~/.aryn/config.yaml).ssl_verify: Aboolthat controls whether the client verifies the SSL certificate of the chosen DocParse server. ssl_verify isTrueby default, enforcing SSL verification.
Example
Example
Return Value
Return Value
A dict like the one in the example below containing “task_status”. When “task_status” is “done”, the returned
dict also contains “result” which contains what would have been returned had
add_doc been called directly. If there is an error with ingesting the file itself, then the “task_status” will still be “done” but the
“result” will contain an “error” field indicating what went wrong.“task_status” can be “done” or “pending”. Exceptions
Exceptions
User errors:
HTTPError: Error:status_code 403. Reason:"This async action requires you to upgrade your account plan"- Fix: Please upgrade your account here
HTTPError: Error:status_code 403. Reason:"No Aryn API Key provided"- Fix: Please provide an API key either as a parameter or specify it in the environment variable
ARYN_API_KEY. HTTPError: Error:status_code 403. Reason:"Invalid Aryn API key"- Fix: Please provide a valid API key either as a parameter or specify it in the environment variable
ARYN_API_KEY. HTTPError: Error:status_code 403. Reason:"Expired Aryn API key"- Fix: Please get a new API key here.
aryn_sdk.partition.partition.PartitionTaskNotFoundError. Reason:"No such task"- Fix: Check to make sure the task_id specified is correct.
HTTPError: Error:status_code 5xx. Reason:Internal Server Error
cancel_async_task
Cancels the task associated with the task_id specified.Parameters
Parameters
task_id: Required. A string of the task id to cancel.aryn_api_key: An Aryn API key, provided as a string. You can get one for free at aryn.ai/get-started. Default isNone(If not provided, the sdk will check for it in the environment variableARYN_API_KEYor will look in aryn_config as specified above).region: A string that specifies the region to use for the DocParse server. Valid values areUSandNone. Default isNone, which uses the US region. Via the API, you can specify the region by modifying the base URL of the DocParse server.aryn_config: An ArynConfig object (defined in aryn_sdk/config.py), used for finding an api key. Ifaryn_api_keyis set it will override this. The default ArynConfig looks in the env varARYN_API_KEYand then in the file~/.aryn/config.yaml. Default is None (aryn-sdk will look in the aryn_api_key parameter, in your environment variables, and then in~/.aryn/config.yaml).ssl_verify: Aboolthat controls whether the client verifies the SSL certificate of the chosen DocParse server. ssl_verify isTrueby default, enforcing SSL verification.
Example
Example
Return Value
Return Value
No return value. Asynchronous tasks may only be successfully cancelled once. Once a task has been
cancelled, any
get_async_result calls using that task’s id will throw an exception.Exceptions
Exceptions
User errors:
HTTPError: Error:status_code 403. Reason:"This async action requires you to upgrade your account plan"- Fix: Please upgrade your account here
HTTPError: Error:status_code 403. Reason:"No Aryn API Key provided"- Fix: Please provide an API key either as a parameter or specify it in the environment variable
ARYN_API_KEY. HTTPError: Error:status_code 403. Reason:"Invalid Aryn API key"- Fix: Please provide a valid API key either as a parameter or specify it in the environment variable
ARYN_API_KEY. HTTPError: Error:status_code 403. Reason:"Expired Aryn API key"- Fix: Please get a new API key here.
aryn_sdk.partition.partition.PartitionTaskNotFoundError. Reason:"No such task"- Fix: Check to make sure the task_id specified is correct.
HTTPError: Error:status_code 5xx. Reason:Internal Server Error
list_async_tasks
Lists all the add_doc tasks still running in your account.Parameters
Parameters
aryn_api_key: An Aryn API key, provided as a string. You can get one for free at aryn.ai/get-started. Default isNone(If not provided, the sdk will check for it in the environment variableARYN_API_KEYor will look in aryn_config as specified above).region: A string that specifies the region to use for the DocParse server. Valid values areUSandNone. Default isNone, which uses the US region. Via the API, you can specify the region by modifying the base URL of the DocParse server.aryn_config: An ArynConfig object (defined in aryn_sdk/config.py), used for finding an api key. Ifaryn_api_keyis set it will override this. The default ArynConfig looks in the env varARYN_API_KEYand then in the file~/.aryn/config.yaml. Default is None (aryn-sdk will look in the aryn_api_key parameter, in your environment variables, and then in~/.aryn/config.yaml).ssl_verify: Aboolthat controls whether the client verifies the SSL certificate of the chosen DocParse server. ssl_verify isTrueby default, enforcing SSL verification.
Example
Example
Return Value
Return Value
A dict like the one below which maps task_ids to a dict containing details of the respective task.
Exceptions
Exceptions
User errors:
HTTPError: Error:status_code 403. Reason:"This async action requires you to upgrade your account plan"- Fix: Please upgrade your account here
HTTPError: Error:status_code 403. Reason:"No Aryn API Key provided"- Fix: Please provide an API key either as a parameter or specify it in the environment variable
ARYN_API_KEY. HTTPError: Error:status_code 403. Reason:"Invalid Aryn API key"- Fix: Please provide a valid API key either as a parameter or specify it in the environment variable
ARYN_API_KEY. HTTPError: Error:status_code 403. Reason:"Expired Aryn API key"- Fix: Please get a new API key here.
HTTPError: Error:status_code 5xx. Reason:Internal Server Error
Properties Functions
Functions for managing document properties.Update Document Properties
Update properties of a document.Parameters
Parameters
Example
Example
Return Value
Return Value
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “Document not found”HTTPError 5xx: Internal Server Error
Extract Properties
Extract properties from a document.Parameters
Parameters
Example
Example
Return Value
Return Value
A job status object containing:
exit_status: The exit status of the job
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “DocSet not found”HTTPError 5xx: Internal Server Error
Delete Properties
Delete properties from a document.Parameters
Parameters
Example
Example
Return Value
Return Value
A job status object
Exceptions
Exceptions
HTTPError 403: “No Aryn API Key provided”HTTPError 403: “Invalid Aryn API key”HTTPError 403: “Expired Aryn API key”HTTPError 404: “DocSet not found”HTTPError 5xx: Internal Server Error
Client Options
Parameters
Parameters
region: A string that specifies the region to use for the DocParse server. Valid values areUSandNone. Default isNone.timeout: A float that specifies the timeout in seconds for the client. Default is240.0.
Example
Example
