Using the Aryn SDK
Using DocParse with the Aryn SDK
Installation
We recommend installing the Aryn SDK library using pip
:
Partitioning a Document
Partition a document like so:
partition_file
takes the same options as curl, except as keyword arguments. You can find a list of options here.
Key management
By default, aryn-sdk
looks for Aryn API keys first in the environment variable ARYN_API_KEY
, and then in ~/.aryn/config.yaml
. You can override this behavior by specifying a key directly or a different path to the Aryn config file:
Helper Functions
aryn_sdk
provides some helper functions to make working with and visualizing the output of partition_file
easier.
Different File Formats
It is easy to process files with different formats using the aryn-sdk:
Chunking a document
Chunking support has been added in v0.1.9. You can enable the default chunking options by specifying an empty dict:
Here is an example specifying certain chunking options:
The full chunking options are documented here.
Partitioning asynchronously
partition_file_async_submit
is a function that submits a file partitioning task to Aryn and returns with its task_id
.
You can use the returned task_id
to keep track of your request to partition the file. Poll for the result using
another function, partition_file_async_result
, (it returns a dict with a “status” key that indicates whether the task
is “done”, “pending” or in “error” state). Once the “status” is “done” the dict returned by
partition_file_async_result
will have a “result” key with the actual results. The following code snippet is an
example of how you can use these
functions:
One way to do this with multiple requests at a time is shown below:
Optionally, you can also set a webhook for Aryn’s services to call when your task is completed:
Aryn will POST a request containing a body like the below:
For more information, see the Aryn SDK documentation.
Was this page helpful?