Please find the documentation for the Aryn SDK Partition module below. All parameters are optional unless specified otherwise.

Synchronous Partitioning Functions

partition_file

Sends a file to Aryn DocParse and returns a Python dictionary with elements containing its document structure and text.

Aynchronous Partitioning Functions

partition_file_async_submit

Submit a document for asynchronous partitioning and get its task_id. The results of the task will remain available in the system for 48 hours. Meant to be used with partition_file_async_result. Note: sending multiple asynchronous partitioning tasks at the same time does not guarantee that they will run simultaneously.

partition_file_async_result

Gets the results of an asynchronous partitioning task by task_id. Meant to be used with partition_file_async_submit.

partition_file_async_cancel

Cancels the task associated with the task_id specified.

partition_file_async_list

Lists all the partition_file tasks still running in your account.

Helper Functions

convert_image_element

Convert an image element to a more usable format. If no format is specified, create a PIL Image object. If a format is specified, output the bytes of the image in that format. If b64encode is set to True, base64-encode the bytes and return them as a string.

draw_with_boxes

Create a list of images from the provided PDF, one for each page, with bounding boxes detected by the partitioner drawn on.

table_elem_to_dataframe

Create a pandas DataFrame representing the tabular data inside the provided table element. If the element is not of type table or doesn’t contain any table data, return None instead.

tables_to_pandas

For every table element in the provided partitioning response, create a pandas DataFrame representing the tabular data. Return a list containing all the elements, with tables paired with their corresponding DataFrames.

Was this page helpful?