Please find the documentation for the Aryn SDK Partition module below. All parameters are optional unless specified otherwise.

Synchronous Partitioning Functions

partition_file

Sends a file to Aryn DocParse and returns a Python dictionary with elements containing its document structure and text.

Aynchronous Partitioning Functions

partition_file_async_submit

Submit a document for asynchronous partitioning and get its task_id. The results of the task will remain available in the system for 48 hours. Meant to be used with partition_file_async_result. Note: sending multiple asynchronous partitioning tasks at the same time does not guarantee that they will run simultaneously.

partition_file_async_result

Gets the results of an asynchronous partitioning task by task_id. Meant to be used with partition_file_async_submit.

partition_file_async_cancel

Cancels the task associated with the task_id specified.

partition_file_async_list

Lists all the partition_file tasks still running in your account.

Helper Functions

convert_image_element

Convert an image element to a more usable format. If no format is specified, create a PIL Image object. If a format is specified, output the bytes of the image in that format. If b64encode is set to True, base64-encode the bytes and return them as a string.

draw_with_boxes

Create a list of images from the provided PDF, one for each page, with bounding boxes detected by the partitioner drawn on.

table_elem_to_dataframe

Create a pandas DataFrame representing the tabular data inside the provided table element. If the element is not of type table or doesn’t contain any table data, return None instead.

table_elem_to_html

Convert the tabular data inside the provided table element into an HTML string. If the element is not of type table or doesn’t contain any table data, return None instead.

tables_to_pandas

For every table element in the provided partitioning response, create a pandas DataFrame representing the tabular data. Return a list containing all the elements, with tables paired with their corresponding DataFrames.

tables_to_html

For every table element in the provided partitioning response, create an HTML string representing the tabular data. Return a list containing all the elements, with tables paired with their corresponding HTML.