Extract structured table data from PDFs using Aryn DocParse
table
element that contains the information about the table in the page.
cells
field which is an array of cell objects that represent each of the cells in the table. Let’s focus on the first element of that list.
tables_to_pandas
function to turn the JSON into a pandas dataframe and then perform some analysis on it:
Years ended December 31 (Millions) | 2018 | 2017 | 2016 | |
---|---|---|---|---|
0 | Major GAAP Cash Flow Categories | |||
1 | Net cash provided by operating activities | $ 6,439 | 6,240 | $ 6,662 |
2 | Net cash provided by (used in) investing activities | 222 | (3,086) | (1,403) |
3 | Net cash used in financing activities. | (6,701) | (2,655) | (4,626) |
4 | Free Cash Flow (non-GAAP measure) | |||
5 | Net cash provided by operating activities | $ 6,439 | $ 6,240 | 6,662 |
6 | Purchases of property, plant and equipment (PP&E | (1,577) | (1,373) | (1,420) |
7 | Free cash flow | $ 4,862 | $ 4,867 | $ 5,242 |
8 | Net income attributable to 3M | $ 5,349 | $ 4,858 | $ 5,050 |
9 | Free cash flow conversion | 91 % | 100 % | 104 % |