Show HN: Extract Tables from Any Website – Images to JSON via OCR

github.com

1 points by valliappanr 13 hours ago

built a two-step open-source tool that extracts tables from any website, even the hard ones that rely on dynamic rendering or CSS.

Step 1: Capture tables as images using a headless browser Step 2: Run OCR to convert them into structured JSON

This works well when traditional HTML parsers fail, like for complex styles, merged cells, or JS-rendered content.

GitHub: https://github.com/enterpriseqa/extract_tables_from_websites Examples included. Feedback and contributions are welcome!