TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

News

[2024/08/27] 🔥 The training code is coming soon.
[2024/08/27] 🔥 Released inference code for visual table understanding tasks. Due to company copyright restrictions, we utilize InternLM-7B-chat as the LLM.

Installation

This codebase is tested on CUDA 11.8 and A100-SXM-80G.

conda create -n TabPedia python=3.10 -y && conda activate TabPedia
pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
pip install packaging &&  pip install ninja && pip install flash-attn==2.3.6 --no-build-isolation --no-cache-dir
pip install -r requirements.txt
git clone https://github.com/InternLM/xtuner.git -b 9bce7b
cd xtuner
pip install -e '.[all]'

Quick Start

You need to download the official ViT-L/224 from 🤗 Huggingface and save it into ./pretrained_pth/CLIP-ViT-Large.
You need to download our pretrained model from 🤗 TabPedia_v1.0 and save it into ./pretrained_pth.
Change the configuration of CLIP_L_224px_pretrained_pth and llm_name_or_path in tools/configs/Internlm2_7b_chat_TabPedia.py
Finally, you can perform evaluation shell to coduct prediction. The results could be found in ./results.
```
bash eval_TabPedia.sh
```

Citation

If you find this work useful, please consider citing our paper:

@article{zhao2024tabpedia,
title={TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy},
author={Zhao, Weichao and Feng, Hao and Liu, Qi and Tang, Jingqun and Wei, Shu and Wu, Binghong and Liao, Lei and Ye, Yongjie and Liu, Hao and Li, Houqiang and others},
journal={arXiv preprint arXiv:2406.01326},
year={2024}
}

Acknowledgement

Xtuner: the codebase we built upon.

Related Projects

pytorch_tabular

A standard framework for modelling Deep Learning Models for tabular data

15 Dec 2020 1,215

OpenSTL

OpenSTL: A Comprehensive Benchmark of Spatio-Temporal Predictive Learning

27 Jul 2022 739

audio

Data manipulation and transformation for audio signal processing, powered by PyTorch

05 May 2017 2,468

JARVIS

JARVIS, a system to connect LLMs with ML community. Paper: https://arxiv.org/pdf/2303.17580.pdf

30 Mar 2023 23,575

pytorch-widedeep

A flexible package for multimodal-deep-learning to combine tabular data with text and images usin...

21 Oct 2017 1,243

EasyNLP

EasyNLP: A Comprehensive and Easy-to-use NLP Toolkit

06 Apr 2022 2,037

pyramidtabnet

Official PyTorch implementation of PyramidTabNet: Transformer-based Table Recognition in Image-ba...

09 Sep 2022 22