TabPedia

This repository is the codebase of TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

APACHE-2.0 License

Stars
10
Committers
1

TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

News

  • [2024/08/27] 🔥 The training code is coming soon.
  • [2024/08/27] 🔥 Released inference code for visual table understanding tasks. Due to company copyright restrictions, we utilize InternLM-7B-chat as the LLM.

Installation

  • This codebase is tested on CUDA 11.8 and A100-SXM-80G.
    conda create -n TabPedia python=3.10 -y && conda activate TabPedia
    pip install torch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 --index-url https://download.pytorch.org/whl/cu118
    pip install packaging &&  pip install ninja && pip install flash-attn==2.3.6 --no-build-isolation --no-cache-dir
    pip install -r requirements.txt
    git clone https://github.com/InternLM/xtuner.git -b 9bce7b
    cd xtuner
    pip install -e '.[all]'
    

Quick Start

  • You need to download the official ViT-L/224 from 🤗 Huggingface and save it into ./pretrained_pth/CLIP-ViT-Large.
  • You need to download our pretrained model from 🤗 TabPedia_v1.0 and save it into ./pretrained_pth.
  • Change the configuration of CLIP_L_224px_pretrained_pth and llm_name_or_path in tools/configs/Internlm2_7b_chat_TabPedia.py
  • Finally, you can perform evaluation shell to coduct prediction. The results could be found in ./results.
    bash eval_TabPedia.sh
    

Citation

If you find this work useful, please consider citing our paper:

@article{zhao2024tabpedia,
title={TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy},
author={Zhao, Weichao and Feng, Hao and Liu, Qi and Tang, Jingqun and Wei, Shu and Wu, Binghong and Liao, Lei and Ye, Yongjie and Liu, Hao and Li, Houqiang and others},
journal={arXiv preprint arXiv:2406.01326},
year={2024}
}

Acknowledgement

  • Xtuner: the codebase we built upon.