Website: https://rohetoric.github.io/text-vector-visualisation/
APACHE-2.0 License
To run the code the following are a must to be installed:
Serial No | Libraries to Install |
---|---|
1. | FastText |
2. | TensorFlow |
3. | Spacy |
Download the bbc-text.csv
dataset from here or it can be downloaded through the terminal if gcloud is already setup by the command gsutil cp gs://dataset-uploader/bbc/bbc-text.csv [path to notebook directory]
Make sure all the libraries are present/updated according to the requirements
and dependencies
mentioned above.
To train the model according to the above complete dataset using FastText, run the notebook fasttextmodeltrain.ipynb
present in _notebooks
folder. A pre-trained model (2.4GB size) based on the dataset can be downloaded from here.
According to the FastText documentation:
Steps 4,5 and 6 differ for TF1 and TF 2. After that, the steps are same.
Create a folder called tb1files
in the same directory of the notebooks and keep it empty. It will store all the tensorflow log files after step 5 is run.
Run the notebook tb1vis.ipynb
present in _notebooks
folder.
Set the terminal address path to the directory where the files are stored in the terminal and type the command: tensorboard --logdir tb1files/
The above command would yield a result:
Create a folder called tb2files
in the same directory of the notebooks and keep it empty. It will store all the tensorflow log files after step 5 is run.
Run the notebook tb2vis.ipynb
present in _notebooks
folder.
Set the terminal address path to the directory where the files are stored in the terminal and type the command: tensorboard --logdir tb2files/
The above command would yield a result:
Open the local host URL link present in the last line. For Example: http://localhost:6008/
[in TB1 Command image].
The local host website shown below will run. From the drop-down which reads Inactive, press and go to Projector as depicted by the arrow in the image below.
That's it, folks!