Create and **automatically** update a list of all videos on a YouTube channel (in txt/csv/md form) via YouTube bot with end-to-end web scraping - no API tokens required. Multi-threaded support for YouTube videos list updates.
APACHE-2.0 License
Bot releases are visible (Hide)
Published by shailshouryya 11 months ago
BUGFIX
pip
installation problem due to incorrectly formattedFEATURE IMPROVEMENTS
time.time()
andtime.perf_counter()
when logging the time taken to performPERFORMANCE IMPROVEMENTS
INTERNAL IMPROVEMENT
Published by shailshouryya almost 2 years ago
Published by shailshouryya almost 2 years ago
call
command to properly run helper batch script (commit d519edfc83f2a06eb5ca507a7cd2f485ffc68b63)Published by shailshouryya about 2 years ago
BUGFIXES
FEATURE IMPROVEMENTS
log_time_taken
PERFORMANCE IMPROVEMENTS
create_thread_from
mentioned in this commit message was a typo and should be create_list_from
INTERNAL IMPROVEMENTS
Published by shailshouryya almost 3 years ago
create_list_from()
method (commit aa4ff3de648f84891a96a5e7a33a8efb00fc0b19)Published by shailshouryya about 3 years ago
Published by shailshouryya about 3 years ago
BREAKING CHANGE
create_list_for()
returned a str
containing the name of the file the program wrote tocreate_list_for()
returns a tuple
containing
list
of list
s containing the video information found by the program for the current run
video_data_returned
ListCreator attribute to True
[[0, '', '', '']]
tuple
containing a str
with the name of the channel (taken from the channel's heading) and a str
with the name of the file written to
('The Channel Name', 'the_name_of_the_file')
('The Channel Name', '')
if the ListCreator attributes are txt=False
, csv=False
, md=False
, AND video_data_returned=True
video_data_returned
create_list_for
method with help(ListCreator.create_list_for)
in the python interpreterBUGFIX
cookie_consent
blocking logic for new HTML in GDPR regions
NEW FEATURES
help(ListCreator)
in the python interpreter or read the "More API information" section in the python README to see the full documentation:
file_suffix
allows more control over the file naming (True
by default)all_video_data_in_memory
scrapes the ENTIRE YouTube channel's videos page, EVEN if files exist for the channel already (False
by default)
video_data_returned
attribute to True
to actually get this informationvideo_data_returned
returns the video data for all videos the program scraped (False
by default)
video_id_only
saves only the video ID instead of the entire URL (False
by default)
file_name
argument options in the create_list_for
method given here, but run help(ListCreator.create_list_for)
in the python interpreter to see the full documentation:
file_name='auto'
names the output file(s) using the name that shows up under the banner when you navigate to the channel's homepage (with spaces removed)file_name='id'
names the output file(s) using the identifier from the URL provided to the url
argument
help(ListCreator.create_list_for)
for a comprehensive list of examplesfile_name='id'
is very useful when multiple channels have the SAME channel namePERFORMANCE IMPROVEMENTS
It took 9.240757292005583 seconds to find 230 videos from https://www.youtube.com/user/schafer5/videos
It took 4.265756259999762 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.csv
This program took 19.537945401003526 seconds to complete.
It took 0.8453300589972059 seconds to find 60 videos from https://www.youtube.com/user/schafer5/videos
It took 0.6392399440010195 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.csv
This program took 7.754261410002073 seconds to complete.
It took 9.163404727999989 seconds to find 230 videos from https://www.youtube.com/user/schafer5/videos
It took 4.260267737000007 seconds to load information for 230 videos into memory
It took 0.002389371999996115 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.csv
This program took 19.483281371000004 seconds to complete.
It took 0.8521808300000089 seconds to find 60 videos from https://www.youtube.com/user/schafer5/videos
It took 1.0964175420000117 seconds to load information for 60 videos into memory
It took 0.0015745449999826633 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.csv
This program took 7.985743492000012 seconds to complete.
It took 9.166668037003546 seconds to find 230 videos from https://www.youtube.com/user/schafer5/videos
It took 10.160974278995127 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.txt
It took 10.164936708999448 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.csv
It took 10.168633003995637 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.md
This program took 25.594990328005224 seconds to complete.
It took 0.8503098270011833 seconds to find 60 videos from https://www.youtube.com/user/schafer5/videos
It took 1.5225159670007997 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.csv
It took 1.5322243859991431 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.txt
It took 1.5359413480036892 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.md
This program took 8.472728426997492 seconds to complete.
It took 9.367390958000005 seconds to find 230 videos from https://www.youtube.com/user/schafer5/videos
It took 4.218187391999997 seconds to load information for 230 videos into memory
It took 0.003894963000000473 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.md
It took 0.005060710999998719 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.csv
It took 0.006283445999997639 seconds to write all 230 videos to CoreySchafer_reverse_chronological_videos_list.txt
This program took 18.754924324 seconds to complete.
It took 0.8672965029999986 seconds to find 60 videos from https://www.youtube.com/user/schafer5/videos
It took 1.0901944209999996 seconds to load information for 60 videos into memory
It took 0.005667658999996661 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.csv
It took 0.008393589000000645 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.txt
It took 0.008197031000001687 seconds to write the 0 ***NEW*** videos to the pre-existing CoreySchafer_reverse_chronological_videos_list.md
This program took 8.090583961999997 seconds to complete.
It took 322.72226654399856 seconds to find 8095 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 256.63442500399833 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.csv
This program took 585.4076739919983 seconds to complete.
It took 0.8482559289986966 seconds to find 60 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 0.5600300389996846 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.csv
This program took 7.653723870003887 seconds to complete.
It took 316.9717323640002 seconds to find 8095 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 248.92245618300012 seconds to load information for 8095 videos into memory
It took 0.07691853599999376 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.csv
This program took 572.114162118 seconds to complete.
It took 0.8459371520000332 seconds to find 60 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 0.9670944140000302 seconds to load information for 60 videos into memory
It took 0.02941359300007207 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.csv
This program took 8.209143252000104 seconds to complete.
It took 314.01985485899786 seconds to find 8095 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 519.1903085960002 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.txt
It took 519.1941804189992 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.csv
It took 519.197644068001 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.md
This program took 839.4073893879977 seconds to complete.
It took 0.8488957250010571 seconds to find 60 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 1.580211615000735 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.csv
It took 1.681963879003888 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.txt
It took 1.6842712280049454 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.md
This program took 8.823843261001457 seconds to complete.
It took 316.342601403 seconds to find 8095 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 261.87072707100003 seconds to load information for 8095 videos into memory
It took 0.1363127509999913 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.csv
It took 0.1775351439999895 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.md
It took 0.18588107000005039 seconds to write all 8095 videos to KhanAcademy_reverse_chronological_videos_list.txt
This program took 584.703847726 seconds to complete.
It took 0.8483775499998956 seconds to find 60 videos from https://www.youtube.com/c/KhanAcademy/videos
It took 1.0671216570001434 seconds to load information for 60 videos into memory
It took 0.17331316700006028 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.csv
It took 0.22995445900005507 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.txt
It took 0.23345572800008085 seconds to write the 0 ***NEW*** videos to the pre-existing KhanAcademy_reverse_chronological_videos_list.md
This program took 8.503321469999833 seconds to complete.
It took 3420.0639533489993 seconds to find 32347 videos from https://www.youtube.com/user/NBCNews/videos
It took 4988.648231769999 seconds to write all 32347 videos to NBCNews_reverse_chronological_videos_list.csv
This program took 8414.909623333002 seconds to complete.
# forgot to run this test :D
It took 3367.386001154002 seconds to find 32357 videos from https://www.youtube.com/user/NBCNews/videos
It took 4880.191474030002 seconds to load information for 32357 videos into memory
It took 0.24478799300050014 seconds to write all 32357 videos to NBCNews_reverse_chronological_videos_list.csv
This program took 8253.73690525 seconds to complete.
It took 0.8474488579995523 seconds to find 60 videos from https://www.youtube.com/user/NBCNews/videos
It took 1.1012943870009622 seconds to load information for 60 videos into memory
It took 0.11654774600174278 seconds to write the 5 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.csv
This program took 8.668505469999218 seconds to complete.
It took 3396.025502143 seconds to find 32347 videos from https://www.youtube.com/user/NBCNews/videos
It took 7683.585577874001 seconds to write all 32347 videos to NBCNews_reverse_chronological_videos_list.txt
It took 7683.592947972 seconds to write all 32347 videos to NBCNews_reverse_chronological_videos_list.md
It took 7684.030176524999 seconds to write all 32347 videos to NBCNews_reverse_chronological_videos_list.csv
This program took 11086.336240618999 seconds to complete.
It took 0.8738655359993572 seconds to find 60 videos from https://www.youtube.com/user/NBCNews/videos
It took 1.8775347520004289 seconds to write the 0 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.csv
It took 2.120259861001614 seconds to write the 0 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.txt
It took 2.132926509999379 seconds to write the 0 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.md
This program took 9.435579917999348 seconds to complete.
It took 3478.1540728540003 seconds to find 32353 videos from https://www.youtube.com/user/NBCNews/videos
It took 5022.493407319 seconds to load information for 32353 videos into memory
It took 0.5065521739998076 seconds to write the 6 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.csv
It took 0.587243801997829 seconds to write all 32353 videos to NBCNews_reverse_chronological_videos_list.txt
It took 0.6058889249979984 seconds to write all 32353 videos to NBCNews_reverse_chronological_videos_list.md
This program took 8507.703900004002 seconds to complete.
It took 0.8569685050024418 seconds to find 60 videos from https://www.youtube.com/user/NBCNews/videos
It took 1.1060196290018212 seconds to load information for 60 videos into memory
It took 0.5880495099991094 seconds to write the 4 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.csv
It took 0.8386826800015115 seconds to write the 4 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.txt
It took 0.8496009250011411 seconds to write the 4 ***NEW*** videos to the pre-existing NBCNews_reverse_chronological_videos_list.md
This program took 9.45503293100046 seconds to complete.
Published by shailshouryya about 3 years ago
csv
files, since
csv
file renderers expect consistent column formatting throughout the file
Video Number,Video Title,Video URL,Watched,Watch again later,Notes
columnscsv
file will result in newly extracted videos having the Video Number,Video Title,Video Duration,Video URL,Watched,Watch again later,Notes
columns while the already extracted videos will only have the Video Number,Video Title,Video URL,Watched,Watch again later,Notes
columns (no Video Duration
column)Video Duration
column between the Video Title
and Video URL
columnsVideo Duration
columnVideo Duration
columnVideo Duration
column between the Video Title
and Video URL
columnsVideo Duration
column
Video Duration
columnFind and Replace
operation:
,https://
,,https://
Video URL
column!
([^:][^\d]{2}),https://
$1,,https://
(depending on your editor, you may need to substitute $1
with \1
or something else)
,https://
where it is NOT preceeded with :\d\d
Video URL
column!
chronological_videos_list
file (as opposed to a reverse_chronological_videos_list
file):
Video Duration
column between the Video Title
and Video URL
columns in the csv header
chronological_videos_list
files use the csv header from the pre-existing csv file
reverse_chronological_videos_list
csv header every time the program looks for new videos when rerun on a previously scraped channelchronological_videos_list
files, however, the program never updates the csv header0.6.0+
txt
and md
files now also include the video duration information
txt
and md
files do not depend on a consistent formatting the way csv
files dotxt
and md
file now use slightly different formatting such as
md
files using h3
headings for video information instead of bullet points (the bullet points were also improperly formatted previously, but since they are no longer used, this is not an issue)verify_page_bottom_n_times
attribute
file_buffering
attribute
Published by shailshouryya over 3 years ago
__init__.py
file for code changesPublished by shailshouryya over 3 years ago
Published by shailshouryya over 3 years ago
cookie_consent
parameter for ListCreator
:lc = ListCreator() # cookie_consent=False by default
# OR
lc = ListCreator(cookie_consent=True) # if you want to accept cookies, or if blocking cookies doesn't work properly
# rest of API unchanged
# use other code as you normally would
# .....
Published by shailshouryya over 3 years ago
Published by shailshouryya over 3 years ago
file
subpackage in create_file.py
and update_file.py
into scroller.py
and write.py
to ↑ DRY and make things Easier to Change (ETC))yt-videos-list
(shipped) package with minifier.py
updatesPublished by shailshouryya over 3 years ago
Published by shailshouryya over 3 years ago
Published by shailshouryya almost 4 years ago
Published by shailshouryya almost 4 years ago
Published by shailshouryya almost 4 years ago
_reverse_chronological
or _chronological
MyChannel_reverse_chronological_videos_list.txt
MyChannel_chronological_videos_list.txt
MyChannel_videos_list.txt
- REGARDLESS of whether the file was in reverse chronological order or chronological orderTrue
on one thread and False
on the other thread# without yt_videos_list submodule function
for i in {1..10}; do (time (for i in {1..100}; do python3 minifier.py; done)); done
real 0m8.261s
user 0m5.433s
sys 0m2.259s
real 0m8.288s
user 0m5.429s
sys 0m2.247s
real 0m8.022s
user 0m5.272s
sys 0m2.164s
real 0m7.989s
user 0m5.266s
sys 0m2.165s
real 0m7.984s
user 0m5.253s
sys 0m2.163s
real 0m8.009s
user 0m5.268s
sys 0m2.164s
real 0m8.047s
user 0m5.269s
sys 0m2.175s
real 0m8.068s
user 0m5.242s
sys 0m2.182s
real 0m8.030s
user 0m5.289s
sys 0m2.164s
real 0m8.046s
user 0m5.284s
sys 0m2.176s
# with yt_videos_list submodule function
for i in {1..10}; do (time (for i in {1..100}; do python3 minifier.py; done)); done
real 1m28.987s
user 0m42.470s
sys 0m41.508s
real 1m28.921s
user 0m42.508s
sys 0m41.411s
real 1m28.753s
user 0m42.436s
sys 0m41.378s
real 1m29.467s
user 0m42.700s
sys 0m41.732s
real 1m28.672s
user 0m42.286s
sys 0m41.406s
real 1m28.415s
user 0m42.297s
sys 0m41.202s
real 1m28.629s
user 0m42.360s
sys 0m41.244s
real 1m29.088s
user 0m42.587s
sys 0m41.527s
real 1m29.392s
user 0m42.644s
sys 0m41.637s
real 1m29.345s
user 0m42.657s
sys 0m41.643s
# without yt_videos_list submodule function again
for i in {1..10}; do (time (for i in {1..100}; do python3 minifier.py; done)); done
real 0m8.488s
user 0m5.585s
sys 0m2.308s
real 0m8.293s
user 0m5.497s
sys 0m2.251s
real 0m8.115s
user 0m5.396s
sys 0m2.188s
real 0m8.116s
user 0m5.396s
sys 0m2.179s
real 0m8.145s
user 0m5.395s
sys 0m2.198s
real 0m8.066s
user 0m5.367s
sys 0m2.170s
real 0m8.042s
user 0m5.340s
sys 0m2.162s
real 0m8.029s
user 0m5.329s
sys 0m2.159s
real 0m8.170s
user 0m5.420s
sys 0m2.195s
real 0m8.154s
user 0m5.426s
sys 0m2.190s
rm /usr/local/bin/sha512_sum
command (bravedriver)Published by shailshouryya almost 4 years ago
Published by shailshouryya about 4 years ago