CLI tool to check whether dead urls are included in files
MIT License
CLI tool to check whether dead urls are included in files.
Suppose you develop an oss and write documents about it. You would want many users to use it. But if it's documents include dead urls, maybe users are disappointed and give up it even if it is good.
How sad it is!
So we have developed this tool. It is good to use this tool at CI.
Of course, you can use durl other than oss documents. For example, you can also check your blog posts with durl.
durl
accepts file paths as stdin and extracts urls in the files and checks whether they are dead.
durl
sends the http requests to all urls and checks the http status code.
If the status code isn't 2xx, durl
treats the url is dead and outputs the file path and url and http status code.
Note that durl
can't detect dead anchors such as https://github.com/suzuki-shunsuke/durl#hoge .
durl
is written with Golang and binary is distributed at release page, so installation is easy.
We provide busybox based docker image installed durl
.
https://quay.io/repository/suzuki_shunsuke/durl
You can try to use durl without installation, and this is useful for CI.
$ docker run -ti --rm -v $PWD:/workspace -w /workspace quay.io/suzuki_shunsuke/durl sh
# echo foo.txt | durl check
At first generate the configuration file.
# Generate .durl.yml
$ durl init
Generate a file included dead url.
$ cat << EOF > bar.txt
https://github.com/suzuki-shunsuke/durl
Please see https://github.com/suzuki-shunsuke/dead-repository .
EOF
Then check the file with durl check
.
durl check
accepts file paths as stdin.
$ echo bar.txt | durl check
[bar.txt] https://github.com/suzuki-shunsuke/dead-repository is dead (404)
It is good to use durl
combining with the find
command.
find . \
-type d -name node_modules -prune -o \
-type d -name .git -prune -o \
-type d -name vendor -prune -o \
-type f -print | \
durl check || exit 1
---
ignore_urls:
- https://github.com/suzuki-shunsuke/ignore-repository
ignore_hosts:
- localhost.com
http_method: head,get
# max parallel http request count.
# the default is 10
max_request_count: 10
# when the number of failed http request become `max_failed_request_count` + 1, exit.
# if max_failed_request_count is -1, don't exit even if how many errors occur.
# the default is 0
max_failed_request_count: 5
# the default is 10 second
http_request_timeout: 10
http_method
is the HTTP method used to check urls.
Please see Releases.
Please see CONTRIBUTING.md .