Determine the relation between two
URL
s.
Node.js >= 14
is required. To install, type this at the command line:
npm install url-relation
URLRelation.match(url1, url2[, options])
const URLRelation = require('url-relation');
const url1 = new URL('http://domain.com/');
const url2 = new URL('http://domain.com/#hash');
const options = {
components: [URLRelation.HASH],
ignoreComponents: true
};
if (URLRelation.match(url1, url2, options)) {
// considered the same
}
URLRelation::upTo(component[, ignoredComponents])
component
is the same as targetComponent
.
ignoredComponents
is the same as components
. However, if it's value is a non-empty array, it will also set ignoreComponents
to true
.
const URLRelation = require('url-relation');
const url1 = new URL('http://domain.com/');
const url2 = new URL('http://domain.com/#hash');
const relation = new URLRelation(url1, url2, options);
if (relation.upTo(URLRelation.HASH, [URLRelation.HASH])) {
// considered the same
}
if (relation.upTo(URLRelation.PATH)) {
// considered the same
}
It is simplest to use an option profile, but custom configurations are still possible.
components
Type: Array<Symbol>
Default value: []
A list of URL components for ignoreComponents
. See URL Components for possible values.
defaultPorts
Type: Object
Default value: {}
A map of protocol default ports for ignoreDefaultPort
. Be sure to include the suffixed ":" in the key. Common protocols already have their ports removed.
ignoreComponents
Type: Boolean
or Function
Default value: true
When set to true
or a function that returns true
, a URL's components specified in components
will be ignored during comparison.
ignoreDefaultPort
Type: Boolean
or Function
Default value: true
When set to true
or a function that returns true
, a URL's port that matches any found in defaultPorts
will be ignored during comparison.
ignoreIndexFilename
Type: Boolean
or Function
Default value: Function
When set to true
or a function that returns true
, a URL's file name that matches any found in indexFilenames
will be ignored during comparison.
ignoreEmptyQueries
Type: Boolean
or Function
Default value: Function
When set to true
or a function that returns true
, a URL's empty query parameters (such as "?=") will be ignored during comparison. This option will be silently skipped if the input URL
s do not support URLSearchParams
.
ignoreQueryNames
Type: Boolean
or Function
Default value: false
When set to true
or a function that returns true
, a URL's query parameters matching queryNames
will be ignored during comparison. This option will be silently skipped if the input URL
s do not support URLSearchParams
.
ignoreQueryOrder
Type: Boolean
or Function
Default value: Function
When set to true
or a function that returns true
, the order of unique query parameters will not distinguish one URL from another. This option will be silently skipped if the input URL
s do not support URLSearchParams
.
ignoreEmptySegmentNames
Type: Boolean
or Function
Default value: false
When set to true
or a function that returns true
, empty segment names within a URL's path (such as the "//" in "/path//to/") will be ignored during comparison.
ignoreWWW
Type: Boolean
or Function
Default value: Function
When set to true
or a function that returns true
, a URL's "www" subdomain will be ignored during comparison.
indexFilenames
Type: Array<RegExp|string>
Default value: ['index.html']
A list of file names for ignoreIndexFilename
.
queryNames
Type: Array<RegExp|string>
Default value: []
A list of query parameters for ignoreQueryNames
.
targetComponent
Type: Symbol
Default value: URLRelation.HASH
The URL component at which to limit—and include in—the relation from left to right. See URL Components for more info and for possible values.
When an option is defined as a Function
, it must return true
to be included in the custom filter:
const options = {
ignoreIndexFilename: (url1, url2) => {
// Only URLs with these protocols will have their index filename ignored
return url1.protocol === 'http:' && url1.protocol === 'https:';
}
};
CAREFUL_PROFILE
is useful for a URL to an unknown or third-party server that could be incorrectly configured according to specifications and common best practices.
COMMON_PROFILE
, the default profile, is useful for a URL to a known server that you trust and expect to be correctly configured according to specifications and common best practices.
An example of checking for a trusted hostname:
const dynamicProfile = (url1, url2) => {
const trustedHostnames = ['domain.com'];
const isTrusted = trustedHostnames
.reduce((results, trustedHostname) => {
results[0] = results[0] || url1.hostname.endsWith(trustedHostname);
results[1] = results[1] || url2.hostname.endsWith(trustedHostname);
return results;
}, [false,false])
.every(result => result);
return URLRelation[`${isTrusted ? 'COMMON' : 'CAREFUL'}_PROFILE`];
};
const url1 = new URL('http://domain.com/');
const url2 = new URL('http://domain.com/#hash');
const profile = dynamicProfile(url1, url2);
const custom = {
...URLRelation.COMMON_PROFILE,
indexFilenames: ['index.html', 'index.php']
};
Or:
const extend = require('extend');
const custom = extend(true, {}, URLRelation.COMMON_PROFILE, { indexFilenames:['index.php'] });
AUTH HOST PATH
__|__ ___|___ ______|______
/ \ / \ / \
USERNAME PASSWORD HOSTNAME PORT PATHNAME SEARCH HASH
___|__ __|___ ______|______ | __________|_________ ___|___ |
/ \ / \ / \ / \ / \ / \ / \
foo://username:[email protected]:123/hello/world/there.html?var=value#foo
\_/ \_/ \_____/ \_/ \_________/ \________/
| | | | | |
PROTOCOL SUBDOMAIN | TLD SEGMENTS FILENAME
|
DOMAIN
The components of URLs are compared in the following order:
PROTOCOL
USERNAME
PASSWORD
AUTH
TLD
DOMAIN
SUBDOMAIN
HOSTNAME
PORT
HOST
SEGMENTS
FILENAME
PATHNAME
SEARCH
PATH
HASH
As you may have noticed, there are a few breaks in linearity:
TLD
is prioritized before DOMAIN
because matching a domain on a different top-level domain is very uncommon (but still possible via ignoreComponents
).SUBDOMAIN
is prioritized after DOMAIN
.Other considerations:
HOSTNAME
components will also have related TLD
, DOMAIN
and SUBDOMAIN
components due to the above mentioned comparison order only; not because they actually have those components.