A helper for combating incorrect content-type, aka a mime sniffing module for node.js
MIT License
A helper for combating incorrect content-type, aka a mime sniffing module for node.js
So you have made a http request and got back some headers and a response body, but you just don't know if that innocent Content-Type
header tells you what really goes on in its body
.
Enter doc-sniff
, a much simpler implementation of whatwg mime sniffing algorithm. Specifically for those responses that can't be easily distinguished via file extensions or magic numbers, eg. HTML, XML documents.
npm install doc-sniff --save
var docsniff = require('doc-sniff');
var mime1 = docsniff(false, '<html></html>');
console.log(mime1); // text/html
var mime2 = docsniff('text/html', '<?xml version="1.0" encoding="UTF-8" ?><feed></feed>');
console.log(mime2); // application/atom+xml
var mime3 = docsniff('application/xml; charset=UTF-8', '<?xml version="1.0" encoding="UTF-8" ?><feed></feed>');
console.log(mime3); // application/xml
Currently this module will correct following mime:
It does not attempt to be overzealous at correcting subtypes; see example 3 above, if original mime is acceptable, it will not be replaced.
type
is the content-type header in responsebody
is the response body stringThe whatwg spec has a much more thorough algorithm and mime list for browser vendors, but on server-side, we are more interested in parsable documents and information extractions, if you encounter a use case not covered by this algorithm, please let us know on github issues.
Like any simple algorithm, this can easily be spoofed, so don't use it for validation, use it for mime sniffing incoming documents only.
(For better security: mime and mmmagic can handle most filetypes, but you still need XSS protections and content whitelist to safely serve content to users.)
MIT