htmlq – Command-line HTML Processor

Similar to how jq allows you to extract content from JSON files on the CLI, htmlq allows you extract content from HTML files.

Like jq, but for HTML. Uses CSS selectors to extract bits of content from HTML files.

You can pass in a local HTML file, but also pipe it to cURL requests.

For example, to get the links to all last articles on the homepage of bram.us, you can perform this request:

$ curl -X GET https://www.bram.us/ | htmlq "main h2 a" -a href

https​://www.bram.us/2021/09/03/next-js-apollo-server-side-rendering-ssr/
https​://www.bram.us/2021/09/03/multiple-accounts-and-git/
https​://www.bram.us/2021/09/03/random-paint-effects-with-houdini/
https​://www.bram.us/2021/09/02/crafting-organic-patterns-with-voronoi-tessellations/
https​://www.bram.us/2021/09/01/pick-colors-from-websites-with-the-eyedropper-api/
https​://www.bram.us/2021/09/01/the-universe-is-hostile-to-computers/
https​://www.bram.us/2021/08/27/morse-code-translator-html-css/
https​://www.bram.us/2021/08/27/vector-raster-why-not-both/
https​://www.bram.us/2021/08/27/key-data-structures-and-their-roles-in-renderingng/
https​://www.bram.us/2021/08/27/css-shapes-editor-extension-for-chrome-devtools/

Using main h2 a we extract the link elements that we need, and with the -a flag we can opt to only return the specified href attribute from those selected elements.

Installation possible via Brew:

brew install htmlq

htmlq — Like jq, but for HTML →

About the author

Bramus is a Freelance Web Developer from Belgium. From the moment he discovered view-source at the age of 14 (way back in 1997), he fell in love with the web and has been tinkering with it ever since (more …)

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.