htmlq – Command-line HTML Processor

Similar to how jq allows you to extract content from JSON files on the CLI, htmlq allows you extract content from HTML files.

Like jq, but for HTML. Uses CSS selectors to extract bits of content from HTML files.

You can pass in a local HTML file, but also pipe it to cURL requests.

For example, to get the links to all last articles on the homepage of bram.us, you can perform this request:

$ curl -X GET https://www.bram.us/ | htmlq "main h2 a" -a href

https://www.bram.us/2021/09/03/next-js-apollo-server-side-rendering-ssr/
https://www.bram.us/2021/09/03/multiple-accounts-and-git/
https://www.bram.us/2021/09/03/random-paint-effects-with-houdini/
https://www.bram.us/2021/09/02/crafting-organic-patterns-with-voronoi-tessellations/
https://www.bram.us/2021/09/01/pick-colors-from-websites-with-the-eyedropper-api/
https://www.bram.us/2021/09/01/the-universe-is-hostile-to-computers/
https://www.bram.us/2021/08/27/morse-code-translator-html-css/
https://www.bram.us/2021/08/27/vector-raster-why-not-both/
https://www.bram.us/2021/08/27/key-data-structures-and-their-roles-in-renderingng/
https://www.bram.us/2021/08/27/css-shapes-editor-extension-for-chrome-devtools/

Using main h2 a we extract the link elements that we need, and with the -a flag we can opt to only return the specified href attribute from those selected elements.

Installation possible via Brew:

brew install htmlq

htmlq — Like jq, but for HTML →

htmlq – Command-line HTML Processor

Published by Bramus!

Leave a comment

Cancel reply