Similar to how jq
allows you to extract content from JSON files on the CLI, htmlq
allows you extract content from HTML files.
Like jq, but for HTML. Uses CSS selectors to extract bits of content from HTML files.
You can pass in a local HTML file, but also pipe it to cURL requests.
For example, to get the links to all last articles on the homepage of bram.us, you can perform this request:
$ curl -X GET https://www.bram.us/ | htmlq "main h2 a" -a href
https://www.bram.us/2021/09/03/next-js-apollo-server-side-rendering-ssr/
https://www.bram.us/2021/09/03/multiple-accounts-and-git/
https://www.bram.us/2021/09/03/random-paint-effects-with-houdini/
https://www.bram.us/2021/09/02/crafting-organic-patterns-with-voronoi-tessellations/
https://www.bram.us/2021/09/01/pick-colors-from-websites-with-the-eyedropper-api/
https://www.bram.us/2021/09/01/the-universe-is-hostile-to-computers/
https://www.bram.us/2021/08/27/morse-code-translator-html-css/
https://www.bram.us/2021/08/27/vector-raster-why-not-both/
https://www.bram.us/2021/08/27/key-data-structures-and-their-roles-in-renderingng/
https://www.bram.us/2021/08/27/css-shapes-editor-extension-for-chrome-devtools/
Using main h2 a
we extract the link elements that we need, and with the -a
flag we can opt to only return the specified href
attribute from those selected elements.
Installation possible via Brew:
brew install htmlq