Similar to how
jq allows you to extract content from JSON files on the CLI,
htmlq allows you extract content from HTML files.
Like jq, but for HTML. Uses CSS selectors to extract bits of content from HTML files.
You can pass in a local HTML file, but also pipe it to cURL requests.
For example, to get the links to all last articles on the homepage of bram.us, you can perform this request:
$ curl -X GET https://www.bram.us/ | htmlq "main h2 a" -a href https://www.bram.us/2021/09/03/next-js-apollo-server-side-rendering-ssr/ https://www.bram.us/2021/09/03/multiple-accounts-and-git/ https://www.bram.us/2021/09/03/random-paint-effects-with-houdini/ https://www.bram.us/2021/09/02/crafting-organic-patterns-with-voronoi-tessellations/ https://www.bram.us/2021/09/01/pick-colors-from-websites-with-the-eyedropper-api/ https://www.bram.us/2021/09/01/the-universe-is-hostile-to-computers/ https://www.bram.us/2021/08/27/morse-code-translator-html-css/ https://www.bram.us/2021/08/27/vector-raster-why-not-both/ https://www.bram.us/2021/08/27/key-data-structures-and-their-roles-in-renderingng/ https://www.bram.us/2021/08/27/css-shapes-editor-extension-for-chrome-devtools/
main h2 a we extract the link elements that we need, and with the
-a flag we can opt to only return the specified
href attribute from those selected elements.
Installation possible via Brew:
brew install htmlq