Transform HTML Content to Apple News / Google AMP / Facebook Instant Articles with Distro

distro

Automatically transform article HTML for third-party platforms such as Facebook Instant Articles, Apple News, Google AMP

“Just” make a curl request …

curl --data "
<h1>Distro.Mic example</h1>
<p>Text and media embed</p>
<iframe src=\"https://www.youtube.com/embed/M7lc1UVf-VE\"></iframe>
" https://distro.mic.com/1.0/format?output=apple-news

… and the content is “translated” to the specified format:

{
	"article": [{
		"text": "Distro.Mic example",
		"additions": [],
		"inlineTextStyles": [],
		"role": "heading1",
		"layout": "bodyLayout"
	}, {
		"text": "Text and media embed\n",
		"additions": [],
		"inlineTextStyles": [],
		"role": "body",
		"layout": "bodyLayout"
	}, {
		"role": "container",
		"components": [{
			"role": "embedwebvideo",
			"URL": "https://www.youtube.com/embed/M7lc1UVf-VE",
			"style": "embedMediaStyle",
			"layout": "embedMediaLayout"
		}],
		"layout": "embedLayout",
		"style": "embedStyle"
	}],
	"bundlesToUrls": {}
}

You can also install it as a dependency:

npm install distro-mic
import { format } from 'distro-mic';

const html = '<p>Article HTML</p>';
const output = format(html);

Distro →

Getting the share count of a URL

30757382

At work I’m working on a project in which we need to track the share count of a URL. Below are my quick notes. In all examples I’m getting the share count of https://www.facebook.com/. Note that I’m using the jq notation to extract the actual result from the response.

Just GET it

Facebook

Request:

curl --silent -X GET http://graph.facebook.com/?id=https://www.facebook.com/

UPDATE August 23, 2016

As of August 7, 2016 version 2.0 of the Graph API has been disabled. Version 2.1 is now the default one. In between 2.0 and 2.1 the response format has been changed. Thanks go out to Nabil Souk for pointing this out. This post has been updated to reflect this change. For archival purposes both responses (both the old, and new) are still included on this page.

Response (Graph API v2.0, pre August 7, 2016):

{
  "id": "https://www.facebook.com/",
  "shares": 42944662,
  "comments": 1086
}

The count can be found in .shares

Response (Graph API v2.1 and higher, post August 7, 2016):

{
    "og_object": {
        "id": "10151063484068358",
        "title": "Welcome to Facebook - Log In, Sign Up or Learn More",
        "type": "website",
        "updated_time": "2016-08-23T08:50:45+0000"
    },
    "share": {
        "comment_count": 1335,
        "share_count": 113774962
    },
    "id": "https://www.facebook.com/"
}

The count can be found in share.share_count

Twitter

Unfortunately it’s not that easy (anymore). See below for a possible solution.

LinkedIn

Request:

curl --silent -X GET 'https://www.linkedin.com/countserv/count/share?url=https%3A%2F%2Fwww.facebook.com&format=json'

Response:

{
  "count": 1672,
  "fCnt": "1,672",
  "fCntPlusOne": "1,673",
  "url": "https://www.facebook.com"
}

The count can be found in .count

Google Plus

Request:

curl --silent -H 'Content-type: application/json' -X POST -d '[{"method":"pos.plusones.get","id":"p","params":{"nolog":true,"id":"https://www.facebook.com/","source":"widget","userId":"@viewer","groupId":"@self"},"jsonrpc":"2.0","key":"p","apiVersion":"v1"}]' https://clients6.google.com/rpc

Response:

[
 {
  "id": "p",
  "result": {
   "kind": "pos#plusones",
   "id": "https://www.facebook.com/",
   "isSetByViewer": false,
   "metadata": {
    "type": "URL",
    "globalCounts": {
     "count": 430133.0
    }
   },
   "abtk": "AEIZW7RT8UqRam+DqDUFw3fY9+GpV+OwizHp+BDdEKudWO37f2nItnJhQKKGNT3nMuz5XXwcRe8qSDQMNkHTiOOeHtxJCcvJdg=="
  }
 }
]

The count can be found in .[0].result.metadata.globalCounts.count

Other Services

More services (Buffer, Pinterest, etc.) needed? Check https://www.madebymagnitude.com/blog/getting-your-social-share-counts-with-php/ for inspiration.

Scripts / Inspiration

Some scripts that might help you out:

So, what about Twitter?

The problem

In the past it was possible to send a GET request to https://cdn.syndication.twitter.com/widgets/tweetbutton/count.json?url={URL}. However, the “Count Endpoint” has been shut down and now returns a 410 Gone

A possible solution

Use Twitter’s Search API:

The Twitter Search API is part of Twitter’s v1.1 REST API. It allows queries against the indices of recent or popular Tweets

Endpoint URLs that might be of interest:

But:

Before getting involved, it’s important to know that the Search API is focused on relevance and not completeness. This means that some Tweets and users may be missing from search results.

In detail: “Not all Tweets will be indexed or made available via the search interface.” and “[T]he Search API is focused on relevance and not completeness. This means that some Tweets and users may be missing from search results.” and “Search API usually only serves tweets from the past week.”

Oh.

A proper solution then?

As Twitter’s Search API mentions:

If you want to match for completeness you should consider using a Streaming API instead.

The Streaming API allows you to track Public Streams. Track that, and then count yourself.

This code snippet using the twitter-stream-api NPM module should get you started:

var TwitterStream = require('twitter-stream-api'),
    fs = require('fs');

// Create your keys at https://apps.twitter.com/
var keys = {
    consumer_key : "consumer-key",
    consumer_secret : "consumer-secret",
    token : "your-token",
    token_secret : "your-token-secret"
};

var Twitter = new TwitterStream(keys, false);

Twitter.stream('statuses/filter', {
    track: 'npmjs.com/package/defer'
});

Twitter.on('connection success', function (uri) {
    console.log('connection success', uri);
});

Twitter.on('connection aborted', function () {
    console.log('connection aborted');
});

Twitter.on('connection error network', function (error) {
    console.log('connection error network', error);
});

Twitter.on('connection error stall', function () {
    console.log('connection error stall');
});

Twitter.on('connection error http', function (httpStatusCode) {
    console.log('connection error http', httpStatusCode);
});

Twitter.on('connection rate limit', function (httpStatusCode) {
    console.log('connection rate limit', httpStatusCode);
});

Twitter.on('connection error unknown', function (error) {
    console.log('connection error unknown', error);
    Twitter.close();
});

Twitter.on('data', function (obj) {
    console.log('data', obj.toString());
});

Twitter.on('data keep-alive', function () {
    console.log('data keep-alive');
});

Twitter.on('data error', function (error) {
    console.log('data error', error);
});

Twitter.pipe(fs.createWriteStream('tweets.json'));

Note: Protected Streams won’t show up in here, of course.

Note: If the script is not running you’ll, as this is a “live” API, miss counts! Filling the gaps with a normal REST API request won’t be 100% accurate (see above).

Good to know: “Note that display_url does not contain a protocol, so this is not required to perform a match.”

Need to know: “URLs are considered words for the purposes of matches which means that the entire domain and path must be included in the track query for a Tweet containing an URL to match.”
→ Translated: no partial URL matches!

Beware though: “Each phrase must be between 1 and 60 bytes, inclusive.”
→ Translated: no long URLS!

Why make it so difficult, Twitter?

Did this help you out? Like what you see?
Thank me with a coffee.

I don't do this for profit but a small one-time donation would surely put a smile on my face. Thanks!

☕️ Buy me a Coffee (€5)

To stay in the loop you can follow @bramus or follow @bramusblog on Twitter.

Responsible Social Share Links

responsible-social-share-links

Social share scripts are convenient and easy to copy & paste but rely on JavaScript and add additional overhead to your site, which means more HTTP requests and slower load times. Instead, use share links that don’t require you to load scripts for each social site.

Back to the basics: URLs and plain HTML (with a tad of VanillaJS on top to enhance the experience).

Responsible Social Share Links →

Look Up

‘Look Up’ is a lesson taught to us through a love story, in a world where we continue to find ways to make it easier for us to connect with one another, but always results in us spending more time alone.

Or why I find it funny when I see people “checking in” at a location, something I used to myself a few years ago.

Noah

Noah, a short film that debuted at the Toronto International FIlm Festival, illustrates the flitting attention span and lack of true connection in digital culture more clearly than anything else in recent memory.

(Chatroulette appearing in the short, so #NSFW)

You Need To See This 17-Minute Film Set Entirely On A Teen’s Computer Screen →

(via @themaninblue)

Facebook Is a Fundamentally Broken Product That Is Collapsing Under Its Own Weight

facebook-fundamentally-broken-product-collapsing-under-own-weight-mark-zuckerberg-2

Every time someone visits News Feed there are on average 1,500 potential stories from friends, people they follow and Pages for them to see, and most people don’t have enough time to see them all. These stories include everything from wedding photos posted by a best friend, to an acquaintance checking in to a restaurant.

Let’s say the average Facebook user is awake for 17 hours a day. To consume all that stuff, they would take in 88 new items per hour, or 1.5 things per minute. That’s just not possible.

Facebook Is a Fundamentally Broken Product That Is Collapsing Under Its Own Weight →

(via @OrT)

Facebook Demetricator

The Facebook interface is filled with numbers. These numbers, or metrics, measure and present our social value and activity, enumerating friends, likes, comments, and more. Facebook Demetricator is a web browser addon that hides these metrics. No longer is the focus on how many friends you have or on how much they like your status, but on who they are and what they said.

Facebook Demetricator →