Google PhotoScan

Don’t just take a picture of a picture. Create enhanced digital scans, with automatic edge detection, perspective correction, and smart rotation.

PhotoScan stitches multiple images together to remove glare and improve the quality of your scans.

PhotoScan by Google (iOS) →
PhotoScan by Google (Android) →

Mixed Content Scan: Scan your HTTPS-enabled website for Mixed Content

mixed-content-scan

With my recent move to HTTPS I wasn’t sure if there were any pages left on my site that had Mixed Content or not.

If an HTTPS page includes content retrieved through regular, cleartext HTTP, then the connection is only partially encrypted. […] When a webpage exhibits this behavior, it is called a mixed content page. (src)

As modern browsers block most Mixed Content from being downloaded this may leave your HTTPS-enabled website broken.

To check this I wrote a little PHP CLI app to scan an HTTPS website for Mixed Content. The script starts crawling at a given URL, and processes the page:

  • All contained img[src], iframe[src], script[src], and link[href][rel="stylesheet"] elements are checked for being Mixed Content or not.
  • All contained a[href] elements linking to the same or a deeper level are successively crawled and scanned for Mixed Content.

The script itself will start scanning and give feedback whilst running. When Mixed Content is found, the URLs will be shown on screen:

Scanning https://www.bram.us/
[2014-12-10 15:38:31] 00000 - https://www.bram.us/
[2014-12-10 15:38:32] 00001 - https://www.bram.us/projects/
[2014-12-10 15:38:33] 00002 - https://www.bram.us/projects/mint-custom-title/
[2014-12-10 15:38:33] 00003 - https://www.bram.us/projects/bramusicq/
[2014-12-10 15:38:33] 00004 - https://www.bram.us/projects/gm_bramus/
[2014-12-10 15:38:34] 00005 - https://www.bram.us/projects/js_bramus/
[2014-12-10 15:38:34] 00006 - https://www.bram.us/projects/js_bramus/jsprogressbarhandler/
[2014-12-10 15:38:36] 00007 - https://www.bram.us/projects/js_bramus/lazierload/
[2014-12-10 15:38:37] 00008 - https://www.bram.us/projects/the-box-office/
[2014-12-10 15:38:37] 00009 - https://www.bram.us/projects/tinymce-plugins/
[2014-12-10 15:38:38] 00010 - https://www.bram.us/projects/tinymce-plugins/tinymce-classes-and-ids-plugin-bramus_cssextras/
[2014-12-10 15:38:38] 00011 - https://www.bram.us/projects/flashlightboxinjector/
[2014-12-10 15:38:40] 00012 - https://www.bram.us/contact/
[2014-12-10 15:38:40] 00013 - https://www.bram.us/2014/12/09/youtube-rewind-2014/
[2014-12-10 15:38:41] 00014 - https://www.bram.us/2014/12/09/6-billion-tweets/
[2014-12-10 15:38:41] 00015 - https://www.bram.us/2014/12/09/little-dragon-underbart/
[2014-12-10 15:38:41] 00016 - https://www.bram.us/2014/12/09/yik-yak-messaging-app-vulnerability/
[2014-12-10 15:38:42] 00017 - https://www.bram.us/2014/11/13/https-everywhere/
[2014-12-10 15:38:42] 00018 - https://www.bram.us/2014/12/09/the-state-of-javascript-in-2015/
[2014-12-10 15:38:43] 00019 - https://www.bram.us/2013/06/27/the-franticness-of-working-in-the-web-business/
[2014-12-10 15:38:43] 00020 - https://www.bram.us/2014/12/09/crossbeat-uprising/
[2014-12-10 15:38:44] 00021 - https://www.bram.us/2014/12/09/its-all-about-time-timing-attacks-in-php/

...

[2014-12-10 15:38:56] 00050 - https://www.bram.us/2008/11/10/jsprogressbarhandler-033/
[2014-12-10 15:38:56] 00051 - https://www.bram.us/demo/projects/lazierload/
  - http://farm2.static.flickr.com/1212/1285026452_0aeb38b6e6.jpg
  - http://farm2.static.flickr.com/1074/1273115418_a77357040a.jpg
  - http://farm2.static.flickr.com/1096/1273106588_91f7a736c6.jpg
  - http://farm2.static.flickr.com/1324/1216309045_31ca82f9d9.jpg
  - http://farm2.static.flickr.com/1262/1217169586_e4b2bfa7df.jpg
  - http://farm2.static.flickr.com/1149/1216304291_63fd48d9c4.jpg
  - http://farm2.static.flickr.com/1366/1216301505_51b3c590ff.jpg
  - http://farm2.static.flickr.com/1184/1216299847_c57975bed2.jpg
  - http://farm2.static.flickr.com/1085/1217158084_a9b059d25b.jpg
  - http://farm2.static.flickr.com/1040/1216293529_3b7c044815.jpg
  - http://farm2.static.flickr.com/1029/1084232736_5b8c023f46.jpg
  - http://farm2.static.flickr.com/1318/1043062251_17071a8cc7.jpg
  - http://farm2.static.flickr.com/1221/1043059543_05713e6156.jpg
  - http://www.google-analytics.com/urchin.js
[2014-12-10 15:38:57] 00052 - https://www.bram.us/wordpress/wp-content/uploads/2008/02/lazierload_04.zip
[2014-12-10 15:38:57] 00053 - https://www.bram.us/wordpress/wp-content/uploads/2008/02/lazierload_03.zip
[2014-12-10 15:38:57] 00054 - https://www.bram.us/wordpress/wp-content/uploads/2007/09/lazierload_02.zip
[2014-12-10 15:38:57] 00055 - https://www.bram.us/2011/09/30/css-regions-and-css-exclusions/
[2014-12-10 15:38:57] 00056 - https://www.bram.us/2014/06/04/good-looking-shapes-gallery/

...

Invoke the script as such:

$ php bin/scanner.php https://www.bram.us/

To speed things up it’s also possible to define a set of ignore patterns. The default ignore patterns defined are those for a WordPress installation:

return [
    '^{$rootUrl}/page/(\d+)/$', // Paginated Overview Links
    // '^{$rootUrl}/(\d+)/(\d+)/', // Single Post Links
    '^{$rootUrl}/tag/', // Tag Overview Links
    '^{$rootUrl}/author/', // Author Overview Links
    '^{$rootUrl}/category/', // Category Overview Links
    '^{$rootUrl}/(\d+)/(\d+)/$', // Monthly Overview Links
    '^{$rootUrl}/(\d+)/$',  // Year Overview Links
    '^{$rootUrl}/comment-subscriptions', // Comment Subscription Link
    '^{$rootUrl}/(.*)?wp\-(.*)\.php', // WordPress Core File Links
    '^{$rootUrl}/archive/', // Archive Links
    '\?replytocom\=', // Replyto Links
];

The {$rootUrl} token in each pattern will be replaced with the (root) URL passed into the script.

Mixed Content Scan →

Special thanks go out to Mathias Bynens for making a few suggestions and additions to Mixed Content Scan.

Did this help you out? Like what you see?
Consider donating.

I don’t run ads on my blog nor do I do this for profit. A donation however would always put a smile on my face though. Thanks!

β˜•οΈ Buy me a Coffee ($3)

BFS-Auto: High Speed Book Scanner at over 250 pages/min

BFS-Auto can achieve high-speed and high-definition book digitization at over 250 pages/min using the original media format. This performance is realized by three key points: high-speed fully-automated page flipping, real-time 3D recognition of the flipped pages, and high-accuracy restoration to a flat document image.