Link Checking using awk and cURL

If you are looking to check many thousands (or tens of thousands of links) and/or you’re interested in status codes, you’ll be better served by a scripted approach.

Script is based off this BASH one liner (to run on Windows, you’ll need to install Cygwin and the cURL package):

awk -F $'\t' 'BEGIN {OFS = FS} {print $4}' avdb_export_0423.txt | xargs -n1 curl -o /dev/null --silent --head --connect-timeout 1 --write-out '%{http_code} %{url_effective}\n' > records.txt
  • First step is to save your spreadsheet as a tab separated txt file. In this case, I am using “avdb_export_0423.txt”
  • “$4″ means the urls are in the 4th column of spreadsheet.
  • “–connect-timeout 1″ means the script will allow 1 second for each URL to be checked before a timeout. If you get lots of “0’s” returned by the script (meaning a timeout) you might want to increase the timeout.
  • ‘%{http_code}\%{url_effective}\n’ means the script will print out the link it checked, a tab followed by the response code. In general, 404 means dead, 200 means alive, and 0 means timeout.
  • “records.txt” is the output file.

If you want to have the script run faster, you can check multiple links at the same time (the above command does one at a time). So something like

awk -F $'\t' 'BEGIN {OFS = FS} {print $4}' Italian_avdb_export_0423.txt | xargs -n1 -P 10 curl -o /dev/null --silent --head --connect-timeout 10 --write-out '%{http_code} %{url_effective}\n' > records.txt

“-P 10″ means I am pinging 10 links at once. This will speed up the script run time, but the output will likely contain the the links in a slightly different order than the spreadsheet, as each response will be returning asynchronously.

Ignoring Search field in WordPress

At times you may have a search page containing select list filters in addition to a general text field search box. If you hook these filters up to custom query vars, but use a normal WordPress search box (hooked up to the “s” query var) you can be left high and dry when the search box is left blank. You’ll likely either get the message that your search terms can’t be found or be redirected home.

The WordPress team has made this intentional. Their perspective is that some results should at least show up in this case. WordPress hits the home page using HTTP GET and passes your search query to an “s” parameter. When WordPress sees an empty search string it doesn’t even use the “is_search” page handler and that’s why this happens (it actually uses the “is_home” handler instead). However, in this case we don’t want this action to “overpower” the results we’re passing with our filters.

The solution is simply to unset the “s” query var in cases where the Search field is left blank.

function search_unset( $vars ) {
	if( isset( $_GET['s'] ) && empty( $_GET['s'] ) )
	unset( $vars['s'] );
	return $vars;
add_filter( 'request', 'search_unset' );

Remove duplicate lines with uniq

sort myfile.txt | uniq

List only the unique lines: sort myfile.txt | uniq -u

List only the duplicate lines: sort myfile.txt | uniq -d

Get a count of the number of lines by adding the -c option.

sort myfile.txt | uniq -uc

sort myfile.txt | uniq -dc

Skip fields: uniq -f 3 mylogfile. this could be useful with log files to skip the time stamp data

Skip characters. uniq -s 30 myfile.txt. Skip the first 30 characters

Compare characters. uniq -w 30 myfile.txt. Compare the first 30 characters

advanced ffmpeg recipes

Output a single frame from the video into an image file:

ffmpeg -i input.flv -ss 00:00:14.435 -vframes 1 out.png

Output one image every second, named out1.png, out2.png, out3.png, etc.

ffmpeg -i input.flv -vf fps=1 out%d.png

Output one image every minute, named img001.jpg, img002.jpg, img003.jpg, etc. The %03d dictates that the ordinal number of each output image will be formatted using 3 digits.

ffmpeg -i myvideo.avi -vf fps=1/60 img%03d.jpg

Extracting X images from a video of variable length

ffmpeg -i <input_file> -vsync 0 -vf "select='not(mod(n,100))'" <output_file>

(where 100 is the frame-frequency you’d like to use, for example, every 100th frame)

The filter thumbnail tries to find the most representative frames in the video:

ffmpeg -i input.mp4 -vf  "thumbnail,scale=640:360" -frames:v 1 thumb.png

and the following command selects only frames that have more than 40% of changes compared to previous (and so probably are scene changes) and generates a sequence of 5 PNGs.

ffmpeg -i input.mp4 -vf  "select=gt(scene\,0.4),scale=640:360" -frames:v 5 thumb%03d.png

Looks for the first >40%-change frame within each of 5 time spans, where the time spans are the 1st, 2nd, 3rd, 4th, and 5th 20% of the video.

ffmpeg -ss 3 -i input.mp4 -vf "select=gt(scene\,0.4)" -frames:v 5 -vsync vfr fps=fps=1/600 out%02d.jpg

Missing cygwin1.dll Simple Fix: Put Cygwin in PATH

f you try running Cygwin from another program, or are running an installer (say for yasm) and you get a “missing cygwin1.dll” error, you should check that you have put Cygwin into your Windows PATH environment variable. The file is part of cygwin , so most likely it’s located in C:\cygwin\bin. To fix the problem all you have to do is add C:\cygwin\bin (or the location where cygwin1.dll can be found) to your system path. Alternatively you can copy cygwin1.dll into your Windows directory.

To add to your user profile path you can do the following from the command line using the setx command which is built into Windows Vista and above. In earlier versions of Windows you can use the Windows Resource Kit to get it.

Say Cygwin in installed in c:\cygwin, do:

SETX path c:\cygwin;c:\cygwin\bin;%path%

Or for you, as user only:

SETX -m path c:\cygwin;c:\cygwin\bin;%path%

Or via the GUI

Position element at the bottom of parent

Assign position:relative to the parent element, and then position:absolute; bottom:0; to the element.

So for:

<div id="container">
  <footer id="copyright">
    Copyright James Fishwick

Assign position:relative to #container, and then position:absolute; bottom:0; to #copyright.

Bonus flexbox method:

#container {
    display: flex;
    align-self: flex-end;

Javascript: Return an array of objects according to key, value, or key and value matching

function getObjects(obj, key, val) {
    var objects = [];
    for (var i in obj) {
        if (!obj.hasOwnProperty(i)) continue;
        if (typeof obj[i] == 'object') {
            objects = objects.concat(getObjects(obj[i], key, val));    
        } else 
        //if key matches and value matches or if key matches and value is not passed (eliminating the case where key matches but passed value does not)
        if (i == key && obj[i] == val || i == key && val == '') { //
        } else if (obj[i] == val && key == ''){
            //only add if the object is not already in the array
            if (objects.lastIndexOf(obj) == -1){
    return objects;