You know it before you even try. You try anyway. Something like trying to bzip2 316,387 CSV files representing ~ 10TB of data. You know it will be at least 110 the size and R can handle the bzip2 files directly so you call bzip2 and get the argument list too long error.

Find and xargs to the rescue! Also, lbzip2 because you really don’t want to wait on bzip2 any more than you have to.

find path_to_files/ -name "*.csv" | xargs -P 5 lbzip2

Note: you’ll want to balance the number of parallel lbzip2 instances (or whatever you are running) as indicated by -P 5 above, balance that with the fact that lbzip2 runs in parallel too.

It still took a long time to compress 316,387 files but it went a whole lot faster with parallel instances of parallel bzip2 using the xargs -P flag.

I had the need to mount an NTFS share for an application that was connecting to a SQL Server database and required that a share be mapped. While testing from my CentOS 7 desktop, creating the share was trivial. Not so much once I transitioned over to a CoreOS machine where I was to deploy for user testing.

Note: I map /opt/bin from the host to a folder in the container. I store custom binaries and scripts in /opt/bin on CoreOS as it is in the path and persists even after CoreOS updates to the latest version. I’ve also changed the example from a Fedora container to CentOS as CentOS already has items installed that the example I link to installed in addition to the dev tools and libraries.

This is how I got around that problem:

docker run -t -i -v /opt/bin:/host_tmp centos /bin/bash
yum groupinstall -y "Development Tools" "Development Libraries"

curl https://download.samba.org/pub/linux-cifs/cifs-utils/cifs-utils-6.3.tar.bz2 | bunzip2 -c - | tar -xvf -
cd cifs-utils-6.3/
./configure && make
cp mount.cifs /host_tmp/
exit

sudo mkdir /media/foo
sudo mount.cifs "//1.1.1.1/ntfs_share" -o username=winuser,domain=mydomain.com,rw,dir_mode=0775,noperm /media/foo/ 

Originally found here.

I recently wrote a Docker event monitor in Go as an excercise to demonstrate some proficiency in Go and Docker. Before getting started I was thinking about how it could be done with piping, bash, and awk. It was actually really easy to do. Some of the excercise requirements were:

  • The service should monitor the Docker API for restart events
  • The service should run an arbitrary command in response to that event.
  • The arbitrary command should be supplied via a config file.

The pipe, bash, awk solution:

echo ' pwd' > cmd.txt && docker events | awk '/container restart/{system("echo docker exec " $4 " $(cat cmd.txt) | bash -")}'

This pipes the pwd command into a file and then pipes the stream from docker events into awk which is searching for the restart event. When the restart event is encountered it executes the arbitrary command from the text file against the restarted container. The command in the text file could be replaced with any desired command.

There were other requirements that made it interesting to think through in Go. I’ll be posting the entire excercise and my code soon.

ctl+r...

Control plus r and start typing part of a command you want to recall - searches the history from the terminal. This is an old but good one. Even after several years of doing command line work I still use this one fondly as it took several years of work prior to discovering this trick to stumble upon it!

ssh -NL 3389:internal.network.ip.address:3389 user@public.ip.address

In this example I’m forwarding an RDP connection to a machine on some internal network IP address through a public facing SSH server. I’ve found this sort of tunelling through SSH to be invaluable and I’ve used it for everything from accessing Windows servers over RDP where the only external access was a key protected SSH server to forwarding http/https traffic to remote APIs for testing and debugging.