Bringing Devops Home

Hmmm….almost 2 years since a post.  I was pretty busy.

Anyway, I’ve been making some major changes to how I manage my home server, so I thought a write-up would be interesting.  I’ve had some form of home server for over 10 years.  It’s always been Linux, but has morphed in many ways over the years.  It began as a frontend for XBMC (before it was Kodi), but after migrating to Plex, it’s become completely headless.  Over the years, I’ve stacked software on it, installing OS level packages and letting them run behind the scenes or though Apache.  Quite honestly, it began to be a mess.  I took inventory of it one day, and realized should this thing ever die a horrible fiery death, I’d have weeks of rebuilding.  Plus, I frequently found myself in dependency hell, trying to track down what was needed for me to do the thing that I wanted to do.

So, I took a step back, remembered some of those buzzwords that I’d heard around work and the internet, studied them, compared them, decided on a toolset, and put them into practice.

One thing that was clear to me in doing my analysis…. I am not running the data center for a Fortune 500 company.  I only manage three small Linux servers for personal use.  Often, lightweight, simple options work great for me.  With that, many of these solutions can scale quite large either with some additional thought towards configuration or add-on services.

Ansible – Configuration Management

First of all, I’m tired of manually editing configuration files, documenting my changes, and hoping I can remember what I did next time.  That’s silly.  I needed a configuration management system.  I considered a few alternatives, Chef, Salt, Puppet were the main competitors, but I chose Ansible.  It’s a simple push based architecture that relies on ssh and Python, two technologies I already know very well.  It also does not require any special infrastructure.  I could run it off my server, a Raspberry Pi, or a laptop.

I’d actually done some work for my home server in Ansible before, but I’d automated simple tasks, not the state.  I went ahead and started with a blank slate using the Ansible best practices to define the state I wanted vs defining the tasks I needed to accomplish on my server.  Then, I stuck this in a Git Repo.  I’m currently using Gitlab, because their free accounts offer the most flexibility for a hobbyist.

Semaphore – Ansible Frontend

Semaphore Dashboard

Ansible’s full capabilities are available from the command line alone, but sometimes it’s easier just to open up a web page and click a button.  The enterprise solution is Ansible Tower and the open source upstream solution is AWX.  I did play with AWX, and found a lot of good features, but it was very heavy for what I needed.  It requires 4 Docker containers: Web UI container, Worker container(s), PostgreSQL, and RabbitMQ.  I found Semaphore to be simple, lightweight, and did everything I needed.  It can be used to manage SSH keys, GitHub Repos for your playbooks, users, and projects.  On any playbook execution, it will update from git, then perform the requested action.  There is currently no internal scheduling mechanism, but there a rest API available for externally triggered jobs.

At the end of the day, it accomplishes my goal rather well.  I can edit, commit, merge, and run all in a handful of minutes. (more if I actually test first)

Docker – Application Management

One of my bigger frustrations began to be managing software dependencies.  I often found myself troubleshooting dependencies, manually editing configuration files, and configuring Linux users and groups to allow shared file access.

Why do this anymore?  Most mainstream Linux services have some form of Dockerfile available on Docker Hub.  The file is easily readable, so even if you don’t like some of the practices in an image you want, you can definitely create one of your own.  Additionally, Ansible has great Docker modules, so services can be configured easily with the same configuration management system in use by the rest of the system.  Some of the services I’m running in docker are:

  • Airsonic – Free, web-based media streamer.  Fork of Subsonic.
  • Grafana – Analytics and Monitoring
  • Plex – Media Server
  • Portainer – Management UI for Docker – Useful for inspecting and viewing logs
  • Prometheus (and add ons) – Monitoring System and Time Series database
  • RabbitMQ – Message broker – Used in a Django/Celery project I’m working on.  Prime candidate for Docker due to Erlang requirements.

Prometheus – Monitoring

Previously, I had been relying on Icinga2.  While stable, configuration was a pain, and it relied on OK/Warn/Critical limits that would need to be configured on each node remotely.  I felt I needed to re-learn the configuration schema each time I needed to add a new custom alert.  Additionally, Icinga2 had limited options for reporting history and graphing out of the box.  It was also dependent on Apache and MySQL, so what would alert me if those went down?

After analyzing my options, I gravitated towards Prometheus.  It didn’t come pre-configured with a bunch of fancy dashboards and alerts like some of other offerings, but it was easy to manage, and there were many add-ons to enrich the experience.  Data  is gathered through exporters, scraped with HTTP requests.  Prometheus can even scrape HTTPS URLs with authentication.  I’m currently using few exporters to gather information on my systems:

  • Prometheus node_exporter – runs as  service on all nodes to collect system metrics.  This exporter can even scrape text files, which I’ve configured to check for available apt packages on my Ubuntu systems.
  • cAdvisor -Analyzes resource usage and performance characteristics of running containers. (Offered by Google)
  • Blackbox exporter – Allows blackbox probing of endpoints over HTTP, HTTPS, DNS, TCP and ICMP.

I plan to retire Icinga2 soon, after I have been able to improve my alerting thresholds and gain a little more confidence in the system.

Grafana – Analytics and Alerting for Prometheus

Prometheus is great for storing and querying data.  It can graph data, but the interface is best used to develop new queries and graphs.  I found Grafana to be the best package deal to support Prometheus, as it can generate graphs, and send alerts to multiple channels.  I’ve tried my hand at generating my own dashboards, but shared ones available on are much better than I’ve been able to create quickly.  So far, I’ve been able to get whole system dashboards to help me monitor and alert on various metrics.  The big ones for me are filesystem space, backup status, and security patch requirements.  As a bonus, I’ve also been able to create dashboards for others to show only the metrics they are concerned with (and automate a nagging email when disk space runs low).

I’ll end this post with some of the graphs I have configured in my Grafana Dashboards:

WordPress performance with caching

In my last entry, I detailed the performance gains to be had from switching host providers.  That’s pretty cool, but a lot can still be done within WordPress to improve performance with caching.  Here, I’m going to use the URL from my previous blog post (, and I’m going to run it through similar benchmark tests to see what kind of difference that makes.

During these tests, nothing is being changed except for the caching plugin.  All server variables remained constant, and no other plugins were touched at this time.  This plugin will allow wordpress to generate a static html webpage to take the place of php/mysql code.  Therefore, a page request will simply read a flat file that is ready to go vs execute php code and pull data from the database, limiting processing time.

Note, this test is not downloading images, javascript or any other static content that can be included with a webpage.  I’m purposely leaving that out, testing the webserver’s ability to process the wordpress php code only.

Test #1: 1000 requests, single threaded

Example command: ab  -n 1000 -e post_280_ssl_std.csv -g post_280_ssl_std_gnuplot.tsv

General Numbers:

 Uncached  Cached
 Document Length  35424 bytes  35568 bytes
 Concurrency Level  1  1
 Time taken for tests  280.391 seconds  171.673 seconds
 Complete Requests  1000  1000
 Failed Requests  389 (length)  0
 Total Transferred  35,789,569 bytes  35,873,068 bytes
 HTML transferred:  35,423,569 bytes  35,568,000 bytes
 Requests per second:  3.57 [#/sec]  5.83 [#/sec]
 Mean time per request:  280.391 [ms]  171.673 [ms]
 Transfer rate:  124.65 [Kbytes/sec]  204.06 [Kbytes/sec]

For this test, there were 389 failed requests based on length.  Researching this error indicates it could be caused by dynamic content, and does not necessarily indicate a problem.  Therefore, I’m going to ignore this figure, and assume all connections were successful.

Continue reading

I switched my host provider!

DO_SSD_Power_Badge_Blue-077bf22e…and you should too!
(provided you know a thing or two about system management and online security)

I found myself in a place where the basic and plus hosting accounts were providing extreme sub-par service, with no SSL.  I had two options, move up to the $15 dollar a month (on sale) “Pro” hosting account, or jump ship.  I jumped to a $10/month Digital Ocean Droplet and I couldn’t be happier!

  • root
  • Faster performance
  • SSL for free, thanks to Let’s Encrypt
  • Free reign to monitor and tune system
  • Complete control over security policies and patching

Note, all of these things come with a varying levels of responsibility, which should not be taken lightly.  There are plenty of tutorials out there on how to harden a servers and configure web services.  If you go down this road, I highly suggest you do your research first.

While that short bit on the “why” is imporant, I really wrote this to share some performance data!  I used Apache Benchmark against my old hosting account and my new VM.  Honestly, I don’t get that many hits, so load on my own server is negligible.  In order to give both hosts a shot, I performed these tests between 12:00 am and 2 am CST.  I used the same theme and the same config options.  I’m unable to modify the my.cnf file on my shared hosting provider, so I left the defaults in place on my new host.  I did create an apache virtual host, otherwise I left the apache configs alone for similar reasons.  My site runs wordpress, and I made sure both sites were running the same plugins with the same options using the same theme.  At the time, LightWord, Akismet, Jetpack, SyntaxHighlighter Evolved, Ultimate Google Analytics.

Test #1 – 1,000 gets against the main page, single thread:

There is a slight difference in the total bytes and the filesize transferred.  I’ve identified this to be the difference between a custom footer and the standard.  It was a negligible change, and the tests took a while to complete, so I’ve left it alone.  Also, the hostname was different, because I chose to run the tests at the same time, using a sub-domain to point to the new host.

Example Command: ab  -n 1000 -e digitalocean.csv -g digitalocean_gnuplot.tsv

General Numbers:

 Host Type:  Shared  Virtual Machine  Virtual Machine
 Host Provider  Bluehost  Digital Ocean  Digital Ocean
 Monthly Cost:  $6.95  $10  $10
 Server Software  Apache  Apache/2.4.7  Apache/2.4.7
 Server Hostname:
 Server Port:  80  80  443
 SSL/TLS Protocol:  n/a  n/a  TLSv1.2,ECDHE-RSA-AES128-GCM-SHA256,2048,128
 Document Path:  /  /  /
 Document Length:  88453 bytes  84502 bytes  84502 bytes
 Concurrency Level:  1  1  1
 Time taken for tests:  1380.997 seconds  265.789 seconds  399.775 seconds
 Complete requests:  1000  1000  1000
 Failed requests:  0  0  0
 Total transferred:  88714000 bytes  84824000 bytes  84524000 bytes
 HTML transferred:  88453000 bytes  84502000 bytes  84203000 bytes
 Requests per second (mean):  0.72 [#/sec]  3.76 [#/sec]  2.50 [#/sec]
 Time per request (mean):  1380.997 [ms]  265.789 [ms]  399.775 [ms]
 Transfer rate:  62.73 [Kbytes/sec]  311.66 [Kbytes/sec]  206.47 [Kbytes/sec]

Continue reading

MariaDB master-master configuration over ssh tunnel

MariaDB LogoA master-master database cluster is often referred to as an active-active database cluster.  Some other database systems including later MariaDB releases have built in cluster configuration options.  This guide sticks to MariaDB 5.5, as it is still completely compatible with MySQL 5.5.  In a MySQL Master-Master configuration, both nodes have a reciprocal master-slave relationship.  Node A is a master for Node B, and Node B is a master for Node A.  If a change is made to either instance, it is reflected on the other.

This is configured for a number of different reasons.  If the database is extremely active, it can be configured to load balance between two physical database nodes.  It can also be set up for redundancy purposes, so if one fails, all traffic can be instantly redirected to the other.  In this situation, I just want the data available on two nodes in separate networks, and I want the data to sync.  While there are many vpn projects available, I know and trust ssh, and it does not require any additional software to run on the node.

Chances are, if you’ve landed here, you have some knowledge on databases and Linux, but I’ll still warn you.

Warning:  Be very careful writing to two databases in this configuration!  There is no conflict resolution system!

I use this system in two different ways.  First, it is an always-on backup system for some databases that are only needed on one server.  Second, I have a script that writes to the two different instances, but one instance only receives inserts, and the other only receives updates.  Please don’t bug me with data integrity issues if you don’t heed my advice.  Also keep in mind that since these nodes are not on the same network, there may be some additional latency.


First, you will need two Linux nodes.  I recommend using the same distribution so your packages will remain in sync.  Make sure you are able to ssh between the two, and install MariaDB server.  Please refer to standard documentation in setting up these services, as I’m considering that out of scope, but I have provided some links below.

It would be best to access your nodes via domain name, as IP’s change, and hard coded values can be a royal pain to hunt down and fix.  I personally use no-ip for my dynamic domain name needs.

For this scenario, I am using Ubuntu 14.04 Server Edition with MariaDB 5.5.  This should also work with MySQL 5.5, but I did not test.  This guide should also work with any other Linux distribution, but there may be some slight differences in file names or log file locations.  I’ve personally gone back to Ubuntu because I don’t have time to download and compile software or track down additional repositories.  It seems like any Linux software developer makes releasing on Ubuntu a priority, and that far outweighs any other factor for me.  Since MariaDB is just a drop in replacement for MySQL, many commands still reference MySQL, so no, I am not confused.

It’s best to start with a database that is either already manually in sync, or start with no database in place, then create them.  In my situation, I did not have the database created when I set up replication.  I used an SQL file to create the database, schema, and add data after replication was set up.

Step 1: Configure MariaDB Ports

For the purpose of this guide, I will be referring to the nodes as localnode and remotenode.  The fully qualified domain names I will be using are and  When possible, use this fully qualified domain name (FQDN).  Go ahead and su to root, as you will need it for most operations we will do here.

If your MariaDB server is currently running, shut it down on both nodes:

root@localnode:~# service mysql status
root@remotenode:~# service mysql status

To keep things sane and easy for me, I just switched the MariaDB listening port on remotenode to 3305.  This can be changed in the my.cnf file:

root@remotenode:~# vi /etc/mysql/my.cnf

The port is defined twice, once under the [client] header, and again under the [mysqld] header.  Find both, and change them to 3305:

port            = 3305
port            = 3305

You can start the service up to verify that it is running on the port, but let’s go ahead and shut it down immediately after.

root@remotenode:~# service mysql start
 * Starting MariaDB database server mysqld                                            [ OK ]
 * Checking for corrupt, not cleanly closed and upgrade needing tables.

root@remotenode:~# netstat -anp | grep :3305
tcp        0      0*               LISTEN      31586/mysqld
tcp        0      0         ESTABLISHED 31586/mysqld

root@remotenode:~# service mysql stop
 * Stopping MariaDB database server mysqld                                            [ OK ]

This shows that you’ve changed the port on the server, but before we move on, let’s try a test connection with the client:

root@remotenode:~# mysql -u root -p
Enter password:
Welcome to the MariaDB monitor.  Commands end with ; or g.
Your MariaDB connection id is 58
Server version: 5.5.37-MariaDB-0ubuntu0.14.04.1-log (Ubuntu)
MariaDB [(none)]> status;
mysql  Ver 15.1 Distrib 5.5.37-MariaDB, for debian-linux-gnu (x86_64) using readline 5.1
MariaDB [(none)]> exit

Let’s move on!

Note: this configuration is my personal preference only.  You may also set them to both listen on 3306, but you will need to set up the ssh tunnel slightly differently.  I will note this in the next section.  This setup is ideal for me, because I want to control which instance the database is connecting to.

Step 2: Configure SSH

For this step, you will have to create a private/public key pair.  Here are the commands to generate and copy the keys from the localnode to the remotenode.  Do not set a password on the key.  Follow the prompts and enter passwords when necessary.  If you are unfamiliar with this process, it would be best to read up on it using the link below.

root@localnode:~# ssh-keygen -b 4096

Now, repeat the process from the other.

root@remotenode:~# ssh-keygen -b 4096

You should now be able to ssh back and forth without being prompted for a password.

root@localnode:~$ ssh
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-30-generic x86_64)
root@remotenode:~$ ssh
Welcome to Ubuntu 14.04 LTS (GNU/Linux 3.13.0-30-generic x86_64)

Step3: Set Up SSH Tunnel

Hat tip to the comments on this blog post!

Edit the crontab on your localnode.

root@localnode:~# crontab -e

Add the following:

* * * * * nc -z localhost 3305 || ssh -f -L 3305: -N

Edit the crontab on remotenode.

root@remotenode:~# crontab -e

Add the following:

* * * * * nc -z localhost 3306 || ssh -f -L 3306: -N

Now, once a minute, crontab will check for a connection, and if it does not exist, it will open an ssh tunnel.  Once this tunnel is established, 3305 on localnode will connect directly to 3305 on the remotenode.  Also 3306 on remotenode will connect to 3306 on the localnode.

root@remotenode:~# netstat -anp | grep :3306
tcp        0      0*               LISTEN      24931/ssh
tcp        0      0         ESTABLISHED 24931/ssh
tcp6       0      0 ::1:3306                :::*                    LISTEN      24931/ssh
root@localnode:~# netstat -anp | grep :3305
tcp        0      0*               LISTEN      2104/ssh
tcp        0      0         ESTABLISHED 2104/ssh
tcp6       0      0 ::1:3305                :::*                    LISTEN      2104/ssh

I show the mysqld connection becauseIf you chose to leave the database listening port as 3306 on both nodes, you would use crontab entries similar to this:

* * * * * nc -z localhost 3305 || ssh -f -L 3305: -N
* * * * * nc -z localhost 3305 || ssh -f -L 3305: -N

 Step 4: Configure Master Master configuration for MariaDB

Another hat tip to this tutorial.

Time to edit the my.cnf again, but we will be making the same changes to both nodes.

root@bothnodes:~# vi /etc/mysql/my.cnf

We will be making several changes, first up is server-id.  Make sure they are different.  I set localnode to 1, and remotenode to 2.

server-id               = 1 # localnode
server-id               = 2 # remotenode

Next, make sure log_bin and log_bin_index are uncommented.

log_bin                 = /var/log/mysql/mariadb-bin.log
log_bin_index           = /var/log/mysql/mariadb-bin.index

Now, we need to add the databases to the my.cnf.  In order to add multiple databases to the master-master or any master-slave configuration, you just need to add separate lines to the binlog_do_db parameter.

binlog_do_db            = database_number_1
binlog_do_db            = database_number_2

There are other tuning paramters that can be changed should you experience performance issues.  The notes I have in my default my.cnf file say that this is geared towards safety, not performance.

expire_logs_days        = 10
max_binlog_size         = 100M

Now, we can bring up the database on both nodes.

root@remotenode:~# service mysql start
 * Starting MariaDB database server mysqld                                            [ OK ]
 * Checking for corrupt, not cleanly closed and upgrade needing tables.
root@localnode:~# service mysql start
 * Starting MariaDB database server mysqld                                            [ OK ]
 * Checking for corrupt, not cleanly closed and upgrade needing tables.

Now, we need to log into the database on both nodes.

root@bothnodes:~# mysql -u root -p
Enter password:
MariaDB [(none)]>

First, we need to create the slave replication user on both nodes.  Of course, feel free to use a different username or password.  Normally, this user is not created as a local user only, but since we have the ssh tunnel in place, we will.  Run this on both nodes.

# both nodes
MariaDB [(none)]> create user 'replicator'@'localhost' identified by 'password'; 
MariaDB [(none)]> grant replication slave on *.* to 'replicator'@'localhost';

Now, we are ready to start the replication.  On localnode, we use the show master status command in the mysql client to get the rest.  Note the file and Position.

MariaDB [(none)]> show master status;
| File               | Position | Binlog_Do_DB                          | Binlog_Ignore_DB |
| mariadb-bin.000019 |   780421 | database_number_1,database_number_2   |                  |

Now that we have this information, we go over to the remotenode to start the slave replication, using the File and Position output from the master status.  We also use the port definition of 3306.

MariaDB [(none)]> slave stop; 

MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST = '', MASTER_PORT = '3306', MASTER_USER = 'replicator', MASTER_PASSWORD = 'password', MASTER_LOG_FILE = 'mariadb-bin.000019', MASTER_LOG_POS = 780421; 

MariaDB [(none)]> slave start;

Now, we do the same in reverse.

MariaDB [(none)]> show master status;
| File               | Position | Binlog_Do_DB                          | Binlog_Ignore_DB |
| mariadb-bin.000025 |      245 | database_number_1,database_number_2   |                  |

Then we set the localnode as a slave to the remotenode.

MariaDB [(none)]> slave stop; 

MariaDB [(none)]> CHANGE MASTER TO MASTER_HOST = '', MASTER_PORT = '3305', MASTER_USER = 'replicator', MASTER_PASSWORD = 'password', MASTER_LOG_FILE = 'mariadb-bin.000025', MASTER_LOG_POS = 245; 

MariaDB [(none)]> slave start;

Of course, in the situation where both databases are listening on 3306, and the ssh tunnel is listening on 3305, you would need to define the port as 3305 for both CHANGE MASTER commands.

Now we can test!  From the localnode, create a database.

MariaDB [(none)]> create database database_number_1;

If you go over to the remotenode, you will see that the db has been created.  Similarly, when you make a change to the secondary, it will be reflected in the primary.

Have fun!  My experience has been that the two instances sync instantaneously.  I have a standard Time Warner internet connection, and I was unable to detect a lag in the remote node, even when scripting thousands of inserts and updates on the local node.  Your results may vary, depending on connection reliability and connection speed.



MariaDB Installation Guide:

Ubuntu Documentation: SSH/OpenSSH/Keys –

How to Set Up MySQL Master-Master Replication –

Calling rsync with Python’s Subprocess module

I was recently trying to script multiple file transfers via rsync, but unfortunately, I was unable to control file names.  I chose to use python and issue commands to the OS to initiate transfer.  Initially, everything was working great, but as soon as I encountered a space or parenthesis, the script blew up!

In this tuturial, I’m showing how to transfer a single file, but rsync is a very powerful tool capable of much more.  The principles discussed in this post can be adapted with other uses of rsync, but I’m considering rsync usage to be out of scope here.  There is a very good online man page here: (  I’ve chosen to initiate transfers one file at a time, so I can easily have multiple connection streams running vs a single connection stream that transmits all files in sequence.  It is much faster this way with my current ISP, as I suspect they shape my traffic.  Also note, these methods can be applied to scp file transfers as well.

We will start with a very basic rsync command to copy /tmp/test.txt from a remote location to my local pc.  Before starting, I’ve set up public key authentication between my home pc and the remote server.  I initiate the connection from my home pc, as I don’t care to put private keys in remote locations that could provide access to my home network.

/usr/bin/rsync -va

This works very well, but what happens when the file has a space? With most commands, we can just wrap quotes around it, and it works.

rsync'/tmp/with space.txt' '/tmp/with space.txt'
rsync: link_stat "/tmp/with" failed: No such file or directory (2)
rsync: link_stat "/home/myusername/space.txt" failed: No such file or directory (2)
rsync error: some files/attrs were not transferred (see previous errors) (code 23) at main.c(1637) [Receiver=3.1.0]

Unfortunately, in this case, the remote system sees “/tmp/with space.txt” as two separate files, /tmp/with and $HOME/space.txt. What we need to do for the remote location is both wrap it with quotes and escape it.  We could also double escape the filename, but I chose to keep things looking a little bit sane.

/usr/bin/rsync'/tmp/with space.txt' '/tmp/with space.txt'

That is fine, but we need a good way to do this on the fly when we are given file names in bulk.  There are three key libraries I like to use when doing this:

  • subprocess – This is an extremely powerful library for spawning processes on the OS.
  • os.path – This submodule of os contains very useful tools for manipulating filesystem object strings.
  • re – Regular expression operations provides an easy to use escape function.

In a nutshell, here is the operation that needs to happen to create the command and execute it:

import re
import subprocess

full_remote_path = "/tmp/filename with space.txt"
full_local_path = "/tmp/filename with space.txt"
remote_username = "myusername"
remote_hostname = ""

# Here we use re.escape to escape the paths.
escaped_remote = re.escape(full_remote_path)
escaped_local = re.escape(full_local_path)

# I've chosen to just escape the local path and leave off the quotes.
cmd = "/usr/bin/rsync -va %s@%s:'%s' %s" % (remote_username, remote_hostname, escaped_remote, escaped_local)
print cmd

p = subprocess.Popen(cmd, shell=True).wait()

Here is the rsync command that is sent to the os:

/usr/bin/rsync -va'/tmp/filename with space.txt' /tmp/filename with space.txt

Now that we have this working, now, I get to explain how os.path fits in.  Should you be copying /tmp/mydirectory/afile.txt on the remote system to /tmp on your local system, but /tmp/mydirectory does not exist, you will receive an error:

rsync -qv /tmp/mydirectory/test.txt
rsync: change_dir#3 "/tmp/mydirectory" failed: No such file or directory (2)
rsync error: errors selecting input/output files, dirs (code 3) at main.c(694) [Receiver=3.1.0]

The easiest way to do this would be to run a simple mkdir -p command on /tmp/mydirectory before beginning.  Should the directory exist, the command does nothing.  Should it be missing, it will be created with the necessary parent directories.  In a case where you are copying a file to a remote machine, you can pass this command to the remote machine via ssh.

To do this in python, I like to take the full filename, and split it to receive the complete directory path.

import os
import re
import subprocess

local = "/tmp/mydirectory/test.txt"

localdir = os.path.split(local)[0]
localdir = "%s/" % localdir
localdir = re.escape(localdir)

mkdir_cmd = '/bin/mkdir -p %s' % localdir
p = subprocess.Popen(mkdir_cmd, shell=True).wait()

Here is my full example code that I created to test and demo this technique:

#! /usr/bin/python

import subprocess
import os
import re

def do_rsync(rh, ru, rd, rf, ld):

 # The full file path is the directory plus file.
 remote = os.path.join(rd, rf)

 # escape all characters in the full file path.
 remote = re.escape(remote)

 # here we format the remote location as 'username@hostname:'location'
 remote = "%s@%s:'%s'" % (ru, rh, remote)

 # here we define the desired full path of the new file.
 local = os.path.join(ld, rf)

 # This statement will provide the containing directory of the file
 # this is useful in case the file passed as rf contains a directory
 localdir = os.path.split(local)[0]

 # os.path.split always returns a directory without the trailing /
 # We add it back here
 localdir = "%s/" % localdir

 # escape all characters in the local filename/directory
 local = re.escape(local)
 localdir = re.escape(localdir)

 # before issuing the rsync command, I've been running a mkdir command
 # Without this, if the directory did not exist, rsync would fail.
 # If the directory exists, then the mkdir command does nothing.
 # If you are copying the file to the remote directoy, the mkdir command can be passed by ssh
 mkdir_cmd = '/bin/mkdir -p %s' % localdir

 # create the rsync command
 rsync_cmd = '/usr/bin/rsync -va %s %s' % (remote, local)

 # Now we run the commands.
 # shell=True is used as the excaped characters would cause failures.
 p1 = subprocess.Popen(mkdir_cmd, shell=True).wait()
 p2 = subprocess.Popen(rsync_cmd, shell=True).wait()
 print ""
 return 0

rh = ""
ru = "myusername"
rd = "/tmp"
rf = "test.txt"
ld = "/tmp"

print "Here we do a simple test with test.dat"
do_rsync(rh, ru, rd, rf, ld)

rf = "this is a filename - with (stuff) in it.dat"

print "Here is a filename with a bit more character."
do_rsync(rh, ru, rd, rf, ld)


A function like this could be put into place very easily, but a few changes would be necessary in order to make this production ready.  The rsync cleanup can be minimized by changing the -v to a -q, but in doing this, you will want to check the exit status using subprocess to determine if the transfer was successful.  In my case, I chose to use the process and queue functions from the multiprocessing module to manage multiple streams.