Clipboards Have Multiple Personalties

Clipboards are more than a text buffer, they're almost full databases

One day, I copied a paragraph from a Word document into another Word document, and also into notepad.exe. When I pasted into the second Word document, all my font sizes, color, spacing etc. remained in place, but when I pasted into notepad.exe I just got the raw text. Clearly the clipboard is storing styling information, but it's additionally storing the raw version? I had to find out more...

It turns out that, yes, when you copy data, multiple formats are stored in clipboard, and then applications dictate which formats they can accept. Let's have a quick look at that in practice.

Cracking open the clipboard

In this demo, I am using Linux with the Xorg clipboard, although similar principles apply on Windows, and I'm sure MacOS.

The Windows InsideClipboard application

Using the tool xclip we can examine the clipboard in detail. To get a list of formats currently stored on the clipboard, use the command: xclip -selection clipboard -o -t TARGETS[1]. For the first demo, let's copy some raw text and see what's on the clipboard.

xclip showing text clipboard formats

Seems we get a MIME type for plain text, a UTF8 formatted string, and a timestamp for when the copy event happened. Nothing too surprising other than the UTF8 string. Does the system auto convert plain text to UTF8 in case an application doesn't accept plain text? What about if we were to copy an image?

xclip showing image clipboard formats

Again, no surprises there. Now for a file from my nautilus file manager.

xclip showing file clipboard formats

We have a lot of objects, but many of them contain the same data, which is simply a URI to the file e.g file:///home/harrison/example.json. Finally, what kicked this all off, some formatted text from Google Docs:

xclip showing Google Docs clipboard formats

Looks like it stores both HTML and custom Google Docs schema (in Word, this would be XML). What we can also see is that web apps can register custom schemas, are there security implications to this?

One little quirk I found while looking into this, is that when you copy an image from Firefox, it converts the image on the fly to a whole bunch of image formats!

xclip showing clipboard formats after copying an image in Firefox

A view from the other side

So we've seen what the clipboard looks like, but we're yet to see how applications copy various formats, and how they register what formats they can understand. Each OS and application framework would do this differently, so let's use the great equalizer and look at this in the browser with javascript. Firstly: setting custom clipboard data[2].

document.addEventListener('copy', function(e) {
    e.clipboardData.setData('text/plain', 'Howdy');
    e.clipboardData.setData('application/harrisonm', 'Howdy');
    e.preventDefault();
});

Easy enough, and what about listing / reacting to different formats?[3]

document.addEventListener('paste', function(e) {
    console.log(e.clipboardData.types);
    if (e.clipboardData.types.indexOf('text/html') > -1) {
        var HTMLdata = e.clipboardData.getData('text/html');
        e.preventDefault();
    }
});

That's all, hope you enjoyed my little foray into clipboard management, and more importantly, that you learnt something!