Clipboards Have Multiple Personalties
Clipboards are more than a text buffer, they're almost full databases
One day, I copied a paragraph from a Word document into another Word document, and also into notepad.exe
. When I pasted into the second Word document, all my font sizes, color, spacing etc. remained in place, but when I pasted into notepad.exe
I just got the raw text. Clearly the clipboard is storing styling information, but it's additionally storing the raw version? I had to find out more...
It turns out that, yes, when you copy data, multiple formats are stored in clipboard, and then applications dictate which formats they can accept. Let's have a quick look at that in practice.
Cracking open the clipboard
In this demo, I am using Linux with the Xorg
clipboard, although similar principles apply on Windows, and I'm sure MacOS.
Using the tool xclip
we can examine the clipboard in detail. To get a list of formats currently stored on the clipboard, use the command: xclip -selection clipboard -o -t TARGETS
[1]. For the first demo, let's copy some raw text and see what's on the clipboard.
Seems we get a MIME type for plain text, a UTF8 formatted string, and a timestamp for when the copy event happened. Nothing too surprising other than the UTF8 string. Does the system auto convert plain text to UTF8 in case an application doesn't accept plain text? What about if we were to copy an image?
Again, no surprises there. Now for a file from my nautilus
file manager.
We have a lot of objects, but many of them contain the same data, which is simply a URI to the file e.g file:///home/harrison/example.json
. Finally, what kicked this all off, some formatted text from Google Docs:
Looks like it stores both HTML and custom Google Docs schema (in Word, this would be XML). What we can also see is that web apps can register custom schemas, are there security implications to this?
One little quirk I found while looking into this, is that when you copy an image from Firefox, it converts the image on the fly to a whole bunch of image formats!
A view from the other side
So we've seen what the clipboard looks like, but we're yet to see how applications copy various formats, and how they register what formats they can understand. Each OS and application framework would do this differently, so let's use the great equalizer and look at this in the browser with javascript. Firstly: setting custom clipboard data[2].
document.addEventListener('copy', function(e) {
e.clipboardData.setData('text/plain', 'Howdy');
e.clipboardData.setData('application/harrisonm', 'Howdy');
e.preventDefault();
});
Easy enough, and what about listing / reacting to different formats?[3]
document.addEventListener('paste', function(e) {
console.log(e.clipboardData.types);
if (e.clipboardData.types.indexOf('text/html') > -1) {
var HTMLdata = e.clipboardData.getData('text/html');
e.preventDefault();
}
});
That's all, hope you enjoyed my little foray into clipboard management, and more importantly, that you learnt something!