Clipboards Have Multiple Personalties

Clipboards are more than a text buffer, they're almost full databases

One day, I copied a paragraph from a Word document into another Word document, and also into notepad.exe. When I pasted into the second Word document, all my font sizes, color, spacing etc. remained in place, but when I pasted into notepad.exe I just got the raw text. Clearly the clipboard is storing styling information, but it's additionally storing the raw version? I had to find out more... It turns out that, yes, when you copy data, multiple formats are stored in clipboard, and then applications dictate which formats they can accept. Let's have a quick look at that in practice.

Cracking open the clipboard

In this demo, I am using Linux with the Xorg clipboard, although similar principals apply on Windows, and I'm sure MacOS.
Inside Clipboard application
The Windows InsideClipboard application
Using the tool xclip we can examine the clipboard in detail. To get a list of formats currently stored on the clipboard, use the command: xclip -selection clipboard -o -t TARGETS
[?] The -selection clipboard is necessary as Linux generally has two clipboards, the one assigned to middle-click copying/pasting, and the Ctrl+C / Ctrl+V clipboard. xclip defaults to the middle-click one.
. For the first demo, let's copy some raw text and see what's on the clipboard.
xclip tool showing text clipboard formats
xclip showing text clipboard formats
Seems we get a MIME type for plain text, a UTF8 formatted string, and a timestamp for when the copy event happened. Nothing too surprising other than the UTF8 string. Does the system auto convert plain text to UTF8 in case an application doesn't accept plain text? What about if we were to copy an image?
xclip showing image clipboard formats
xclip showing image clipboard formats
Again, no surprises there. Now for a file from my nautilus file manager.
xclip showing file clipboard formats
xclip showing file clipboard formats
We have a lot of objects, but many of them contain the same data, which is simply a URI to the file e.g file:///home/harrison/example.json. Finally, what kicked this all off, some formatted text from Google Docs:
xclip showing Google Docs clipboard formats
xclip showing Google Docs clipboard formats
Looks like it stores both HTML and custom Google Docs schema (in Word, this would be XML). What we can also see is that web apps can register custom schemas, are there security implications to this? One little quirk I found while looking into this, is that when you copy an image from Firefox, it coverts the image on the fly to a whole bunch of image formats!
xclip showing firefox image clipboard formats
xclip showing clipboard formats after copying an image in Firefox


A view from the other side

So we've seen what the clipboard looks like, but we're yet to see how applications copy various formats, and how they register what formats they can understand. Each OS and application framework would do this differently, so let's use the great equalizer and look at this in the browser with javascript. Firstly: setting custom clipboard data.
[?] https://www.w3.org/TR/clipboard-apis/#override-copy
document.addEventListener('copy', function(e) {
	e.clipboardData.setData('text/plain', 'Howdy');
	e.clipboardData.setData('application/harrisonm', 'Howdy');
	e.preventDefault();
});
Easy enough, and what about listing / reacting to different formats?
[?] https://www.w3.org/TR/clipboard-apis/#override-paste
document.addEventListener('paste', function(e) {
	console.log(e.clipboardData.types);
	if (e.clipboardData.types.indexOf('text/html') > -1) {
		var HTMLdata = e.clipboardData.getData('text/html');
		e.preventDefault();
	}
});
That's all, hope you enjoyed my little foray into clipboard management, and more importantly, that you learnt something!