Adding Attachments to a Document in CouchDB
To start off, here are the links to my previous posts about CouchDB:
- Relaxing on the Couch(DB)
- Installing the Couch(DB)
- PUTting the Couch(DB) in Your Living Room
- GETting Documents From CouchDB
- DELETE Documents From CouchDB
Today, I want to talk about how to create attachments for a document. Documents in CouchDB can have attachments just like an email. CouchDB has two ways for dealing with attachments:
- Inline Attachments
- Standalone Attachments
Inline Attachments
Inline attachments can be added to a document by using the dedicated _attachments attribute while PUTting the document into CouchDB.
PUT /documentstore/4f754d4b-540d-4b77-8507-6b7243ef8325 HTTP/1.1 { "Author":"Stephen Hawking", "Title":"The Universe in a Nutshell", "Tags":[{"Name":"Physics"},{"Name":"Universe"}], "_attachments": { "The Universe in a Nutshell.pdf": { "content_type": "application/pdf", "data": "JVBERi0xLjUNCiW1tbW1DQox ... " } }}
For creating an attachment, we need to provide a file name, the MIME type and the base64 encoded binary data. Its even possible to have multiple attachments for a single document.
Standalone Attachments
Standalone attachments are a fairly recent feature of CouchDB that has been added to version 0.9. As it name implies, it involves adding, updating and removing attachments without the document itself being involved.
PUT documentstore/4f754d4b-540d-4b77-8507-6b7243ef8325/ The%20Universe%20in%20a%20Nutshell.pdf?rev=1-1437623276 HTTP/1.0 Content-Length: 911240 Content-Type: application/pdf JVBERi0xLjUNCiW1tbW1DQox ...
The major difference and advantage of this approach is that the binary data is sent directly to CouchDB without the need for a base64 conversion on both the client and server. This implies a significant performance improvement when storing attachments in CouchDB. Notice that you still need to provide a MIME type using the Content-Type header.
GETting Documents with Attachments
When retrieving a document with either an inline or standalone attachment, the actual binary data is not returned. Instead, CouchDB returns a stub to inform that the requested document has an attachment associated with it.
GET /documentstore/4f754d4b-540d-4b77-8507-6b7243ef8325 HTTP/1.1 { "_id":"4f754d4b-540d-4b77-8507-6b7243ef8325", "_rev":"1-1969924333", "Author":"Stephen Hawking", "Title":"The Universe in a Nutshell", "Tags":[{"Name":"Physics"},{"Name":"Universe"}], "_attachments": { "The Universe in a Nutshell.pdf": { "stub":true, "content_type":"application/pdf", "length":911240} } }}
In order to retrieve the binary data of the attachment itself (yes please!), we have to issue a second GET but now using both the document identifier and the file name of the attachment.
GET /documentstore/4f754d4b-540d-4b77-8507-6b7243ef8325/ The%20Universe%20in%20a%20Nutshell.pdf HTTP/1.1
CouchDB responds to this request by returning the binary data of the attachment. When an inline attachment is used, the binary data automatically gets decoded.
For my next post, I will talk about MapReduce functions providing a simple example of a Map function in particular.
Till next time.
Just what I was looking for. Thank you for this. Couldn’t find a clear answer anywhere.