To start off, here are the links to my previous posts about CouchDB:
- Relaxing on the Couch(DB)
- Installing the Couch(DB)
- PUTting the Couch(DB) in Your Living Room
- GETting Documents From CouchDB
- DELETE Documents From CouchDB
- Adding Attachments to a Document in CouchDB
So far, we’ve mostly talked about managing documents in CouchDB. Now I want to discuss another important concept of CouchDB, namely views.
Views are the primary means for querying and searching documents that are stored by CouchDB. As mentioned in one of my previous posts, CouchDB doesn’t support SQL for querying documents. Consequently, views are to CouchDB as SQL is to an RDBMS. They are defined as MapReduce functions using JavaScript. When you’ve never heard about MapReduce, then take a look at the Introduction to MapReduce for .NET Developers for a quick dip into the concepts behind MapReduce.
CouchDB supports two kinds of views, permanent views and temporary views. A permanent view is stored as a special kind of document between the other regular documents. These special kind of documents are called ‘design documents‘. A permanent view can be executed by performing a GET operation with the name of the view. A temporary view, as its name implies, is not stored by CouchDB. Instead, the code for the view is posted to CouchDB where it is executed once.
For me, permanent views are the most interesting so we will use this option for the example in this post. Recall that in the examples used in previous posts we’ve had documents like the following:
{ "_id":"96f49e5a-6b5b-47ed-9234-9a98d600013e", "_rev":"2-1534297415", "Author":"Stephen Hawking", "Title":"The Universe in a Nutshell", "Tags":[{"Name":"Physics"},{"Name":"Universe"}] }
Lets create a view that we can use for retrieving the title of all documents for a particular tag. For creating a permanent view, there are again two options: using Futon (the web-based user interface of CouchDB) or through the HTTP View API. For keeping things simple, let’s use Futon for creating our view in CouchDB.
When creating a new view, CouchDB provides a map function with a default implementation and an optional reduce function. The purpose of a map function is to perform a number of computations using arbitrary JavaScript and to emit key/value pairs into the view. If the view also has a reduce function, then its used for aggregating the results. We’ll ignore the reduce function for now and focus our attention to the map function.
The basic anatomy of the map function as provided by Futon looks like this:
function(doc) { emit(null, doc); }
As already mentioned, the emit function takes care of inserting a key/value pair of your own choosing into the view. For our example, we’ll emit the name of a tag as the key and the title of a document as the value.
function(doc) { for each(var tag in doc.Tags) { emit(tag.Name, doc.Title); } }
Using Futon it’s possible to code up a map and reduce function and try it out on the documents that are stored in CouchDB. Executing the map function as is using an HTTP GET operation yields the following results:
GET /documentstore/_design/documents_by_tag/ _view/documents_by_tag HTTP/1.1 {"total_rows":5,"offset":0,"rows":[ { "id":"0afc1fc2-7b39-461f-87cf-1ed1e21d2f34", "key":"Universe","value":"The Universe in a Nutshell" }, { "id":"7b287e5d-c467-46ad-a10e-66d3c0696743", "key":"Universe","value":"The Theory of Everything" }, { "id":"0afc1fc2-7b39-461f-87cf-1ed1e21d2f34", "key":"Physics","value":"The Universe in a Nutshell"}, { "id":"7b287e5d-c467-46ad-a10e-66d3c0696743", "key":"Physics","value":"The Theory of Everything"}, { "id":"0afc1fc2-7b39-461f-87cf-1ed1e21d2f34", "key":"Space","value":"The Universe in a Nutshell"} ]}
We call the view by using its name in the URL. The result is a key for every tag we encounter and the title of the document as value. CouchDB also provides the document identifier for each key/value pair. This way we can see that we have five tags for two documents.
Now in order to get all distinct document titles for a particular tag, we have to add the tag name as an extra query parameter called ‘key‘:
GET /documentstore/_design/documents_by_tag/ _view/documents_by_tag?key=%22Universe%22 HTTP/1.1
This yields the following result from our view:
{"total_rows":5,"offset":0,"rows":[ { "id":"0afc1fc2-7b39-461f-87cf-1ed1e21d2f34", "key":"Universe", "value":"The Universe in a Nutshell"}, { "id":"7b287e5d-c467-46ad-a10e-66d3c0696743", "key":"Universe", "value":"The Theory of Everything"} ]}
This way we can also use different tag names and reuse our view. Notice that the total_rows attribute indicates that the original result set from our view contains five objects.
Doesn’t seem to be very hard now doesn’t it? Please do mind that I’ve barely scratched the surface here. For more information, you can take a look at the CouchDB wiki or check out these forthcoming books:
If you’re interested in distributed, non-relational databases in general you might want to check out the recordings from the NOSQL meetup that also hosted a talk on CouchDB as well.
Till next time
I’m Really enjoying your CouchDB series – keep it up!
Excellent posts. Will use the info for CERN lxbatch.