MongoDB and E-commerce
April 30, 2010
There are still a number of misconceptions regarding MongoDB’s
suitability for
The hesitant reaction there expressed is understandable, as most of the databases
falling under the NoSQL umbrella would fare poorly in an
Catalog Management
If you need some insight into the way in which catalog management is handled
with relational databases, have a quick look at the schemas of the popular
Magento
What this means is that the data for any given product is spread out across dozens of tables. This increases the complexity of the code required to persist and query individual products and makes a shell-based inqury all but impossible. Ask yourself if you’d rather write the SQL JOIN to pull together a product modeled like this:
Or issue a simple find from the MongoDB JavaScript shell to pull back a JSON object like this:
// Find a product object db.products.find({'_id': ObjectID("4bd87bd8277d094c458d2fa5")}); {_id: ObjectID("4bd87bd8277d094c458d2a43"), title: "A Love Supreme [Original Recording Reissued]" author: "John Coltrane", author_id: ObjectID("4bd87bd8277d094c458d2fa5"), details: { number_of_discs: 1, label: "Impulse Records", issue_date: "December 9, 1964", average_customer_review: 4.95, asin: "B0000A118M" }, pricing: { list: 1198, retail: 1099, savings: 99, pct_savings: 8 }, categories: [ ObjectID("4bd87bd8277d094c458d2a43"), ObjectID("4bd87bd8277d094c458d2b44"), ObjectID("4bd87bd8277d094c458d29a1") ] }
Of course, this is not a complete representation of a product, but it does demonstrate how many of the more trivial tables existing in a relational representation can be dispensed with in a document representation.
For object-oriented data, documents simply make more sense, both conceptually and performance-wise. A document-oriented representation of product data means fewer entities (a handful of collections vs. dozens of tables), better query performance (no server-side joins), and structures that fit the product precisely. There’s no longer any need to design some master schema that can account for every single conceviable product.
Catalog management is essentially content management, a domain MongoDB excels at. To read more about this, see the Drupal presentation on their move to MongoDB.
Shopping Carts and Orders
Allowing that a shopping cart is merely an order in a ‘cart’ state, the modeling of shopping carts and orders in MongoDB becomes quite straightforward:
{'_id': objectid('4b980a6dea2c3f4579da141e'),
'user_id': objectid('4b980a6dea2c3f4579a4f54'),
'state': 'cart',
'line_items': [
{'sku': 'jc-432',
'name': 'John Coltrane: A Love Supreme',
'retail_price': 1099
},
{'sku': 'ly-211',
'name': 'Larry Young: Unity',
'retail_price': 1199
},
],
'shipping_address': {
'street': '3333 Greene Ave.',
'city': 'Brooklyn',
'state': 'NY',
'zip': '11216'
},
'subtotal': 2199
}
Notice that we can represent the order line items as an array of products. As is usual with documents, this makes displaying the shopping cart more straighforward, since no joins are involved. But it also solves the problem of product versioning. It’s often necessary to take a snapshot of a product when a user purchases it. This can be accomplished in an RDBMS by linking to a particular version of a product. Here, however, we simply store a snapshot of the product in the order itself.
Querying Orders
Since MongoDB supports dynamic queries and secondary indexes, querying for orders is a snap. It’s possible, for example, to define an index on product sku, which allows for efficient queries on all orders of a given product:
db.orders.ensureIndex({'line_items.sku': 1});
db.orders.find({'line_items.sku' => 'jc-431'});
With MongoDB, we can query on arbitrary attributes, so any query on the orders collection is possible. And for common queries, we can define indexes for greater efficiency.
Aggregation
Of course, aggregation is also necessary. We’ll want to report on orders in different ways, and for that, map-reduce is available. As an example, here’s how to write a map-reduce command that aggregates order totals per zip code:
map = " function() { emit(this['shipping_address']['zip'], {total: this.total}) }" reduce = " function(key, values) { var sum = 0; values.forEach(function(doc) { sum += doc.total; } return {total: sum}; }" db.orders.mapReduce(map, reduce, {out: 'order_totals_by_zip'});
For more on MongoDB’s map-reduce, see Map-Reduce Basics.
Updating Orders
Incrementing Quantity
One way to go about adjusting quantity is to use the positional operator, which lets you apply atomic operations to individual objects within an array. Here’s how we change the number of Coltrane albums we’re ordering:
db.orders.update({'_id': order_id, 'line_items.sku':'jc-431'},
{'$set': {'line_items.$.quantity': 2}});
Adding and Removing Items
Likewise, atomic operators solve the problem of adding and removing products from the cart. For instance, we can use the $push atomic operator to add an item to our cart:
db.orders.update({'_id': order_id},
{'$push': {'line_items':
{'sku': 'md-12', 'price': 2500, 'title': 'Basketball'}}
'$inc': {'subtotal': 2500}});
When adjusting quantity and changing the cart items themselves, it’s necessary to update the order totals. Notice that we use the $inc operator to handle this.
Inventory
Not all
One way to model inventory is to store a separate document per physical item in our warehouse. So, for example, if our warehouse has twenty copies of the Coltrane album, that translates to twenty distinct documents in our inventory collection. Each document looks something like this:
{'_id': objectid('4b980a6dea2c3f4579da432a'),
'sku': 'jc-431',
'state': 'available',
'expires': null,
'order_id': null
}
When a user tries to add an item to the cart, a findAndModify command can be issued to atomically mark the item as in-cart, associate the item with a given order, and set an expiration time:
query = {'sku': 'jc-431', 'state': 'available'}; update = {'$set': {'state': 'cart', 'order_id': order_id, 'expires': Date.now() + 15 * 60}}; item = db.inventory.findAndModify(query: query, update: update);
If we get an item back from the findAndModify operation, we know we have a unique lock on that item, and we can store it in our cart. When the user goes to check out, the item’s state can be advance to ‘purchased,’ or whatever the case calls for.
Meanwhile, we can run a script in the background that frees cart inventory not purchased in the fifteen-minute window. The update is trivial:
db.inventory.update({'state': 'cart', 'expires': {'$lt': Date.now()}},
{'$set' {'state': 'available', 'expires': null, 'order_id': null}},
{multi: true});
Transactions, Consistency, and Durability
A lot of the arguments levied against NoSQL in
Concerning transactions, it is true that MongoDB doesn’t support the multi-object
kind; however, it does support atomic operations on individual documents. And
this, combined with the document-oriented modeling just described and a little
creativity, is adequate to many
If you’re concerned about consistency and durability, know that write operations in MongoDB can be made consistent across connections. Furthermore, MongoDB 1.5 support near-real-time replication, so that it’s now possible to assure that an operation has been replicated before returning. See the documentation on replication acknowldgment for details.
Conclusion
Certainly most NoSQL databases weren’t built with
But for the parts of an