MongoDB on Windows Azure brings the power of the leading NoSQL database
to Microsoft's flexible, open, and scalable cloud. MongoDB is an open source, ...
A MongoDB White Paper
MongoDB on Windows Azure
October 2012
Table of Contents INTRODUCTION
1
ABOUT MONGODB
2
MongoDB Architecture
2
ABOUT WINDOWS AZURE SERVICES FOR MONGODB
2
UNDERSTANDING THE DEPLOYMENT OPTIONS
2
Azure Virtual Machines
4
Basic Setup
4
Pros and Cons of Azure Virtual Machines
5
Azure Cloud Services
5
Basic Setup
5
Pros and Cons of Azure Cloud Services
5
SUMMARY
6
ABOUT MONGODB
6
RESOURCES
6
Introduction and Windows Azure provide customers the tools to build limitlessly scalable applications in the cloud.
MongoDB on Windows Azure brings the power of the leading NoSQL database to Microsoft’s flexible, open, and scalable cloud.
This paper begins with an overview of MongoDB. Next, we describe the two primary deployment options available on Microsoft’s cloud platform, Windows Azure Virtual Machines and Windows Azure Cloud Services. Finally, to help those evaluating deploying MongoDB on Windows Azure, we outline the pros and cons of the two deployment options available.
MongoDB is an open source, document-oriented database designed with scalability and developer agility in mind. Windows Azure is the cloud services operating system that provides the development, service hosting, and service management environment for the Azure Services Platform. Together, MongoDB
1
{ "_id": ObjectId("504e4dd43796b3da50183991"), "text": "Study Implicates Immune System in Parkinson’s Disease Pathogenesis http://bit.ly/duhe4P", "source": "
twitterfeed, "coordinates": null, "truncated": false, "entities": { "urls": [{ "indices": [ 67, 87], "url": "http://bit.ly/duhe4P", "expanded_url": null }], "hashtags": [] }, "retweeted": false, "place": null, "user": { "friends_count": 780, "created_at": "Fri Jan 08 17:40:11 +0000 2010", "description": "Latest medical news, articles, and features from Medscape Pathology.", "time_zone": "Eastern Time (US & Canada)", "url": "http://www.medscape.com/pathology", "screen_name": "MedscapePath", "utc_offset": -18000 }, "favorited": false, "in_reply_to_user_id": null, "id": NumberLong("22819397000") }
FIGURE 1 // Sample JSON Document.
About MongoDB
Unlike relational databases, MongoDB does not use SQL syntax. Rather, MongoDB has a query language based on JSON. It also has drivers for most modern languages, such as C#, Java, Ruby, Python, and many others. MongoDB’s flexible data model and support for modern programming languages simplify development and administration significantly.
MongoDB is an open source, document-oriented database. MongoDB bridges the gap between key-value stores – which are fast and scalable – and relational databases – which have rich functionality. Instead of storing data in tables and rows as one would with a relational database, MongoDB stores a binary form of JSON (BSON or ‘binary JSON’ documents). An example of a document is shown in Figure 1.
MONGODB ARCHITECTURE MongoDB’s core capabilities deliver reliability, high availability, high performance, and scalability.
The document serves as the fundamental unit within MongoDB (like a row in an RDBMS); one can add fields (like a column in an RDBMS), as well as nested fields and embedded documents. Rather than imposing a flat, rigid schema across an entire table, the schema is implicit in what fields are used in the documents. Thus, MongoDB allows developers to have variable schemas across documents and to adapt schemas as their applications evolve.
Replication through replica sets provides for high availability and data safety. A replica set is comprised of one primary node and some number of secondary nodes (determined by the user). Figure 2 shows an example replica set, with one primary and two secondaries (a common deployment model). By default, the primary node takes all reads and writes from the
2
An overview of the MongoDB architecture is shown in Figure 4 (see next page). In a multi-shard environment, the application communicates with mongos, an intermediary router that directs reads and writes to the appropriate shard. Each shard is a replica set, providing scalability, availability, and performance to developers. MongoDB was built for the cloud. Cloud services like Windows Azure are therefore a natural fit for MongoDB. By coupling MongoDB’s easy-to-scale architecture and Azure’s elastic cloud capacity, users can quickly and easily build, scale, and manage their applications.
Primary Asynchronous Replication
Secondary Secondary
FIGURE 2 // Replica Sets with MongoDB.
About Windows Azure Services for MongoDB
application; the secondaries replicate asynchronously in the background. If the primary node goes down for any reason, one of the secondaries is automatically promoted to primary status and begins to take all reads and writes. Replica sets help protect applications from hardware and data center-related downtime. Moreover, they make it easy for DBAs to conduct operational tasks, including software upgrades and hardware changes.
Windows Azure is Microsoft’s suite of cloud services, providing developers on-demand compute and storage to create, host and manage scalable and available web applications through Microsoft data centers. When deploying MongoDB to Windows Azure, users can choose from two deployment options: • Windows Azure Virtual Machines. Windows Azure Virtual Machines (VMs) is Microsoft’s Infrastructure-as-a-Service (IaaS) offering. Similar to Amazon Web Services EC2, Azure VMs give users access to elastic, on-demand virtual servers. Users can install Windows or Linux on a VM and configure it based on their own preferences or their apps’ specific needs. Users manage the VMs themselves, including scaling, installing security patches, and ongoing performance monitoring and management. Azure VMs give users a relatively significant degree of control over their environments, but by the same token require users to take on the VM management. Note: This service is currently in preview (beta).
Sharding enables users to scale horizontally as their data volumes grow and/or as demands on their data stores grow. A shard is a subset of the database, kind of like a partition of the data. In Figure 3, Shard A contains documents 1-30; Shard B contains documents 31-60; and so on. One can choose any key on which to shard the collection (e.g., user name), and MongoDB will automatically shard the data store based on this key. One can scale a database infinitely using sharding by adding new nodes to a cluster. When a new node is added, MongoDB recognizes it and redistributes the data across the cluster. Because sharding distributes both the actual data and therefore the load (i.e., traffic), it enables horizontal scalability as well as high performance.
Shard 1
Shard 2
Shard 3
• Windows Azure Cloud Services. Windows Azure Cloud Services (Worker Roles and Web Roles) is Microsoft’s Platform–as–a–Service (PaaS) offering. Similar to Heroku, Worker Roles provide users with prebuilt, preconfigured instances of compute power. In contrast with Azure VMs, users do not have to configure or manage Azure Worker Roles. Windows Azure handles the
Shard N
Horizontally Scalable
FIGURE 3 // Sharding with MongoDB.
3
Shard 1
Shard 2
Shard 3
Shard N
Primary
Primary
Primary
Primary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
Secondary
FIGURE 4 // MongoDB Architecture.
deployment details – from provisioning and load balancing to health monitoring for continuous availability. This can be helpful to some users who prefer not to manage their applications at the infrastructure level, though it restricts the level of control users have over their environments.
an instance and install and configure MongoDB on it manually. Alternatively, users can use the recently released Windows Azure installer for MongoDB to set up a MongoDB replica set quickly and easily on Windows Azure VMs. The installer is built on top of Windows PowerShell and the Windows Azure command line tool. It contains a number of deployment scripts. The tool is designed to help users get single or multi-node MongoDB configurations up and running quickly. There are only two steps to installing and configuring a MongoDB replica set on Azure VMs. Note: the installer is designed to run on a user’s local machine (i.e., not directly on an Azure VM), and then to deploy output to Windows Azure VMs. To start, download the publish settings file. Next, run the installer from the command prompt.
Understanding the Deployment Options Given that MongoDB can be deployed on either Windows Azure Virtual Machines (IaaS) or Windows Azure Cloud Services (PaaS), it is important for users to consider the different capabilities and implementation details of each service to determine which deployment model makes the most sense for their applications.
With Azure Virtual Machines, users can create their own VMs or they can create a VM instance from one of several pre-installed operating system configurations. Both Windows and Linux are supported on Azure Virtual Machines. To deploy MongoDB on Linux, visit the MongoDB wiki (wiki.mongodb.org) for step-by-step instructions.
AZURE VIRTUAL MACHINES Basic Setup After being granted access to the preview functionality for Azure Virtual Machines, users can launch
4
AZURE CLOUD SERVICES
Pros and Cons of Azure Virtual Machines The pros and cons of deploying MongoDB on Azure Virtual Machines are generally consistent with the considerations around using IaaS more broadly. Overall, Azure Virtual Machines allow users to fine-tune their deployments but by the same token require increased operational effort.
Basic Setup Users can also deploy MongoDB on Azure Cloud Services. To do so, download the MongoDB Azure Worker Role package, which is a preconfigured Worker Role with MongoDB. When deployed, each replica set member runs as a separate Worker Role instance; MongoDB data files are stored in Azure Cloud Drives. For detailed instructions, visit the MongoDB wiki (wiki. mongodb.org).
The advantages of using MongoDB on Azure Virtual Machines are as follows: • Increased Control. Users have more control over their infrastructural configuration relative to Azure Cloud Services. For instance, they can install and configure services on the OS, define policies, etc. This consideration may be important for enterprises that have regimented policies and processes for IT security and compliance.
Pros and Cons of Azure Cloud Services The pros and cons of running MongoDB on Azure Cloud Services are generally consistent with those of using PaaS in general, though there are some Azurespecific considerations. Overall, Windows Azure Cloud Services decreases the operational burden on users but affords them less control from an infrastructure configuration standpoint. The advantages of using Azure Cloud Services are as follows:
• OS Choice. Users can use Windows or Linux. Azure Virtual Machines may not always be the right fit for the following reasons:
• Lower Operational Effort. Microsoft manages OS updates and security, decreasing the operational burden on the users.
• Increased Operational Effort. The increased control that Azure Virtual Machines provide comes with increase effort, as well. Users must define and implement their own security measures, apply patches, and locate instances for fault tolerance. This consideration may be important for developers that lack experience managing their own infrastructure or for companies that don’t have the operational bandwidth to devote to managing this component of the stack.
• Built-in Fault Tolerance. When deploying multiple MongoDB worker role instances, Windows Azure automatically deploys the instances across multiple fault and update domains to guarantee better uptime. • Secure by Default. Microsoft takes measures to ensure that worker and web roles are secure. Endpoints on instances can be enabled for instance-to-instance communication without making them public. Thus, one can configure MongoDB to be secure by enabling it only for other roles in the same deployment.
• Beta. The Azure Virtual Machines service is still in Preview (beta).
TABLE 1 // Pros and Cons Summary - Windows Azure Virtual Machines and Windows Azure Cloud Services Pros IaaS – Windows Azure Virtual Machines Initial Administrative Effort
Increased Control OS Choice Lower operational effort Built-in fault tolerance
By the same token, there are some aspects of Azure Cloud Services that may be considered drawbacks:
Cons
Windows Only. Worker Roles can only be deployed with Windows; Linux is not an option.
Increased operational effort
Fixed OS Configuration. Users cannot configure the OS, and must therefore develop applications that run on the pre-defined machine configurations available.
Windows only Fixed OS configuration
Table 1 summarizes the pros and cons of using Windows Azure Worker Roles and Windows Azure Virtual Machines.
Secure by default
5
Summary
Resources
MongoDB was built for ease of use, scalability, availability, and performance, and it’s quickly becoming an attractive alternative to relational databases. Windows Azure provides a flexible cloud platform for hosting MongoDB, with two deployment models to choose from. Developers and enterprises looking at deploying MongoDB on Windows Azure should consider the pros and cons discussed here when evaluating which option is most appropriate for them. We hope that this paper helps customers better understand these solutions, how they work, and how to assess them.
For more information, please visit mongodb.com or contact us at
[email protected].
Resource
Website URL
MongoDB Enterprise Download
mongodb.com/download
Free Online Training
education.mongodb.com
Webinars and Events
mongodb.com/events
White Papers
mongodb.com/white-papers
About MongoDB
Case Studies
mongodb.com/customers
Presentations
mongodb.com/presentations
MongoDB (from humongous) is reinventing data management and powering big data as the leading NoSQL database. Designed for how we build and run applications today, it empowers organizations to be more agile and scalable. MongoDB enables new types of applications, better customer experience, faster time to market and lower costs. It has a thriving global community with over 4 million downloads, 100,000 online education registrations, 20,000 user group members and 20,000 MongoDB Days attendees. The company has more than 600 customers, including many of the world’s largest organizations.
Documentation
docs.mongodb.org
6
New York • Palo Alto • Washington, D.C. • London • Dublin • Barcelona • Sydney US 866.237.8815 • INTL +1 650 440 4474 •
[email protected] Copyright 2013 MongoDB, Inc. All Rights Reserved.