Address: 15640 NE Fourth Plain Blvd, Ste 106 #826, Vancouver, WA 98682, USA
Database Options In The Cloud – Part 1
The rush to the cloud is often measured in pure business terms—and its growing popularity is widely recognized. For instance, a recent Gartner report (requiring the submission of business information to view) finds that databases are growing at a rate of 68% in the cloud, whereas there is little on-premises growth outside of price increases and what they call “forced upgrades.”
But this dramatic shift in computing and database access also shows up in the types of services offered and the evolution of professional jobs in computing. Managed databases of many sorts are now part of every major cloud vendor’s offerings. The use of these databases will remove many tasks performed by a database administrator (DBA) in traditional environments where an organization owns its own hardware, which we’ll call on-premises environments. Moving to the cloud will add new tasks, change some existing tasks, and provide a subtly different context for understanding many of the tasks.
Cloud vendors doggedly keep up with changes in the database space and vary their database offerings to meet the diverse needs of their clients. Relational databases (in both transactional and data warehouse form) appear along with nonrelational databases, such as key/value and document stores.
This report helps the DBA and related staff—such as data scientists, data architects, and application developers—choose among cloud offerings. It explains on a general level what responsibilities the DBA should expect to perform in the cloud environment. Finally, it offers guidelines for migration.
The report does not cover arguments for or against moving to the cloud, because there are other resources to support this decision and because the decision is tied in so tightly with the particular traits of your databases and how you use them. Furthermore, we don’t review or recommend particular cloud offerings, although we refer to the offerings of the currently dominant cloud vendors: Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform.
This part 1 includes the following:
Cloud offerings often mirror the software available for traditional, on-premises deployment. This means that you can move data from your own equipment to a comparable database offered by a cloud vendor. For instance, you can run MySQL or Oracle in the cloud instead of in your own datacenter, or you can use a database native to the cloud such as Amazon Aurora.
You should also be able to keep a foot in each territory (on-premises and cloud), moving data into the cloud at a pace that’s comfortable for your organization. You can also maintain two deployments forever, one on-premises and the other in the cloud. The cloud is often a good place to try something totally new, such as a big data project that you’ve never created before.
Although we expect most readers to be familiar with the cloud, its benefits, and various reasons to use or avoid it, the next section summarizes traits of the cloud that underlie the ideas in this report. After that, we can focus on databases.
High-Level Effects of Moving to the Cloud
As you move data to the cloud, you will encounter differences from on-premises deployments that affect your decisions. Following are the key changes of interest to a DBA:
Many criteria for choosing a provider are general, not database specific: pricing, the ecosystem of third-party developers around each offering, stability, support for legal compliance regimes, added value in high-end tools for artificial intelligence, and more.
As you evaluate cloud offerings, think about future directions for your organization. For instance, will streaming and real-time data processing be a bigger part of your future? Which cloud offerings will support you a couple of years from now?
Self-Managed Versus Managed Databases
We now look specifically at databases. Your options in the cloud divide into two major categories:Self-managed databases
In this cloud offering, the vendor provides just the hardware, the hypervisor to run your VM, and the API or other tools to manage deployment. You need to create a VM running an operating system and your application. Obviously, you can run anything you want in the VM, including a database engine of your choice. You need to perform most of the administrative tasks yourself, such as installing updates and configuring all of the networking options.Managed databases
In this cloud offering, the vendor provides not only hardware but also the server software itself. Most vendors offer both traditional databases (such as Oracle and MySQL) and cloud native databases that are specific to that vendor.
Self-managed databases are much more like running your own database on-premises; managed databases change the way you work a lot more. Here are the key differences:
Managed databases therefore offer many advantages over on-premises deployments and self-managed databases, but you might have unique reasons to run your own database. The rest of this report covers managed databases because they offer a unique opportunity for a DBA to focus on activities that have more long-term impact on the business.
Cloud Native Databases
Managed databases can also be divided into two categories: traditional and cloud native. Traditional databases such as Oracle, SQL Server, MySQL, and PostgreSQL are frequently offered as managed databases by cloud providers. If you have built your organization around one of these databases on-premises, moving to the same database in the cloud simplifies migration. You are less likely to need to alter your applications and can use familiar tools to manage the databases. You can also mix offerings from different vendors or maintain your on-premises version of the database.
But the vendors have invested great effort into their own new offerings, sometimes called cloud native databases. Vendors provide evidence that the cloud native databases perform better, scale more easily, and are cheaper in the long run. Thus, testimonials from Autodesk and InfoScout suggest that AWS engineers have solved many of the scaling and efficiency problems of managing relational databases in the cloud with their own database, Amazon Aurora. Cloud native databases are also designed to scale enormously, a task that has historically been difficult for relational databases.
Types of Managed Databases
Most types of databases you find out in the field are also offered as managed databases. Additionally, cloud vendors have also developed their own cloud native databases, following common industry trends and offering performance benefits. The major types of supported databases include the following:Relational databases
As mentioned earlier, some cloud databases are managed versions of popular databases in widespread use. For instance, Azure offers Microsoft’s traditional SQL Server. Amazon supports MariaDB, MySQL, Oracle, PostgreSQL, and SQL Server through its Amazon Relational Database Service (RDS). Such offerings help you to move databases more easily from on-premises installations.
In addition, vendors have created databases of their own. For instance, Azure provides Azure Cosmos DB, Google Cloud offers Cloud Spanner, and AWS offers Amazon Aurora.Data warehouses
Although these are typically relational databases, they differ from the transaction-oriented databases internally and in their offerings. For instance, transaction-oriented databases usually store all the columns of a single row together so that you can quickly retrieve multiple columns about a customer or product. In contrast, data warehouses in the cloud tend to be columnar, meaning that they store data by column instead of by row. This greatly speeds up common warehouse queries like, “give me the ages of all customers who live in California.” The tools offered by cloud vendors with data warehouses focus on fast ingestion and extraction, facilitating their use in big data applications. Amazon Redshift, Google’s BigQuery, and Azure’s SQL Data Warehouse are examples of these offerings.Nonrelational databases
This term is commonly used to cover a variety of different data stores that, unlike traditional relational databases, are built for special-application use cases. Cloud vendors offer a variety of these for different purposes:
Other options, including in-memory caching databases, might also be available. Cloud vendors offer services for less structured storage (i.e., BLOBs or files): examples include Amazon Simple Storage Service (Amazon S3), Google Cloud Storage, and Azure Blob Storage. The services offer even cheaper options for archival storage, along with tools to move data in and out of storage easily.
This post, necessarily, has stated database features in general terms, with lots of hedging words such as “some” or “might.” This reluctance to be pinned down stems from the subtle differences that can be seen after you examine vendor offerings carefully. Some offerings have evolved further than others. Some databases offer easier administration or migration than others. You must take all of these factors into account when you choose a cloud offering
When a Managed Database Might Not Be a Good Fit
A managed database service isn’t appropriate in a couple scenarios.Your database size and IOPS are greater than the database limits.
Cloud native databases are designed to scale to meet extreme demands. But managed databases based on more traditional database engines often come with some upper limits on their size and the input/output operations per second (IOPS) they can support. Many organizations can fit within these limits. But if your database is truly enormous, and you have greater IOPS needs than what the provider can support, you might find your chosen managed database services to be infeasible. You should run a cloud native database or run your own instances in the cloud by renting the VMs and storage that you need.You need full control over your databases.
If it’s important for you to have maximum control over your databases, you must manage your own databases in the cloud. If you need a specific database version that’s not supported by the cloud vendor, or you need to use database features or options that aren’t supported by the provider, you’re back on your own.
Role of the DBA in a Managed Database
As mentioned at the start of this report, moving to the cloud will change your job, sometimes in subtle ways. Part 2 covers this in detail, but we’ll just say here that after you relinquish hardware maintenance, system configuration, and other things a managed database takes care of, there is plenty of work left for you. Here are some tasks that the DBA often performs on a managed database:
Most exciting, you’ll have more time to look at the business context for the database and think of ways to improve the value of the data you manage for your organization. We go into more detail about these topics in Part 2.
Recent Comments