In this article, I will highlight key announcements and sessions that my Dev9 colleagues and I attended. The sessions were very well done and brought a lot of depth for people implementing on the GCP platform.
On Tuesday, February 28, 2017, many people found that their smart phone applications were no longer working properly, many web sites were down and the Internet in general just seemed broken. This is what happens when AWS, the largest Cloud provider, experiences a “service disruption.”
Dev9's Director of Cloud Services, Brian Guy, wrote an indepth analysis of the event for Geekwire. In the article, Brian explains what happened and offers possible solutions to helps organizations minimize risks.
How do you enable applications to best use the Cloud? It is true that nearly any application can be moved to the Cloud without much change. This is the premise behind the concept of the “lift & shift.” Moving to the cloud can be the same as moving to any virtualized environment. If the Cloud is just another virtualized environment, there is not much interesting about moving to it.
But...of course, the Cloud is much more than simply another virtual environment.
Dev9, a custom software development firm focused on Cloud services, announced today that it has named Gabe Hicks as Chief Technology Officer (CTO). Gabe has been promoted from his previous position at Dev9 as Vice President of Software Delivery. He has been appointed to the new role to lead Dev9’s services to help enterprises with their Cloud strategy, development, migration and implementation.
The software delivery industry has developed into a highly pragmatic practice forged from relentless questioning and refactoring of our collective delivery processes. Over the last decade, the industry has moved from software builds on specific, blessed individual developer machines, to an automated and sophisticated build pipeline that allows developers increase feedback at all stages of the software development lifecycle. As the build pipeline has evolved, so have our methods for delivery. We have progressed significantly from bare-metal, single installation monolithic applications, and introduced an abstraction of the physical hardware. This is accomplished by utilizing virtualization and adding an additional layer of abstraction in containers, decoupling the lightweight applications from the server hosting the application.
This article will show how containerization has provided developers with the last abstraction required to decouple a development team from the traditional Software Development Lifecycle (SDLC) established over a decade ago.
The Software Development Renaissance - Uncovering an Efficient Path
The software development lifecycle (SDLC) has been evolving for over five decades, but was widely adopted and refined by a wide audience beginning in the early 1980’s. We can call this the SDLC renaissance. Let’s start our journey there.
In the 80s, software development had moved off of mainframes and onto individual PCs, requiring individuals to share source code between machines both locally and remotely. Concurrent Version System, or CVS, was introduced in 1986 and was widely adopted by the late 80s and early 90s. Many problems surfaced with the first implementation of remote source control. Chief among them was the difficulty in adhering to a process for building software. At the time, there were several SDLC models, some of which had existed since the 1940’s. These models were unable to meet the growing needs of large groups of distributed contributors. On top of that, many of the contributors began as hobbyists and did not have the same fundamental knowledge as the previous, highly-educated generation. This resulted in shortcuts being taken that often led to inefficient and damaging practices such as binary code being sent by FTP, email, network file system folders or physical medium via “sneaker net.” To make matters worse, most builds were conducted on an individual developer’s machine, adding yet another variable to the build process. All these things together greatly reduced the repeatability of builds should the team need to re-construct or investigate a previous version of the application.
Once a build produced an artifact, the software then faced the challenges of installation of deployment. Services typically required long downtimes combined with very complicated, highly-manual procedures to deploy new versions of the codebase. As a result, the time-to-market increased and companies made use of this “extra” time to pack more features (and thus more risk) into each new version. Additionally, companies increased their dependence on their IT teams, many of which had little or no software development skills. This created a knowledge gap and frequent, contentious meetings between developers and IT.
The deployed solutions relied heavily on the embedded dependencies and libraries of the host operating system creating a dependency on the Operating System. Operating systems provided a few technologies to aid the IT teams with deployment, but the automation tools were vastly inadequate by today's standard.
In order to track the progress of the SDLC, we will quantify each major milestone using five main variables. A one (1) indicates the value is very low or almost no presence where a five (5) indicates high capability.
Time to Market
This is the measure of how fast a single change can be introduced and deployed to a production environment.
The ability for a build to be repeated resulting in the same binary. Repeatability also pertains to the ability to deploy the same binary artifact without any modification.
The measure of how much automation is possible and the relative usefulness automation would bring. Additionally, automation capability is a measure of how manual a process is.
Measure of coupling between the environment and/or Operating System (OS) has between the end application. Greater OS independence allows for applications to be run in a wider range of server and delivery locations.
How coupled is the delivery of software on the Information Technology team(s).
Virtualization: The First Mile
Provisioning new environments would often take months and require the developers to have intimate knowledge of hypothetical server capabilities. Furthermore, IT personnel were required to understand the basics of running the application to ensure the correct servers were ordered to achieve the desired scale, business objectives and cost efficiency.
The first major change to the deployment process happened when virtualization became widely adopted. Two key features provided by virtualization improved the delivery pipeline. The first was the abstraction between physical hardware and the servers that were created virtually on top of the hypervisor. This abstraction allowed teams to create servers with the operating system most appropriate for their work. This helped decouple the software and the operating system, reducing the OS dependency.
The second major benefit was the decreased time-to-market. Development teams could request provisioned servers much more rapidly. Decreasing the time required to provision new environments allowed developers to try new processes of development without tying up precious resources.
While virtualization provided more OS independence and faster time-to-market, a significant human intervention was still required for provisioning infrastructure and deploying new changes. Additionally, virtualization required a specialized skill set to configure, manage, and maintain the virtualized software, resulting in an even wider knowledge gap between development and IT. This was only made worse by the high cost of training personnel on how to operate the tools needed to virtualize the software. Though it added efficiency, virtualization did not bring any significant improvements to software delivery processes.
Building the Development Highway with Continuous Delivery
With the increased efficiency provided by virtualization, additional servers could be requisitioned to help build software. The most important tool developed to accomplish this came from the Continuous Delivery (CD) movement, largely born out of the Agile SDLC process.
Continuous Delivery utilizes servers to compile, test and package the project using machines specifically designed to build software. This technique provides many benefits, but the two most important to software delivery were the introduction of standardized builds and immutable binary artifacts. Standardized builds removed the developers "blessed" machine from the build cycle and allowed all builds to be consistently done on a machine that was not used for daily development. This practice increased the security and repeatability for all builds while providing tangential benefits like transparency, fast feedback through automated testing suites and accountability for broken builds.
The second major benefit gained when implementing CD is the production of versioned, immutable binary artifacts. Builds were no longer built on a production server or a reconstituted environment through crazy rsync or file copy scripts. Binaries were placed on a separate server that managed dependencies, allowing the project to pull versioned artifacts in as dependencies as necessary.
The introduction of CD practices was revolutionary for the SDLC, but there were still several key areas that required attention. While the binary repository offered on-demand retrieval of binary artifacts, the deployment process was still highly dependant on IT. In a Java-based web application, the CD pipeline would generate and upload a versioned WAR file to the artifact repository, but the deployment would require that the artifacts be pulled down manually or with a single-purpose script requiring manual execution. Additionally, deployments were still being pushed to largely static environment, despite the use of virtualized environments.
A Break in the Clouds
Continuous Delivery provided a model to automate build, increase quality and improve repeatability, but it was initially used only for software development. The next major evolution in the maturity of the delivery pipeline was with Configuration Management (CM) tools. These tools were designed to write software to automate the creation and management of infrastructure. Utilizing enhancements in virtualization tools, CM tools allowed developers and IT teams to start working together to engineer repeatable automated infrastructure. The concept of using code to create and manage infrastructure is known as Infrastructure as Code or IaC. It was initially designed to manage server resources and their associated configurations, but was often extended to include an automated deployment process. These tools combined with deployment automation created an abstraction between the deployment process and the servers they ran on. By simply changing a variable and properly running an engineered script, an IT team member could deploy a particular version of an artifact into a variable controlled environment. These automation and process changes helped decrease time-to-market while providing better repeatability and reliability. The dependance on IT staff was reduced due to API capabilities introduced in the second generation of CM tools. Now developers could utilize automation tools to replicate clustered environments on their local machines or rapidly provision entirely new environments.
While the new tools and more sophisticated automation dramatically reduced manual intervention and decoupled build pipelines, the artifacts were still being deployed to static environments simply because of the cost limitations of expanding data centers. Those static environments still contained dependencies that were not always captured by CM tools and thus they remained “snowflake” servers. IT and development teams were closer together than ever, but their responsibilities were still separated.
The Possibility of the Cloud
Public cloud technology began to rise in popularity as companies started to realize that their data centers were large cost centers that were not flexible enough to allow developers the needed capacity when working on large or complex solutions.
Public cloud offerings have been around for over a decade and have gained wider adoption in the last three to five years. Public cloud services combine automation, configuration management, virtualization, provisioning, reduced time-to-market and all of the benefits that those features bring. Additionally, utilizing a public cloud closes the gap between the delivery team from it’s IT dependency by combining the two teams together allowing for better organizational alignment.
Using a public cloud as the delivery end of a software build pipeline contains one last coupling. With the exception of the Netflix Spinnaker tool (replaced Asgard), applications are still being deplored using a process of combining newly created environments with versioned binary artifacts. This process is not easily portable and requires a great deal of effort should the underlying machines change.
Creating a new infrastructure every time a deployment is done increases the time-to-market of software. The development process needs an abstraction between the machine and the artifacts. Containers serve as a perfect abstraction allowing a build pipeline to construct a mini-server as the binary artifact, streamlining the deployment process. Container technology providers have worked on establishing a standard called Open Container Initiative (OCI). This community initiative allows the versioned container artifacts to be highly portable across multiple cloud or container services. Out of the container-based development pattern, a new architectural pattern emerged where applications inside containers carry minimum configuration and reach out to external configuration servers during bootup. Configuration servers allow the container artifacts to act more like business compute machines rather than static servers providing higher degrees of portability and flexibility. With a simple change of an environment variable, a container can run as a production or lower level environment service. External configuration allows development pipelines to continue to test the same immutable artifact through a progression of environments which follows one of the core principles defined in Continuous Delivery.
Supporting the container delivery pipeline architecture continues to push the definition of application and software delivery. Container-based solutions that utilize external configuration allow each container-based artifact to effectively become a business logic compute engine. Container artifacts reach out and pull in configuration upon startup, allowing the application to boot up in a logical environment. This further abstracts the concept of an environment and, at the same time, adds greater support for the core continuous delivery concepts of immutability.
Is This the End of the Journey?
“The only constant is change” (Heraclitus) is a great way to describe the industry’s pragmatic approach to generating higher quality, more transparent and predictable software. After the explosion of new developers in the 80’s and 90’s, the industry has improved software delivery through a series of iterations. Virtualization provided an abstraction from physical machines allowing faster time-to-market. Continuous Delivery added a consistent and predictable model for delivering high-quality, versioned artifacts. Configuration management added a degree of repeatable automation via code, for both the infrastructure and software. Public clouds provide an environment fostering innovative software build tools that take advantage of an almost inexhaustible set of resources. Containers provide the last major innovation decoupling build artifacts from their accompanying servers.
Is this the end of innovation? If history is any indication of innovation, we will continue to see future advances in software delivery methodology. Perhaps the next journey will come from AI-based software revolutionizing software development as we know it.
Starting The Journey?
Dev9’s seasoned team members have been down the development path many times. Many of our developers were part of the software renaissance, and we continue as a team to seek out the next frontier in development. We are committed to helping our clients clear the path to develop the best software possible. Software that will stand the test of time and maximize business operations while meeting organizational needs. Contact us for help on your journey.
As an industry, we are adopting higher transparent and more predictable build processes in order to reduce the risks in building software. One of the core principles of Continuous Delivery is to gather feedback via Feedback Loops. At Dev9, we have adopted a "first to know" principle that aligns with the CD principle which means that we (the dev team) wants to be the first to know when there is a failure, degradation of performance or any result not consistent with the business objectives.
Maven and other build tools have provided developers a standardized tool and ecosystem in which to establish and communicate feedback. While unit tests, functional, build acceptance, database migration, performance testing and code analysis tools have become a mainstay in a development pipeline, benchmarking has largely remained outside of the process. This could be due to the lack of open sourced, low cost tooling or lightweight libraries that add minimal complexity.
The existing tools often compound complexity by requiring an outside tool to be integrated with the runtime artifact and the tests are not saved in the same source repository or even stored in a source repository. Local developers are unable to run the benchmarks without effort and therefore the tests lose their value quickly. Adding to the mainstream solution problems, benchmarking is not typically taught in classes and is often implemented without the necessary isolation required to gather credible results. This makes all blogs or posts about benchmark results a ripe target for trolls.
With all that said, it is still very important to put some sort of benchmark coverage around critical areas of your codebase. Building up historical knowledge about critical sections of code can help influence optimization efforts, inform the team about technical debt, alert when a performance threshold change has been committed and compare previous or new versions of algorithms. The question should now be, how do find and easily add benchmarking to my new or existing project. In this blog, we will focus on Java projects (1.7+). The sample code will utilize Maven, though Gradle works very similarly. I make a few recommendations throughout the blog and they are based on experience from past projects.
There are many strong choices when looking to benchmark Java based code, but most of them have drawbacks that include license fees, additional tooling, byte code manipulation and/or java agents, tests outlined using non-Java based code and highly complex configuration settings. I like to have tests as close to the code under test as possible to reduce brittleness, lower cohesion and reduce coupling. I consider most of the benchmarking solutions I have previously used to be too cumbersome to work with or the code to run the tests are either not isolated enough (literally integrated in the code) or contained in a secondary solution far from the source.
The purpose of this blog is to demonstrate how to add a lightweight benchmarking tool to your build pipeline so I will not go into detail about how to use JMH, the following blogs are excellent sources to learn:
There are a small number of items I want to point out with respect to the modes and scoring as they play an important role in how the base configuration is setup. At a basic level, JMH has two main types of measure: throughput and time-based.
Throughput is the amount of operations that can be completed per the unit of time. JMH maintains a collection of successful and failed operations as the framework increases the amount of load on the test. Note: ensure the method or test is well isolated and dependencies like test object creation is done outside of the method or pre-test in a setup method. With Throughput, the higher the value, the better as it indicates that more operations can be run per unit-time.
Time-based measuring is the counter-partner to throughput. The goal of time-based measuring is to identify how long a particular operation takes to run per unit-time.
The most common time-based measurement is the "AverageTime" which calculates the average time of the operation. JMH will also produce a "Score Error" to help determine confidence in the produced score. The "Score Error" is typically 1/2 of the confidence interval and indicates how close the results deviated from the average time. The lower the result, the better as it indicates a lower average time to run per operation.
SampleTime is similar to AverageTime, but JMH attempts to push more load and look for failures which produces a matrix of failed percentages. With AverageTime, lower numbers are better and the percentages are useful to determine where you are comfortable with failures due to throughput and length of time.
The last and least commonly used mode is SingleShotTime. This mode is literally a single run and can be useful for cold testing a method or testing your tests. SingleShotTime could be useful if passed in as a parameter when running benchmarking tests, but reducing the time required to run tests (though, this diminishes the value of the tests and may make them deadweight). As with the rest of the time-based measurements, the lower the value the better.
Adding JMH to a Java Project
Goal: This section will show how to create a repeatable harness that allows new tests to be added with minimal overhead or duplication of code. Note, the dependencies are in the "test" scope to avoid JMH being added to the final artifact. I have created a github repository that uses JMH while working on Protobuf alternative to REST for Microservices. The code can be found here: https://github.com/mike-ensor/protobuf-serialization
1) Start by adding the dependencies to the project:
2) JMH recommends that benchmark tests and the artifact be packaged in the same uber jar. There are several ways to implement an uber jar, explicitly using the "shade" plugin for maven or implicitly using Spring Boot, Dropwizard or some framework with similar results. For the purposes of this blog post, I have used a Spring Boot application.
3) Add a test harness with a main entry class and global configuration. In this step, create an entry point in the test area of your project (indicated with #1). The intention is to avoid having benchmarking code being packaged with the main artifact.
3.1) Add the BenchmarkBase file (indicated above #2). This file will serve as the entry point for the benchmark tests and contain all of the global configuration for the tests. The class I have written looks for a "benchmark.properties" file containing configuration properties (indicated above in #3). JMH has an option to output file results and this configuration is setup for JSON. The results are used in conjunction with your continuous integration tool and can (should) be stored for historical usage.
This code segment is the base harness and entry point into the Benchmark process run by Maven (setup in step #5 below) At this point, the project should be able to run a benchmark test, so let's add a test case.
4) Create a Class to benchmark an operation. Keep in mind, benchmark tests will run against the entirety of the method body, this includes logging, file reading, external resources, etc. Be aware of what you want to benchmark and reduce or remove dependencies in order to isolate your subject code to ensure higher confidence in results. In this example, the configuration setup during
Caption: This gist is a sample benchmark test case extracted from Protobuf Serialization
All of your *Benchmark*.java test classes will now run when you execute the test jar, but this is often not ideal as the process is not segregated and having some control over when and how the benchmarks are run is important to keeping build times down. Let's build a Maven profile to control when the benchmarks are run and potentially start the application. Note, for the purposes of showing that maven integration tests start/stop the server, I have included this in the blog post. I would caution the need to start or stop the application server as you might be incurring the costs of resource fetching (REST calls) which would not be very isolated.
5) The concept is to create a maven profile to run all of the benchmark tests in isolation (ie. no unit or functional tests). This will allow the benchmark tests to be run in parallel with the rest of the build pipeline. Note that the code uses the "exec" plugin and runs the uber jar looking for the full classpath path to the main class. Additionally, the executable scope is only limited to the "test" sources to avoid putting benchmark code into final artifacts.
This code segment shows an example maven profile to run just the Benchmark tests
6) Last, optional item is to create a runnable build step in your Continuous Integration build pipeline. In order to run your benchmark tests in isolation, you or your CI can run:
If you are using a Java based project, JMH is relativly easy to add to your project and pipeline. The benefits of a historical ledger relating to critical areas of your project can be very useful in keeping the quality bar high. Adding JMH to your pipeline also adheres to the Continuous Delivery principles including feedback loops, automation, repeatable, and improving continuously. Consider adding a JMH harness and a few tests to the critical areas of your solution.
A few months ago a colleague and long-time friend of mine published an intriguing blog on a few of the less discussed costs associated with implementing microservices. The blog post made several important points on performance when designing and consuming microservices. There is an overhead to using a remote service beyond the obvious network latency due to routing and distance. The blog describes how there is a cost attributed to serialization of JSON and therefore a microservice should do meaningful work to overcome the costs of serialization.
While this is a generally accepted guideline for microservices, it is often overlooked and thus a concrete reminder helps to illustrate the point. The second point of interest is the costs associated to the bandwidth size of JSON based RESTful API responses. One potential pitfall of having a more substantive endpoint is that the payload of a response can degrade performance and quickly consume thread pools and overload the network.
These two main points made me think about alternatives and I decided to create an experiment to see if there were benefits from using Google Protocol Buffers (aka, "Protobuf" for short) over JSON in RESTful API calls. I set out to show this by first highlighting performance differences between converting JSON using Jackson into POJOs versus Protobuf messages into and out of the a data model.
I decided to create a sufficiently complex data model that utilized nested objects, lists and primitives while trying to keep the model simple to understand; Therefore I ended up with a Recipe domain model that I would probably not use in a serious cooking application, but serves the purpose for this experiment.
TEST #1: MEASURE COSTS OF SERIALIZATION AND DESERIALIZATION
The first challenge I encountered was how to work effectively with Protobuf messages. After spending some time reading through sparse documentation that focused on an elementary demonstration of Protobuf messages, I finally decided on a method for converting Messages in and out of my domain model. The preceding statements about using Protobufs is opinionated and someone who uses them often may disagree, but my experience was not smooth and I found messages to be rigid and more difficult than I expected.
The second challenge I encountered came when I wanted to measure the "performance" of both marshaling JSON and Serializing Protobufs. I spent some time learning JMH and designed my plan on how to test both methods. Using JMH, I designed a series of tests that allowed me to populate my POJO model, then construct a method that converted into and out of each of the technologies. I isolated the conversion of the objects in order to capture just the costs associated with conversion.
TEST #1: RESULTS
My results were not surprising as I expected Protobuf to be more efficient. I measured the average time to marshal an object into JSON at 876.754 ns/operation (±43.222ns) versus 148.160 ns/operation (±6.922ns) showing that equivalent objects converted into Protobuf was nearly 6 times faster than into JSON.
Reversing a JSON and Protobuf message into a POJO yielded slower results and were closer together, but Protobuf still out performed JSON un-marshaling. Converting a JSON string into the domain object took on average 2037.075 ns/operation (±121.997) and Protobuf message to object took on average 844.382 ns/operation (±41.852), nearly 2.4 times faster than JSON.
Run the samples yourself using the github project created for this project: https://github.com/mike-ensor/protobuf-serialization
Test #2: Bandwidth differences
I did not find a straight forward way to capture bandwidth using traditional Java-based tools, so I decided to setup a service on AWS and communicate to the API using JSON and Protobuf requests. I then captured the traffic using Wireshark and calculated the total amount of bytes sent for these requests. I included the headers and payload in the calculation since both JSON and Protobufs require Accepts and Content-Type mime-type headers.
Test #2: Results
The total size of the request for the JSON request was 789 bytes versus the Protobuf at 518 bytes. While the JSON request was 45% greater in size than the Protobuf, there was no optimization applied to either request. The JSON was minified but not compressed. Using compression can be detrimental to the overall performance of the solution based on the payload size. If the payload is too small, the cost of compressing and decompressing will overcome the benefits of a smaller payload. This is a very similar problem to the costs associated with marshaling JSON with small payloads as found by Jeremy's blog.
After completing a project to help determine the overall benefits of using Protobuf over JSON I have come to a conclusion that unless performance is absolutely critical and the developing team's maturity level is high enough to understand the high costs of using Protobufs, then it is a legitimate option to increase the performance associated with message passing.
That being said, the costs of working with Protobufs is very high. Developers lose access to human readable messages often useful during debugging. Additionally, Protobufs are messages, not objects and therefore come with more structure and rigger which I found to be complicated due to the inflexibility using only primitives and enums, and updating messages requires the developer to mark new fields as "optional" for backwards compatibility. Lastly, there is limited documentation on Protocol Buffers beyond the basic "hello world" applications.
At the beginning of December, we sent several of our Amazon Web Services (AWS) certified developers and solutions architects to AWS’s annual conference re:Invent. During the five day event, AWS introduced a long list of new features, new services and enrichment of existing services.
The key themes we saw throughout the conference surrounded serverless architecture, voice enablement and a new focus on "compute at the edge." Database solutions continue to be improved with new offerings like Amazon Athena and enhancements to Amazon Aurora and Redshift.
Below are some of the announcements that our developers consider to be the highlights of this year’s show:
Amazon Lightsail is the AWS answer to new user complaints that Amazon Elastic Compute Cloud (EC2) is still far too complex with too steep of a learning curve. Additionally, some customers discovered they could run Google Compute Engine (GCE) at a lower cost than EC2. Lightsail addresses both the complexity and pricing issue.
Anyone who has used EC2 for more than a few years knows that it has come a long way in reliability and in ease of use. In the early days, there was a steep learning curve to understanding how to define, deploy, and manage EC2 instances, and it was not uncommon for them to crash, freeze up or otherwise need to be urgently relaunched or redeployed. Prior to Amazon Elastic Block Store (EBS) volumes, you even lost all your data on the ephemeral virtual hard drives anytime an EC2 instance was no longer running. These challenges forced good planning and good habits, but it nevertheless made it harder for a new user to efficiently get up and running with EC2 instances.
EC2 has become significantly more reliable and easier to get up and running, to the detriment of some of the good habits and best practices that users were previously forced to implement, but it still requires a fair amount of knowledge. For example, users need to understand a Virtual Private Cloud (VPC), public vs. private subnets, routing tables, security groups, and DNS resolution. A software development team just trying to get a prototype up and running might not fully understand these infrastructure concepts that perhaps their IT colleagues took care of in their on-premises paradigm.
Meanwhile, Google and Microsoft made it fairly simple to get up and running quickly with a Virtual Private Server (VPS) in Google Cloud or Microsoft Azure.
Enter Amazon Lightsail. Lightsail is a preconfigured, easy-to-deploy VPS that starts at just $5 USD per month in the US East (Northern Virginia) Region (currently the only supported region for Lightsail as of this writing). Now anyone can get up and running quickly without having to first learn about VPCs, subnets, and other networking topics. Lightsail is expected to become a popular offering because of its pricing and ease of use. More information about Lightsail is available at https://amazonlightsail.com/
One year ago at re:Invent 2015, AWS QuickSight was announced. QuickSight makes it easy for business users to analyze and summarize data. Initially the offering was restricted to a small audience with interested customers finding themselves on a waiting list. It was finally made generally available (GA) shortly before re:Invent 2016.
Over the past year, there has been significant interest and buzz around QuickSight. While AWS historically focuses on the needs of developers, QuickSight allows less technical end users, for example line of business (LOB) managers, to use the tool – even from a smartphone – to analyze and visualize everything from Excel spreadsheets to complex relational databases.
Below is a sample screenshot from QuickSight running on an iPhone 7 Plus:
Notably, QuickSight can work with on premises databases as well as with data in Amazon Simple Storage Service (S3). It of course also supports Amazon Redshift, Amazon RDS, and Amazon Aurora.
QuickSight is expected to rapidly become popular now that it is GA. Full details are available at https://quicksight.aws/
The most significant announcement regarding Amazon Aurora is that it is now also PostgreSQL compatible. Additionally, both Amazon Aurora and RDS for PostgreSQL are now included in the AWS HIPAA compliance program, allowing customers needing HIPAA compliance to take advantage of the managed service of either Aurora or RDS.
A key argument the Aurora team makes when encouraging customers to choose Aurora over a NoSQL solution is the available talent of MySQL experts. MySQL certified developers and DBAs can be up and running quickly with Aurora when taking advantage of its MySQL compatibility, whereas there is a smaller talent pool of resources knowledgeable in NoSQL solutions, according to AWS. Additionally, some Aurora customers have found a 40% cost savings when migrating away from a NoSQL solution to Amazon Aurora, according to the Aurora team.
Migrating databases is of course not trivial, so careful consideration should be given up front whether an application should leverage an ACID compliant relational database or instead use a NoSQL database solution. Pricing, available talent, and vendor lock-in are also important considerations.
An interesting scenario mentioned by AWS was existing MySQL customers converting their multiple MySQL shards into a single Amazon Aurora database. Application-level sharding is a popular scalability strategy in MySQL when needing to scale out and reduce individual table sizes. For example, some MySQL tables will have performance difficulties beyond hundreds of millions of records in a single table, and sharding is a popular strategy for reducing this table size. A telephone directory of the United States, for instance, could be broken down into geographic shards, with each shard containing a subset of the data (e.g., Northern California shard, Midwest shard, etc.). Some customers are finding that with Amazon Aurora, they can consolidate their shards, which reduces cost and complexity while making large queries much simpler.
Compute at the Edge
Another key theme at re:Invent was Compute at the Edge. AWS has quietly been hinting about this for the past year, and re:Invent saw three important announcements in this space: AWS Greengrass, Lambda@Edge, and Snowball Edge.
AWS Greengrass puts compute, and specifically AWS Lambda, on local devices so they can function independently of the Cloud. A popular use case is a home automation solution, where perhaps your car pulling in the driveway causes your house doors to unlock, your indoor holiday lights to turn on, your kitchen lights to turn on, music to start playing, and indoor surveillance cameras to turn off. The challenge prior to Greengrass is that often this logic requires Internet connectivity and fast responsiveness from code running in the Cloud. A proximity sensor or door sensor would communicate to a local hub which would in turn find out what to do from code running in the Cloud. The code in the Cloud would then send a push command back down to the hub and/or to additional devices within the home. If there were an Internet outage, Cloud provider incident, or other connectivity problem – perhaps even just extended latency – desired actions might be delayed minutes or hours. We want our lights turned on immediately, not hours later. Bringing the code and compute down to the local devices eliminates the Cloud and the Internet as potential single points of failure (SPOF).
Greengrass allows developers to use their existing AWS Lambda skills to also implement code locally. More information about Greengrass is available at https://aws.amazon.com/greengrass/
A popular announcement at 2015 re:Invent was Snowball, an AWS appliance that allows for simple transfer of terabytes of data from an on-premises datacenter to Amazon S3. Snowball Edge, announced last week, not only doubles the capacity of a single device from 50 TB to 100 TB, but it also adds local clustering ability and local compute power in order to more intelligently process and transfer data. Details on Snowball Edge are here: https://aws.amazon.com/snowball-edge/
It is also worth noting that in October 2016, AWS added the ability to transfer directly from HDFS to Snowball, a feature that was missing at the original launch of Snowball. Prior to October, customers had to do an intermediary step of getting the data out of HDFS first and then to Snowball second. This commonly requested feature makes migrating HDFS data to AWS significantly faster and easier. Details on this announcement are here: https://aws.amazon.com/blogs/aws/snowball-hdfs-import/
Amazon Athena is one of the more interesting serverless solutions, since it introduces a “pay per query” model. Athena uses standard SQL to query data directly from S3. Anyone with SQL skills can immediately start analyzing data files with no infrastructure required. It is not expected to be as fast as server-based solutions, but it provides an interesting alternative when ad-hoc questions need to be answered without time or budget to set up an infrastructure first. More details are here: https://aws.amazon.com/athena/
Full List of Announcements
The full list of announcements from AWS re:Invent 2016 are available at this web site: https://aws.amazon.com/new/reinvent/
Videos of re:Invent sessions are available here: https://www.youtube.com/user/AmazonWebServices
The Cloud is a strategic part of everything we do. Every year it becomes easier to use and configure, easier to access and more powerful. This year’s re:Invent announcements represent the continuation of this trend. At Dev9, we believe that every industry can benefit in some way by using Cloud services. We are dedicated Cloud experts supporting our current and future clients on their Cloud journey.
Dev9 is proud to be a member of the Amazon Web Services Partner network. We work with organizations to develop custom software solutions, migrate from legacy systems to the Cloud, and modernize existing applications. Find out more about our Cloud services here, or contact us if you’d like to talk about your specific Cloud needs.
KIRKLAND, WA--(Marketwired - September 15, 2016) - Dev9, a Kirkland, Washington-based custom software development company, is pleased to announce a technology partnership with Broadleaf Commerce, a provider of B2B and B2C e-commerce platform solutions for complex, multi-channel commerce and digital experience management.
"We've worked with many different organizations at Dev9 and have seen a common problem: businesses are often forced to tailor their e-commerce processes around their software. It shouldn't have to be that way," Mike Ensor, Practice Director of Digital Transformation Services at Dev9 said. "By leveraging Broadleaf's platform, we're able to work with our clients to create an e-commerce solution that's just right for their business -- and one that will scale easily as they continue to evolve."
After an extensive analysis of e-commerce platform options, Dev9 has selected Broadleaf as a technology partner. Dev9 found that Broadleaf's feature-rich platform best fits the company's core principles of predictive, transparent and lean development, enabling flexibility and scalability. Broadleaf's solutions allow Dev9 engineers to maintain Continuous Delivery best practices while designing an end-to-end, custom e-commerce system that exactly fits the client's needs.
"Broadleaf's core philosophies are closely aligned with Dev9's. We're committed to providing tailored, lightweight solutions designed for continuous innovation," stated Brad Buhl, COO at Broadleaf Commerce. "For complex enterprise commerce systems, Dev9's focus on Continuous Delivery is a perfect fit. Iterative implementations, automated testing, continuous integration, and automated deployments provide businesses with platform stability, while lowering the cost and risk associated with monolithic projects."
Dev9 has deep experience architecting and developing e-commerce systems for enterprises. This partnership with Broadleaf will expedite development and support Dev9 in its promise to deliver superior custom software solutions for clients.
Broadleaf Commerce provides B2B and B2C e-commerce platform solutions to simplify the complexities of multi-channel commerce and digital experience management. As the market-leading choice for enterprise organizations requiring tailored, highly scalable commerce systems, Broadleaf is fully customizable and extensible. Trusted by Fortune 500 corporations, yet priced for the mid-market, Broadleaf provides the framework for leading brands, including Google, The Container Store, O'Reilly Auto Parts, and Vology.
For more information, visit www.broadleafcommerce.com.