Files
davideisinger.com/static/archive/artsy-github-io-sroaub.txt
2024-11-21 10:35:42 -05:00

396 lines
18 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
[1]
• [2]API
• [3]Careers
• [4]@artsyopensource
• [5]Artsy.net
[6]Open Source
[7]Artsy
[8]Engineering Blog
toggle menu
[10]
Introduction to AWS OpsWorks
Aug 27, 2013
[11]Joey Aghion
[12]@joeyAghion
OpsWorks is a new service from Amazon that promises to provide high-level tools
to manage your EC2-based deployment. From [13]the announcement:
AWS OpsWorks features an integrated management experience for the entire
application lifecycle including resource provisioning, configuration
management, application deployment, monitoring, and access control. It will
work with applications of any level of complexity and is independent of any
particular architectural pattern.
After scratching our heads about exactly what that meant, we tried it anyway.
If youve been straining at the limits of your Platform as a Service (PaaS)
provider, or just wishing for more automation for your EC2 deployment, you may
want to try it out too.
Artsy has been experimenting with OpsWorks for a few months now and recently
adopted it for the production [14]artsy.net site. Were excited to share what
weve learned in the process.
[15] OpsWorks overview
Why OpsWorks?
If youve worked with the confusing array of AWS services in the past, youre
already wondering how OpsWorks fits in. Amazons own [16]Elastic Beanstalk or
PaaS providers such as [17]Heroku typically focus on making your application as
simple as possible to deploy. You dont have to worry about the underlying
hardware or virtual resources; the platform manages that transparently.
Dependencies (such as a data-store, cache, or email server) often take the form
of external services.
But this simplicity comes at a cost. Your applications architecture is
constrained to a few common patterns. Your functionality may be limited by the
system packages available in the standardized environment, or your performance
may be limited by the available resources. OpsWorks offers more flexibility and
control, allowing you to customize the types of servers you employ and the
layers or services that make up your application. Its a lower-level tool than
those PaaS providers.
Conversely, OpsWorks offers higher-level control than [18]CloudFormation or
than managing EC2 instances and related services directly. By focusing on the
most commonly used AWS services, instance types, and architectures, it can
provide greater automation and more robust tools for configuration,
authorization, scaling, and monitoring. Amazon CTO [19]Werner Vogels rendered
it thus:
[20] How OpsWorks fits in AWS offerings
Historically, Artsy delegated dev-ops concerns to Heroku. They worried about
infrastructure, freeing us to focus on our applications higher-level goals.
Increasingly though, we were forced to work around limitations of the
platforms performance, architecture, and customizability. (We even blogged
about it [21]here, [22]here, [23]here, [24]here, and [25]here.) Rather than
continue to work against the platform, we turned to OpsWorks for greater
flexibility while keeping administrative burden low.
OpsWorks Overview
OpsWorks comes with a new vocabulary. Lets look at the major concepts:
• A Stack is the highest-level container. It groups custom configuration and
houses one or more applications. To manage a simple to-do list site, youd
create a todo stack, although you might choose to have separate
todo-production and todo-staging stacks.
• Each stack has one or more Layers. Think of these as definitions for
different server roles. A simple static web site might have a single Nginx
layer. A typical web application might instead have a load-balancer layer,
a Rails layer, and a MySQL layer. OpsWorks defines plenty of [26]built-in
layers (for Rails, HAProxy, PHP, Node, Memcached, MySQL, etc.), but you can
also define your own.
• Applications are your code, sourced from a git or subversion repository, an
S3 bucket, or even an external web site. A typical Rails site might have a
single application defined, but you can define multiple applications if
youd like to configure, scale, and monitor them together.
• Finally, we define Instances and assign each to one or more layers. These
are the EC2 servers themselves. You can start instances manually, or
configure them to start and stop on a schedule or in response to load
patterns.
Configuring your stack
If your app employs a common architecture, you can probably use the OpsWorks
dashboard to define layers, add a few instances, link your git repo and be up
and running. Examples:
• A static web site hosted on Nginx
• A single-server PHP app
• A Rails app with an [27]HAProxy load-balancer, unicorn app servers, and
MySQL database
• A Node.js app using [28]Elastic Load Balancer and a Memcached cache
You can find [29]detailed walk-throughs of a few such common use cases in the
OpsWorks docs.
[30] PHP app instances (image from AWS blog)
If the built-in layers dont quite satisfy your needs, there are several
facilities for customization. But first, its useful to understand how OpsWorks
manages your instances.
Chef cookbooks
OpsWorks uses [31]Chef to configure EC2 instances. If youre unfamiliar, Chef
is a popular tool for making server configuration more automated and
repeatable—like code. The Chef “recipes” that configure each layer are
open-source and available in the [32]opsworks-cookbooks github repo. (Cookbooks
contain one or more “recipes”—get it?) There, you can see precisely what
commands are run in response to server lifecycle events (i.e., as servers are
started, configured, deployed to, and stopped). These recipes write out
configuration files, restart services, authorize users for SSH access, ensure
logs are rotated, etc.—everything typical deployments might need.
For example, the recipes that set up an HAProxy instance look like this:
[33] Built-in recipes for the HAProxy layer
Overriding configuration “attributes”
Chef cookbooks accept parameters in the form of “node attributes.” The default
attributes will serve you well in most cases. To override them, edit the
stacks [34]custom Chef JSON. For example, to configure Unicorn to run 8
workers instead of 16 and Memcached to bind to port 11212 instead of 11211,
youd enter the following for your stacks custom JSON:
[35] <img src=”/images/2013-08-27-introduction-to-aws-opsworks/custom_json.png”
alt=”{“rails:” {“max_pool_size”: 8}, “memcached”: {“port”: 11212}}” style=””>
Custom cookbooks
If setting node attributes isnt sufficient, you can go further and override
the files written out by your layers recipes. Simply toggle the Use custom
Chef cookbooks option in your stack settings and provide a link to a git,
subversion, S3, or HTTP location for your [36]custom cookbooks.
[37] Enabling custom cookbooks
Your custom cookbooks bundle can also contain original or [38]borrowed recipes
that perform any other custom configuration. Tell OpsWorks when to run your
recipes by associating them with the desired events in your layer settings. For
example, we use custom recipes at our Rails layers setup stage to perform
additional Nginx configuration, install a JavaScript runtime, and send logs to
[39]Papertrail.
[40] custom Chef recipes
OpsWorks shares details about the entire stack with recipes via node
attributes, allowing custom recipes to connect to other instances as required.
Custom layers
If the built-in layers dont satisfy your needs even after customization, you
can create custom layers. The base OpsWorks configuration is provided (for SSH
authorization, monitoring, etc.) and your custom recipes do the rest. For
example, we created a custom layer to process background jobs:
[41] custom background jobs layer
Down the road, we might introduce additional layers for Redis, Solr, or
MongoDB. (Even better, AWS may introduce built-in support for these.)
Performance
OpsWorks makes most [42]EC2 instance types available, so we can choose an
appropriate balance of CPU power, memory, disk space, network performance, and
architecture for each instance. This can be a huge boon to the performance of
resource-constrained applications. It probably still pales in comparison to
running directly on physical hardware, but this benefit alone could make
OpsWorks a worthwhile choice over providers of “standard” computing resources.
While not a rigorous comparison, the experience of one of our particularly
memory-constrained applications illustrates this. The applications responses
took an average of 638 milliseconds when running on Herokus [43]“2x” (1 GB)
dynos. The same application responded in only 134 milliseconds on
OpsWorks-managed m1.large instances (with 7.5 GB). Thats a ~80% (5x)
improvement!
[44] OpsWorks performance superimposed on Heroku performance (chart: New Relic)
Troubleshooting
Thats all well and good, but what about when things arent working?
Weve experienced our fair share of failures with both OpsWorks and Heroku.
PaaS providers like Heroku offer a pleasant abstraction, but in doing so reduce
our visibility into the systems running our application. (Want to know why a
dyno seems to be performing poorly? Good luck diagnosing resource contention,
disk space problems, or network latency.) Instead, were reduced to repeatedly
issuing restart commands.
In contrast, I can easily SSH into an OpsWorks instance and notice that a
runaway process has pegged the CPU or that a chatty log has filled the disk.
(Of course, the additional control afforded by OpsWorks increases the chance
that Ive caused the problem myself.)
Which do we prefer? Wed probably be safer with Herokus experts in charge, but
Ill happily accept light sysadmin duties in exchange for the flexibility
OpsWorks affords. And by sticking with the OpsWorks default recipes as much as
possible, we benefit from the platforms combined experience.
Scaling and recovery
Scalability and recovery are critical, so how does OpsWorks compare to
full-featured PaaS providers? Pretty well, actually.
OpsWorks instances can be launched in multiple AWS availability zones for
greater redundancy. And if an instance fails for any reason, OpsWorks will stop
it and start a new one in its place.
Especially useful is the automatic scaling, which can be time-based or
load-based. This nicely matches the horizontal scaling needs of our app: weve
chosen to run additional Rails app servers during peak business hours, and
additional background workers when load on existing servers exceeds a certain
threshold.
[45] time-based scaling
[46] load-based scaling
When background workers are busy, new instances spin up automatically to tackle
the growing queue. That is dev-ops gold.
Monitoring
OpsWorks provides a monitoring view of each stack, with CPU, memory, load, and
process statistics aggregated by layer. You can drill down to individual
instances and review periods anywhere from 1 hour to 2 weeks long.
[47] OpsWorks monitoring view
We havent tried it, but OpsWorks also offers a built-in [48]Ganglia layer that
automatically collects metrics from each of your stacks instances.
Conveniently, AWS also sends these metrics to its own [49]CloudWatch monitoring
service, where you can configure custom alerts.
Integration with other AWS services
You might be noticing a theme here: OpsWorks leverages AWSs other tools and
services quite a bit.
[50]Identity and Access Management (IAM) allows you to define individual user
accounts within an umbrella account for your organization. These users can be
authorized for varying levels of access to your OpsWorks stacks. From the
Permissions view of each stack, you can then grant them SSH and sudo rights on
an individual basis.
[51] OpsWorks permissions view
Other tools such as the [52]EC2 Dashboard and [53]AWS API work as youd hope,
with all of the usual functions being applicable to your OpsWorks-managed
instances and other services like elastic IPs and EBS volumes.
Cost
Pricing is simple and enticing. Theres no charge for using OpsWorks; you pay
only for your underlying usage of other AWS resources like EC2 instances, S3
storage, bandwidth, elastic IPs, etc. If youve purchased [54]reserved
instances, those savings will apply as usual.
Unfortunately, OpsWorks doesnt yet support [55]spot instances (but I imagine
thats in the works).
Roadmap
In the few months since its launch, OpsWorks has added support for [56]ELB,
monitoring, custom AMIs, and more recent versions of Chef and Ruby. Theres
also an [57]active discussion forum where developers and Amazon employees
circulate issues and request features. Its a relatively new service and can
occasionally be rough around the edges, butknowing AWSwe expect the current
pace of enhancements to continue.
Weve already launched one major app on OpsWorks and will be looking for more
opportunities as it gains a following and grows in sophistication.
Look for a follow-up post where we document our experience transitioning an app
from Heroku to OpsWorks!
Posted by
[58] Joey Aghion
[59]Site [60]GitHub [61]@joeyAghion
Categories: [62]AWS, [63]Heroku, [64]OpsWorks, [65]dev-ops
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Comments
Please enable JavaScript to view the [66]comments powered by Disqus.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
[67]« Normalizing GMail E-Mail Addresses with CanonicalEmails [68]Upgrading to
Mongoid 4.x »
• [69]API
• [70]Careers
• [71]@artsyopensource
• [72]Artsy.net
References:
[1] https://www.artsy.net/
[2] https://developers.artsy.net/
[3] https://www.artsy.net/jobs
[4] http://twitter.com/artsyopensource
[5] http://www.artsy.net/
[6] https://artsy.github.io/open-source
[7] https://artsy.github.io/
[8] https://artsy.github.io/
[10] https://artsy.github.io/blog/2013/08/27/introduction-to-aws-opsworks/
[11] https://artsy.github.io/author/joey
[12] https://twitter.com/joeyAghion
[13] http://aws.typepad.com/aws/2013/02/aws-opsworks-flexible-application-management-in-the-cloud.html
[14] http://artsy.net/
[15] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/opsworks.png
[16] http://aws.amazon.com/elasticbeanstalk/
[17] http://heroku.com/
[18] https://aws.amazon.com/cloudformation/
[19] http://www.allthingsdistributed.com/2013/02/aws-opsworks.html
[20] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/aws_control.png
[21] http://artsy.github.io/blog/2012/01/31/beyond-heroku-satellite-delayed-job-workers-on-ec2/
[22] http://artsy.github.io/blog/2012/11/15/how-to-monitor-503s-and-timeout-on-heroku/
[23] http://artsy.github.io/blog/2012/12/13/beat-heroku-60-seconds-application-boot-timeout-with-a-proxy/
[24] http://artsy.github.io/blog/2013/02/01/master-heroku-command-line-with-heroku-commander/
[25] http://artsy.github.io/blog/2013/02/17/impact-of-heroku-routing-mesh-and-random-routing/
[26] http://docs.aws.amazon.com/opsworks/latest/userguide/workinglayers.html
[27] http://haproxy.1wt.eu/
[28] http://aws.amazon.com/elasticloadbalancing/
[29] http://docs.aws.amazon.com/opsworks/latest/userguide/walkthroughs.html
[30] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/standard_instances.png
[31] http://www.opscode.com/chef/
[32] http://github.com/aws/opsworks-cookbooks
[33] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/haproxy_recipes.png
[34] http://docs.aws.amazon.com/opsworks/latest/userguide/workingstacks-json.html
[35] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/custom_json.png
[36] http://docs.aws.amazon.com/opsworks/latest/userguide/workingcookbook-installingcustom-enable.html
[37] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/custom_cookbooks.png
[38] http://docs.opscode.com/essentials_cookbooks.html
[39] https://papertrailapp.com/
[40] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/custom_recipes.png
[41] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/custom_layer.png
[42] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/instance-types.html
[43] https://devcenter.heroku.com/articles/dyno-size
[44] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/new_relic_comparison.png
[45] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/time-based_scaling.png
[46] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/load-based_scaling.png
[47] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/monitoring.png
[48] http://docs.aws.amazon.com/opsworks/latest/userguide/workinglayers-ganglia.html
[49] http://aws.amazon.com/cloudwatch/
[50] http://aws.amazon.com/iam/
[51] https://artsy.github.io/images/2013-08-27-introduction-to-aws-opsworks/permissions.png
[52] https://console.aws.amazon.com/ec2
[53] http://docs.aws.amazon.com/AWSRubySDK/latest/frames.html
[54] http://aws.amazon.com/ec2/reserved-instances/
[55] http://aws.amazon.com/ec2/spot-instances/
[56] http://aws.amazon.com/elasticloadbalancing/
[57] https://forums.aws.amazon.com/forum.jspa?forumID=153
[58] https://artsy.github.io/author/joey
[59] http://joey.aghion.com/
[60] https://github.com/joeyAghion
[61] https://twitter.com/joeyAghion
[62] https://artsy.github.io/blog/categories/aws/
[63] https://artsy.github.io/blog/categories/heroku/
[64] https://artsy.github.io/blog/categories/opsworks/
[65] https://artsy.github.io/blog/categories/dev-ops/
[66] http://disqus.com/?ref_noscript
[67] https://artsy.github.io/blog/2013/06/23/normalizing-gmail-email-addresses-with-canonical-emails/
[68] https://artsy.github.io/blog/2013/11/07/upgrading-to-mongoid4/
[69] https://developers.artsy.net/
[70] https://www.artsy.net/jobs
[71] http://twitter.com/artsyopensource
[72] http://www.artsy.net/