From 0438a6d828e93ae2b34f6f1d9e33b86f3b59bc8e Mon Sep 17 00:00:00 2001 From: David Eisinger Date: Sun, 22 Oct 2023 23:52:56 -0400 Subject: [PATCH] Pull in Viget posts --- .../index.md | 182 ++++++++ .../around-hello-world-in-30-days/index.md | 93 ++++ .../aws-opsworks-lessons-learned/index.md | 113 +++++ .../backup-your-database-in-git/index.md | 51 +++ .../coffeescript-for-ruby-bros/index.md | 71 +++ .../convert-ruby-method-to-lambda/index.md | 63 +++ .../curl-and-your-rails-2-app/index.md | 52 +++ .../index.md | 23 + .../diving-into-go-a-five-week-intro/index.md | 232 ++++++++++ .../index.md | 231 ++++++++++ .../index.md | 86 ++++ .../elsewhere/first-class-failure/index.md | 131 ++++++ .../index.md | 122 +++++ .../index.md | 151 +++++++ .../index.md | 121 +++++ .../get-lazy-with-custom-enumerators/index.md | 78 ++++ .../getting-into-open-source/index.md | 51 +++ .../elsewhere/gifts-for-your-nerd/index.md | 97 ++++ .../index.md | 75 +++ .../index.md | 57 +++ .../introducing-email-labs-client/index.md | 30 ++ .../elsewhere/json-feed-validator/index.md | 80 ++++ .../elsewhere/large-images-in-rails/index.md | 86 ++++ .../lets-make-a-hash-chain-in-sqlite/index.md | 233 ++++++++++ .../index.md | 427 ++++++++++++++++++ .../level-up-your-shell-game/index.md | 301 ++++++++++++ .../local-docker-best-practices/index.md | 345 ++++++++++++++ .../index.md | 122 +++++ .../index.md | 190 ++++++++ .../manual-cropping-with-paperclip/index.md | 81 ++++ content/elsewhere/motivated-to-code/index.md | 79 ++++ .../elsewhere/multi-line-memoization/index.md | 46 ++ .../index.md | 38 ++ .../index.md | 55 +++ .../index.md | 61 +++ .../index.md | 107 +++++ .../otp-ocaml-haskell-elixir/index.md | 192 ++++++++ content/elsewhere/out-damned-tabs/index.md | 65 +++ .../pandoc-a-tool-i-use-and-like/index.md | 214 +++++++++ .../index.md | 183 ++++++++ .../practical-uses-of-ruby-blocks/index.md | 86 ++++ .../protip-timewithzone-all-the-time/index.md | 49 ++ content/elsewhere/puma-on-redis/index.md | 78 ++++ .../rails-admin-interface-generators/index.md | 96 ++++ .../elsewhere/refresh-006-dr-jquery/index.md | 35 ++ .../refresh-recap-the-future-of-data/index.md | 36 ++ .../regular-expressions-in-mysql/index.md | 75 +++ .../index.md | 87 ++++ .../index.md | 57 +++ .../index.md | 50 ++ .../sessions-on-pcs-and-macs/index.md | 46 ++ .../shoulda-macros-with-blocks/index.md | 46 ++ .../index.md | 135 ++++++ .../simple-app-stats-with-statboard/index.md | 28 ++ .../index.md | 207 +++++++++ .../simple-secure-file-transmission/index.md | 80 ++++ .../single-use-jquery-plugins/index.md | 86 ++++ .../social-media-api-gotchas/index.md | 71 +++ .../index.md | 67 +++ .../stop-pissing-off-your-designers/index.md | 85 ++++ .../index.md | 136 ++++++ .../testing-your-codes-text/index.md | 54 +++ .../elsewhere/the-balanced-developer/index.md | 49 ++ .../index.md | 85 ++++ .../index.md | 138 ++++++ .../index.md | 219 +++++++++ .../three-magical-git-aliases/index.md | 63 +++ .../elsewhere/unfuddle-user-feedback/index.md | 46 ++ .../index.md | 260 +++++++++++ .../viget-devs-storm-chicago/index.md | 29 ++ .../index.md | 153 +++++++ .../whats-new-since-the-last-deploy/index.md | 69 +++ .../index.md | 161 +++++++ .../index.md | 190 ++++++++ themes/v2/assets/css/style.scss | 26 +- themes/v2/layouts/elsewhere/list.html | 17 + themes/v2/layouts/elsewhere/single.html | 14 + 77 files changed, 8219 insertions(+), 5 deletions(-) create mode 100644 content/elsewhere/adding-a-not-null-column-to-an-existing-table/index.md create mode 100644 content/elsewhere/around-hello-world-in-30-days/index.md create mode 100644 content/elsewhere/aws-opsworks-lessons-learned/index.md create mode 100644 content/elsewhere/backup-your-database-in-git/index.md create mode 100644 content/elsewhere/coffeescript-for-ruby-bros/index.md create mode 100644 content/elsewhere/convert-ruby-method-to-lambda/index.md create mode 100644 content/elsewhere/curl-and-your-rails-2-app/index.md create mode 100644 content/elsewhere/devnation-coming-to-san-francisco/index.md create mode 100644 content/elsewhere/diving-into-go-a-five-week-intro/index.md create mode 100644 content/elsewhere/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/index.md create mode 100644 content/elsewhere/extract-embedded-text-from-pdfs-with-poppler-in-ruby/index.md create mode 100644 content/elsewhere/first-class-failure/index.md create mode 100644 content/elsewhere/five-turbo-lessons-i-learned-the-hard-way/index.md create mode 100644 content/elsewhere/friends-undirected-graph-connections-in-rails/index.md create mode 100644 content/elsewhere/functional-programming-in-ruby-with-contracts/index.md create mode 100644 content/elsewhere/get-lazy-with-custom-enumerators/index.md create mode 100644 content/elsewhere/getting-into-open-source/index.md create mode 100644 content/elsewhere/gifts-for-your-nerd/index.md create mode 100644 content/elsewhere/how-why-to-run-autotest-on-your-mac/index.md create mode 100644 content/elsewhere/html-sanitization-in-rails-that-actually-works/index.md create mode 100644 content/elsewhere/introducing-email-labs-client/index.md create mode 100644 content/elsewhere/json-feed-validator/index.md create mode 100644 content/elsewhere/large-images-in-rails/index.md create mode 100644 content/elsewhere/lets-make-a-hash-chain-in-sqlite/index.md create mode 100644 content/elsewhere/lets-write-a-dang-elasticsearch-plugin/index.md create mode 100644 content/elsewhere/level-up-your-shell-game/index.md create mode 100644 content/elsewhere/local-docker-best-practices/index.md create mode 100644 content/elsewhere/maintenance-matters-continuous-integration/index.md create mode 100644 content/elsewhere/making-an-email-powered-e-paper-picture-frame/index.md create mode 100644 content/elsewhere/manual-cropping-with-paperclip/index.md create mode 100644 content/elsewhere/motivated-to-code/index.md create mode 100644 content/elsewhere/multi-line-memoization/index.md create mode 100644 content/elsewhere/new-pointless-project-i-dig-durham/index.md create mode 100644 content/elsewhere/new-pointless-project-officegames/index.md create mode 100644 content/elsewhere/on-confidence-and-real-time-strategy-games/index.md create mode 100644 content/elsewhere/otp-a-language-agnostic-programming-challenge/index.md create mode 100644 content/elsewhere/otp-ocaml-haskell-elixir/index.md create mode 100644 content/elsewhere/out-damned-tabs/index.md create mode 100644 content/elsewhere/pandoc-a-tool-i-use-and-like/index.md create mode 100644 content/elsewhere/pluck-subset-rails-activerecord-model-attributes/index.md create mode 100644 content/elsewhere/practical-uses-of-ruby-blocks/index.md create mode 100644 content/elsewhere/protip-timewithzone-all-the-time/index.md create mode 100644 content/elsewhere/puma-on-redis/index.md create mode 100644 content/elsewhere/rails-admin-interface-generators/index.md create mode 100644 content/elsewhere/refresh-006-dr-jquery/index.md create mode 100644 content/elsewhere/refresh-recap-the-future-of-data/index.md create mode 100644 content/elsewhere/regular-expressions-in-mysql/index.md create mode 100644 content/elsewhere/required-fields-should-be-marked-not-null/index.md create mode 100644 content/elsewhere/romanize-another-programming-puzzle/index.md create mode 100644 content/elsewhere/rubyinline-in-shared-rails-environments/index.md create mode 100644 content/elsewhere/sessions-on-pcs-and-macs/index.md create mode 100644 content/elsewhere/shoulda-macros-with-blocks/index.md create mode 100644 content/elsewhere/simple-apis-using-serializewithoptions/index.md create mode 100644 content/elsewhere/simple-app-stats-with-statboard/index.md create mode 100644 content/elsewhere/simple-commit-linting-for-issue-number-in-github-actions/index.md create mode 100644 content/elsewhere/simple-secure-file-transmission/index.md create mode 100644 content/elsewhere/single-use-jquery-plugins/index.md create mode 100644 content/elsewhere/social-media-api-gotchas/index.md create mode 100644 content/elsewhere/static-asset-packaging-rails-3-heroku/index.md create mode 100644 content/elsewhere/stop-pissing-off-your-designers/index.md create mode 100644 content/elsewhere/testing-solr-and-sunspot-locally-and-on-circleci/index.md create mode 100644 content/elsewhere/testing-your-codes-text/index.md create mode 100644 content/elsewhere/the-balanced-developer/index.md create mode 100644 content/elsewhere/the-little-schemer-will-expand-blow-your-mind/index.md create mode 100644 content/elsewhere/the-right-way-to-store-and-serve-dragonfly-thumbnails/index.md create mode 100644 content/elsewhere/things-about-which-the-viget-devs-are-excited-may-2020-edition/index.md create mode 100644 content/elsewhere/three-magical-git-aliases/index.md create mode 100644 content/elsewhere/unfuddle-user-feedback/index.md create mode 100644 content/elsewhere/using-microcosm-presenters-to-manage-complex-features/index.md create mode 100644 content/elsewhere/viget-devs-storm-chicago/index.md create mode 100644 content/elsewhere/whats-in-a-word-building-a-verbose-party-game/index.md create mode 100644 content/elsewhere/whats-new-since-the-last-deploy/index.md create mode 100644 content/elsewhere/why-i-still-like-ruby-and-a-few-things-i-dont-like/index.md create mode 100644 content/elsewhere/write-you-a-parser-for-fun-and-win/index.md create mode 100644 themes/v2/layouts/elsewhere/list.html create mode 100644 themes/v2/layouts/elsewhere/single.html diff --git a/content/elsewhere/adding-a-not-null-column-to-an-existing-table/index.md b/content/elsewhere/adding-a-not-null-column-to-an-existing-table/index.md new file mode 100644 index 0000000..47d3365 --- /dev/null +++ b/content/elsewhere/adding-a-not-null-column-to-an-existing-table/index.md @@ -0,0 +1,182 @@ +--- +title: "Adding a NOT NULL Column to an Existing Table" +date: 2014-09-30T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/adding-a-not-null-column-to-an-existing-table/ +--- + +*Despite some exciting advances in the field, like +[Node](http://nodejs.org/), [Redis](http://redis.io/), and +[Go](https://golang.org/), a well-structured relational database fronted +by a Rails or Sinatra (or Django, etc.) app is still one of the most +effective toolsets for building things for the web. In the coming weeks, +I'll be publishing a series of posts about how to be sure that you're +taking advantage of all your RDBMS has to offer.* + +ASSUMING MY [LAST +POST](https://viget.com/extend/required-fields-should-be-marked-not-null) +CONVINCED YOU of the *why* of marking required fields `NOT NULL`, the +next question is *how*. When creating a brand new table, it's +straightforward enough: + + CREATE TABLE employees ( + id integer NOT NULL, + name character varying(255) NOT NULL, + created_at timestamp without time zone, + ... + ); + +When adding a column to an existing table, things get dicier. If there +are already rows in the table, what should the database do when +confronted with a new column that 1) cannot be null and 2) has no +default value? Ideally, the database would allow you to add the column +if there is no existing data, and throw an error if there is. As we'll +see, depending on your choice of database platform, this isn't always +the case. + +## A Naïve Approach {#anaïveapproach} + +Let's go ahead and add a required `age` column to our employees table, +and let's assume I've laid my case out well enough that you're going to +require it to be non-null. To add our column, we create a migration like +so: + + class AddAgeToEmployees < ActiveRecord::Migration + def change + add_column :employees, :age, :integer, null: false + end + end + +The desired behavior on running this migration would be for it to run +cleanly if there are no employees in the system, and to fail if there +are any. Let's try it out, first in Postgres, with no employees: + + == AddAgeToEmployees: migrating ============================================== + -- add_column(:employees, :age, :integer, {:null=>false}) + -> 0.0006s + == AddAgeToEmployees: migrated (0.0007s) ===================================== + +Bingo. Now, with employees: + + == AddAgeToEmployees: migrating ============================================== + -- add_column(:employees, :age, :integer, {:null=>false}) + rake aborted! + StandardError: An error has occurred, this and all later migrations canceled: + + PG::NotNullViolation: ERROR: column "age" contains null values + +Exactly as we'd expect. Now let's try SQLite, without data: + + == AddAgeToEmployees: migrating ============================================== + -- add_column(:employees, :age, :integer, {:null=>false}) + rake aborted! + StandardError: An error has occurred, this and all later migrations canceled: + + SQLite3::SQLException: Cannot add a NOT NULL column with default value NULL: ALTER TABLE "employees" ADD "age" integer NOT NULL + +Regardless of whether or not there are existing rows in the table, +SQLite won't let you add `NOT NULL` columns without default values. +Super strange. More information on this ... *quirk* ... is available on +this [StackOverflow +thread](http://stackoverflow.com/questions/3170634/how-to-solve-cannot-add-a-not-null-column-with-default-value-null-in-sqlite3). + +Finally, our old friend MySQL. Without data: + + == AddAgeToEmployees: migrating ============================================== + -- add_column(:employees, :age, :integer, {:null=>false}) + -> 0.0217s + == AddAgeToEmployees: migrated (0.0217s) ===================================== + +Looks good. Now, with data: + + == AddAgeToEmployees: migrating ============================================== + -- add_column(:employees, :age, :integer, {:null=>false}) + -> 0.0190s + == AddAgeToEmployees: migrated (0.0191s) ===================================== + +It ... worked? Can you guess what our existing user's age is? + + > be rails runner "p Employee.first" + # + +Zero. Turns out that MySQL has a concept of an [*implicit +default*](http://stackoverflow.com/questions/22868345/mysql-add-a-not-null-column/22868473#22868473), +which is used to populate existing rows when a default is not supplied. +Neat, but exactly the opposite of what we want in this instance. + +### A Better Approach {#abetterapproach} + +What's the solution to this problem? Should we just always use Postgres? + +[Yes.](https://www.youtube.com/watch?v=bXpsFGflT7U) + +But if that's not an option (say your client's support contract only +covers MySQL), there's still a way to write your migrations such that +Postgres, SQLite, and MySQL all behave in the same correct way when +adding `NOT NULL` columns to existing tables: add the column first, then +add the constraint. Your migration would become: + + class AddAgeToEmployees < ActiveRecord::Migration + def up + add_column :employees, :age, :integer + change_column_null :employees, :age, false + end + + def down + remove_column :employees, :age, :integer + end + end + +Postgres behaves exactly the same as before. SQLite, on the other hand, +shows remarkable improvement. Without data: + + == AddAgeToEmployees: migrating ============================================== + -- add_column(:employees, :age, :integer) + -> 0.0024s + -- change_column_null(:employees, :age, false) + -> 0.0032s + == AddAgeToEmployees: migrated (0.0057s) ===================================== + +Success -- the new column is added with the null constraint. And with +data: + + == AddAgeToEmployees: migrating ============================================== + -- add_column(:employees, :age, :integer) + -> 0.0024s + -- change_column_null(:employees, :age, false) + rake aborted! + StandardError: An error has occurred, this and all later migrations canceled: + + SQLite3::ConstraintException: employees.age may not be NULL + +Perfect! And how about MySQL? Without data: + + == AddAgeToEmployees: migrating ============================================== + -- add_column(:employees, :age, :integer) + -> 0.0145s + -- change_column_null(:employees, :age, false) + -> 0.0176s + == AddAgeToEmployees: migrated (0.0323s) ===================================== + +And with: + + == AddAgeToEmployees: migrating ============================================== + -- add_column(:employees, :age, :integer) + -> 0.0142s + -- change_column_null(:employees, :age, false) + rake aborted! + StandardError: An error has occurred, all later migrations canceled: + + Mysql2::Error: Invalid use of NULL value: ALTER TABLE `employees` CHANGE `age` `age` int(11) NOT NULL + +BOOM. [Flawless victory.](https://www.youtube.com/watch?v=kXuCvIbY1v4) + +\* \* \* + +To summarize: never use `add_column` with `null: false`. Instead, add +the column and then use `change_column_null` to set the constraint for +correct behavior regardless of database platform. In a follow-up post, +I'll focus on what to do when you don't want to simply error out if +there is existing data, but rather migrate it into a good state before +setting `NOT NULL`. diff --git a/content/elsewhere/around-hello-world-in-30-days/index.md b/content/elsewhere/around-hello-world-in-30-days/index.md new file mode 100644 index 0000000..6a8a264 --- /dev/null +++ b/content/elsewhere/around-hello-world-in-30-days/index.md @@ -0,0 +1,93 @@ +--- +title: "Around \"Hello World\" in 30 Days" +date: 2010-06-02T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/around-hello-world-in-30-days/ +--- + +I'll say this up front: I love my job. I love the web. I love Rails. And +I love working here at Viget. But lately, I've gone through periods +where web development feels a bit stale. Hugh Macleod has a great post +called [Beware of Turning Hobbies into +Jobs](http://gapingvoid.com/2008/01/10/beware-of-turning-hobbies-into-jobs/) +that sheds a bit of light on this problem: once you make a career out of +doing what you love, it's not solely yours anymore. There are clearly +[bigger problems one could +have](http://news.nationalgeographic.com/news/2010/06/100601-sinkhole-in-guatemala-2010-world-science/), +but I think this is something all developers struggle with at some point +in their careers. + +This problem was weighing on my mind one morning, combined with a +looming speaking engagement I'd committed to for [DevNation +Chicago](http://devnation.us/events/8), when it hit me: I would spend a +month trying a new technology every day, and then share my experiences +in Chicago. Learning is a core value here at Viget, and my coworkers +were incredibly supportive, adding to the list of technologies and +asking to join me in learning several of them. With their help, coming +up with the list was no problem --- it was actually harder to get the +list *down* to 30. Here's what I finally committed to: + +1. [Cassandra](http://cassandra.apache.org/) +2. [Chrome Extensions](https://code.google.com/chrome/extensions/) +3. [Clojure](http://clojure.org/) +4. [CoffeeScript](https://jashkenas.github.com/coffee-script/) +5. [CouchDB](http://couchdb.apache.org/) +6. [CSS3](http://www.css3.info/) +7. [Django](https://www.djangoproject.com/) +8. [Erlang](http://www.erlang.org/) +9. [Go](https://golang.org/) +10. [Haskell](http://www.haskell.org/) +11. [HTML5](https://en.wikipedia.org/wiki/HTML5) +12. [Io](http://www.iolanguage.com/) +13. [Jekyll](https://github.com/mojombo/jekyll) +14. [jQTouch](http://www.jqtouch.com/) +15. [Lua](http://www.lua.org/) +16. [MacRuby](http://www.macruby.org/) +17. [Mercurial](http://mercurial.selenic.com/) +18. [MongoDB](http://www.mongodb.org/) +19. [Node.js](http://nodejs.org/) +20. [OCaml](http://caml.inria.fr/) +21. [ooc](http://ooc-lang.org/) +22. [Redis](https://code.google.com/p/redis/) +23. [Riak](http://riak.basho.com/) +24. [Scala](http://www.scala-lang.org/) +25. [Scheme](https://en.wikipedia.org/wiki/Scheme_(programming_language)) +26. [Sinatra](http://www.sinatrarb.com/) +27. [Squeak](http://www.squeak.org/) +28. [Treetop](http://treetop.rubyforge.org/) +29. [VIM](http://www.vim.org/) +30. [ZSH](http://www.zsh.org/) + +Thirteen languages, most of them functional. Five datastores of various +[NoSQL](https://en.wikipedia.org/wiki/NoSQL) flavors. Five web +frameworks, and seven "others," including a new version control system, +text editor, and shell. + +Once I'd committed myself to this project, an hour a day for 30 days, it +was surprisingly easy to stick with it. The hour time slot was critical, +both as a minimum (no giving up when things get too hard or too easy) +and as a maximum (it's easier to sit down with an intimidating piece of +technology at 7 p.m. when you know you'll be done by 8). I did have some +ups and downs, though. High points included Redis, Scheme, Erlang, and +CoffeeScript. Lows included Cassandra and CouchDB, which I couldn't even +get running in the allotted hour. + +I created a simple [Tumblr blog](https://techmonth.tumblr.com) + +and posted to it after every new tech, which kept me accountable and +spurred discussion on Twitter and at the office. My talk went over +surprisingly well at DevNation ([here are my +slides](http://www.slideshare.net/deisinger/techmonth)), and I hope to +give it again at future events. + +All in all, it was a great experience and proved that projects that are +intimidating when considered all at once are easily manageable when +broken down into small pieces. The biggest lesson I took away from the +whole thing was that it's fundamental to find a way to make programming +fun. Working my way through [The Little +Schemer](https://www.amazon.com/Little-Schemer-Daniel-P-Friedman/dp/0262560992) +or building a simple webapp with [Node.js](http://nodejs.org/), I felt +like a kid again, pecking out my first QBasic programs. Learning how to +keep programming exciting is far more beneficial than any concrete +technical knowhow I gained. diff --git a/content/elsewhere/aws-opsworks-lessons-learned/index.md b/content/elsewhere/aws-opsworks-lessons-learned/index.md new file mode 100644 index 0000000..0222476 --- /dev/null +++ b/content/elsewhere/aws-opsworks-lessons-learned/index.md @@ -0,0 +1,113 @@ +--- +title: "AWS OpsWorks: Lessons Learned" +date: 2013-10-04T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/aws-opsworks-lessons-learned/ +--- + +We've been using Amazon's [AWS +OpsWorks](http://aws.amazon.com/opsworks/) to manage our infrastructure +on a recent client project. The website describes OpsWorks as + +> a DevOps solution for managing applications of any scale or complexity +> on the AWS cloud. + +You can think of it as a middleground between something like Heroku and +a manually configured server environment. You can also think of it as +[Chef](http://www.opscode.com/chef/)-as-a-service. Before reading on, +I'd recommend reading this [Introduction to AWS +OpsWorks](http://artsy.github.io/blog/2013/08/27/introduction-to-aws-opsworks/), +a post I wish had existed when I was first diving into this stuff. With +that out of the way, here are a few lessons I had to learn the hard way +so hopefully you won't have to. + +### You'll need to learn Chef {#youllneedtolearnchef} + +The basis of OpsWorks is [Chef](http://www.opscode.com/chef/), and if +you want to do anything interesting with your instances, you're going to +have to dive in, fork the [OpsWorks +cookbooks](https://github.com/aws/opsworks-cookbooks), and start adding +your own recipes. Suppose, like we did, you want to add +[PDFtk](http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) to your +servers to merge some documents: + +1. Check the [OpsCode + Community](http://community.opscode.com/cookbooks) site for a + recipe. +2. [A recipe exists.](http://community.opscode.com/cookbooks/pdftk) You + lucky dog. +3. Add the recipe to your fork, push it up, and run it. +4. It fails. Turns out they renamed the `gcj` package to `gcj-jdk`. + Fix. +5. It fails again. The recipe is referencing an old version of PDFtk. + Fix. +6. [Great sexy success.](http://cdn.meme.li/i/d1v84.jpg) + +A little bit tedious compared with `wget/tar/make`, for sure, but once +you get it configured properly, you can spin up new servers at will and +be confident that they include all the necessary software. + +### Deploy hooks: learn them, love them {#deployhooks:learnthemlovethem} + +Chef offers a number of [deploy +callbacks](http://docs.opscode.com/resource_deploy.html#callbacks) you +can use as a stand-in for Capistrano's `before`/`after` hooks. To use +them, create a directory in your app called `deploy` and add files named +for the appropriate callbacks (e.g. `deploy/before_migrate.rb`). For +example, here's how we precompile assets before migration: + + rails_env = new_resource.environment["RAILS_ENV"] + + Chef::Log.info("Precompiling assets for RAILS_ENV=#{rails_env}...") + + execute "rake assets:precompile" do + cwd release_path + command "bundle exec rake assets:precompile" + environment "RAILS_ENV" => rails_env + end + +### Layers: roles, but not *dedicated* roles {#layers:rolesbutnotdedicatedroles} + +AWS documentation describes +[layers](http://docs.aws.amazon.com/opsworks/latest/userguide/workinglayers.html) +as + +> how to set up and configure a set of instances and related resources +> such as volumes and Elastic IP addresses. + +The default layer types ("PHP App Server", "MySQL") imply that layers +distinguish separate components of your infrastructure. While that's +partially true, it's better to think about layers as the *roles* your +EC2 instances fill. For example, you might have two instances in your +"Rails App Server" role, a single, separate instance for your "Resque" +role, and one of the two app servers in the "Cron" role, responsible for +sending nightly emails. + +### Altering the Rails environment {#alteringtherailsenvironment} + +If you need to manually execute a custom recipe against your existing +instances, the Rails environment is going to be set to "production" no +matter what you've defined in the application configuration. In order to +change this value, add the following to the "Custom Chef JSON" field: + + { + "deploy": { + "app_name": { + "rails_env": "staging" + } + } + } + +(Substituting in your own application and environment names.) + +------------------------------------------------------------------------ + +We've found OpsWorks to be a solid choice for repeatable, maintainable +server infrastructure that still offers the root access we all crave. +Certainly, it's slower out of the gate than spinning up a new Heroku app +or logging into a VPS and `apt-get`ting it up, but the investment up +front leads to a more sustainable system over time. If this sounds at +all interesting to you, seriously go check out that [introduction +post](http://artsy.github.io/blog/2013/08/27/introduction-to-aws-opsworks/). +It's the post this post wishes it was. diff --git a/content/elsewhere/backup-your-database-in-git/index.md b/content/elsewhere/backup-your-database-in-git/index.md new file mode 100644 index 0000000..f6e7fb9 --- /dev/null +++ b/content/elsewhere/backup-your-database-in-git/index.md @@ -0,0 +1,51 @@ +--- +title: "Backup your Database in Git" +date: 2009-05-08T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/backup-your-database-in-git/ +--- + +**Short version**: dump your production database into a git repository +for an instant backup solution. + +**Long version**: keeping backups of production data is fundamental for +a well-run web application, but it's tricky to maintain history while +keeping disk usage at a reasonable level. You could continually +overwrite the backup with the latest data, but you risk automatically +replacing good data with bad. You could save each version in a separate, +timestamped file, but since most of the data is static, you would end up +wasting a lot of disk space. + +When you think about it, a database dump is just SQL code, so why not +manage it the same way you manage the rest of your code --- in a source +code manager? Setting such a scheme up is dead simple. On your +production server, with git installed: + + mkdir -p /path/to/backup cd /path/to/backup mysqldump -u [user] -p[pass] --skip-extended-insert [database] > [database].sql git init git add [database].sql git commit -m "Initial commit" + +The `--skip-extended-insert` option tells mysqldump to give each table +row its own `insert` statement. This creates a larger initial commit +than the default bulk insert, but makes future commits much easier to +read and (I suspect) keeps the overall repository size smaller, since +each patch only includes the individual records added/updated/deleted. + +From here, all we have to do is set up a cronjob to update the backup: + + 0 * * * * cd /path/to/backup && \ mysqldump -u [user] -p[pass] --skip-extended-insert [database] > [database].sql && \ git commit -am "Updating DB backup" + +You may want to add another entry to run +[`git gc`](http://www.kernel.org/pub/software/scm/git/docs/git-gc.html) +every day or so in order to keep disk space down and performance up. + +Now that you have all of your data in a git repo, you've got a lot of +options. Easily view activity on your site with `git whatchanged -p`. +Update your staging server to the latest data with +`git clone ssh://[hostname]/path/to/backup`. Add a remote on +[Github](https://github.com/) and get offsite backups with a simple +`git push`. + +This technique might fall down if your app approaches +[Craigslist](http://craigslist.org/)-level traffic, but it's working +flawlessly for us on [SpeakerRate](http://speakerrate.com), and should +work well for your small- to medium-sized web application. diff --git a/content/elsewhere/coffeescript-for-ruby-bros/index.md b/content/elsewhere/coffeescript-for-ruby-bros/index.md new file mode 100644 index 0000000..35bb6e0 --- /dev/null +++ b/content/elsewhere/coffeescript-for-ruby-bros/index.md @@ -0,0 +1,71 @@ +--- +title: "CoffeeScript for Ruby Bros" +date: 2010-08-06T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/coffeescript-for-ruby-bros/ +--- + +Hello there, Ruby friend. You've perhaps heard of +[CoffeeScript](https://jashkenas.github.com/coffee-script/), +"JavaScript's less ostentatious kid brother," but it might *as yet* be +unclear why you'd want to stray from Ruby's loving embrace. Well, +friend, I've been playing with it off-and-on for the past few months, +and I've come to the following conclusion: **CoffeeScript combines the +simplicity of Javascript with the elegance of Ruby.** + +## Syntax + +Despite its compactness as a language, Javascript has always felt a bit +noisy to me. Its excessive punctuation is pretty much the only thing it +has in common with its namesake. CoffeeScript borrows from the syntaxes +of Ruby and Python to create a sort of minimalist Javascript. From +Python, we get significant whitespace and list comprehensions. + +Otherwise, it's all Ruby: semicolons and parentheses around function +arguments are entirely optional. Like Ruby's `||=`, conditional +assignment is handled with `?=`. Conditionals can be inlined +(`something if something_else`). And every statement has an implicit +value, so `return` is unnecessary. + +## Functions + +Both Javascript and Ruby support functional programming. Ruby offers +numerous language features to make functional programming as concise as +possible, the drawback being the sheer number of ways to define a +function: at least six, by my count (`def`, `do/end`, `{ }`, `lambda`, +`Proc.new`, `proc`). + +At the other extreme, Javascript offers but one way to define a +function: the `function` keyword. It's certainly simple, but especially +in callback-oriented code, you wind up writing `function` one hell of a +lot. CoffeeScript gives us the `->` operator, combining the brevity of +Ruby with the simplicity of Javascript: + + thrice: (f) -> f() f() f() thrice -> puts "OHAI" + +Which translates to: + + (function(){ var thrice; thrice = function(f) { f(); f(); return f(); }; thrice(function() { return puts("OHAI"); }); })(); + +I'll tell you what that is: MONEY. Money in the BANK. + +## It's Node + +Though not dependent upon it, CoffeeScript is built to run on top of +[Node.js](http://nodejs.org/). This means you can take advantage of all +the incredible work people are doing with Node, including the +[Express](http://expressjs.com/) web framework, the [Redis Node +Client](https://github.com/fictorial/redis-node-client), and +[Connect](https://github.com/senchalabs/connect), a middleware framework +along the lines of [Rack](http://rack.rubyforge.org/). What's more, its +integration with Node allows you to run CoffeeScript programs from the +command line just like you would Ruby code. + +CoffeeScript is an exciting technology, as both a standalone language +and as a piece of a larger Node.js toolkit. Take a look at +[Defer](http://gfxmonk.net/2010/07/04/defer-taming-asynchronous-javascript-with-coffeescript.html) +to see what the language might soon be capable of, and if you're +participating in this year's [Node.js +Knockout](http://nodeknockout.com/), watch out for the +[Rocketpants](http://nodeknockout.com/teams/2eb41a4c31f50c044a280000). diff --git a/content/elsewhere/convert-ruby-method-to-lambda/index.md b/content/elsewhere/convert-ruby-method-to-lambda/index.md new file mode 100644 index 0000000..01a5907 --- /dev/null +++ b/content/elsewhere/convert-ruby-method-to-lambda/index.md @@ -0,0 +1,63 @@ +--- +title: "Convert a Ruby Method to a Lambda" +date: 2011-04-26T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/convert-ruby-method-to-lambda/ +--- + +Last week I +[tweeted](https://twitter.com/#!/deisinger/status/60706017037660160): + +> Convert a method to a lambda in Ruby: lambda(&method(:events_path)). +> OR JUST USE JAVASCRIPT. + +It might not be clear what I was talking about or why it would be +useful, so allow me to elaborate. Say you've got the following bit of +Javascript: + + var ytmnd = function() { alert("you're the man now " + (arguments[0] || "dog")); }; + +Calling `ytmnd()` gets us `you're the man now dog`, while +`ytmnd("david")` yields `you're the man now david`. Calling simply +`ytmnd` gives us a reference to the function that we're free to pass +around and call at a later time. Consider now the following Ruby code: + + def ytmnd(name = "dog") puts "you're the man now #{name}" end + +First, aren't default argument values and string interpolation awesome? +Love you, Ruby. Just as with our Javascript function, calling `ytmnd()` +prints "you're the man now dog", and `ytmnd("david")` also works as +you'd expect. But. BUT. Running `ytmnd` returns *not* a reference to the +method, but rather calls it outright, leaving you with nothing but Sean +Connery's timeless words. + +To duplicate Javascript's behavior, you can convert the method to a +lambda with `sean = lambda(&method(:ytmnd))`. Now you've got something +you can call with `sean.call` or `sean.call("david")` and pass around +with `sean`. + +BUT WAIT. Everything in Ruby is an object, even methods. And as it turns +out, a method object behaves very much like a lambda. So rather than +saying `sean = lambda(&method(:ytmnd))`, you can simply say +`sean = method(:ytmnd)`, and then call it as if it were a lambda with +`.call` or `[]`. Big ups to +[Justin](https://www.viget.com/about/team/jmarney/) for that knowledge +bomb. + +### WHOOOO CARES + +All contrivances aside, there are real-life instances where you'd want +to take advantage of this language feature. Imagine a Rails partial that +renders a list of filtered links for a given model. How would you tell +the partial where to send the links? You could pass in a string and use +old-school `:action` and `:controller` params or use `eval` (yuck). You +could create the lambda the long way with something like +`:base_url => lambda { |*args| articles_path(*args) }`, but using +`method(:articles_path)` accomplishes the same thing with much less line +noise. + +I'm not sure it would have ever occurred to me to do something like this +before I got into Javascript. Just goes to show that if you want to get +better as a Rubyist, a great place to start is with a different language +entirely. diff --git a/content/elsewhere/curl-and-your-rails-2-app/index.md b/content/elsewhere/curl-and-your-rails-2-app/index.md new file mode 100644 index 0000000..5d11379 --- /dev/null +++ b/content/elsewhere/curl-and-your-rails-2-app/index.md @@ -0,0 +1,52 @@ +--- +title: "cURL and Your Rails 2 App" +date: 2008-03-28T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/curl-and-your-rails-2-app/ +--- + +If you're anything like me, you've used +[cURL](https://en.wikipedia.org/wiki/CURL) to download a batch of MP3 +files from the web, or to move a TAR file from one remote server to +another. It might come as a surprise, then, that cURL is a full-featured +HTTP client, which makes it perfect for interacting with RESTful web +services like the ones encouraged by Rails 2. To illustrate, let's +create a small Rails app called 'tv_show': + + rails tv_show cd tv_show script/generate scaffold character name:string action:string rake db:migrate script/server + +Fire up your web browser and create a few characters. Once you've done +that, open a new terminal window and try the following: + + curl http://localhost:3000/characters.xml + +You'll get a nice XML representation of your characters: + + 1 George Sr. goes to jail 2008-03-28T11:01:57-04:00 2008-03-28T11:01:57-04:00 2 Gob rides a Segway 2008-03-28T11:02:07-04:00 2008-03-28T11:02:12-04:00 3 Tobias wears cutoffs 2008-03-28T11:02:20-04:00 2008-03-28T11:02:20-04:00 + +You can retrieve the representation of a specific character by +specifying his ID in the URL: + + dce@roflcopter ~ > curl http://localhost:3000/characters/1.xml 1 George Sr. goes to jail 2008-03-28T11:01:57-04:00 2008-03-28T11:01:57-04:00 + +To create a new character, issue a POST request, use the -X flag to +specify the action, and the -d flag to define the request body: + + curl -X POST -d "character[name]=Lindsay&character[action]=does+nothing" http://localhost:3000/characters.xml + +Here's where things get interesting: unlike most web browsers, which +only support GET and POST, cURL supports the complete set of HTTP +actions. If we want to update one of our existing characters, we can +issue a PUT request to the URL of that character's representation, like +so: + + curl -X PUT -d "character[action]=works+at+clothing+store" http://localhost:3000/characters/4.xml + +If we want to delete a character, issue a DELETE request: + + curl -X DELETE http://localhost:3000/characters/1.xml + +For some more sophisticated uses of REST and Rails, check out +[rest-client](https://rest-client.heroku.com/rdoc/) and +[ActiveResource](http://ryandaigle.com/articles/2006/06/30/whats-new-in-edge-rails-activeresource-is-here). diff --git a/content/elsewhere/devnation-coming-to-san-francisco/index.md b/content/elsewhere/devnation-coming-to-san-francisco/index.md new file mode 100644 index 0000000..f3f0beb --- /dev/null +++ b/content/elsewhere/devnation-coming-to-san-francisco/index.md @@ -0,0 +1,23 @@ +--- +title: "DevNation Coming to San Francisco" +date: 2010-07-29T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/devnation-coming-to-san-francisco/ +--- + +On Saturday, August 14th, we're taking the +[DevNation](http://devnation.us/) tour across the country for our first +ever stop in the Bay Area. Our friends at [Engine +Yard](http://www.engineyard.com/) will be hosting us for a day of talks, +hacking, and discussion. The lineup is our finest to date, featuring, +among others, speakers from [Pivotal Labs](http://pivotallabs.com/), +[LinkedIn](http://linkedin.com/), [Basho](http://basho.com/), and +[Yahoo!](http://yahoo.com/) and capped off by a keynote from [Chris +Wanstrath](http://chriswanstrath.com/) +([defunkt](https://twitter.com/defunkt) of [GitHub](https://github.com/) +fame). As always, breakfast and lunch will be provided. + +If you're in the Bay Area, we'd love to meet you. Registration is only +\$50 if you sign by this Saturday, so save your money for the happy hour +and [sign up now](http://devnation.us/events/9). diff --git a/content/elsewhere/diving-into-go-a-five-week-intro/index.md b/content/elsewhere/diving-into-go-a-five-week-intro/index.md new file mode 100644 index 0000000..0269a6c --- /dev/null +++ b/content/elsewhere/diving-into-go-a-five-week-intro/index.md @@ -0,0 +1,232 @@ +--- +title: "Diving into Go: A Five-Week Intro" +date: 2014-04-25T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/diving-into-go-a-five-week-intro/ +--- + +One of my favorite parts of being a developer here at Viget is our +[developer book club](https://viget.com/extend/confident-ruby-a-review). +We've read [some](http://www.confidentruby.com/) +[fantastic](http://www.poodr.com/) +[books](http://martinfowler.com/books/nosql.html), but for our most +recent go-round, we decided to try something different. A few of us have +been interested in the [Go programming language](https://golang.org/) +for some time, so we decided to combine two free online texts, [*An +Introduction to Programming in Go*](http://www.golang-book.com/) and +[*Go By Example*](https://gobyexample.com/), plus a few other resources, +into a short introduction to the language. +[Chris](https://viget.com/about/team/cjones) and +[Ryan](https://viget.com/about/team/rfoster) put together a curriculum +that I thought was too good not to share with the internet at large. + +## Week 1 {#week1} + +Chapter 1: [Getting Started](http://www.golang-book.com/1) + +- Files and Folders +- The Terminal +- Text Editors +- Go Tools +- **Go By Example** + - [Hello World](https://gobyexample.com/hello-world) + +Chapter 2: [Your First Program](http://www.golang-book.com/2) + +- How to Read a Go Program + +Chapter 3: [Types](http://www.golang-book.com/3) + +- Numbers +- Strings +- Booleans +- **Go By Example** + - [Values](https://gobyexample.com/values) + - [Random Numbers](https://gobyexample.com/random-numbers) + - [String Functions](https://gobyexample.com/string-functions) + - [String Formatting](https://gobyexample.com/string-formatting) + - [Regular + Expressions](https://gobyexample.com/regular-expressions) + +Chapter 4: [Variables](http://www.golang-book.com/4) + +- How to Name a Variable +- Scope +- Constants +- Defining Multiple Variables +- An Example Program +- **Go By Example** + - [Variables](https://gobyexample.com/variables) + - [Constants](https://gobyexample.com/constants) + - [Number Parsing](https://gobyexample.com/number-parsing) + - [Time](https://gobyexample.com/time) + - [Epoch](https://gobyexample.com/epoch) + - [Time Formatting / + Parsing](https://gobyexample.com/time-formatting-parsing) + +Chapter 5: [Control Structures](http://www.golang-book.com/5) + +- For +- If +- Switch +- **Go By Example** + - [For](https://gobyexample.com/for) + - [If/Else](https://gobyexample.com/if-else) + - [Switch](https://gobyexample.com/switch) + - [Line Filters](https://gobyexample.com/line-filters) + +Chapter 6: [Arrays, Slices and Maps](http://www.golang-book.com/6) + +- Arrays +- Slices +- Maps +- **Go By Example** + - [Arrays](https://gobyexample.com/arrays) + - [Slices](https://gobyexample.com/slices) + - [Maps](https://gobyexample.com/maps) + - [Range](https://gobyexample.com/range) +- **Blog Posts** + - [Go Slices: usage and + internals](https://blog.golang.org/go-slices-usage-and-internals) + - [Arrays, Slices (and strings): The mechanics of + 'append'](https://blog.golang.org/slices) + +## Week 2 {#week2} + +Chapter 7: [Functions](http://www.golang-book.com/7) + +- Your Second Function +- Returning Multiple Values +- Variadic Functions +- Closure +- Recursion +- Defer, Panic & Recover +- **Go By Example** + - [Functions](https://gobyexample.com/functions) + - [Multiple Return + Values](https://gobyexample.com/multiple-return-values) + - [Variadic Functions](https://gobyexample.com/variadic-functions) + - [Closures](https://gobyexample.com/closures) + - [Recursion](https://gobyexample.com/recursion) + - [Panic](https://gobyexample.com/panic) + - [Defer](https://gobyexample.com/defer) + - [Collection + Functions](https://gobyexample.com/collection-functions) + +Chapter 8: [Pointers](http://www.golang-book.com/8) + +- The \* and & operators + - new +- **Go By Example** + - [Pointers](https://gobyexample.com/pointers) + - [Reading Files](https://gobyexample.com/reading-files) + - [Writing Files](https://gobyexample.com/writing-files) + +## Week 3 {#week3} + +Chapter 9: [Structs and Interfaces](http://www.golang-book.com/9) + +- Structs +- Methods +- Interfaces +- **Go By Example** + - [Structs](https://gobyexample.com/structs) + - [Methods](https://gobyexample.com/methods) + - [Interfaces](https://gobyexample.com/interfaces) + - [Errors](https://gobyexample.com/errors) + - [JSON](https://gobyexample.com/json) + +Chapter 10: [Concurrency](http://www.golang-book.com/10) + +- Goroutines +- Channels +- **Go By Example** + - [Goroutines](https://gobyexample.com/goroutines) + - [Channels](https://gobyexample.com/channels) + - [Channel Buffering](https://gobyexample.com/channel-buffering) + - [Channel + Synchronization](https://gobyexample.com/channel-synchronization) + - [Channel Directions](https://gobyexample.com/channel-directions) + - [Select](https://gobyexample.com/select) + - [Timeouts](https://gobyexample.com/timeouts) + - [Non-Blocking Channel + Operations](https://gobyexample.com/non-blocking-channel-operations) + - [Closing Channels](https://gobyexample.com/closing-channels) + - [Range over + Channels](https://gobyexample.com/range-over-channels) + - [Timers](https://gobyexample.com/timers) + - [Tickers](https://gobyexample.com/tickers) + - [Worker Pools](https://gobyexample.com/worker-pools) + - [Rate Limiting](https://gobyexample.com/rate-limiting) + +## Week 4 {#week4} + +- **Videos** + - [Lexical Scanning in + Go](https://www.youtube.com/watch?v=HxaD_trXwRE) + - [Concurrency is not + parallelism](https://blog.golang.org/concurrency-is-not-parallelism) +- Blog Posts + - [Share Memory By + Communicating](https://blog.golang.org/share-memory-by-communicating) + - [A GIF decoder: an exercise in Go + interfaces](https://blog.golang.org/gif-decoder-exercise-in-go-interfaces) + - [Error handling and + Go](https://blog.golang.org/error-handling-and-go) + - [Defer, Panic, and + Recover](https://blog.golang.org/defer-panic-and-recover) + +## Week 5 {#week5} + +Chapter 11: [Packages](http://www.golang-book.com/11) + +- Creating Packages +- Documentation + +Chapter 12: [Testing](http://www.golang-book.com/12) + +Chapter 13: [The Core Packages](http://www.golang-book.com/13) + +- Strings +- Input / Output +- Files & Folders +- Errors +- Containers & Sort +- Hashes & Cryptography +- Servers +- Parsing Command Line Arguments +- Synchronization Primitives +- **Go By Example** + - [Sorting](https://gobyexample.com/sorting) + - [Sorting by + Functions](https://gobyexample.com/sorting-by-functions) + - [URL Parsing](https://gobyexample.com/url-parsing) + - [SHA1 Hashes](https://gobyexample.com/sha1-hashes) + - [Base64 Encoding](https://gobyexample.com/base64-encoding) + - [Atomic Counters](https://gobyexample.com/atomic-counters) + - [Mutexes](https://gobyexample.com/mutexes) + - [Stateful + Goroutines](https://gobyexample.com/stateful-goroutines) + - [Command-Line + Arguments](https://gobyexample.com/command-line-arguments) + - [Command-Line Flags](https://gobyexample.com/command-line-flags) + - [Environment + Variables](https://gobyexample.com/environment-variables) + - [Spawning Processes](https://gobyexample.com/spawning-processes) + - [Exec'ing Processes](https://gobyexample.com/execing-processes) + - [Signals](https://gobyexample.com/signals) + - [Exit](https://gobyexample.com/exit) + +Chapter 14: [Next Steps](http://www.golang-book.com/14) + +- Study the Masters +- Make Something +- Team Up + +\* \* \* + +Go is an exciting language, and a great complement to the Ruby work we +do. Working through this program was a fantastic intro to the language +and prepared us to create our own Go programs for great justice. Give it +a shot and let us know how it goes. diff --git a/content/elsewhere/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/index.md b/content/elsewhere/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/index.md new file mode 100644 index 0000000..fb40119 --- /dev/null +++ b/content/elsewhere/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/index.md @@ -0,0 +1,231 @@ +--- +title: "Email Photos to an S3 Bucket with AWS Lambda (with Cropping, in Ruby)" +date: 2021-04-07T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/ +--- + +In my annual search for holiday gifts, I came across this [digital photo +frame](https://auraframes.com/digital-frames/color/graphite) that lets +you load photos via email. Pretty neat, but I ultimately didn\'t buy it +for a few reason: 1) it\'s pretty expensive, 2) I\'d be trusting my +family\'s data to an unknown entity, and 3) if the company ever goes +under or just decides to stop supporting the product, it might stop +working or at least stop updating. But I got to thinking, could I build +something like this myself? I\'ll save the full details for a later +article, but the first thing I needed to figure out was how to get +photos from an email into an S3 bucket that could be synced onto a +device. + +I try to keep up with the various AWS offerings, and Lambda has been on +my radar for a few years, but I haven\'t had the opportunity to use it +in anger. Services like this really excel at the extremes of web +software --- at the low end, where you don\'t want to incur the costs of +an always-on server, and at the high-end, where you don\'t want to pay +for a whole fleet of them. Most of our work falls in the middle, where +developer time is way more costly than hosting infrastructure and so +using a more full-featured stack running on a handful of conventional +servers is usually the best option. But an email-to-S3 gateway is a +perfect use case for on-demand computing. + +[]{#the-services} + +## The Services [\#](#the-services "Direct link to The Services"){.anchor aria-label="Direct link to The Services"} + +To make this work, we need to connect several AWS services: + +- [Route 53](https://aws.amazon.com/route53/) (for domain registration + and DNS configuration) +- [SES](https://aws.amazon.com/ses/) (for setting up the email address + and \"rule set\" that triggers the Lambda function) +- [S3](https://aws.amazon.com/s3/) (for storing the contents of the + incoming emails as well as the resulting photos) +- [SNS](https://aws.amazon.com/sns/) (for notifying the Lambda + function of an incoming email) +- [Lambda](https://aws.amazon.com/lambda) (to process the incoming + email, extract the photos, crop them, and store the results) +- [CloudWatch](https://aws.amazon.com/cloudwatch) (for debugging + issues with the code) +- [IAM](https://aws.amazon.com/iam) (for setting the appropriate + permissions) + +It\'s a lot, to be sure, but it comes together pretty easily: + +1. Create a couple buckets in S3, one to hold emails, the other to hold + photos. +2. Register a domain (\"hosted zone\") in Route 53. +3. Go to Simple Email Service \> Domains and verify a new domain, + selecting the domain you just registered in Route 53. +4. Go to the SES \"rule sets\" interface and click \"Create Rule.\" + Give it a name and an email address you want to send your photos to. +5. For the rule action, pick \"S3\" and then the email bucket you + created in step 1 (we have to use S3 rather than just calling the + Lambda function directly because our emails exceed the maximum + payload size). Make sure to add an SNS (Simple Notification Service) + topic to go along with your S3 action, which is how we\'ll trigger + our Lambda function. +6. Go to the Lambda interface and create a new function. Give it a name + that makes sense for you and pick Ruby 2.7 as the language. +7. With your skeleton function created, click \"Add Trigger\" and + select the SNS topic you created in step 5. You\'ll need to add + ImageMagick as a layer[^1^](#fn1){#fnref1 .footnote-ref + role="doc-noteref"} and bump the memory and timeout (I used 512 MB + and 30 seconds, respectively, but you should use whatever makes you + feel good in your heart). +8. Create a couple environment variables: `BUCKET` should be name of + the S3 bucket you want to upload photos to, and `AUTHORIZED_EMAILS` + to hold all the valid email addresses separated by semicolons. +9. Give your function permissions to read and write to/from the two + buckets. +10. And finally, the code. We\'ll manage that locally rather than using + the web-based interface since we need to include a couple gems. + +[]{#the-code} + +## The Code [\#](#the-code "Direct link to The Code"){.anchor aria-label="Direct link to The Code"} + +So as I said literally one sentence ago, we manage the code for this +Lambda function locally since we need to include a couple gems: +[`mail`](https://github.com/mikel/mail) to parse the emails stored in S3 +and [`mini_magick`](https://github.com/minimagick/minimagick) to do the +cropping. If you don\'t need cropping, feel free to leave that one out +and update the code accordingly. Without further ado: + +``` {.code-block .line-numbers} +require 'json' +require 'aws-sdk-s3' +require 'mail' +require 'mini_magick' + +BUCKET = ENV["BUCKET"] +AUTHORIZED_EMAILS = ENV["AUTHORIZED_EMAILS"].split(";") + +def lambda_handler(event:, context:) + message = JSON.parse(event["Records"][0]["Sns"]["Message"]) + s3_info = message["receipt"]["action"] + client = Aws::S3::Client.new(region: "us-east-1") + + # Get the incoming email from S3 + object = client.get_object( + bucket: s3_info["bucketName"], + key: s3_info["objectKey"] + ) + + email = Mail.new(object.body.read) + sender = email.from.first + + # Confirm that the sender is in the list, otherwise abort + unless AUTHORIZED_EMAILS.include?(sender) + puts "Unauthorized email: #{sender}" + exit + end + + # Get all the images out of the email + attachments = email.parts.filter { |p| p.content_type =~ /^image/ } + + attachments.each do |attachment| + # First, just put the original photo in the `photos` subdirectory + client.put_object( + body: attachment.body.to_s, + bucket: BUCKET, + key: "photos/#{attachment.filename}" + ) + + thumb = MiniMagick::Image.read(attachment.body.to_s) + + # Crop the photo down for displaying on a webpage + thumb.combine_options do |i| + i.auto_orient + i.resize "440x264^" + i.gravity "center" + i.extent "440x264" + end + + client.put_object( + body: thumb.to_blob, + bucket: BUCKET, + key: "thumbs/#{attachment.filename}" + ) + + dithered = MiniMagick::Image.read(attachment.body.to_s) + + # Crop and dither the photo for displaying on an e-ink screen + dithered.combine_options do |i| + i.auto_orient + i.resize "880x528^" + i.gravity "center" + i.extent "880x528" + i.ordered_dither "o8x8" + i.monochrome + end + + client.put_object( + body: dithered.to_blob, + bucket: BUCKET, + key: "dithered/#{attachment.filename}" + ) + + puts "Photo '#{attachment.filename}' uploaded" + end + + { + statusCode: 200, + body: JSON.generate("#{attachments.size} photo(s) uploaded.") + } +end +``` + +If you\'re unfamiliar with dithering, [here\'s a great +post](https://surma.dev/things/ditherpunk/) with more info, but in +short, it\'s a way to simulate grayscale with only black and white +pixels like what you find on an e-ink/e-paper display. + +[]{#deploying} + +## Deploying [\#](#deploying "Direct link to Deploying"){.anchor aria-label="Direct link to Deploying"} + +To deploy your code, you\'ll use the [AWS +CLI](https://aws.amazon.com/cli/). [Here\'s a pretty good +walkthrough](https://docs.aws.amazon.com/lambda/latest/dg/ruby-package.html) +of how to do it but I\'ll summarize: + +1. Install your gems locally with + `bundle install --path vendor/bundle`. +2. Edit your code (in our case, it lives in `lambda_function.rb`). +3. Make a simple shell script that zips up your function and gems and + sends it up to AWS: + +``` {.code-block .line-numbers} +#!/bin/sh + +zip -r function.zip lambda_function.rb vendor + && aws lambda update-function-code + --function-name [lambda-function-name] + --zip-file fileb://function.zip +``` + +And that\'s it! A simple, resilient, cheap way to email photos into an +S3 bucket with no servers in sight (at least none you care about or have +to manage). + +------------------------------------------------------------------------ + +In closing, this project was a great way to get familiar with Lambda and +the wider AWS ecosystem. It came together in just a few hours and is +still going strong several months later. My typical bill is something on +the order of \$0.50 per month. If anything goes wrong, I can pop into +CloudWatch to view the result of the function, but so far, [so +smooth](https://static.viget.com/DP823L7XkAIJ_xK.jpg). + +I\'ll be back in a few weeks detailing the rest of the project. Stay +tuned! + + +------------------------------------------------------------------------ + +1. ::: {#fn1} + I used the ARN + `arn:aws:lambda:us-east-1:182378087270:layer:image-magick:1`[↩︎](#fnref1){.footnote-back + role="doc-backlink"} + ::: diff --git a/content/elsewhere/extract-embedded-text-from-pdfs-with-poppler-in-ruby/index.md b/content/elsewhere/extract-embedded-text-from-pdfs-with-poppler-in-ruby/index.md new file mode 100644 index 0000000..4313227 --- /dev/null +++ b/content/elsewhere/extract-embedded-text-from-pdfs-with-poppler-in-ruby/index.md @@ -0,0 +1,86 @@ +--- +title: "Extract Embedded Text from PDFs with Poppler in Ruby" +date: 2022-02-10T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/extract-embedded-text-from-pdfs-with-poppler-in-ruby/ +--- + +A recent client request had us adding an archive of magazine issues +dating back to the 1980s. Pretty straightforward stuff, with the hiccup +that they wanted the magazine content to be searchable. Fortunately, the +example PDFs they provided us had embedded text +content[^1^](#fn1){#fnref1 .footnote-ref role="doc-noteref"}, i.e. the +text was selectable. The trick was to figure out how to programmatically +extract that content. + +Our first attempt involved the [`pdf-reader` +gem](https://rubygems.org/gems/pdf-reader/versions/2.2.1), which worked +admirably with the caveat that it had a little bit of trouble with +multi-column / art-directed layouts[^2^](#fn2){#fnref2 .footnote-ref +role="doc-noteref"}, which was a lot of the content we were dealing +with. + +A bit of research uncovered [Poppler](https://poppler.freedesktop.org/), +"a free software utility library for rendering Portable Document Format +(PDF) documents," which includes text extraction functionality and has a +corresponding [Ruby +library](https://rubygems.org/gems/poppler/versions/3.4.9). This worked +great and here's how to do it. + +## Install Poppler + +Poppler installs as a standalone library. On Mac: + + brew install poppler + +On (Debian-based) Linux: + + apt-get install libgirepository1.0-dev libpoppler-glib-dev + +In a (Debian-based) Dockerfile: + + RUN apt-get update && + apt-get install -y libgirepository1.0-dev libpoppler-glib-dev && + rm -rf /var/lib/apt/lists/* + +Then, in your `Gemfile`: + + gem "poppler" + +## Use it in your application + +Extracting text from a PDF document is super straightforward: + + document = Poppler::Document.new(path_to_pdf) + document.map { |page| page.get_text }.join + +The results are really good, and Poppler understands complex page +layouts to an impressive degree. Additionally, the library seems to +support a lot more [advanced +functionality](https://www.rubydoc.info/gems/poppler/3.4.9). If you ever +need to extract text from a PDF, Poppler is a good choice. + +[*John Popper photo by Gage Skidmore, CC BY-SA +3.0*](https://commons.wikimedia.org/w/index.php?curid=39946499) + + +------------------------------------------------------------------------ + +1. [Note that we're not talking about extracting text from images/OCR; + if you need to take an image-based PDF and add a selectable text + layer to it, I recommend + [OCRmyPDF](https://pypi.org/project/ocrmypdf/). + [↩︎](#fnref1){.footnote-back role="doc-backlink"}]{#fn1} + +2. [So for a page like this:]{#fn2} + + +-----------------+---------------------+ + | This is a story | my life got flipped | + | all about how | turned upside-down | + +-----------------+---------------------+ + + `pdf-reader` would parse this into "This is a story my life got + flipped all about how turned upside-down," which led to issues when + searching for multi-word phrases. [↩︎](#fnref2){.footnote-back + role="doc-backlink"} diff --git a/content/elsewhere/first-class-failure/index.md b/content/elsewhere/first-class-failure/index.md new file mode 100644 index 0000000..9e3df8e --- /dev/null +++ b/content/elsewhere/first-class-failure/index.md @@ -0,0 +1,131 @@ +--- +title: "First-Class Failure" +date: 2014-07-22T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/first-class-failure/ +--- + +As a developer, nothing makes me more nervous than third-party +dependencies and things that can fail in unpredictable +ways^[1](%7Bfn:1:url%7D "see footnote"){#fnref:1 .footnote}^. More often +than not, these two go hand-in-hand, taking our elegant, robust +applications and dragging them down to the lowest common denominator of +the services they depend upon. A recent internal project called for +slurping in and then reporting against data from +[Harvest](http://www.getharvest.com/), our time tracking service of +choice and a fickle beast on its very best days. + +I knew that both components (`/(im|re)porting/`) were prone to failure. +How to handle that failure in a graceful way, so that our users see +something more meaningful than a 500 page, and our developers have a +fighting chance at tracking and fixing the problem? Here's the approach +we took. + +## Step 1: Model the processes {#step1:modeltheprocesses} + +Rather than importing the data or generating the report with procedural +code, create ActiveRecord models for them. In our case, the models are +`HarvestImport` and `Report`. When a user initiates a data import or a +report generation, save a new record to the database *immediately*, +before doing any work. + +## Step 2: Give 'em status {#step2:giveemstatus} + +These models have a `status` column. We default it to "queued," since we +offload most of the work to a series of [Resque](http://resquework.org/) +tasks, but you can use "pending" or somesuch if that's more your speed. +They also have an `error` field for reasons that will become apparent +shortly. + +## Step 3: Define an interface {#step3:defineaninterface} + +Into both of these models, we include the following module: + + module ProcessingStatus + def mark_processing + update_attributes(status: "processing") + end + + def mark_successful + update_attributes(status: "success", error: nil) + end + + def mark_failure(error) + update_attributes(status: "failed", error: error.to_s) + end + + def process(cleanup = nil) + mark_processing + yield + mark_successful + rescue => ex + mark_failure(ex) + ensure + cleanup.try(:call) + end + end + +Lines 2--12 should be self-explanatory: methods for setting the object's +status. The `mark_failure` method takes an exception object, which it +stores in the model's `error` field, and `mark_successful` clears said +error. + +Line 14 (the `process` method) is where things get interesting. Calling +this method immediately marks the object "processing," and then yields +to the provided block. If the block executes without error, the object +is marked "success." If any^[2](#fn:2 "see footnote"){#fnref:2 +.footnote}^ exception is thrown, the object marked "failure" and the +error message is logged. Either way, if a `cleanup` lambda is provided, +we call it (courtesy of Ruby's +[`ensure`](http://ruby.activeventure.com/usersguide/rg/ensure.html) +keyword). + +## Step 4: Wrap it up {#step4:wrapitup} + +Now we can wrap our nasty, fail-prone reporting code in a `process` call +for great justice. + + class ReportGenerator + attr_accessor :report + + def generate_report + report.process -> { File.delete(file_path) } do + # do some fail-prone work + end + end + + # ... + end + +The benefits are almost too numerous to count: 1) no 500 pages, 2) +meaningful feedback for users, and 3) super detailed diagnostic info for +developers -- better than something like +[Honeybadger](https://www.honeybadger.io/), which doesn't provide nearly +the same level of context. (`-> { File.delete(file_path) }` is just a +little bit of file cleanup that should happen regardless of outcome.) + +\* \* \* + +I've always found it an exercise in futility to try to predict all the +ways a system can fail when integrating with an external dependency. +Being able to blanket rescue any exception and store it in a way that's +meaningful to users *and* developers has been hugely liberating and has +contributed to a seriously robust platform. This technique may not be +applicable in every case, but when it fits, [it's +good](https://www.youtube.com/watch?v=HNfciDzZTNM&t=1m40s). + + +------------------------------------------------------------------------ + +1. ::: {#fn:1} + Well, [almost + nothing](https://github.com/github/hubot/blob/master/src/scripts/google-images.coffee#L5). + [ ↩](#fnref:1 "return to article"){.reversefootnote} + ::: + +2. ::: {#fn:2} + [Any descendent of + `StandardError`](http://stackoverflow.com/a/10048406), in any event. + [ ↩](#fnref:2 "return to article"){.reversefootnote} + ::: diff --git a/content/elsewhere/five-turbo-lessons-i-learned-the-hard-way/index.md b/content/elsewhere/five-turbo-lessons-i-learned-the-hard-way/index.md new file mode 100644 index 0000000..2fecf17 --- /dev/null +++ b/content/elsewhere/five-turbo-lessons-i-learned-the-hard-way/index.md @@ -0,0 +1,122 @@ +--- +title: "Five Turbo Lessons I Learned the Hard Way" +date: 2021-08-02T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/five-turbo-lessons-i-learned-the-hard-way/ +--- + +We\'ve been using [Turbo](https://turbo.hotwired.dev/) on our latest +client project (a Ruby on Rails web application), and after a slight +learning curve, we\'ve been super impressed by how much dynamic behavior +it\'s allowed us to add while writing very little code. We have hit some +gotchas (or at least some undocumented behavior), often with solutions +that lie deep in GitHub issue threads. Here are a few of the things +we\'ve discovered along our Turbo journey. + +[]{#turbo-stream-fragments-are-server-responses} + +### Turbo Stream fragments are server responses (and you don\'t have to write them by hand) [\#](#turbo-stream-fragments-are-server-responses "Direct link to Turbo Stream fragments are server responses (and you don't have to write them by hand)"){.anchor aria-label="Direct link to Turbo Stream fragments are server responses (and you don't have to write them by hand)"} + +[The docs on Turbo Streams](https://turbo.hotwired.dev/handbook/streams) +kind of bury the lede. They start out with the markup to update the +client, and only [further +down](https://turbo.hotwired.dev/handbook/streams#streaming-from-http-responses) +illustrate how to use them in a Rails app. Here\'s the thing: you don\'t +really need to write any stream markup at all. It\'s (IMHO) cleaner to +just use the built-in Rails methods, i.e. + + render turbo_stream: turbo_stream.update("flash", partial: "shared/flash") + +And though [DHH would +disagree](https://github.com/hotwired/turbo-rails/issues/77#issuecomment-757349251), +you can use an array to make multiple updates to the page. + +[]{#send-unprocessable-entity-to-re-render-a-form-with-errors} + +### Send `:unprocessable_entity` to re-render a form with errors [\#](#send-unprocessable-entity-to-re-render-a-form-with-errors "Direct link to Send :unprocessable_entity to re-render a form with errors"){.anchor aria-label="Direct link to Send :unprocessable_entity to re-render a form with errors"} + +For create/update actions, we follow the usual pattern of redirect on +success, re-render the form on error. Once you enable Turbo, however, +that direct rendering stops working. The solution is to [return a 422 +status](https://github.com/hotwired/turbo-rails/issues/12), though we +prefer the `:unprocessable_entity` alias (so like +`render :new, status: :unprocessable_entity`). This seems to work well +with and without JavaScript and inside or outside of a Turbo frame. + +[]{#use-data-turbo-false-to-break-out-of-a-frame} + +### Use `data-turbo="false"` to break out of a frame [\#](#use-data-turbo-false-to-break-out-of-a-frame "Direct link to Use data-turbo="false" to break out of a frame"){.anchor aria-label="Direct link to Use data-turbo=\"false\" to break out of a frame"} + +If you have a link inside of a frame that you want to bypass the default +Turbo behavior and trigger a full page reload, [include the +`data-turbo="false"` +attribute](https://github.com/hotwired/turbo/issues/45#issuecomment-753444256) +(or use `data: { turbo: false }` in your helper). + +*Update from good guy [Leo](https://www.viget.com/about/team/lbauza/): +you can also use +[`target="_top"`](https://turbo.hotwired.dev/handbook/frames#targeting-navigation-into-or-out-of-a-frame) +to load all the content from the response without doing a full page +reload, which seems (to me, David) what you typically want except under +specific circumstances.* + +[]{#use-requestSubmit-to-trigger-a-turbo-form-submission-via-javaScript} + +### Use `requestSubmit()` to trigger a Turbo form submission via JavaScript [\#](#use-requestSubmit-to-trigger-a-turbo-form-submission-via-javaScript "Direct link to Use requestSubmit() to trigger a Turbo form submission via JavaScript"){.anchor aria-label="Direct link to Use requestSubmit() to trigger a Turbo form submission via JavaScript"} + +If you have some JavaScript (say in a Stimulus controller) that you want +to trigger a form submission with a Turbo response, you can\'t use the +usual `submit()` method. [This discussion +thread](https://discuss.hotwired.dev/t/triggering-turbo-frame-with-js/1622/15) +sums it up well: + +> It turns out that the turbo-stream mechanism listens for form +> submission events, and for some reason the submit() function does not +> emit a form submission event. That means that it'll bring back a +> normal HTML response. That said, it looks like there's another method, +> requestSubmit() which does issue a submit event. Weird stuff from +> JavaScript land. + +So, yeah, use `requestSubmit()` (i.e. `this.formTarget.requestSubmit()`) +and you\'re golden (except in Safari, where you might need [this +polyfill](https://github.com/javan/form-request-submit-polyfill)). + +[]{#loading-the-same-url-multiple-times-in-a-turbo-frame} + +### Loading the same URL multiple times in a Turbo Frame [\#](#loading-the-same-url-multiple-times-in-a-turbo-frame "Direct link to Loading the same URL multiple times in a Turbo Frame"){.anchor aria-label="Direct link to Loading the same URL multiple times in a Turbo Frame"} + +I hit an interesting issue with a form inside a frame: in a listing of +comments, I set it up where you could click an edit link, and the +content would be swapped out for an edit form using a Turbo Frame. +Update and save your comment, and the new content would render. Issue +was, if you hit the edit link *again*, nothing would happen. Turns out, +a Turbo frame won't reload a URL if it thinks it already has the +contents of that URL (which it tracks in a `src` attribute). + +The [solution I +found](https://github.com/hotwired/turbo/issues/245#issuecomment-847711320) +was to append a timestamp to the URL to ensure it\'s always unique. +Works like a charm. + +*Update from good guy +[Joshua](https://www.viget.com/about/team/jpease/): this has been fixed +an a [recent +update](https://github.com/hotwired/turbo/releases/tag/v7.0.0-beta.7).* + + +[[Learn More]{.util-breadcrumb-md .mb-8 .group-hover:translate-y-20 +.group-hover:opacity-0 .transition-all .ease-in-out +.duration-500}](https://www.viget.com/careers/application-developer/){.relative +.flex .group .flex-col .p-32 .md:p-40 .lg:p-64 .z-10} + +### We're hiring Application Developers. Learn more and introduce yourself. {#were-hiring-application-developers.-learn-more-and-introduce-yourself. .text-20 .md:text-24 .lg:text-32 .font-bold .leading-[170%] .group-hover:-translate-y-20 .transition-transform .ease-in-out .duration-500} + +![](data:image/svg+xml;base64,PHN2ZyBjbGFzcz0icmVjdC1pY29uLW1kIHNlbGYtZW5kIG10LTE2IGdyb3VwLWhvdmVyOi10cmFuc2xhdGUteS0yMCB0cmFuc2l0aW9uLWFsbCBlYXNlLWluLW91dCBkdXJhdGlvbi01MDAiIHZpZXdib3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBhcmlhLWhpZGRlbj0idHJ1ZSI+CjxwYXRoIGZpbGwtcnVsZT0iZXZlbm9kZCIgY2xpcC1ydWxlPSJldmVub2RkIiBkPSJNMTMuNzg0OCAxOS4zMDkxQzEzLjQ3NTggMTkuNTg1IDEzLjAwMTcgMTkuNTU4MyAxMi43MjU4IDE5LjI0OTRDMTIuNDQ5OCAxOC45NDA1IDEyLjQ3NjYgMTguNDY2MyAxMi43ODU1IDE4LjE5MDRMMTguNzg2NiAxMi44MzAxTDQuNzUxOTUgMTIuODMwMUM0LjMzNzc0IDEyLjgzMDEgNC4wMDE5NSAxMi40OTQzIDQuMDAxOTUgMTIuMDgwMUM0LjAwMTk1IDExLjY2NTkgNC4zMzc3NCAxMS4zMzAxIDQuNzUxOTUgMTEuMzMwMUwxOC43ODU1IDExLjMzMDFMMTIuNzg1NSA1Ljk3MDgyQzEyLjQ3NjYgNS42OTQ4OCAxMi40NDk4IDUuMjIwNzYgMTIuNzI1OCA0LjkxMTg0QzEzLjAwMTcgNC42MDI5MiAxMy40NzU4IDQuNTc2MTggMTMuNzg0OCA0Ljg1MjEyTDIxLjIzNTggMTEuNTA3NkMyMS4zNzM4IDExLjYyNDQgMjEuNDY5IDExLjc5MDMgMjEuNDk0NSAxMS45NzgyQzIxLjQ5OTIgMTIuMDExOSAyMS41MDE1IDEyLjA0NjEgMjEuNTAxNSAxMi4wODA2QzIxLjUwMTUgMTIuMjk0MiAyMS40MTA1IDEyLjQ5NzcgMjEuMjUxMSAxMi42NEwxMy43ODQ4IDE5LjMwOTFaIj48L3BhdGg+Cjwvc3ZnPg==){.rect-icon-md +.self-end .mt-16 .group-hover:-translate-y-20 .transition-all +.ease-in-out .duration-500} + +These small issues aside, Turbo has been a BLAST to work with and has +allowed us to easily build a highly dynamic app that works surprisingly +well even with JavaScript disabled. We\'re excited to see how this +technology develops. diff --git a/content/elsewhere/friends-undirected-graph-connections-in-rails/index.md b/content/elsewhere/friends-undirected-graph-connections-in-rails/index.md new file mode 100644 index 0000000..16e24f1 --- /dev/null +++ b/content/elsewhere/friends-undirected-graph-connections-in-rails/index.md @@ -0,0 +1,151 @@ +--- +title: "“Friends” (Undirected Graph Connections) in Rails" +date: 2021-06-09T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/friends-undirected-graph-connections-in-rails/ +--- + +No, sorry, not THOSE friends. But if you\'re interested in how to do +some graph stuff in a relational database, SMASH that play button and +read on. + +My current project is a social network of sorts, and includes the +ability for users to connect with one another. I\'ve built this +functionality once or twice before, but I\'ve never come up with a +database implementation I was perfectly happy with. This type of +relationship is perfect for a [graph +database](https://en.wikipedia.org/wiki/Graph_database), but we\'re +using a relational database and introducing a second data store +wouldn\'t be worth the overhead. + +The most straightforward implementation would involve a join model +(`Connection` or somesuch) with two foreign key columns pointed at the +same table (`users` in our case). When you want to pull back a user\'s +contacts, you\'d have to query against both foreign keys, and then pull +back the opposite key to retrieve the list. Alternately, you could store +connections in both directions and hope that your application code +always inserts the connections in pairs (spoiler: at some point, it +won\'t). + +But what if there was a better way? I stumbled on [this article that +talks through the problem in +depth](https://inviqa.com/blog/storing-graphs-database-sql-meets-social-network), +and it led me down the path of using an SQL view and the +[`UNION`](https://www.postgresqltutorial.com/postgresql-union/) +operator, and the result came together really nicely. Let\'s walk +through it step-by-step. + +First, we\'ll model the connection between two users: + +``` {.code-block .line-numbers} +class CreateConnections < ActiveRecord::Migration[6.1] + def change + create_table :connections do |t| + t.references :sender, null: false + t.references :receiver, null: false + + t.timestamps + end + + add_foreign_key :connections, :users, column: :sender_id, on_delete: :cascade + add_foreign_key :connections, :users, column: :receiver_id, on_delete: :cascade + + add_index :connections, + "(ARRAY[least(sender_id, receiver_id), greatest(sender_id, receiver_id)])", + unique: true, + name: :connection_pair_uniq + end +end +``` + +I chose to call the foreign keys `sender` and `receiver`, not that I +particularly care who initiated the connection, but it seemed better +than `user_1` and `user_2`. Notice the index, which ensures that a +sender/receiver pair is unique *in both directions* (so if a connection +already exists where Alice is the sender and Bob is the receiver, we +can\'t insert a connection where the roles are reversed). Apparently +Rails has supported [expression-based +indices](https://bigbinary.com/blog/rails-5-adds-support-for-expression-indexes-for-postgresql) +since version 5. Who knew! + +With connections modeled in our database, let\'s set up the +relationships between user and connection. In `connection.rb`: + + belongs_to :sender, class_name: "User" + belongs_to :receiver, class_name: "User" + +In `user.rb`: + + has_many :sent_connections, + class_name: "Connection", + foreign_key: :sender_id + has_many :received_connections, + class_name: "Connection", + foreign_key: :receiver_id + +Next, we\'ll turn to the +[Scenic](https://github.com/scenic-views/scenic) gem to create a +database view that normalizes sender/receiver into user/contact. Install +the gem, then run `rails generate scenic:model user_contacts`. That\'ll +create a file called `db/views/user_contacts_v01.sql`, where we\'ll put +the following: + + SELECT sender_id AS user_id, receiver_id AS contact_id + FROM connections + UNION + SELECT receiver_id AS user_id, sender_id AS contact_id + FROM connections; + +Basically, we\'re using the `UNION` operator to merge two queries +together (reversing sender and receiver), then making the result +queryable via a virtual table called `user_contacts`. + +Finally, we\'ll add the contact relationships. In `user_contact.rb`: + + belongs_to :user + belongs_to :contact, class_name: "User" + +And in `user.rb`, right below the +`sent_connections`/`received_connections` stuff: + + has_many :user_contacts + has_many :contacts, through: :user_contacts + +And that\'s it! You\'ll probably want to write some validations and unit +tests but I can\'t give away all my tricks (or all of my client\'s +code). + +Here\'s our friendship system in action: + +``` {.code-block .line-numbers} +[1] pry(main)> u1, u2 = User.first, User.last +=> [#, #] +[2] pry(main)> u1.sent_connections.create(receiver: u2) +=> # +[3] pry(main)> UserContact.all +=> [#, + #] +[4] pry(main)> u1.contacts +=> [#] +[5] pry(main)> u2.contacts +=> [#] +[6] pry(main)> # they're lobsters +[7] pry(main)> +``` + +So there it is, a simple, easily queryable vertex/edge implementation in +a vanilla Rails app. I hope you have a great day, week, month, and even +year. + +------------------------------------------------------------------------ + +[Network Diagram Vectors by +Vecteezy](https://www.vecteezy.com/free-vector/network-diagram) + +[*\"I\'ll Be There for You\" (Theme from +Friends)*](https://archive.org/details/tvtunes_31736) © 1995 The +Rembrandts diff --git a/content/elsewhere/functional-programming-in-ruby-with-contracts/index.md b/content/elsewhere/functional-programming-in-ruby-with-contracts/index.md new file mode 100644 index 0000000..bbb555c --- /dev/null +++ b/content/elsewhere/functional-programming-in-ruby-with-contracts/index.md @@ -0,0 +1,121 @@ +--- +title: "Functional Programming in Ruby with Contracts" +date: 2015-03-31T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/functional-programming-in-ruby-with-contracts/ +--- + +I read Thomas Reynolds' [*My Weird +Ruby*](http://awardwinningfjords.com/2015/03/03/my-weird-ruby.html) a +week or two ago, and I **loved** it. I'd never heard of the +[Contracts](https://github.com/egonSchiele/contracts.ruby) gem, but +after reading the post and the [well-written +docs](http://egonschiele.github.io/contracts.ruby/), I couldn't wait to +try it out. I'd been doing some functional programming as part of our +ongoing programming challenge series, and saw an opportunity to use +Contracts to rewrite my Ruby solution to the [One-Time +Pad](https://viget.com/extend/otp-a-language-agnostic-programming-challenge) +problem. Check out my [rewritten `encrypt` +program](https://github.com/vigetlabs/otp/blob/master/languages/Ruby/encrypt): + + #!/usr/bin/env ruby + + require "contracts" + include Contracts + + Char = -> (c) { c.is_a?(String) && c.length == 1 } + Cycle = Enumerator::Lazy + + Contract [Char, Char] => Num + def int_of_hex_chars(chars) + chars.join.to_i(16) + end + + Contract ArrayOf[Num] => String + def hex_string_of_ints(nums) + nums.map { |n| n.to_s(16) }.join + end + + Contract Cycle => Num + def get_mask(key) + int_of_hex_chars key.first(2) + end + + Contract [], Cycle => [] + def encrypt(plaintext, key) + [] + end + + Contract ArrayOf[Char], Cycle => ArrayOf[Num] + def encrypt(plaintext, key) + char = plaintext.first.ord ^ get_mask(key) + [char] + encrypt(plaintext.drop(1), key.drop(2)) + end + + plaintext = STDIN.read.chars + key = ARGV.last.chars.cycle.lazy + + print hex_string_of_ints(encrypt(plaintext, key)) + +Pretty cool, yeah? Compare with this [Haskell +solution](https://github.com/vigetlabs/otp/blob/master/languages/Haskell/encrypt.hs). +Some highlights: + +### Typechecking + +At its most basic, Contracts offers typechecking on function input and +output. Give it the expected classes of the arguments and the return +value, and you'll get a nicely formatted error message if the function +is called with something else, or returns something else. + +### Custom types with lambdas {#customtypeswithlambdas} + +Ruby has no concept of a single character data type -- running +`"string".chars` returns an array of single-character strings. We can +simulate a native char type using a lambda, as seen on line #6, which +says that the argument must be a string and must have a length of one. + +### Tuples + +If you're expecting an array of a specific length and type, you can +specify it, as I've done on line #9. + +### Pattern matching {#patternmatching} + +Rather than one `encrypt` method with a conditional to see if the list +is empty, we define the method twice: once for the base case (line #24) +and once for the recursive case (line #29). This keeps our functions +concise and allows us to do case-specific typechecking on the output. + +### No unexpected `nil` {#nounexpectednil} + +There's nothing worse than `undefined method 'foo' for nil:NilClass`, +except maybe littering your methods with presence checks. Using +Contracts, you can be sure that your functions aren't being called with +`nil`. If it happens that `nil` is an acceptable input to your function, +use `Maybe[Type]` à la Haskell. + +### Lazy, circular lists {#lazycircularlists} + +Unrelated to Contracts, but similarly inspired by *My Weird Ruby*, check +out the rotating encryption key made with +[`cycle`](http://ruby-doc.org/core-2.1.0/Enumerable.html#method-i-cycle) +and +[`lazy`](http://ruby-doc.org/core-2.1.0/Enumerable.html#method-i-lazy) +on line #36. + +\* \* \* + +As a professional Ruby developer with an interest in strongly typed +functional languages, I'm totally psyched to start using Contracts on my +projects. While you don't get the benefits of compile-time checking, you +do get cleaner functions, better implicit documentation, and more +overall confidence about your code. + +And even if Contracts or FP aren't your thing, from a broader +perspective, this demonstrates that **experimenting with other +programming paradigms makes you a better programmer in your primary +language.** It was so easy to see the utility and application of +Contracts while reading *My Weird Ruby*, which would not have been the +case had I not spent time with Haskell, OCaml, and Elixir. diff --git a/content/elsewhere/get-lazy-with-custom-enumerators/index.md b/content/elsewhere/get-lazy-with-custom-enumerators/index.md new file mode 100644 index 0000000..f721d3f --- /dev/null +++ b/content/elsewhere/get-lazy-with-custom-enumerators/index.md @@ -0,0 +1,78 @@ +--- +title: "Get Lazy with Custom Enumerators" +date: 2015-09-28T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/get-lazy-with-custom-enumerators/ +--- + +Ruby 2.0 added the ability to create [custom +enumerators](http://ruby-doc.org/core-2.2.0/Enumerator.html#method-c-new) +and they are +[bad](https://themoviegourmet.files.wordpress.com/2010/07/machete1.jpg) +[ass](https://lifevsfilm.files.wordpress.com/2013/11/grindhouse.jpg). I +tend to group [lazy +evaluation](https://en.wikipedia.org/wiki/Lazy_evaluation) with things +like [pattern matching](https://en.wikipedia.org/wiki/Pattern_matching) +and [currying](https://en.wikipedia.org/wiki/Currying) -- super cool but +not directly applicable to our day-to-day work. I recently had the +chance to use a custom enumerator to clean up some hairy business logic, +though, and I thought I'd share. + +**Some background:** our client had originally requested the ability to +select two related places to display at the bottom of a given place +detail page, one of the primary pages in our app. Over time, they found +that content editors were not always diligent about selecting these +related places, often choosing only one or none. They requested that two +related places always display, using the following logic: + +1. If the place has published, associated places, use those; +2. Otherwise, if there are nearby places, use those; +3. Otherwise, use the most recently updated places. + +Straightforward enough. An early, naïve approach: + + def associated_places + [ + (associated_place_1 if associated_place_1.try(:published?)), + (associated_place_2 if associated_place_2.try(:published?)), + *nearby_places, + *recently_updated_places + ].compact.first(2) + end + +But if a place *does* have two associated places, we don't want to +perform the expensive call to `nearby_places`, and similarly, if it has +nearby places, we'd like to avoid calling `recently_updated_places`. We +also don't want to litter the method with conditional logic. This is a +perfect opportunity to build a custom enumerator: + + def associated_places + Enumerator.new do |y| + y << associated_place_1 if associated_place_1.try(:published?) + y << associated_place_2 if associated_place_2.try(:published?) + nearby_places.each { |place| y << place } + recently_updated_places.each { |place| y << place } + end + end + +`Enumerator.new` takes a block with "yielder" argument. We call the +yielder's `yield` method[^1^](#fn:1 "see footnote"){#fnref:1 .footnote}, +aliased as `<<`, to return the next enumerable value. Now, we can just +say `@place.associated_places.take(2)` and we'll always get back two +places with minimum effort. + +This code ticks all the boxes: fast, clean, and nerdy as hell. If you're +interested in learning more about Ruby's lazy enumerators, I recommend +[*Ruby 2.0 Works Hard So You Can Be +Lazy*](http://patshaughnessy.net/2013/4/3/ruby-2-0-works-hard-so-you-can-be-lazy) +by Pat Shaughnessy and [*Lazy +Refactoring*](https://robots.thoughtbot.com/lazy-refactoring) on the +Thoughtbot blog. + +\* \* \* + +1. ::: {#fn:1} + Confusing name -- not the same as the `yield` keyword. + [ ↩](#fnref:1 "return to article"){.reversefootnote} + ::: diff --git a/content/elsewhere/getting-into-open-source/index.md b/content/elsewhere/getting-into-open-source/index.md new file mode 100644 index 0000000..a1966af --- /dev/null +++ b/content/elsewhere/getting-into-open-source/index.md @@ -0,0 +1,51 @@ +--- +title: "Getting into Open Source" +date: 2010-12-01T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/getting-into-open-source/ +--- + +When evaluating a potential developer hire, one of the first things we +look for is a profile on [GitHub](https://github.com), and I'm always +surprised when someone doesn't have one. When asked, the most frequent +response is that people don't know where to begin contributing to open +source. This response might've had some validity in the +[SourceForge](http://sourceforge.net) days, but with the rise of GitHub, +it\'s become a lot easier to get involved. Here are four easy ways to +get started. + +## 1. Documentation {#1_documentation} + +There's a lot of great open source code out there that goes unused +simply because people can't figure out how to use it. A great way to get +your foot in the door is to improve documentation, whether by updating +the primary README, including examples in the source code, or simply +fixing typos and grammatical errors. + +## 2. Something You Use {#2_something_you_use} + +The vast majority of the plugins and gems that you use every day are +one-person operations. It is a bit intimidating to attempt to improve +code that someone else has spent so much time on, but if you see +something wrong, fork the project and fix it. You'll be amazed how easy +it is and how grateful the original authors will be. + +## 3. Your Blog {#3_your_blog} + +I don't necessarily recommend reinventing the wheel when it comes to +blogging platforms, but if you're looking for something small to code up +using your web framework of choice, writing the software that powers +your personal website is a good option. [The +Setup](http://usesthis.com/), one of my favorite sites, includes a link +to the project source in its footer. + +## 4. Any Dumb Crap {#4_any_dumb_crap} + +One of my favorite talks from RailsConf a few years back was Nathaniel +Talbott's [23 +Hacks](http://en.oreilly.com/rails2008/public/schedule/detail/1980), +which encouraged developers to "enjoy tinkering, puttering, and +generally hacking around." Don't worry that your code isn't perfect and +might never light the world on fire; put it out there and keep improving +it. Simply put, there's almost no code worse than *no code*. diff --git a/content/elsewhere/gifts-for-your-nerd/index.md b/content/elsewhere/gifts-for-your-nerd/index.md new file mode 100644 index 0000000..29fb2c4 --- /dev/null +++ b/content/elsewhere/gifts-for-your-nerd/index.md @@ -0,0 +1,97 @@ +--- +title: "Gifts For Your Nerd" +date: 2009-12-16T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/gifts-for-your-nerd/ +--- + +Shopping for a nerd this holiday season? A difficult proposition, to be +sure. We are, after all, complicated creatures. Fortunately, Viget +Extend is here to help. Here are some gifts your nerd is sure to love. + +[![](https://www.viget.com/uploads/image/dce_iamakey.jpg){.left} **Lacie +iamaKey Flash +Drive**](https://www.amazon.com/LaCie-iamaKey-Flash-Drive-130870/dp/B001V7XPSA) +**(\$30)** + +If your nerd goes to tech conferences with any regularity, your +residence is already littered with these things. USB flash drives are a +dime a dozen, but this one's different: stylish and rugged, and since +it's designed to be carried on a keychain, it'll always around when your +nerd needs it. + +[![](https://www.viget.com/uploads/image/dce_aeropress.jpg){.left} +**AeroPress**](https://www.amazon.com/AeroPress-Coffee-and-Espresso-Maker/dp/B000GXZ2GS) +**(\$25)** + +A simple device that makes a cup of espresso better than machines +costing twenty times as much. Buy this one for your nerd and wake up to +delicious, homemade espresso every morning. In other words, it\'s the +gift that keeps on giving. If espresso gives your nerd the jitters, you +can't go wrong with a [french +press](https://www.amazon.com/Bodum-Chambord-4-Cup-Coffee-Press/dp/B00012D0R2/). + +[![](https://www.viget.com/uploads/image/dce_charge_tee.jpg){.left} +**SimpleBits Charge +Tee**](http://shop.simplebits.com/product/charge-tee-tri-blend) +**(\$22)** + +Simple, vaguely Mac-ish graphic printed on an American Apparel Tri-Blend +tee, no lie the greatest and best t-shirt ever created. + +[![](https://www.viget.com/uploads/image/dce_hard_graft.jpg){.left} +**Hard Graft iPhone +Case**](http://shop.hardgraft.com/product/base-phone-case) **(\$60)** + +Your nerd probably already has a case for her iPhone, but it's made of +rubber or plastic. Class it up with this handmade leather-and-wool case. +Doubles as a slim wallet if your nerd is of the minimalist mindset, and +here's a hint: we all are. + +[![](https://www.viget.com/uploads/image/dce_ignore.jpg){.left} **Ignore +Everybody**](https://www.amazon.com/Ignore-Everybody-Other-Keys-Creativity/dp/159184259X) +**by Hugh MacLeod (\$16)** + +Give your nerd the motivation to finish that web application he's been +talking about for the last two years so you can retire. + +[![](https://www.viget.com/uploads/image/dce_moleskine.jpg){.left} +**Moleskine +Notebook**](https://www.amazon.com/Moleskine-Squared-Notebook-Cover-Pocket/dp/8883707125) +**(\$10)** + +What nerd doesn't love a new notebook? Just make sure it's graph paper; +unlined paper was not created for mathematical formulae and drawings of +robots. Alternatively, take a look at [Field +Notes](http://fieldnotesbrand.com). As for pens, I highly, *highly* +recommend the [Uni-ball +Signo](http://www.jetpens.com/product_info.php/cPath/239_90/products_id/466). + +[![](https://www.viget.com/uploads/image/dce_canon.jpg){.left} **Canon +PowerShot S90**](https://www.amazon.com/dp/B002LITT42/) **(\$400)** + +Packs the low-light photographic abilities of your nerd's DSLR into a +compact form factor that fits in his shirt pocket, right next to his +slide rule. + +[![](https://www.viget.com/uploads/image/dce_newegg.png){.left} **Newegg +Gift +Card**](https://secure.newegg.com/GiftCertificate/GiftCardStep1.aspx) + +If all else fails, a gift card from [Newegg](http://newegg.com) shows +you know your nerd a little better than the usual from Amazon. + +[![](https://www.viget.com/uploads/image/dce_moto_guzzi.jpg){.left} +**Moto Guzzi V7 +Classic**](http://www.autoblog.com/2009/09/30/review-moto-guzzi-v7-classic-is-an-italian-beauty-you-can-live/) +**(\$8500)** + +Actually, this one's probably just me. + +If your nerd is a little more design-oriented, check out Viget Inspire +for ideas from [Owen](https://www.viget.com/inspire/the-winter-scrooge/) +and +[Rob](https://www.viget.com/inspire/10-t-shirts-you-want-to-buy-a-designer/). +Got any other gift suggestions for the nerd in your life, or ARE YOU +YOURSELF a nerd? Link it up in the comments. diff --git a/content/elsewhere/how-why-to-run-autotest-on-your-mac/index.md b/content/elsewhere/how-why-to-run-autotest-on-your-mac/index.md new file mode 100644 index 0000000..5526446 --- /dev/null +++ b/content/elsewhere/how-why-to-run-autotest-on-your-mac/index.md @@ -0,0 +1,75 @@ +--- +title: "How (& Why) to Run Autotest on your Mac" +date: 2009-06-19T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/how-why-to-run-autotest-on-your-mac/ +--- + +If you aren't using Autotest to develop your Ruby application, you're +missing out on effortless continuous testing. If you'd *like* to be +using Autotest, but can't get it running properly, I'll show you how to +set it up. + +Autotest is a fantastic way to do TDD/BDD. Here's a rundown of the +benefits from the [project +homepage](http://www.zenspider.com/ZSS/Products/ZenTest/): + +- Improves feedback by running tests continuously. +- Continually runs tests based on files you've changed. +- Get feedback as soon as you save. Keeps you in your editor allowing + you to get stuff done faster. +- Focuses on running previous failures until you've fixed them. + +Like any responsible Ruby citizen, Autotest changes radically every +month or so. A few weeks ago, some enterprising developers released +autotest-mac (now +[autotest-fsevent](http://www.bitcetera.com/en/techblog/2009/05/27/mac-friendly-autotest/)), +which monitors code changes via native OS X system events rather than by +polling the hard drive, increasing battery and disk life and improving +performance. Here's how get Autotest running on your Mac, current as of +this morning: + +1. Install autotest: + + ``` {#code} + gem install ZenTest + ``` + +2. Or, if you've already got an older version installed: + + ``` {#code} + gem update ZenTest gem cleanup ZenTest + ``` + +3. Install autotest-rails: + + ``` {#code} + gem install autotest-rails + ``` + +4. Install autotest-fsevent: + + ``` {#code} + gem install autotest-fsevent + ``` + +5. Install autotest-growl: + + ``` {#code} + gem install autotest-growl + ``` + +6. Make a `~/.autotest` file, with the following: + + ``` {#code} + require "autotest/growl" require "autotest/fsevent" + ``` + +7. Run `autotest` in your app root. + +Autotest is a fundamental part of my development workflow, and well +worth the occasional setup headache; give it a shot and I think you'll +agree. These instructions should be enough to get you up and running, +unless you're reading this more than three weeks after it was published, +in which case all. bets. are. off. diff --git a/content/elsewhere/html-sanitization-in-rails-that-actually-works/index.md b/content/elsewhere/html-sanitization-in-rails-that-actually-works/index.md new file mode 100644 index 0000000..d6c9963 --- /dev/null +++ b/content/elsewhere/html-sanitization-in-rails-that-actually-works/index.md @@ -0,0 +1,57 @@ +--- +title: "HTML Sanitization In Rails That Actually Works" +date: 2009-11-23T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/html-sanitization-in-rails-that-actually-works/ +--- + +Assuming you don't want to simply escape everything, sanitizing user +input is one of the relative weak points of the Rails framework. On +[SpeakerRate](http://speakerrate.com/), where users can use +[Markdown](http://daringfireball.net/projects/markdown/) to format +comments and descriptions, we've run up against some of the limitations +of Rails' built-in sanitization features, so we decided to dig in and +fix it ourselves. + +In creating our own sanitizer, our goals were threefold: we want to +**let a subset of HTML in**. As the [Markdown +documentation](http://daringfireball.net/projects/markdown/syntax#html) +clearly states, "for any markup that is not covered by Markdown's +syntax, you simply use HTML itself." In keeping with the Markdown +philosophy, we can't simply strip all HTML from incoming comments, so +the included +[HTML::WhiteListSanitizer](https://github.com/rails/rails/blob/master/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb#LID60) +is the obvious starting point. + +Additionally, we want to **escape, rather than remove, non-approved +tags**, since some commenters want to discuss the merits of, say, +[`

`](http://speakerrate.com/talks/1698-object-oriented-css#c797). +Contrary to its documentation, WhiteListSanitizer simply removes all +non-whitelisted tags. Someone opened a +[ticket](https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/916) +about this issue in August of 2008 with an included patch, but the +ticket was marked as resolved without ever applying it. Probably for the +best, as the patch introduces a new bug. + +Finally, we want to **escape unclosed tags even if they belong to the +whitelist**. An unclosed `` tag can wreak havoc on the rest of a +page, not to mention what a `
` can do. Self-closing tags are okay. + +With these requirements in mind, we subclassed HTML::WhiteListSanitizer +and fixed it up. Introducing, then: + +![Jason +Statham](http://goremasternews.files.wordpress.com/2009/10/jason_statham.jpg "Jason Statham") + +[**HTML::StathamSanitizer**](https://gist.github.com/241114). +User-generated markup, you're on notice: this sanitizer will take its +shirt off and use it to kick your ass. At this point, I've written more +about the code than code itself, so without further ado: + +``` {#code .ruby} +module HTML class StathamSanitizer < WhiteListSanitizer protected def tokenize(text, options) super.map do |token| if token.is_a?(HTML::Tag) && options[:parent].include?(token.name) token.to_s.gsub(/ (input = '') xml.instruct! xml.DATASET do xml.SITE_ID SITE_ID yield xml end Net::HTTP.post_form(URI.parse(ENDPOINT), :type => request_type, :activity => activity, :input => input) end +``` + +Then you can make API requests like this: + +``` {#code .ruby} +def self.subscribe_user(mailing_list, email_address) send_request('record', 'add') do |body| body.MLID mailing_list body.DATA email_address, :type => 'email' end end +``` + +If you find yourself needing to work with an EmailLabs mailing list, +check it out. At the very least, you should get a decent idea of how to +interact with their API. It's up on +[GitHub](https://github.com/vigetlabs/email_labs_client/tree/master), so +if you add any functionality, send those patches our way. diff --git a/content/elsewhere/json-feed-validator/index.md b/content/elsewhere/json-feed-validator/index.md new file mode 100644 index 0000000..29cf6e2 --- /dev/null +++ b/content/elsewhere/json-feed-validator/index.md @@ -0,0 +1,80 @@ +--- +title: "JSON Feed Is Cool (+ a Simple Tool to Create Your Own)" +date: 2017-08-02T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/json-feed-validator/ +--- + +A few months ago, Manton Reece and Brent Simmons [announced the creation +of JSON Feed](https://jsonfeed.org/2017/05/17/announcing_json_feed), a +new JSON-based syndication format similar to (but so much better than) +[RSS](https://en.wikipedia.org/wiki/RSS) and +[Atom](https://en.wikipedia.org/wiki/Atom_(standard)). One might +reasonably contend that Google killed feed-based content aggregation in +2013 when they end-of-lifed™ Google Reader, but RSS continues to enjoy +[underground +popularity](http://www.makeuseof.com/tag/rss-dead-look-numbers/) and +JSON Feed has the potential to make feed creation and consumption even +more widespread. So why are we^[1](#fn:1 "see footnote"){#fnref:1 +.footnote}^ so excited about it? + +## JSON \> XML {#jsonxml} + +RSS and Atom are both XML-based formats, and as someone who's written +code to both produce and ingest these feeds, it's not how I'd choose to +spend a Saturday. Or even a Tuesday. Elements in XML have both +attributes and children, which is a mismatch for most modern languages' +native data structures. You end up having to use libraries like +[Nokogiri](http://www.nokogiri.org/) to write code like +`item.attributes["name"]` and `item.children[0]`. And producing a feed +usually involves a full-blown templating solution like ERB. Contrast +that with JSON, which maps perfectly to JavaScript objects (-\_-), Ruby +hashes/arrays, Elixir maps, etc., etc. Producing a feed becomes a call +to `.to_json`, and consuming one, `JSON.parse`. + +## Flexibility + +While still largely focused on content syndication, [the +spec](https://jsonfeed.org/version/1) includes support for plaintext and +title-less posts and custom extensions, meaning its potential uses are +myriad. Imagine a new generation of microblogs, Slack bots, and IoT +devices consuming and/or producing JSON feeds. + +## Feeds Are (Still) Cool {#feedsarestillcool} + +Not to get too high up on my horse or whatever, but as a longtime web +nerd, I'm dismayed by how much content creation has migrated to walled +gardens like Facebook/Instagram/Twitter/Medium that make it super easy +to get content *in*, but very difficult to get it back *out*. [Twitter +killed RSS in 2012](http://mashable.com/2012/09/05/twitter-api-rss), and +have you ever tried to get a list of your most recent Instagram photos +programatically? I wouldn't. Owning your own content and sharing it +liberally is what the web was made for, and JSON Feed has the potential +to make it easy and fun to do. [It's how things should be. It's how they +could be.](https://www.youtube.com/watch?v=TgqiSBxvdws) + +------------------------------------------------------------------------ + +## Your Turn + +If this sounds at all interesting to you, read the +[announcement](https://jsonfeed.org/2017/05/17/announcing_json_feed) and +the [spec](https://jsonfeed.org/version/1), listen to this [interview +with the +creators](https://daringfireball.net/thetalkshow/2017/05/31/ep-192), and +**try out this [JSON Feed +Validator](https://json-feed-validator.herokuapp.com/) I put up this +week**. You can use the [Daring Fireball +feed](https://daringfireball.net/feeds/json) or create your own. It's +pretty simple right now, running your input against a schema I +downloaded from [JSON Schema Store](http://schemastore.org/json/), but +[suggestions and pull requests are +welcome](https://github.com/vigetlabs/json-feed-validator). + + +------------------------------------------------------------------------ + +1. [The royal we, you + know?](https://www.youtube.com/watch?v=VLR_TDO0FTg#t=45s) + [ ↩](#fnref:1 "return to article"){.reversefootnote} diff --git a/content/elsewhere/large-images-in-rails/index.md b/content/elsewhere/large-images-in-rails/index.md new file mode 100644 index 0000000..40ca8ad --- /dev/null +++ b/content/elsewhere/large-images-in-rails/index.md @@ -0,0 +1,86 @@ +--- +title: "Large Images in Rails" +date: 2012-09-18T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/large-images-in-rails/ +--- + +The most visually striking feature on the new +[WWF](http://worldwildlife.org/) site, as well as the source of the +largest technical challenges, is the photography. The client team is +working with gorgeous, high-fidelity photographs loaded with metadata, +and it was up to us to make them work in a web context. Here are a few +things we did to make the site look and perform like a veritable [snow +leopard](http://worldwildlife.org/species/snow-leopard). + +## Optimize Images + +The average uploaded photo into this system is around five megabytes, so +the first order of business was to find ways to get filesize down. Two +techniques turned out to be very effective: +[jpegtran](http://jpegclub.org/jpegtran/) and +[ImageMagick](http://www.imagemagick.org/script/index.php)'s `quality` +option. We run all photos through a custom +[Paperclip](https://github.com/thoughtbot/paperclip) processor that +calls out to jpegtran to losslessly optimize image compression and strip +out metadata. In some cases, we were seeing thumbnailed images go from +60k to 15k by removing unused color profile data. We save the resulting +images out at 75% quality with the following Paperclip directive: + + has_attached_file :image, + :convert_options => { :all => "-quality 75" }, + :styles => { # ... + +Enabling this option has a huge impact on filesize (about a 90% +reduction) with no visible loss of quality. Be aware that we're working +with giant, unoptimized images; if you're going to be uploading images +that have already been saved out for the web, this level of compression +is probably too aggressive. + +## Process in Background + +Basic maths: large images × lots of crop styles = long processing time. +As the site grew, the delay after uploading a new photo increased until +it became unacceptable. It was time to implement background processing. +[Resque](https://github.com/defunkt/resque) and +[delayed_paperclip](https://github.com/jstorimer/delayed_paperclip) to +the ... rescue (derp). These two gems make it super simple to process +images outside of the request/response flow with a simple +`process_in_background :image` in your model. + +A few notes: as of this writing, delayed_paperclip hasn't been updated +recently. [Here's a fork that +works](https://github.com/tommeier/delayed_paperclip) from tommeier. I +recommend using the +[rescue-ensure-connected](https://github.com/socialcast/resque-ensure-connected) +gem if you're going to run Resque in production to keep your +long-running processes from losing their DB connnections. + +## Server Configuration + +You'll want to put [far-future expires +headers](http://developer.yahoo.com/performance/rules.html#expires) on +these photos so that browsers know not to redownload them. If you +control the servers from which they'll be served, you can configure +Apache to send these headers with the following bit of configuration: + + ExpiresActive On + ExpiresByType image/png "access plus 1 year" + ExpiresByType image/gif "access plus 1 year" + ExpiresByType image/jpeg "access plus 1 year" + +([Similarly, for +nginx](http://www.agileweboperations.com/far-future-expires-headers-for-ruby-on-rails-with-nginx).) +When working with a bunch of large files, though, you're probably better +served by uploading them to S3 or RackSpace Cloud Files and serving them +from there. + +------------------------------------------------------------------------ + +Another option to look at might be +[Dragonfly](https://github.com/markevans/dragonfly), which takes a +different approach to photo processing than does Paperclip, resizing on +the fly rather than on upload. This might obviate the need for Resque +but at unknown (by me) cost. We hope that some of this will be helpful +in your next photo-intensive project. diff --git a/content/elsewhere/lets-make-a-hash-chain-in-sqlite/index.md b/content/elsewhere/lets-make-a-hash-chain-in-sqlite/index.md new file mode 100644 index 0000000..238c7ba --- /dev/null +++ b/content/elsewhere/lets-make-a-hash-chain-in-sqlite/index.md @@ -0,0 +1,233 @@ +--- +title: "Let’s Make a Hash Chain in SQLite" +date: 2021-06-30T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/lets-make-a-hash-chain-in-sqlite/ +--- + +I\'m not much of a cryptocurrency enthusiast, but there are some neat +ideas in these protocols that I wanted to explore further. Based on my +absolute layperson\'s understanding, the \"crypto\" in +\"cryptocurrency\" describes three things: + +1. Some public key/private key stuff to grant access to funds at an + address; +2. For certain protocols (e.g. Bitcoin), the cryptographic + puzzles[^1^](#fn:1 "see footnote"){#fnref:1 .footnote} that miners + have to solve in order to add new blocks to the ledger; and +3. The use of hashed signatures to ensure data integrity. + +Of those three uses, the first two (asymmetric cryptography and +proof-of-work) aren\'t that interesting to me, at least from a technical +perspective. The third concept, though --- using cryptography to make +data verifiable and tamper-resistant --- that\'s pretty cool, and +something I wanted to dig into. I decided to build a little +proof-of-concept using [SQLite](https://www.sqlite.org/index.html), a +\"small, fast, self-contained, high-reliability, full-featured, SQL +database engine.\" + +A couple notes before we dive in: these concepts aren\'t unique to the +blockchain; Wikipedia has good explanations of [cryptographic hash +functions](https://en.wikipedia.org/wiki/Cryptographic_hash_function), +[Merkle trees](https://en.wikipedia.org/wiki/Merkle_tree), and [hash +chains](https://en.wikipedia.org/wiki/Hash_chain) if any of this piques +your curiosity. This stuff is also [at the core of +git](https://initialcommit.com/blog/git-bitcoin-merkle-tree), which is +really pretty neat. + +[]{#onto-the-code} + +## Onto the code [\#](#onto-the-code "Direct link to Onto the code"){.anchor aria-label="Direct link to Onto the code"} + +Implementing a rudimentary hash chain in SQL is pretty simple. Here\'s +my approach, which uses \"bookmarks\" as an arbitrary record type. + +``` {.code-block .line-numbers} +PRAGMA foreign_keys = ON; +SELECT load_extension("./sha1"); + +CREATE TABLE bookmarks ( + id INTEGER PRIMARY KEY, + signature TEXT NOT NULL UNIQUE + CHECK(signature = sha1(url || COALESCE(parent, ""))), + parent TEXT, + url TEXT NOT NULL UNIQUE, + FOREIGN KEY(parent) REFERENCES bookmarks(signature) +); + +CREATE UNIQUE INDEX parent_unique ON bookmarks ( + ifnull(parent, "") +); +``` + +This code is available on +[GitHub](https://github.com/dce/sqlite-hash-chain) in case you want to +try this out on your own. Let\'s break it down a little bit. + +- First, we enable foreign key constraints, which aren\'t on by + default +- Then we pull in SQLite\'s [`sha1` + function](https://www.i-programmer.info/news/84-database/10527-sqlite-317-adds-sha1-extension.html), + which implements a common hashing algorithm +- Then we define our table + - `id` isn\'t mandatory but makes it easier to grab the last entry + - `signature` is the SHA1 hash of the bookmark URL and parent + entry\'s signature; it uses a `CHECK` constraint to ensure this + is guaranteed to be true + - `parent` is the `signature` of the previous entry in the chain + (notice that it\'s allowed to be null) + - `url` is the data we want to ensure is immutable (though as + we\'ll see later, it\'s not truly immutable since we can still + do cascading updates) +- We set a foreign key constraint that `parent` refers to another + row\'s `signature` unless it\'s null +- Then we create a unique index on `parent` that covers the `NULL` + case, since our very first bookmark won\'t have a parent, but no + other row should be allowed to have a null parent, and no two rows + should be able to have the same parent + +Next, let\'s insert some data: + +``` {.code-block .line-numbers} +INSERT INTO bookmarks (url, signature) VALUES ("google", sha1("google")); + +WITH parent AS (SELECT signature FROM bookmarks ORDER BY id DESC LIMIT 1) +INSERT INTO bookmarks (url, parent, signature) VALUES ( + "yahoo", (SELECT signature FROM parent), sha1("yahoo" || (SELECT signature FROM parent)) +); + +WITH parent AS (SELECT signature FROM bookmarks ORDER BY id DESC LIMIT 1) +INSERT INTO bookmarks (url, parent, signature) VALUES ( + "bing", (SELECT signature FROM parent), sha1("bing" || (SELECT signature FROM parent)) +); + +WITH parent AS (SELECT signature FROM bookmarks ORDER BY id DESC LIMIT 1) +INSERT INTO bookmarks (url, parent, signature) VALUES ( + "duckduckgo", (SELECT signature FROM parent), sha1("duckduckgo" || (SELECT signature FROM parent)) +); +``` + +OK! Let\'s fire up `sqlite3` and then `.read` this file. Here\'s the +result: + +``` {.code-block .line-numbers} +sqlite> SELECT * FROM bookmarks; ++----+------------------------------------------+------------------------------------------+------------+ +| id | signature | parent | url | ++----+------------------------------------------+------------------------------------------+------------+ +| 1 | 759730a97e4373f3a0ee12805db065e3a4a649a5 | | google | +| 2 | 64633167b8e44cb833fbfa349731d8a68e942ebc | 759730a97e4373f3a0ee12805db065e3a4a649a5 | yahoo | +| 3 | ce3df1337879e85bc488d4cae129719cc46cad04 | 64633167b8e44cb833fbfa349731d8a68e942ebc | bing | +| 4 | 675570ac126d492e449ebaede091e2b7dad7d515 | ce3df1337879e85bc488d4cae129719cc46cad04 | duckduckgo | ++----+------------------------------------------+------------------------------------------+------------+ +``` + +This has some cool properties. I can\'t delete an entry in the chain: + +`sqlite> DELETE FROM bookmarks WHERE id = 3;` +`Error: FOREIGN KEY constraint failed` + +I can\'t change a URL: + +`sqlite> UPDATE bookmarks SET url = "altavista" WHERE id = 3;` +`Error: CHECK constraint failed: signature = sha1(url || parent)` + +I can\'t re-sign an entry: + +`sqlite> UPDATE bookmarks SET url = "altavista", signature = sha1("altavista" || parent) WHERE id = 3;` +`Error: FOREIGN KEY constraint failed` + +I **can**, however, update the last entry in the chain: + +``` {.code-block .line-numbers} +sqlite> UPDATE bookmarks SET url = "altavista", signature = sha1("altavista" || parent) WHERE id = 4; +sqlite> SELECT * FROM bookmarks; ++----+------------------------------------------+------------------------------------------+-----------+ +| id | signature | parent | url | ++----+------------------------------------------+------------------------------------------+-----------+ +| 1 | 759730a97e4373f3a0ee12805db065e3a4a649a5 | | google | +| 2 | 64633167b8e44cb833fbfa349731d8a68e942ebc | 759730a97e4373f3a0ee12805db065e3a4a649a5 | yahoo | +| 3 | ce3df1337879e85bc488d4cae129719cc46cad04 | 64633167b8e44cb833fbfa349731d8a68e942ebc | bing | +| 4 | b583a025b5a43727978c169fe99f5422039194ea | ce3df1337879e85bc488d4cae129719cc46cad04 | altavista | ++----+------------------------------------------+------------------------------------------+-----------+ +``` + +This is because a row isn\'t really \"locked in\" until it\'s pointed to +by another row. It\'s worth pointing out that an actual blockchain would +use a [consensus +mechanism](https://www.investopedia.com/terms/c/consensus-mechanism-cryptocurrency.asp) +to prevent any updates like this, but that\'s way beyond the scope of +what we\'re doing here. + +[]{#cascading-updates} + +## Cascading updates [\#](#cascading-updates "Direct link to Cascading updates"){.anchor aria-label="Direct link to Cascading updates"} + +Given that we can change the last row, it\'s possible to update any row +in the ledger provided you 1) also re-sign all of its children and 2) do +it all in a single pass. Here\'s how you\'d update row 2 to +\"askjeeves\" with a [`RECURSIVE` +query](https://www.sqlite.org/lang_with.html#recursive_common_table_expressions) +(and sorry I know this is a little hairy): + +``` {.code-block .line-numbers} +WITH RECURSIVE + t1(url, parent, old_signature, signature) AS ( + SELECT "askjeeves", parent, signature, sha1("askjeeves" || COALESCE(parent, "")) + FROM bookmarks WHERE id = 2 + UNION + SELECT t2.url, t1.signature, t2.signature, sha1(t2.url || t1.signature) + FROM bookmarks AS t2, t1 WHERE t2.parent = t1.old_signature + ) +UPDATE bookmarks +SET url = (SELECT url FROM t1 WHERE t1.old_signature = bookmarks.signature), + parent = (SELECT parent FROM t1 WHERE t1.old_signature = bookmarks.signature), + signature = (SELECT signature FROM t1 WHERE t1.old_signature = bookmarks.signature) +WHERE signature IN (SELECT old_signature FROM t1); +``` + +Here\'s the result of running this update: + +``` {.code-block .line-numbers} ++----+------------------------------------------+------------------------------------------+-----------+ +| id | signature | parent | url | ++----+------------------------------------------+------------------------------------------+-----------+ +| 1 | 759730a97e4373f3a0ee12805db065e3a4a649a5 | | google | +| 2 | de357e976171e528088843dfa35c1097017b1009 | 759730a97e4373f3a0ee12805db065e3a4a649a5 | askjeeves | +| 3 | 1b69dff11f3e8ffeade0f42521f9e1bd1bd78539 | de357e976171e528088843dfa35c1097017b1009 | bing | +| 4 | 924660e4f25e2ac8c38ca25bae201ad3a5b6e545 | 1b69dff11f3e8ffeade0f42521f9e1bd1bd78539 | altavista | ++----+------------------------------------------+------------------------------------------+-----------+ +``` + +As you can see, row 2\'s `url` is updated, and rows 3 and 4 have updated +signatures and parents. Pretty cool, and pretty much the same thing as +what happens when you change a git commit via `rebase` --- all the +successive commits get new SHAs. + + +[[Learn More]{.util-breadcrumb-md .mb-8 .group-hover:translate-y-20 +.group-hover:opacity-0 .transition-all .ease-in-out +.duration-500}](https://www.viget.com/careers/application-developer/){.relative +.flex .group .flex-col .p-32 .md:p-40 .lg:p-64 .z-10} + +### We're hiring Application Developers. Learn more and introduce yourself. {#were-hiring-application-developers.-learn-more-and-introduce-yourself. .text-20 .md:text-24 .lg:text-32 .font-bold .leading-[170%] .group-hover:-translate-y-20 .transition-transform .ease-in-out .duration-500} + +![](data:image/svg+xml;base64,PHN2ZyBjbGFzcz0icmVjdC1pY29uLW1kIHNlbGYtZW5kIG10LTE2IGdyb3VwLWhvdmVyOi10cmFuc2xhdGUteS0yMCB0cmFuc2l0aW9uLWFsbCBlYXNlLWluLW91dCBkdXJhdGlvbi01MDAiIHZpZXdib3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBhcmlhLWhpZGRlbj0idHJ1ZSI+CjxwYXRoIGZpbGwtcnVsZT0iZXZlbm9kZCIgY2xpcC1ydWxlPSJldmVub2RkIiBkPSJNMTMuNzg0OCAxOS4zMDkxQzEzLjQ3NTggMTkuNTg1IDEzLjAwMTcgMTkuNTU4MyAxMi43MjU4IDE5LjI0OTRDMTIuNDQ5OCAxOC45NDA1IDEyLjQ3NjYgMTguNDY2MyAxMi43ODU1IDE4LjE5MDRMMTguNzg2NiAxMi44MzAxTDQuNzUxOTUgMTIuODMwMUM0LjMzNzc0IDEyLjgzMDEgNC4wMDE5NSAxMi40OTQzIDQuMDAxOTUgMTIuMDgwMUM0LjAwMTk1IDExLjY2NTkgNC4zMzc3NCAxMS4zMzAxIDQuNzUxOTUgMTEuMzMwMUwxOC43ODU1IDExLjMzMDFMMTIuNzg1NSA1Ljk3MDgyQzEyLjQ3NjYgNS42OTQ4OCAxMi40NDk4IDUuMjIwNzYgMTIuNzI1OCA0LjkxMTg0QzEzLjAwMTcgNC42MDI5MiAxMy40NzU4IDQuNTc2MTggMTMuNzg0OCA0Ljg1MjEyTDIxLjIzNTggMTEuNTA3NkMyMS4zNzM4IDExLjYyNDQgMjEuNDY5IDExLjc5MDMgMjEuNDk0NSAxMS45NzgyQzIxLjQ5OTIgMTIuMDExOSAyMS41MDE1IDEyLjA0NjEgMjEuNTAxNSAxMi4wODA2QzIxLjUwMTUgMTIuMjk0MiAyMS40MTA1IDEyLjQ5NzcgMjEuMjUxMSAxMi42NEwxMy43ODQ4IDE5LjMwOTFaIj48L3BhdGg+Cjwvc3ZnPg==){.rect-icon-md +.self-end .mt-16 .group-hover:-translate-y-20 .transition-all +.ease-in-out .duration-500} + +I\'ll be honest that I don\'t have any immediately practical uses for a +cryptographically-signed database table, but I thought it was cool and +helped me understand these concepts a little bit better. Hopefully it +gets your mental wheels spinning a little bit, too. Thanks for reading! + +------------------------------------------------------------------------ + +1. ::: {#fn:1} + [Here\'s a pretty good explanation of what mining really + is](https://asthasr.github.io/posts/how-blockchains-work/), but, in + a nutshell, it\'s running a hashing algorithm over and over again + with a random salt until a hash is found that begins with a required + number of zeroes. [ ↩︎](#fnref:1 "return to body"){.reversefootnote} + ::: diff --git a/content/elsewhere/lets-write-a-dang-elasticsearch-plugin/index.md b/content/elsewhere/lets-write-a-dang-elasticsearch-plugin/index.md new file mode 100644 index 0000000..37b9659 --- /dev/null +++ b/content/elsewhere/lets-write-a-dang-elasticsearch-plugin/index.md @@ -0,0 +1,427 @@ +--- +title: "Let’s Write a Dang ElasticSearch Plugin" +date: 2021-03-15T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/lets-write-a-dang-elasticsearch-plugin/ +--- + +One of our current projects involves a complex interactive query builder +to search a large collection of news items. Some of the conditionals +fall outside of the sweet spot of Postgres (e.g. word X must appear +within Y words of word Z), and so we opted to pull in +[ElasticSearch](https://www.elastic.co/elasticsearch/) alongside it. +It\'s worked perfectly, hitting all of our condition and grouping needs +with one exception: we need to be able to filter for articles that +contain a term a minimum number of times (so \"Apple\" must appear in +the article 3 times, for example). Frustratingly, Elastic *totally* has +this information via its +[`term_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/term-vector.html) +feature, but you can\'t use that data inside a query, as least as far as +I can tell. + +The solution, it seems, is to write a custom plugin. I figured it out, +eventually, but it was a lot of trial-and-error as the documentation I +was able to find is largely outdated or incomplete. So I figured I\'d +take what I learned while it\'s still fresh in my mind in the hopes that +someone else might have an easier time of it. That\'s what internet +friends are for, after all. + +Quick note before we start: all the version numbers you see are current +and working as of February 25, 2021. Hopefully this post ages well, but +if you try this out and hit issues, bumping the versions of Elastic, +Gradle, and maybe even Java is probably a good place to start. Also, I +use `projectname` a lot in the code examples --- that\'s not a special +word and you should change it to something that makes sense for you. + +[]{#1-set-up-a-java-development-environment} + +## 1. Set up a Java development environment [\#](#1-set-up-a-java-development-environment "Direct link to 1. Set up a Java development environment"){.anchor aria-label="Direct link to 1. Set up a Java development environment"} + +First off, you\'re gonna be writing some Java. That\'s not my usual +thing, so the first step was to get a working environment to compile my +code. To do that, we\'ll use [Docker](https://www.docker.com/). Here\'s +a `Dockerfile`: + +``` {.code-block .line-numbers} +FROM adoptopenjdk/openjdk12:jdk-12.0.2_10-ubuntu + +RUN apt-get update && + apt-get install -y zip unzip && + rm -rf /var/lib/apt/lists/* + +SHELL ["/bin/bash", "-c"] + +RUN curl -s "https://get.sdkman.io" | bash && + source "/root/.sdkman/bin/sdkman-init.sh" && + sdk install gradle 6.8.2 + +WORKDIR /plugin +``` + +We use a base image with all the Java stuff but also a working Ubuntu +install so that we can do normal Linux-y things inside our container. +From your terminal, build the image: + +`> docker build . -t projectname-java` + +Then, spin up the container and start an interactive shell, mounting +your local working directory into `/plugin`: + +`> docker run --rm -it -v ${PWD}:/plugin projectname-java bash` + +[]{#2-configure-gradle} + +## 2. Configure Gradle [\#](#2-configure-gradle "Direct link to 2. Configure Gradle"){.anchor aria-label="Direct link to 2. Configure Gradle"} + +[Gradle](https://gradle.org/) is a \"build automation tool for +multi-language software development,\" and what Elastic recommends for +plugin development. Configuring Gradle to build the plugin properly was +the hardest part of this whole endeavor. Throw this into `build.gradle` +in your project root: + +``` {.code-block .line-numbers} +buildscript { + repositories { + mavenLocal() + mavenCentral() + jcenter() + } + + dependencies { + classpath "org.elasticsearch.gradle:build-tools:7.11.1" + } +} + +apply plugin: 'java' + +compileJava { + sourceCompatibility = JavaVersion.VERSION_12 + targetCompatibility = JavaVersion.VERSION_12 +} + +apply plugin: 'elasticsearch.esplugin' + +group = "com.projectname" +version = "0.0.1" + +esplugin { + name 'contains-multiple' + description 'A script for finding documents that match a term a certain number of times' + classname 'com.projectname.containsmultiple.ContainsMultiplePlugin' + licenseFile rootProject.file('LICENSE.txt') + noticeFile rootProject.file('NOTICE.txt') +} + +validateNebulaPom.enabled = false +``` + +You\'ll also need files named `LICENSE.txt` and `NOTICE.txt` --- mine +are empty, since the plugin is for internal use only. If you\'re going +to be releasing your plugin in some public way, maybe talk to a lawyer +about what to put in those files. + +[]{#3-write-the-dang-plugin} + +## 3. Write the dang plugin [\#](#3-write-the-dang-plugin "Direct link to 3. Write the dang plugin"){.anchor aria-label="Direct link to 3. Write the dang plugin"} + +To write the actual plugin, I started with [this example +plugin](https://github.com/elastic/elasticsearch/blob/master/plugins/examples/script-expert-scoring/src/main/java/org/elasticsearch/example/expertscript/ExpertScriptPlugin.java) +which scores a document based on the frequency of a given term. My use +case was fortunately quite similar, though I\'m using a `filter` query, +meaning I just want a boolean, i.e. does this document contain this term +the requisite number of times? As such, I implemented a +[`FilterScript`](https://www.javadoc.io/doc/org.elasticsearch/elasticsearch/latest/org/elasticsearch/script/FilterScript.html) +rather than the `ScoreScript` implemented in the example code. + +This file lives in (deep breath) +`src/main/java/com/projectname/containsmultiple/ContainsMultiplePlugin.java`: + +``` {.code-block .line-numbers} +package com.projectname.containsmultiple; + +import org.apache.lucene.index.LeafReaderContext; +import org.apache.lucene.index.PostingsEnum; +import org.apache.lucene.index.Term; +import org.elasticsearch.common.settings.Settings; +import org.elasticsearch.plugins.Plugin; +import org.elasticsearch.plugins.ScriptPlugin; +import org.elasticsearch.script.FilterScript; +import org.elasticsearch.script.FilterScript.LeafFactory; +import org.elasticsearch.script.ScriptContext; +import org.elasticsearch.script.ScriptEngine; +import org.elasticsearch.script.ScriptFactory; +import org.elasticsearch.search.lookup.SearchLookup; + +import java.io.IOException; +import java.io.UncheckedIOException; +import java.util.Collection; +import java.util.Map; +import java.util.Set; + +/** + * A script for finding documents that match a term a certain number of times + */ +public class ContainsMultiplePlugin extends Plugin implements ScriptPlugin { + + @Override + public ScriptEngine getScriptEngine( + Settings settings, + Collection> contexts + ) { + return new ContainsMultipleEngine(); + } + + // tag::contains_multiple + private static class ContainsMultipleEngine implements ScriptEngine { + @Override + public String getType() { + return "expert_scripts"; + } + + @Override + public T compile( + String scriptName, + String scriptSource, + ScriptContext context, + Map params + ) { + if (context.equals(FilterScript.CONTEXT) == false) { + throw new IllegalArgumentException(getType() + + " scripts cannot be used for context [" + + context.name + "]"); + } + // we use the script "source" as the script identifier + if ("contains_multiple".equals(scriptSource)) { + FilterScript.Factory factory = new ContainsMultipleFactory(); + return context.factoryClazz.cast(factory); + } + throw new IllegalArgumentException("Unknown script name " + + scriptSource); + } + + @Override + public void close() { + // optionally close resources + } + + @Override + public Set> getSupportedContexts() { + return Set.of(FilterScript.CONTEXT); + } + + private static class ContainsMultipleFactory implements FilterScript.Factory, + ScriptFactory { + @Override + public boolean isResultDeterministic() { + return true; + } + + @Override + public LeafFactory newFactory( + Map params, + SearchLookup lookup + ) { + return new ContainsMultipleLeafFactory(params, lookup); + } + } + + private static class ContainsMultipleLeafFactory implements LeafFactory { + private final Map params; + private final SearchLookup lookup; + private final String field; + private final String term; + private final int count; + + private ContainsMultipleLeafFactory( + Map params, SearchLookup lookup) { + if (params.containsKey("field") == false) { + throw new IllegalArgumentException( + "Missing parameter [field]"); + } + if (params.containsKey("term") == false) { + throw new IllegalArgumentException( + "Missing parameter [term]"); + } + if (params.containsKey("count") == false) { + throw new IllegalArgumentException( + "Missing parameter [count]"); + } + this.params = params; + this.lookup = lookup; + field = params.get("field").toString(); + term = params.get("term").toString(); + count = Integer.parseInt(params.get("count").toString()); + } + + @Override + public FilterScript newInstance(LeafReaderContext context) + throws IOException { + PostingsEnum postings = context.reader().postings( + new Term(field, term)); + if (postings == null) { + /* + * the field and/or term don't exist in this segment, + * so always return 0 + */ + return new FilterScript(params, lookup, context) { + @Override + public boolean execute() { + return false; + } + }; + } + return new FilterScript(params, lookup, context) { + int currentDocid = -1; + @Override + public void setDocument(int docid) { + /* + * advance has undefined behavior calling with + * a docid <= its current docid + */ + if (postings.docID() < docid) { + try { + postings.advance(docid); + } catch (IOException e) { + throw new UncheckedIOException(e); + } + } + currentDocid = docid; + } + @Override + public boolean execute() { + if (postings.docID() != currentDocid) { + /* + * advance moved past the current doc, so this + * doc has no occurrences of the term + */ + return false; + } + try { + return postings.freq() >= count; + } catch (IOException e) { + throw new UncheckedIOException(e); + } + } + }; + } + } + } + // end::contains_multiple +} +``` + +[]{#4-add-it-to-elasticSearch} + +## 4. Add it to ElasticSearch [\#](#4-add-it-to-elasticSearch "Direct link to 4. Add it to ElasticSearch"){.anchor aria-label="Direct link to 4. Add it to ElasticSearch"} + +With our code in place (and synced into our Docker container with a +mounted volume), it\'s time to compile it. In the Docker shell you +started up in step #1, build your plugin: + +`> gradle build` + +Assuming that works, you should now see a `build` directory with a bunch +of stuff in it. The file you care about is +`build/distributions/contains-multiple-0.0.1.zip` (though that\'ll +obviously change if you call your plugin something different or give it +a different version number). Grab that file and copy it to where you +plan to actually run ElasticSearch. For me, I placed it in a folder +called `.docker/elastic` in the main project repo. In that same +directory, create a new `Dockerfile` that\'ll actually run Elastic: + +``` {.code-block .line-numbers} +FROM docker.elastic.co/elasticsearch/elasticsearch:7.11.1 + +COPY .docker/elastic/contains-multiple-0.0.1.zip /plugins/contains-multiple-0.0.1.zip + +RUN elasticsearch-plugin install + file:///plugins/contains-multiple-0.0.1.zip +``` + +Then, in your project root, create the following `docker-compose.yml`: + +``` {.code-block .line-numbers} +version: '3.2' + +services: elasticsearch: + image: projectname_elasticsearch + build: + context: . + dockerfile: ./.docker/elastic/Dockerfile + ports: + - 9200:9200 + environment: + - discovery.type=single-node + - script.allowed_types=inline + - script.allowed_contexts=filter +``` + +Those last couple lines are pretty important and your script won\'t work +without them. Build your image with `docker-compose build` and then +start Elastic with `docker-compose up`. + +[]{#5-use-your-plugin} + +## 5. Use your plugin [\#](#5-use-your-plugin "Direct link to 5. Use your plugin"){.anchor aria-label="Direct link to 5. Use your plugin"} + +To actually see the plugin in action, first create an index and add some +documents (I\'ll assume you\'re able to do this if you\'ve read this far +into this post). Then, make a query with `curl` (or your Elastic wrapper +of choice), substituting `full_text`, `yabba` and `index_name` with +whatever makes sense for you: + +``` {.code-block .line-numbers} +> curl -H "content-type: application/json" +-d ' +{ + "query": { + "bool": { + "filter": { + "script": { + "script": { + "source": "contains_multiple", + "lang": "expert_scripts", + "params": { + "field": "full_text", + "term": "yabba", + "count": 3 + } + } + } + } + } + } +}' +"localhost:9200/index_name/_search?pretty" +``` + +The result should be something like: + +``` {.code-block .line-numbers} +{ + "took" : 6, + "timed_out" : false, + "_shards" : { + "total" : 1, + "successful" : 1, + "skipped" : 0, + "failed" : 0 + }, + "hits" : { + "total" : { + "value" : 1, + "relation" : "eq" + }, + "max_score" : 0.0, + "hits" : [ + { + "_index" : "index_name", + "_type" : "_doc", + "_id" : "10", + ... +``` + +So that\'s that, an ElasticSearch plugin from start-to-finish. I\'m sure +there are better ways to do some of this stuff, and if you\'re aware of +any, let us know in the comments or write your own dang blog. diff --git a/content/elsewhere/level-up-your-shell-game/index.md b/content/elsewhere/level-up-your-shell-game/index.md new file mode 100644 index 0000000..9030339 --- /dev/null +++ b/content/elsewhere/level-up-your-shell-game/index.md @@ -0,0 +1,301 @@ +--- +title: "Level Up Your Shell Game" +date: 2013-10-24T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/level-up-your-shell-game/ +--- + +The Viget dev team was recently relaxing by the fireplace, sipping a +fine cognac out of those fancy little glasses, when the conversation +turned (as it often does) to the Unix command line. We have good systems +in place for sharing Ruby techniques ([pull request code +reviews](https://viget.com/extend/developer-ramp-up-with-pull-requests)) +and [Git tips](https://viget.com/extend/a-gaggle-of-git-tips), but +everyone seemed to have a simple, useful command-line trick or two that +the rest of the team had never encountered. Here are a few of our +favorites: + +- [Keyboard + Shortcuts](https://viget.com/extend/level-up-your-shell-game#keyboard-shortcuts) +- [Aliases](https://viget.com/extend/level-up-your-shell-game#aliases) +- [History + Expansions](https://viget.com/extend/level-up-your-shell-game#history-expansions) +- [Argument + Expansion](https://viget.com/extend/level-up-your-shell-game#argument-expansion) +- [Customizing + `.inputrc`](https://viget.com/extend/level-up-your-shell-game#customizing-inputrc) +- [Viewing Processes on a Given Port with + `lsof`](https://viget.com/extend/level-up-your-shell-game#viewing-processes-on-a-given-port-with-lsof) +- [SSH + Configuration](https://viget.com/extend/level-up-your-shell-game#ssh-configuration) +- [Invoking Remote Commands with + SSH](https://viget.com/extend/level-up-your-shell-game#invoking-remote-commands-with-ssh) + +Ready to get your +![](https://github.global.ssl.fastly.net/images/icons/emoji/neckbeard.png){.no-border +align="top" height="24" +style="display: inline; vertical-align: top; width: 24px !important; height: 24px !important;"} +on? Good. Let's go. + +## Keyboard Shortcuts + +[**Mike:**](https://viget.com/about/team/mackerman) I recently +discovered a few simple Unix keyboard shortcuts that save me some time: + + Shortcut Result + ---------------------- ---------------------------------------------------------------------------- + `ctrl + u` Deletes the portion of your command **before** the current cursor position + `ctrl + w` Deletes the **word** preceding the current cursor position + `ctrl + left arrow` Moves the cursor to the **left by one word** + `ctrl + right arrow` Moves the cursor to the **right by one word** + `ctrl + a` Moves the cursor to the **beginning** of your command + `ctrl + e` Moves the cursor to the **end** of your command + +Thanks to [Lawson Kurtz](https://viget.com/about/team/lkurtz) for +pointing out the beginning and end shortcuts + +## Aliases + +[**Eli:**](https://viget.com/about/team/efatsi) Sick of typing +`bundle exec rake db:test:prepare` or other long, exhausting lines of +terminal commands? Me too. Aliases can be a big help in alleviating the +pain of typing common commands over and over again. + +They can be easily created in your `~/.bash_profile` file, and have the +following syntax: + + alias gb="git branch" + +I've got a whole slew of git and rails related ones that are fairly +straight-forward: + + alias ga="git add .; git add -u ." + alias glo='git log --pretty=format:"%h%x09%an%x09%s"' + alias gpro="git pull --rebase origin" + ... + alias rs="rails server" + +And a few others I find useful: + + alias editcommit="git commit --amend -m" + alias pro="cd ~/Desktop/Projects/" + alias s.="subl ." + alias psgrep="ps aux | grep" + alias cov='/usr/bin/open -a "/Applications/Google Chrome.app" coverage/index.html' + +If you ever notice yourself typing these things out over and over, pop +into your `.bash_profile` and whip up some of your own! If +`~/.bash_profile` is hard for you to remember like it is for me, nothing +an alias can't fix: `alias editbash="open ~/.bash_profile"`. + +**Note**: you'll need to open a new Terminal window for changes in +`~/.bash_profile` to take place. + +## History Expansions + +[**Chris:**](https://viget.com/about/team/cjones) Here are some of my +favorite tricks for working with your history. + +**`!!` - previous command** + +How many times have you run a command and then immediately re-run it +with `sudo`? The answer is all the time. You could use the up arrow and +then [Mike](https://viget.com/about/team/mackerman)'s `ctrl-a` shortcut +to insert at the beginning of the line. But there's a better way: `!!` +expands to the entire previous command. Observe: + + $ rm path/to/thing + Permission denied + $ sudo !! + sudo rm path/to/thing + +**`!$` - last argument of the previous command** + +How many times have you run a command and then run a different command +with the same argument? The answer is all the time. Don't retype it, use +`!$`: + + $ mkdir path/to/thing + $ cd !$ + cd path/to/thing + +**`!` - most recent command starting with** + +Here's a quick shortcut for running the most recent command that *starts +with* the provided string: + + $ rake db:migrate:reset db:seed + $ rails s + $ !rake # re-runs that first command + +**`!` - numbered command** + +All of your commands are stored in `~/.bash_history`, which you can view +with the `history` command. Each entry has a number, and you can use +`!` to run that specific command. Try it with `grep` to filter +for specific commands: + + $ history | grep heroku + 492 heroku run rake search:reindex -r production + 495 heroku maintenance:off -r production + 496 heroku run rails c -r production + $ !495 + +This technique is perfect for an alias: + + $ alias h?="history | grep" + $ h? heroku + 492 heroku run rake search:reindex -r production + 495 heroku maintenance:off -r production + 496 heroku run rails c -r production + $ !495 + +Sweet. + +## Argument Expansion + +[**Ryan:**](https://viget.com/about/team/rfoster) For commands that take +multiple, similar arguments, you can use `{old,new}` to expand one +argument into two or more. For example: + + mv app/models/foo.rb app/models/foobar.rb + +can be + + mv app/models/{foo,foobar}.rb + +or even + + mv app/models/foo{,bar}.rb + +## Customizing .inputrc {#customizing-inputrc} + +[**Brian:**](https://viget.com/about/team/blandau) One of the things I +have found to be a big time saver when using my terminal is configuring +keyboard shortcuts. Luckily if you're still using bash (which I am), you +can configure shortcuts and use them in a number of other REPLs that all +use readline. You can [configure readline keyboard shortcuts by editing +your `~/.inputrc` +file](http://cnswww.cns.cwru.edu/php/chet/readline/readline.html#SEC9). +Each line in the file defines a shortcut. It's made up of two parts, the +key sequence, and the command or macro. Here are three of my favorites: + +1. `"\ep": history-search-backward`: This will map to escape-p and will + allow you to search for completions to the current line from your + history. For instance, it will allow you to type "`git`" into your + shell and then hit escape-p to cycle through all the git commands + you have used recently looking for the correct completion. +2. `"\t": menu-complete`: I always hated that when I tried to tab + complete something and then I'd get a giant list of possible + completions. By adding this line you can instead use tab to cycle + through all the possible completions stopping on which ever one is + the correct one. +3. `"\C-d": kill-whole-line`: There's a built-in key command for + killing a line after the cursor (control-k), but no way to kill the + whole line. This solves that. After adding this to your `.inputrc` + just type control-d from anywhere on the line and the whole line is + gone and you're ready to start fresh. + +Don't like what I mapped these commands to? Feel free to use different +keyboard shortcuts by changing that first part in quotes. There's a lot +more you can do, just check out [all the commands you can +assign](http://cnswww.cns.cwru.edu/php/chet/readline/readline.html#SEC13) +or create your own macros. + +## Viewing Processes on a Given Port with lsof + +[**Zachary:**](https://viget.com/about/team/zporter) When working on +projects, I occassionally need to run the application on port 80. While +I could use a tool like [Pow](http://pow.cx/) to accomplish this, I +choose to use [Passenger +Standalone](http://www.modrails.com/documentation/Users%20guide%20Standalone.html). +However, when trying to start Passenger on port 80, I will get a +response that looks something like "The address 0.0.0.0:80 is already in +use by another process". To easily view all processes communicating over +port 80, I use [`lsof`](http://linux.die.net/man/8/lsof) like so: + + sudo lsof -i :80 + +From here, I can pin-point who the culprit is and kill it. + +## SSH Configuration + +[**Patrick:**](https://viget.com/about/team/preagan) SSH is a simple +tool to use when you need shell access to a remote server. Everyone is +familiar with the most basic usage: + + $ ssh production.host + +Command-line options give you control over more options such as the user +and private key file that you use to authenticate: + + $ ssh -l www-data -i /Users/preagan/.ssh/viget production.host + +However, managing these options with the command-line is tedious if you +use different private keys for work-related and personal servers. This +is where your local `.ssh/config` file can help -- by specifying the +host that you connect to, you can set specific options for that +connection: + + # ~/.ssh/config + Host production.host + User www-data + IdentityFile /Users/preagan/.ssh/viget + +Now, simply running `ssh production.host` will use the correct username +and private key when authenticating. Additionally, services that use SSH +as the underlying transport mechanism will honor these settings -- you +can use this with Github to send an alternate private key just as +easily: + + Host github.com + IdentityFile /Users/preagan/.ssh/github + +**Bonus Tip** + +This isn't limited to just setting host-specific options, you can also +use this configuration file to create quick aliases for hosts that +aren't addressable by DNS: + + Host prod + Hostname 192.168.1.1 + Port 6000 + User www-data + IdentityFile /Users/preagan/.ssh/production-key + +All you need to do is run `ssh prod` and you're good to go. For more +information on what settings are available, check out the manual +([`man ssh_config`](http://linux.die.net/man/5/ssh_config)). + +## Invoking Remote Commands with SSH + +[**David**:](https://viget.com/about/team/deisinger) You're already +using SSH to launch interactive sessions on your remote servers, but DID +YOU KNOW you can also pass the commands you want to run to the `ssh` +program and use the output just like you would a local operation? For +example, if you want to pull down a production database dump, you could: + +1. `ssh` into your production server +2. Run `mysqldump` to generate the data dump +3. Run `gzip` to create a compressed file +4. Log out +5. Use `scp` to grab the file off the remote server + +Or! You could use this here one-liner: + + ssh user@host.com "mysqldump -u db_user -h db_host -pdb_password db_name | gzip" > production.sql.gz + +Rather than starting an interactive shell, you're logging in, running +the `mysqldump` command, piping the result into `gzip`, and then taking +the result and writing it to a local file. From there, you could chain +on decompressing the file, importing it into your local database, etc. + +**Bonus tip:** store long commands like this in +[boom](https://github.com/holman/boom) for easy recall. + +------------------------------------------------------------------------ + +Well, that's all we've got for you. Hope you picked up something useful +along the way. What are your go-to command line tricks? Let us know in +the comments. diff --git a/content/elsewhere/local-docker-best-practices/index.md b/content/elsewhere/local-docker-best-practices/index.md new file mode 100644 index 0000000..f426c09 --- /dev/null +++ b/content/elsewhere/local-docker-best-practices/index.md @@ -0,0 +1,345 @@ +--- +title: "Local Docker Best Practices" +date: 2022-05-05T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/local-docker-best-practices/ +--- + +Here at Viget, Docker has become an indispensable tool for local +development. We build and maintain a ton of apps across the team, +running different stacks and versions, and being able to package up a +working dev environment makes it much, much easier to switch between +apps and ramp up new devs onto projects. That's not to say that +developing with Docker locally isn't without its +drawbacks[^1^](#fn1){#fnref1 .footnote-ref role="doc-noteref"}, but +they're massively outweighed by the ease and convenience it unlocks. + +Over time, we've developed our own set of best practices for effectively +setting Docker up for local development. Please note that last bit ("for +local development") -- if you're creating images for deployment +purposes, most of these principles don't apply. Our typical setup +involves the following containers, orchestrated with Docker Compose: + +1. The application (e.g. Rails, Django, or Phoenix) +2. A JavaScript watcher/compiler (e.g. `webpack-dev-server`) +3. A database (typically PostgreSQL) +4. Additional necessary infrastructure (e.g. Redis, ElasticSearch, + Mailhog) +5. Occasionally, additional instances of the app doing things other + than running the development server (think background jobs) + +So with that architecture in mind, here are the best practices we've +tried to standardize on: + +1. [Don\'t put code or app-level dependencies into the + image](#1-dont-put-code-or-app-level-dependencies-into-the-image) +2. [Don\'t use a Dockerfile if you don\'t have + to](#2-dont-use-a-dockerfile-if-you-dont-have-to) +3. [Only reference a Dockerfile once in + `docker-compose.yml`](#3-only-reference-a-dockerfile-once-in-docker-compose-yml) +4. [Cache dependencies in named + volumes](#4-cache-dependencies-in-named-volumes) +5. [Put ephemeral stuff in named + volumes](#5-put-ephemeral-stuff-in-named-volumes) +6. [Clean up after `apt-get update`](#6-clean-up-after-apt-get-update) +7. [Prefer `exec` to `run`](#7-prefer-exec-to-run) +8. [Coordinate services with + `wait-for-it`](#8-coordinate-services-with-wait-for-it) +9. [Start entrypoint scripts with `set -e` and end with + `exec "$@"`](#9-start-entrypoint-scripts-with-set-e-and-end-with-exec) +10. [Target different CPU architectures with + `BUILDARCH`](#10-target-different-cpu-architectures-with-buildarch) +11. [Prefer `docker compose` to + `docker-compose`](#11-prefer-docker-compose-to-docker-compose) + +------------------------------------------------------------------------ + +### 1. Don't put code or app-level dependencies into the image [\#](#1-dont-put-code-or-app-level-dependencies-into-the-image "Direct link to 1. Don't put code or app-level dependencies into the image"){.anchor} {#1-dont-put-code-or-app-level-dependencies-into-the-image} + +Your primary Dockerfile, the one the application runs in, should include +all the necessary software to run the app, but shouldn't include the +actual application code itself -- that'll be mounted into the container +when `docker-compose run` starts and synced between the container and +the local machine. + +Additionally, it's important to distinguish between system-level +dependencies (like ImageMagick) and application-level ones (like +Rubygems and NPM packages) -- the former should be included in the +Dockerfile; the latter should not. Baking application-level dependencies +into the image means that it'll have to be rebuilt every time someone +adds a new one, which is both time-consuming and error-prone. Instead, +we install those dependencies as part of a startup script. + +### 2. Don't use a Dockerfile if you don't have to [\#](#2-dont-use-a-dockerfile-if-you-dont-have-to "Direct link to 2. Don't use a Dockerfile if you don't have to"){.anchor} {#2-dont-use-a-dockerfile-if-you-dont-have-to} + +With point #1 in mind, you might find you don't need to write a +Dockerfile at all. If your app doesn't have any special dependencies, +you might be able to point your `docker-compose.yml` entry right at the +official Docker repository (i.e. just reference `ruby:2.7.6`). This +isn't very common -- most apps and frameworks require some amount of +infrastructure (e.g. Rails needs a working version of Node), but if you +find yourself with a Dockerfile that contains just a single `FROM` line, +you can just cut it. + +### 3. Only reference a Dockerfile once in `docker-compose.yml` [\#](#3-only-reference-a-dockerfile-once-in-docker-compose-yml "Direct link to 3. Only reference a Dockerfile once in docker-compose.yml"){.anchor} {#3-only-reference-a-dockerfile-once-in-docker-compose-yml} + +If you're using the same image for multiple services (which you +should!), only provide the build instructions in the definition of a +single service, assign a name to it, and then reference that name for +the additional services. So as an example, imagine a Rails app that uses +a shared image for running the development server and +`webpack-dev-server`. An example configuration might look like this: + + services: + rails: + image: appname_rails + build: + context: . + dockerfile: ./.docker-config/rails/Dockerfile + command: ./bin/rails server -p 3000 -b '0.0.0.0' + + node: + image: appname_rails + command: ./bin/webpack-dev-server + +This way, when we build the services (with `docker-compose build`), our +image only gets built once. If instead we'd omitted the `image:` +directives and duplicated the `build:` one, we'd be rebuilding the exact +same image twice, wasting your disk space and limited time on this +earth. + +### 4. Cache dependencies in named volumes [\#](#4-cache-dependencies-in-named-volumes "Direct link to 4. Cache dependencies in named volumes"){.anchor} {#4-cache-dependencies-in-named-volumes} + +As mentioned in point #1, we don't bake code dependencies into the image +and instead install them on startup. As you can imagine, this would be +pretty slow if we installed every gem/pip/yarn library from scratch each +time we restarted the services (hello NOKOGIRI), so we use Docker's +named volumes to keep a cache. The config above might become something +like: + + volumes: + gems: + yarn: + + services: + rails: + image: appname_rails + build: + context: . + dockerfile: ./.docker-config/rails/Dockerfile + command: ./bin/rails server -p 3000 -b '0.0.0.0' + volumes: + - .:/app + - gems:/usr/local/bundle + - yarn:/app/node_modules + + node: + image: appname_rails + command: ./bin/webpack-dev-server + volumes: + - .:/app + - yarn:/app/node_modules + +Where specifically you should mount the volumes to will vary by stack, +but the same principle applies: keep the compiled dependencies in named +volumes to massively decrease startup time. + +### 5. Put ephemeral stuff in named volumes [\#](#5-put-ephemeral-stuff-in-named-volumes "Direct link to 5. Put ephemeral stuff in named volumes"){.anchor} {#5-put-ephemeral-stuff-in-named-volumes} + +While we're on the subject of using named volumes to increase +performance, here's another hot tip: put directories that hold files you +don't need to edit into named volumes to stop them from being synced +back to your local machine (which carries a big performance cost). I'm +thinking specifically of `log` and `tmp` directories, in addition to +wherever your app stores uploaded files. A good rule of thumb is, if +it's `.gitignore`'d, it's a good candidate for a volume. + +### 6. Clean up after `apt-get update` [\#](#6-clean-up-after-apt-get-update "Direct link to 6. Clean up after apt-get update"){.anchor} {#6-clean-up-after-apt-get-update} + +If you use Debian-based images as the starting point for your +Dockerfiles, you've noticed that you have to run `apt-get update` before +you're able to `apt-get install` your dependencies. If you don't take +precautions, this is going to cause a bunch of additional data to get +baked into your image, drastically increasing its size. Best practice is +to do the update, install, and cleanup in a single `RUN` command: + + RUN apt-get update && + apt-get install -y libgirepository1.0-dev libpoppler-glib-dev && + rm -rf /var/lib/apt/lists/* + +### 7. Prefer `exec` to `run` [\#](#7-prefer-exec-to-run "Direct link to 7. Prefer exec to run"){.anchor} {#7-prefer-exec-to-run} + +If you need to run a command inside a container, you have two options: +`run` and `exec`. The former is going to spin up a new container to run +the command, while the latter attaches to an existing running container. + +In almost every instance, assuming you pretty much always have the +services running while you're working on the app, `exec` (and +specifically `docker-compose exec`) is what you want. It's faster to +spin up and doesn't carry any chance of leaving weird artifacts around +(which will happen if you're not careful about including the `--rm` flag +with `run`). + +### 8. Coordinate services with `wait-for-it` [\#](#8-coordinate-services-with-wait-for-it "Direct link to 8. Coordinate services with wait-for-it"){.anchor} {#8-coordinate-services-with-wait-for-it} + +Given our dependence on shared images and volumes, you may encounter +issues where one of your services starts before another service's +`entrypoint` script finishes executing, leading to errors. When this +occurs, we'll pull in the [`wait-for-it` utility +script](https://github.com/vishnubob/wait-for-it), which takes a web +location to check against and a command to run once that location sends +back a response. Then we update our `docker-compose.yml` to use it: + + volumes: + gems: + yarn: + + services: + rails: + image: appname_rails + build: + context: . + dockerfile: ./.docker-config/rails/Dockerfile + command: ./bin/rails server -p 3000 -b '0.0.0.0' + volumes: + - .:/app + - gems:/usr/local/bundle + - yarn:/app/node_modules + + node: + image: appname_rails + command: [ + "./.docker-config/wait-for-it.sh", + "rails:3000", + "--timeout=0", + "--", + "./bin/webpack-dev-server" + ] + volumes: + - .:/app + - yarn:/app/node_modules + +This way, `webpack-dev-server` won't start until the Rails development +server is fully up and running. + +[]{#9-start-entrypoint-scripts-with-set-e-and-end-with-exec} + +### 9. Start entrypoint scripts with `set -e` and end with `exec "$@"` [\#](#9-start-entrypoint-scripts-with-set-e-and-end-with-exec "Direct link to 9. Start entrypoint scripts with set -e and end with exec "$@""){.anchor aria-label="Direct link to 9. Start entrypoint scripts with set -e and end with exec \"$@\""} + +The setup we\'ve described here depends a lot on using +[entrypoint](https://docs.docker.com/compose/compose-file/#entrypoint) +scripts to install dependencies and manage other setup. There are two +things you should include in **every single one** of these scripts, one +at the beginning, one at the end: + +- At the top of the file, right after `#!/bin/bash` (or similar), put + `set -e`. This will ensure that the script exits if any line exits + with an error. +- At the end of the file, put `exec "$@"`. Without this, the + instructions you pass in with the + [command](https://docs.docker.com/compose/compose-file/#command) + directive won\'t execute. + +[Here\'s a good StackOverflow +answer](https://stackoverflow.com/a/48096779) with some more +information. + +### 10. Target different CPU architectures with `BUILDARCH` [\#](#10-target-different-cpu-architectures-with-buildarch "Direct link to 10. Target different CPU architectures with BUILDARCH"){.anchor} {#10-target-different-cpu-architectures-with-buildarch} + +We're presently about evenly split between Intel and Apple Silicon +laptops. Most of the common base images you pull from +[DockerHub](https://hub.docker.com/) are multi-platform (for example, +look at the "OS/Arch" dropdown for the [Ruby +image](https://hub.docker.com/layers/library/ruby/2.7.6/images/sha256-1af3ca0ab535007d18f7bc183cc49c228729fc10799ba974fbd385889e4d658a?context=explore)), +and Docker will pull the correct image for the local architecture. +However, if you're doing anything architecture-specific in your +Dockerfiles, you might encounter difficulties. + +As mentioned previously, we'll often need a specific version of Node.js +running inside a Ruby-based image. A way we'd commonly set this up is +something like this: + + FROM ruby:2.7.6 + + RUN curl -sS https://nodejs.org/download/release/v16.17.0/node-v16.17.0-linux-x64.tar.gz + | tar xzf - --strip-components=1 -C "/usr/local" + +This works fine on Intel Macs, but blows up on Apple Silicon -- notice +the `x64` in the above URL? That needs to be `arm64` on an M1. The +easiest option is to specify `platform: linux/amd64` for each service +using this image in your `docker-compose.yml`, but that's going to put +Docker into emulation mode, which has performance drawbacks as well as +[other known +issues](https://docs.docker.com/desktop/mac/apple-silicon/#known-issues). + +Fortunately, Docker exposes a handful of [platform-related +arguments](https://docs.docker.com/engine/reference/builder/#automatic-platform-args-in-the-global-scope) +we can lean on to target specific architectures. We'll use `BUILDARCH`, +the architecture of the local machine. While there's no native +conditional functionality in the Dockerfile spec, we can do a little bit +of shell scripting inside of a `RUN` command to achieve the desired +result: + + FROM ruby:2.7.6 + + ARG BUILDARCH + + RUN if [ "$BUILDARCH" = "arm64" ]; + then curl -sS https://nodejs.org/download/release/v16.17.0/node-v16.17.0-linux-arm64.tar.gz + | tar xzf - --strip-components=1 -C "/usr/local"; + else curl -sS https://nodejs.org/download/release/v16.17.0/node-v16.17.0-linux-x64.tar.gz + | tar xzf - --strip-components=1 -C "/usr/local"; + fi + +This way, a dev running on Apple Silicon will download and install +`node-v16.17.0-linux-arm64`, and someone with Intel will use +`node-v16.17.0-linux-x64`. + +### 11. Prefer `docker compose` to `docker-compose` [\#](#11-prefer-docker-compose-to-docker-compose "Direct link to 11. Prefer docker compose to docker-compose"){.anchor} {#11-prefer-docker-compose-to-docker-compose} + +Though both `docker compose up` and `docker-compose up` (with or without +a hyphen) work to spin up your containers, per this [helpful +StackOverflow answer](https://stackoverflow.com/a/66516826), +"`docker compose` (with a space) is a newer project to migrate compose +to Go with the rest of the docker project." + +*Thanks [Dylan](https://www.viget.com/about/team/dlederle-ensign/) for +this one.* + + +[[Learn More]{.util-breadcrumb-md .mb-8 .group-hover:translate-y-20 +.group-hover:opacity-0 .transition-all .ease-in-out +.duration-500}](https://www.viget.com/careers/application-developer/){.relative +.flex .group .flex-col .p-32 .md:p-40 .lg:p-64 .z-10} + +### We're hiring Application Developers. Learn more and introduce yourself. {#were-hiring-application-developers.-learn-more-and-introduce-yourself. .text-20 .md:text-24 .lg:text-32 .font-bold .leading-[170%] .group-hover:-translate-y-20 .transition-transform .ease-in-out .duration-500} + +![](data:image/svg+xml;base64,PHN2ZyBjbGFzcz0icmVjdC1pY29uLW1kIHNlbGYtZW5kIG10LTE2IGdyb3VwLWhvdmVyOi10cmFuc2xhdGUteS0yMCB0cmFuc2l0aW9uLWFsbCBlYXNlLWluLW91dCBkdXJhdGlvbi01MDAiIHZpZXdib3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBhcmlhLWhpZGRlbj0idHJ1ZSI+CjxwYXRoIGZpbGwtcnVsZT0iZXZlbm9kZCIgY2xpcC1ydWxlPSJldmVub2RkIiBkPSJNMTMuNzg0OCAxOS4zMDkxQzEzLjQ3NTggMTkuNTg1IDEzLjAwMTcgMTkuNTU4MyAxMi43MjU4IDE5LjI0OTRDMTIuNDQ5OCAxOC45NDA1IDEyLjQ3NjYgMTguNDY2MyAxMi43ODU1IDE4LjE5MDRMMTguNzg2NiAxMi44MzAxTDQuNzUxOTUgMTIuODMwMUM0LjMzNzc0IDEyLjgzMDEgNC4wMDE5NSAxMi40OTQzIDQuMDAxOTUgMTIuMDgwMUM0LjAwMTk1IDExLjY2NTkgNC4zMzc3NCAxMS4zMzAxIDQuNzUxOTUgMTEuMzMwMUwxOC43ODU1IDExLjMzMDFMMTIuNzg1NSA1Ljk3MDgyQzEyLjQ3NjYgNS42OTQ4OCAxMi40NDk4IDUuMjIwNzYgMTIuNzI1OCA0LjkxMTg0QzEzLjAwMTcgNC42MDI5MiAxMy40NzU4IDQuNTc2MTggMTMuNzg0OCA0Ljg1MjEyTDIxLjIzNTggMTEuNTA3NkMyMS4zNzM4IDExLjYyNDQgMjEuNDY5IDExLjc5MDMgMjEuNDk0NSAxMS45NzgyQzIxLjQ5OTIgMTIuMDExOSAyMS41MDE1IDEyLjA0NjEgMjEuNTAxNSAxMi4wODA2QzIxLjUwMTUgMTIuMjk0MiAyMS40MTA1IDEyLjQ5NzcgMjEuMjUxMSAxMi42NEwxMy43ODQ4IDE5LjMwOTFaIj48L3BhdGg+Cjwvc3ZnPg==){.rect-icon-md +.self-end .mt-16 .group-hover:-translate-y-20 .transition-all +.ease-in-out .duration-500} + +So there you have it, a short list of the best practices we've developed +over the last several years of working with Docker. We'll try to keep +this list updated as we get better at doing and documenting this stuff. + + + +If you're interested in reading more, here are a few good links: + +- [Ruby on Whales: Dockerizing Ruby and Rails + development](https://evilmartians.com/chronicles/ruby-on-whales-docker-for-ruby-rails-development) +- [Docker: Right for Us. Right for + You?](https://www.viget.com/articles/docker-right-for-us-right-for-you-1/) +- [Docker + Rails: Solutions to Common + Hurdles](https://www.viget.com/articles/docker-rails-solutions-to-common-hurdles/) + + +------------------------------------------------------------------------ + +1. [Namely, there's a significant performance hit when running Docker + on Mac (as we do) in addition to the cognitive hurdle of all your + stuff running inside containers. If I worked at a product shop, + where I was focused on a single codebase for the bulk of my time, + I'd think hard before going all in on local + Docker.[↩︎](#fnref1){.footnote-back role="doc-backlink"}]{#fn1} diff --git a/content/elsewhere/maintenance-matters-continuous-integration/index.md b/content/elsewhere/maintenance-matters-continuous-integration/index.md new file mode 100644 index 0000000..a6413bf --- /dev/null +++ b/content/elsewhere/maintenance-matters-continuous-integration/index.md @@ -0,0 +1,122 @@ +--- +title: "Maintenance Matters: Continuous Integration" +date: 2022-08-26T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/maintenance-matters-continuous-integration/ +--- + +*This article is part of a series focusing on how developers can center +and streamline software maintenance. *The other articles in the +Maintenance Matters series are: **[Code +Coverage](https://www.viget.com/articles/maintenance-matters-code-coverage/){target="_blank"}, +**[Documentation](https://www.viget.com/articles/maintenance-matters-documentation/){target="_blank"},**** +[Default +Formatting](https://www.viget.com/articles/maintenance-matters-default-formatting/){target="_blank"}, [Building +Helpful +Logs](https://www.viget.com/articles/maintenance-matters-helpful-logs/){target="_blank"}, +[Timely +Upgrades](https://www.viget.com/articles/maintenance-matters-timely-upgrades/){target="_blank"}, +and [Code +Reviews](https://www.viget.com/articles/maintenance-matters-code-reviews/){target="_blank"}.** + +As Annie said in her [intro +post](https://www.viget.com/articles/maintenance-matters/): + +> There are many factors that go into a successful project, but in this +> series, we're focusing on the small things that developers usually +> have control over. Over the next few months, we'll be expanding on +> many of these in separate articles. + +Today I'd like to talk to you about **Continuous Integration**, as I +feel strongly that it's something no software effort should be without. +Now, before we start, I should clarify: +[Wikipedia](https://en.wikipedia.org/wiki/Continuous_integration) +defines Continuous Integration as "the practice of merging all +developers' working copies to a shared mainline several times a day." +Maybe this was a revolutionary idea in 1991? I don't know, I was in +second grade. Nowadays, at least at Viget, the whole team frequently +merging their work into a common branch is the noncontroversial default. + +For the purposes of this Maintenance Matters article, I'll be focused on +this aspect of CI: + +> In addition to automated unit tests, organisations using CI typically +> use a build server to implement continuous processes of applying +> quality control in general -- small pieces of effort, applied +> frequently. + +If you're not familiar with the concept, it's pretty simple: a typical +Viget dev project includes one or more [GitHub Action +Workflows](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions) +that define a series of tasks that should run every time code is pushed +to the central repository. At a minimum, the workflow checks out the +code, installs the necessary dependencies, and runs the automated test +suite. In most cases, pushes to the `main` branch will trigger automatic +deployment to an internal QA environment (a process known as [continuous +deployment](https://en.wikipedia.org/wiki/Continuous_deployment)). If +any step fails (e.g. the tests don't pass or a dependency can't be +installed), the process aborts and the team gets notified. + +CI is a very tactical, concrete thing you do, but more than that, it's a +mindset -- it's your team's values made concrete. It's one thing to say +"all projects must have 100% code coverage"; it's another thing entirely +to move your deploys into a CI task that only runs after the coverage +check, so that nothing can go live until it's fully tested. Continuous +Integration is code that improves the way you write code, and a +commitment to continuous improvement. + +So what can you do with Continuous Integration? I've mentioned the two +primary tasks (running tests and automated deployment), but that's +really just the tip of the iceberg. You can also: + +- Check code coverage +- Run linters (like `rubocop`, `eslint`, or `prettier`) to enforce + coding standards +- Scan for security issues with your dependencies +- Tag releases in Sentry (or your error tracking tool of choice) +- Deploy feature branches to + [Vercel](https://vercel.com/)/[Netlify](https://www.netlify.com/)/[Fly.io](https://fly.io/) + for easy previews during code review +- Build Docker images and push them to a registry +- Create release artifacts + +Really, anything a computer can do, a CI runner can do: + +- Send messages to Slack +- Spin up new servers as part of a blue/green deployment strategy +- Run your seed script, assert that every model has a valid record +- Grep your codebase for git conflict artifacts +- Assert that all images have been properly optimized + +That's not to say you can't overdo it -- you can. It can take a long +time to configure, and workflows can take a long time to run as +codebases grow. It can cost a lot if you're running a lot of builds. It +can be error-prone, with issues that only occur in CI. And it can be +interpersonally fraught -- as I said, it's your team's values made +concrete, and sometimes getting that alignment is the hardest part. + +Nevertheless, I consider some version of CI to be mandatory for any +software project. It should be part of initial project setup -- get +aligned with your team on what standards you want to enforce, choose +your CI tool, and get it configured ASAP, ideally before development +begins in earnest. It's much easier to stick with established, codified +standards than to come back and try to add them later. + +As mentioned previously, we're big fans of GitHub Actions and its +seamless integration with the rest of our workflow. [Here's a good guide +for getting started](https://docs.github.com/en/actions/quickstart). +We've also used and enjoyed [CircleCI](https://circleci.com/), [GitLab +CI/CD](https://docs.gitlab.com/ee/ci/), and +[Jenkins](https://www.jenkins.io/). Ultimately, the tool doesn't matter +all that much provided it can reliably trigger jobs on push and report +failures, so find the one that works best for your team. + +That's the what, why, and how of Continuous Integration. Of course, all +this is precipitated by having a high-functioning team. And there's no +[GitHub Action for +**that**](https://github.com/marketplace?type=actions&query=good+development+team), +unfortunately. + +*The next article in this series is [Maintenance Matters: Code +Coverage.](https://www.viget.com/articles/maintenance-matters-code-coverage/)* diff --git a/content/elsewhere/making-an-email-powered-e-paper-picture-frame/index.md b/content/elsewhere/making-an-email-powered-e-paper-picture-frame/index.md new file mode 100644 index 0000000..eeeac46 --- /dev/null +++ b/content/elsewhere/making-an-email-powered-e-paper-picture-frame/index.md @@ -0,0 +1,190 @@ +--- +title: "Making an Email-Powered E-Paper Picture Frame" +date: 2021-05-12T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/making-an-email-powered-e-paper-picture-frame/ +--- + +Over the winter, inspired by this [digital photo +frame](http://toolsandtoys.net/aura-mason-smart-digital-picture-frame/) +that uses email to add new photos, I built and programmed a trio of +e-paper picture frames for my family, and I thought it\'d be cool to +walk through the process in case someone out there wants to try +something similar. + +![image](IMG_0120.jpeg) + +In short, it\'s a Raspberry Pi Zero connected to a roughly 5-by-7-inch +e-paper screen, running some software I wrote in Go and living inside a +frame I put together. This project consists of four main parts: + +1. The email-to-S3 gateway, [described in detail in a previous + post](https://www.viget.com/articles/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/); +2. The software to display the photos on the screen; +3. Miscellaneous Raspberry Pi configuration; and +4. The physical frame itself. + +As for materials, you\'ll need the following: + +- [A Raspberry Pi Zero with + headers](https://www.waveshare.com/raspberry-pi-zero-wh.htm) +- [An e-paper + display](https://www.waveshare.com/7.5inch-hd-e-paper-hat.htm) +- A micro SD card (and some way to write to it) +- Some 1x4 lumber (I used oak) +- [4x metal standoffs](https://www.amazon.com/gp/product/B00TX464XQ) +- [A 6x8 piece of + acrylic](https://www.amazon.com/gp/product/B07J4WX7BH) +- Some wood glue to attach the boards, and some wood screws to attach + the standoffs + +I\'ll get more into the woodworking tools down below. + +[]{#the-email-to-s3-gateway} + +## The Email-to-S3 Gateway [\#](#the-email-to-s3-gateway "Direct link to The Email-to-S3 Gateway"){.anchor aria-label="Direct link to The Email-to-S3 Gateway"} + +Like I said, [I\'ve already documented this part pretty +thoroughly](https://www.viget.com/articles/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/), +but in short, we use an array of AWS services to set up an email address +that fires off a Lambda function when it receives an email. The function +extracts the attachments from the email, crops them a couple of ways +(one for display on a webpage, the other for display on the screen), and +uploads the results into an S3 bucket. + +![image](Screen_Shot_2021-05-09_at_1_26_39_PM.png) + +[]{#the-software} + +## The Software [\#](#the-software "Direct link to The Software"){.anchor aria-label="Direct link to The Software"} + +The next task was to write the code that runs on the Pi that can update +the display periodically. I also thought it\'d be cool if it could +expose a simple web interface on the local network to let my family +members browse the photos and display them on the frame. When selecting +a language, I could have gone with either Ruby or Python, the former +since that\'s what I\'m most familiar with, the latter because that\'s +what [the code provided by +Waveshare](https://github.com/waveshare/e-Paper/tree/master/RaspberryPi_JetsonNano/python/lib/waveshare_epd), +the manufacturer, is written in. + +But I chose neither of those options, reader, opting instead for Go. Why +Go, you ask? + +- **I wanted something robust.** Ideally, this code will run on these + devices for years with no downtime. If something does go wrong, I + won\'t have any way to debug the problems remotely, instead having + to wait until the next time I\'m on the same wifi network with the + failing device. Go\'s explicit error checking was appealing in this + regard. + +- **I wanted deployment to be simple.** I didn\'t have any appetite + for all the configuration required to get a Python or Ruby app + running on the Pi. The fact that I could compile my code into a + single binary that I could `scp` onto the device and manage with + `systemd` was compelling. + +- **I wanted a web UI**, but it wasn\'t the main focus. With Go, I + could just import the built-in `net/http` to add simple web + functionality. + +To interface with the screen, I started with [this super awesome GitHub +project](https://github.com/gandaldf/rpi). Out of the box, it didn\'t +work with my screen, I *think* because Waveshare offers a bunch of +different screens and the specific instructions differ between them. So +I forked it and found the specific Waveshare Python code that worked +with my screen ([this +one](https://github.com/waveshare/e-Paper/blob/master/RaspberryPi_JetsonNano/python/lib/waveshare_epd/epd7in5_HD.py), +I believe), and then it was just a matter of updating the Go code to +match the Python, which was tricky because I don\'t know very much about +low-level electronics programming, but also pretty easy since the Go and +Python are set up in pretty much the same way. + +[Here\'s my +fork](https://github.com/dce/rpi/blob/master/epd7in5/epd7in5.go) --- if +you go with the exact screen I linked to above, it *should* work, but +there\'s a chance you end up having to do what I did and customizing it +to match Waveshare\'s official source. + +Writing the main Go program was a lot of fun. I managed to do it all --- +interfacing with the screen, displaying a random photo, and serving up a +web interface --- in one (IMO) pretty clean file. [Here\'s the +source](https://github.com/dce/e-paper-frame), and I\'ve added some +scripts to hopefully making hacking on it a bit easier. + +[]{#configuring-the-raspberry-pi} + +## Configuring the Raspberry Pi [\#](#configuring-the-raspberry-pi "Direct link to Configuring the Raspberry Pi"){.anchor aria-label="Direct link to Configuring the Raspberry Pi"} + +Setting up the Pi was pretty straightforward, though not without a lot +of trial-and-error the first time through: + +1. Flash Raspberry Pi OS onto the SD card +2. [Configure your wifi + information](https://www.raspberrypi.org/documentation/configuration/wireless/wireless-cli.md) + and [enable + SSH](https://howchoo.com/g/ote0ywmzywj/how-to-enable-ssh-on-raspbian-without-a-screen#create-an-empty-file-called-ssh) +3. Plug it in --- if it doesn\'t join your network, you probably messed + something up in step 2 +4. SSH in (`ssh pi@<192.168.XXX.XXX>`, password `raspberry`) and put + your public key in `.ssh` +5. Go ahead and run a full system update + (`sudo apt update && sudo apt upgrade -y`) +6. Install the AWS CLI and NTP (`sudo apt-get install awscli ntp`) +7. You\'ll need some AWS credentials --- if you already have a local + `~/.aws/config`, just put that file in the same place on the Pi; if + not, run `aws configure` +8. Enable SPI --- run `sudo raspi-config`, then select \"Interface + Options\", \"SPI\" +9. Upload `frame-server-arm` from your local machine using `scp`; I + have it living in `/home/pi/frame` +10. Copy the [cron + script](https://github.com/dce/e-paper-frame/blob/main/etc/random-photo) + into `/etc/cron.hourly` and make sure it has execute permissions + (then give it a run to pull in the initial photos) +11. Add a line into the root user\'s crontab to run the script on + startup: `@reboot /etc/cron.hourly/random-photo` +12. Copy the [`systemd` + service](https://github.com/dce/e-paper-frame/blob/main/etc/frame-server.service) + into `/etc/systemd/system`, then enable and start it + +And that should be it. The photo gallery should be accessible at a local +IP and the photo should update hourly (though not ON the hour as that\'s +not how `cron.hourly` works for some reason). + +![image](IMG_0122.jpeg) + +[]{#building-the-frame} + +## Building the Frame [\#](#building-the-frame "Direct link to Building the Frame"){.anchor aria-label="Direct link to Building the Frame"} + +This part is strictly optional, and there are lots of ways you can +display your frame. I took (a lot of) inspiration from this [\"DIY +Modern Wood and Acrylic Photo +Stand\"](https://evanandkatelyn.com/2017/10/modern-wood-and-acrylic-photo-stand/) +with just a few modifications: + +- I used just one sheet of acrylic instead of two +- I used a couple small pieces of wood with a shallow groove to create + a shelf for the screen to rest on +- I used a drill press to make a 3/4\" hole in the middle of the board + to run the cable through +- I didn\'t bother with the pocket holes --- wood glue is plenty + strong + +The tools I used were: a table saw, a miter saw, a drill press, a +regular cordless drill (**do not** try to make the larger holes in the +acrylic with a drill press omfg), an orbital sander, and some 12\" +clamps. I\'d recommend starting with some cheap pine before using nicer +wood --- you\'ll probably screw something up the first time if you\'re +anything like me. + +This project was a lot of fun. Each part was pretty simple --- I\'m +certainly no expert at AWS, Go programming, or woodworking --- but +combined together they make something pretty special. Thanks for +reading, and I hope this inspires you to make something for your mom or +someone else special to you. + +*Raspberry Pi illustration courtesy of [Jonathan +Rutheiser](https://commons.wikimedia.org/wiki/File:Raspberry_Pi_Vector_Illustration.svg)* diff --git a/content/elsewhere/manual-cropping-with-paperclip/index.md b/content/elsewhere/manual-cropping-with-paperclip/index.md new file mode 100644 index 0000000..c0adf29 --- /dev/null +++ b/content/elsewhere/manual-cropping-with-paperclip/index.md @@ -0,0 +1,81 @@ +--- +title: "Manual Cropping with Paperclip" +date: 2012-05-31T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/manual-cropping-with-paperclip/ +--- + +It's relatively straightforward to add basic manual (browser-based) +cropping support to your +[Paperclip](https://github.com/thoughtbot/paperclip) image attachments. +See [RJCrop](https://github.com/jschwindt/rjcrop) for one valid +approach. What's not so straightforward, though, is adding manual +cropping while preserving Paperclip's built-in thumbnailing +capabilities. Here's how. + +Just so we're on the same page, when we're talking about "thumbnailing," +we're talking about the ability to set a size of `50x50#`, which means +"scale and crop the image into a 50 by 50 pixel square." If the original +image is 200x100, it would first be scaled down to 100x50, and then 25 +pixels trimmed from both sides to arrive at the final dimensions. This +is not a native capability of ImageMagick, but rather the result of some +decently complex code in Paperclip. + +Our goal is to allow a user to select a portion of an image and then +create a thumbnail of *just that selected portion*, ideally taking +advantage of Paperclip\'s existing cropping/scaling logic. + +Any time you're dealing with custom Paperclip image processing, you're +talking about creating a custom +[Processor](https://github.com/thoughtbot/paperclip#post-processing). In +this case, we'll be subclassing the default +[Thumbnail](https://github.com/thoughtbot/paperclip/blob/master/lib/paperclip/thumbnail.rb) +processor and making a few small tweaks. We'll imagine you have a model +with the fields `crop_x`, `crop_y`, `crop_width`, and `crop_height`. How +those get set is left as an exercise for the reader (though I recommend +[JCrop](http://deepliquid.com/content/Jcrop.html)). Some code, then: + + module Paperclip + class ManualCropper < Thumbnail + def initialize(file, options = {}, attachment = nil) + super + @current_geometry.width = target.crop_width + @current_geometry.height = target.crop_height + end + + def target + @attachment.instance + end + + def transformation_command + crop_command = [ + "-crop", + "#{target.crop_width}x" + "#{target.crop_height}+" + "#{target.crop_x}+" + "#{target.crop_y}", + "+repage" + ] + + crop_command + super + end + end + end + +In our `initialize` method, we call super, which sets a whole host of +instance variables, include `@current_geometry`, which is responsible +for creating the geometry string that will crop and scale our image. We +then set its `width` and `height` to be the dimensions of our cropped +image. + +We also override the `transformation_command` method, prepending our +manual crop to the instructions provided by `@current_geometry`. The end +result is a geometry string which crops the image, repages it, then +scales the image and crops it a second time. Simple, but not certainly +not intuitive, at least not to me. + +From here, you can include this cropper using the `:processers` +directive in your `has_attached_file` declaration, and you should be +good to go. This simple approach assumes that the crop dimensions will +always be set, so tweak accordingly if that's not the case. diff --git a/content/elsewhere/motivated-to-code/index.md b/content/elsewhere/motivated-to-code/index.md new file mode 100644 index 0000000..601d328 --- /dev/null +++ b/content/elsewhere/motivated-to-code/index.md @@ -0,0 +1,79 @@ +--- +title: "Getting (And Staying) Motivated to Code" +date: 2009-01-21T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/motivated-to-code/ +--- + +When you're working on code written by another programmer --- whether a +coworker, open source contributor, or (worst of all) *yourself* from six +months ago --- it's all too easy to get frustrated and fall into an +unproductive state. The following are some ways I've found to overcome +this apprehension and get down to business. + +### Tiny Improvements, Tiny Commits + +When confronted with a sprawling, outdated codebase, it's easy to get +overwhelmed. To get started, I suggest making a tiny improvement. Add a +[named +scoped](http://ryandaigle.com/articles/2008/3/24/what-s-new-in-edge-rails-has-finder-functionality). +Use a more advanced +[enumerable](http://www.ruby-doc.org/core/classes/Enumerable.html) +method. And, as soon as you've finished, commit it. Committing feels +great and really hammers home that you've accomplished something of +value. Additionally, committing increases momentum and gives you the +courage to take on larger changes. + +### Make a List + +In *Getting Things Done*, [David Allen](http://www.davidco.com/) says, + +> You'll invariably feel a relieving of pressure about anything you have +> a commitment to change or do, when you decide on the very next +> physical action required to move it forward. + +I like to take it a step further: envision the program as I want it to +be, and then list the steps it will take to get there. Even though the +list will change substantially along the way, having a path and a +destination removes a lot of the anxiety of working with unfamiliar +code. + +To manage such lists, I love [Things](https://culturedcode.com/things/), +but a piece of paper works just as well. + +### Delete Something + +As projects grow and requirements change, a lot of code outlives its +usefulness; but it sticks around anyway because, on the surface, its +presence isn't hurting anything. I'm sure you've encountered this --- +hell, I'm sure you've got extraneous code in your current project. When +confronted with such code, delete it. Deleting unused code increases +readability, decreases the likelihood of bugs, and adds to your +understanding of the remaining code. But those reasons aside, it feels +*great*. If I suspect a method isn't being used anywhere, I'll do + + grep -lir "method_name" app/ + +to find all the places where the method name occurs. + +### Stake your Claim + +On one project, I couldn't do any feature development --- or even make +any commits --- until I'd rewritten the entire test suite to use +[Shoulda](http://thoughtbot.com/projects/shoulda/). It was mentally +draining work and took much longer than it shoulda (see what I did +there?). If you need to add functionality to one specific piece of the +site, take the time to address those classes and call it a victory. You +don't have to fix everything at once, and it's much easier to bring code +up to speed one class at a time. With every improvement you make, your +sense of ownership over the codebase will increase and so will your +motivation. + +### In Closing + +As Rails moves from an upstart framework to an established technology, +the number of legacy projects will only increase. But even outside the +scope of Rails development, or working with legacy code at all, I think +maintaining motivation is the biggest challenge we face as developers. +I'd love to hear your tips for getting and staying motivated to code. diff --git a/content/elsewhere/multi-line-memoization/index.md b/content/elsewhere/multi-line-memoization/index.md new file mode 100644 index 0000000..d5db6ad --- /dev/null +++ b/content/elsewhere/multi-line-memoization/index.md @@ -0,0 +1,46 @@ +--- +title: "Multi-line Memoization" +date: 2009-01-05T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/multi-line-memoization/ +--- + +Here's a quick tip that came out of a code review we did last week. One +easy way to add caching to your Ruby app is to +[memoize](https://en.wikipedia.org/wiki/Memoization) the results of +computationally expensive methods: + +``` {#code .ruby} +def foo @foo ||= expensive_method end +``` + +The first time the method is called, `@foo` will be `nil`, so +`expensive_method` will be called and its result stored in `@foo`. On +subsequent calls, `@foo` will have a value, so the call to +`expensive_method` will be bypassed. This works well for one-liners, but +what if our method requires multiple lines to determine its result? + +``` {#code .ruby} +def foo arg1 = expensive_method_1 arg2 = expensive_method_2 expensive_method_3(arg1, arg2) end +``` + +A first attempt at memoization yields this: + +``` {#code .ruby} +def foo unless @foo arg1 = expensive_method_1 arg2 = expensive_method_2 @foo = expensive_method_3(arg1, arg2) end @foo end +``` + +To me, using `@foo` three times obscures the intent of the method. Let's +do this instead: + +``` {#code .ruby} +def foo @foo ||= begin arg1 = expensive_method_1 arg2 = expensive_method_2 expensive_method_3(arg1, arg2) end end +``` + +This clarifies the role of `@foo` and reduces LOC. Of course, if you use +the Rails built-in [`memoize` +method](http://ryandaigle.com/articles/2008/7/16/what-s-new-in-edge-rails-memoization), +you can avoid accessing these instance variables entirely, but this +technique has utility in situations where requiring ActiveSupport would +be overkill. diff --git a/content/elsewhere/new-pointless-project-i-dig-durham/index.md b/content/elsewhere/new-pointless-project-i-dig-durham/index.md new file mode 100644 index 0000000..d17acde --- /dev/null +++ b/content/elsewhere/new-pointless-project-i-dig-durham/index.md @@ -0,0 +1,38 @@ +--- +title: "New Pointless Project: I Dig Durham" +date: 2011-02-25T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/new-pointless-project-i-dig-durham/ +--- + +*This post originally appeared on [Pointless +Corp](http://pointlesscorp.com/).* + +There's a lot of love at Viget South for our adopted hometown of Durham, +NC. A few of us decided to use the first [Pointless +Weekend](https://viget.com/flourish/pointless-weekend-3-new-pointless-projects) to +build a tiny application to highlight some of Durham's finer points and, +48 hours later, launched [I Dig Durham](http://idigdurham.com/). Simply +tweet to [\@idigdurham](https://twitter.com/idigdurham) (or include the +hashtag [#idigdurham](https://twitter.com/search?q=%23idigdurham)) or +post a photo to Flickr +tagged [idigdurham](http://www.flickr.com/photos/tags/idigdurham) and +we'll pull it into the site. What's more, you can order +a [t-shirt](https://idigdurham.spreadshirt.com/) with the logo on it, +with all proceeds going to [Urban Ministries of +Durham](http://www.umdurham.org/). + +As [Rails Rumble](http://railsrumble.com/) (and [Node +Knockout](http://nodeknockout.com/)) veterans, we knew that there's +basically no such thing as too simple a product for these competitions +--- no matter how little you think you have to do, you're always +sweating bullets with half an hour left to go. With that in mind, we +kept I Dig Durham as simple as possible, leaving us plenty of time to +really polish the site. + +Though basically feature complete, we've got a few tweaks we plan to +make to the site, and we'd like to expand the underlying app to support +I Dig sites for more of our favorite cities, but it\'s a good start from +[North Carolina\'s top digital +agency](https://www.viget.com/durham)\...though we may be biased. diff --git a/content/elsewhere/new-pointless-project-officegames/index.md b/content/elsewhere/new-pointless-project-officegames/index.md new file mode 100644 index 0000000..ab86c88 --- /dev/null +++ b/content/elsewhere/new-pointless-project-officegames/index.md @@ -0,0 +1,55 @@ +--- +title: "New Pointless Project: OfficeGames" +date: 2012-02-28T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/new-pointless-project-officegames/ +--- + +*This post originally appeared on [Pointless +Corp](http://pointlesscorp.com/).* + +We're a competitive company, so for this year's [Pointless +Weekend](http://www.pointlesscorp.com/blog/2012-pointless-weekend-kicks-off), +the team in Viget's Durham office thought it'd be cool to put together a +simple app for keeping track of competitions around the office. 48 hours +later (give or take), we launched [OfficeGames](http://officegam.es/). +We're proud of this product, and plan to continue improving it in the +coming weeks. Some of the highlights for me: + +## Everyone Doing Everything + +We're a highly collaborative company, but by and large, when it comes to +client work, everyone on the team has a fairly narrow role. +[Zachary](https://www.viget.com/about/team/zporter) writes Ruby code. +[Todd](https://www.viget.com/about/team/tmoy) does UX. +[Jeremy](https://www.viget.com/about/team/jfrank) focuses on the front +end. Not so for Pointless weekend -- UX, design, and development duties +were spread out across the entire team. Everyone had the repo checked +out and was committing code. + +## Responsive Design with Bootstrap + +We used Twitter's [Bootstrap](https://twitter.github.com/bootstrap/) +framework to build our app. The result is a responsive design that +shines on the iPhone but holds up well on larger screens. I was +impressed with how quickly we were able to get a decent-looking site +together, and how well the framework held up once Jeremy and +[Doug](https://www.viget.com/about/team/davery) started implementing +some of [Mark](https://www.viget.com/about/team/msteinruck)'s design +ideas. + +## Rails as a Mature Framework + +I was impressed with the way everything came together on the backend. It +seems to me that we're finally realizing the promise of the Rails +framework: common libraries that handle the application plumbing, while +still being fully customizable, so developers can quickly knock out the +boilerplate and then focus on the unique aspects of their applications. +We used [SimplestAuth](https://github.com/vigetlabs/simplest_auth), +[InheritedResources](https://github.com/josevalim/inherited_resources), +and [SimpleForm](https://github.com/plataformatec/simple_form) to great +effect. + +Sign your office up for [OfficeGames](http://officegam.es/) and then add +your coworkers to start tracking scores. Let us know what you think! diff --git a/content/elsewhere/on-confidence-and-real-time-strategy-games/index.md b/content/elsewhere/on-confidence-and-real-time-strategy-games/index.md new file mode 100644 index 0000000..d739caa --- /dev/null +++ b/content/elsewhere/on-confidence-and-real-time-strategy-games/index.md @@ -0,0 +1,61 @@ +--- +title: "On Confidence and Real-Time Strategy Games" +date: 2011-06-30T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/on-confidence-and-real-time-strategy-games/ +--- + +I want to talk about confidence and how it applies to being a successful +developer. But before I do that, I want to talk about +*[Z](https://en.wikipedia.org/wiki/Z_(video_game))*, a real-time +strategy game from the mid-'90s. + +[![](https://upload.wikimedia.org/wikipedia/en/thumb/6/68/Z_The_Bitmap_Brothers.PNG/256px-Z_The_Bitmap_Brothers.PNG)](https://en.wikipedia.org/wiki/File:Z_The_Bitmap_Brothers.PNG) + +In other popular RTSes of the time, like *Warcraft* and *Command and +Conquer*, you collected `/(gold|Tiberium|Vespene gas)/` and used it to +build units with which to smite your enemies. Z was different: no +resources, only territories that were held by either you or your +opponent. The more territories you held, the more factories you had +*and* the faster each of your factories was able to manufacture units. + +If you spent a lot of time playing a Blizzard RTS (and of course you +did), your instinct is to spend the first portion of a match fortifying +your base and amassing an army, after which you head out in search of +your enemy. Try this strategy in Z, though, and by the time you put +together a respectable force, your opponent has three times as many +units and the game is all but decided. Instead, the winning strategy is +to expand early and often, defending your territories as best you can +before pushing forward. + +## So What + +As developers, our confidence comes from the code we've written and the +successes we've had. When we find ourselves in unfamiliar territory +(such as a new technology or problem domain), our instinct is to act +like a Starcraft player --- keep a low profile, build two (ALWAYS TWO) +barracks, and code away until we have something we're confident in. This +will get you pretty far against the Zerg swarm, but it's a losing +strategy in the realm of software development: the rest of the team +isn't waiting around for you to find your comfort zone. They're making +decisions in your absence, and they very likely aren't the same +decisions you'd make. Your lack of confidence leads to poor +implementation which leads to less confidence, from both your team and +yourself. + +Instead, I contend that real-world development is closer to Z than it is +to Starcraft: show confidence early (despite lacking total understanding +of the problem) and your teammates and clients will be inclined to trust +your technical leadership, leading to better technical decisions and a +better product, giving you more confidence and your team all the more +reason to follow your advice. Just as territories lead to units lead to +more territories, confidence leads to good code leads to more +confidence. + +**In short:** *display* confidence at the beginning of a project so that +you can *have* confidence when it really counts. + +Do you agree? I'd love to hear your thoughts. Best comment gets [my +personal copy of Z](http://www.flickr.com/photos/deisinger/5888230612) +from 1996. You're on your own for the Windows 95 box. diff --git a/content/elsewhere/otp-a-language-agnostic-programming-challenge/index.md b/content/elsewhere/otp-a-language-agnostic-programming-challenge/index.md new file mode 100644 index 0000000..80f3b04 --- /dev/null +++ b/content/elsewhere/otp-a-language-agnostic-programming-challenge/index.md @@ -0,0 +1,107 @@ +--- +title: "OTP: a Language-Agnostic Programming Challenge" +date: 2015-01-26T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/otp-a-language-agnostic-programming-challenge/ +--- + +We spend our days writing Ruby and JavaScript (and love it), but we're +always looking for what's next or just what's nerdy and interesting. We +have folks exploring Rust, Go, D and Elixir, to name a few. I'm +personally interested in strongly-typed functional languages like +Haskell and OCaml, but I've had little success getting through their +corresponding [animal books](http://www.oreilly.com/). I decided that if +I was going to get serious about learning this stuff, I needed a real +problem to solve. + +Inspired by an [online course on +Cryptography](https://www.coursera.org/course/crypto), I specced out a +simple [one-time pad](https://en.wikipedia.org/wiki/One-time_pad) +encryptor/decryptor, [pushed it up to +GitHub](https://github.com/vigetlabs/otp) and issued a challenge to the +whole Viget dev team: write a pair of programs in your language of +choice to encrypt and decrypt a message from the command line. + +## The Challenge {#thechallenge} + +When you [exclusive or](https://en.wikipedia.org/wiki/Exclusive_or) +(XOR) a value by a second value, and then XOR the resulting value by the +second value, you get the original value back. Suppose you and I want to +exchange a secret message, the word "hi", and we've agreed on a secret +key, the hexadecimal number `b33f` (or in binary, 1011 0011 0011 1111). + +**To encrypt:** + +1. Convert the plaintext ("hi") to its corresponding [ASCII + values](https://en.wikipedia.org/wiki/ASCII#ASCII_printable_code_chart) + ("h" becomes 104 or 0110 1000, "i" 105 or 0110 1001). + +2. XOR the plaintext and the key: + + Plaintext: 0110 1000 0110 1001 + Key: 1011 0011 0011 1111 + XOR: 1101 1011 0101 0110 + +3. Convert the result to hexadecimal: + + 1101 = 13 = d + 1011 = 11 = b + 0101 = 5 = 5 + 0110 = 6 = 6 + +4. So the resulting ciphertext is "db56". + +**To decrypt:** + +1. Expand the ciphertext and key to their binary forms, and XOR: + + Ciphertext: 1101 1011 0101 0110 + Key: 1011 0011 0011 1111 + XOR: 0110 1000 0110 1001 + +2. Convert the resulting binary numbers to their corresponding ASCII + values: + + 0110 1000 = 104 = h + 0110 1001 = 105 = i + +3. So, as expected, the resulting plaintext is "hi". + +The [Wikipedia](https://en.wikipedia.org/wiki/One-time_pad) page plus +the [project's +README](https://github.com/vigetlabs/otp#one-time-pad-otp) provide more +detail. It's a simple problem conceptually, but in order to create a +solution that passes the test suite, you'll need to figure out: + +- Creating a basic command-line executable +- Reading from `STDIN` and `ARGV` +- String manipulation +- Bitwise operators +- Converting to and from hexadecimal + +\* \* \* + +As of today, we've created solutions in [~~eleven~~ ~~twelve~~ thirteen +languages](https://github.com/vigetlabs/otp/tree/master/languages): + +- [C](https://viget.com/extend/otp-the-fun-and-frustration-of-c) +- D +- [Elixir](https://viget.com/extend/otp-ocaml-haskell-elixir) +- Go +- [Haskell](https://viget.com/extend/otp-ocaml-haskell-elixir) +- JavaScript 5 +- JavaScript 6 +- Julia +- [Matlab](https://viget.com/extend/otp-matlab-solution-in-one-or-two-lines) +- [OCaml](https://viget.com/extend/otp-ocaml-haskell-elixir) +- Ruby +- Rust +- Swift (thanks [wasnotrice](https://github.com/wasnotrice)!) + +The results are varied and fascinating -- stay tuned for future posts +about some of our solutions. [In the +meantime](https://www.youtube.com/watch?v=TDkhl-CgETg), we'd love to see +how you approach the problem, whether in a new language or one we've +already attempted. [Fork the repo](https://github.com/vigetlabs/otp) and +show us what you've got! diff --git a/content/elsewhere/otp-ocaml-haskell-elixir/index.md b/content/elsewhere/otp-ocaml-haskell-elixir/index.md new file mode 100644 index 0000000..33dacf0 --- /dev/null +++ b/content/elsewhere/otp-ocaml-haskell-elixir/index.md @@ -0,0 +1,192 @@ +--- +title: "OTP: a Functional Approach (or Three)" +date: 2015-01-29T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/otp-ocaml-haskell-elixir/ +--- + +I intially started the [OTP +challenge](https://viget.com/extend/otp-a-language-agnostic-programming-challenge) +as a fun way to write some [OCaml](https://ocaml.org/). It was, so much +so that I wrote solutions in two other functional languages, +[Haskell](https://wiki.haskell.org/Haskell) and +[Elixir](http://elixir-lang.org/). I structured all three sets of +programs the same so that I could easily see their similarities and +differences. Check out the `encrypt` program in +[all](https://github.com/vigetlabs/otp/blob/master/languages/OCaml/encrypt.ml) +[three](https://github.com/vigetlabs/otp/blob/master/languages/Haskell/encrypt.hs) +[languages](https://github.com/vigetlabs/otp/blob/master/languages/Elixir/encrypt) +and then I'll share some of my favorite parts. Go ahead, I'll wait. + +## Don't Cross the Streams {#dontcrossthestreams} + +One tricky part of the OTP challenge is that you have to cycle over the +key if it's shorter than the plaintext. My initial approaches involved +passing around an offset and using the modulo operator, [like +this](https://github.com/vigetlabs/otp/blob/6d607129f78ccafa9a294ca04da9e4c8bf7b7cc1/decrypt.ml#L11-L14): + + let get_mask key index = + let c1 = List.nth key (index mod (List.length key)) + and c2 = List.nth key ((index + 1) mod (List.length key)) in + int_from_hex_chars c1 c2 + +Pretty gross, huh? Fortunately, both +[Haskell](http://hackage.haskell.org/package/base-4.7.0.2/docs/Prelude.html#v:cycle) +and +[Elixir](http://elixir-lang.org/docs/master/elixir/Stream.html#cycle/1) +have built-in functionality for lazy, cyclical lists, and OCaml (with +the [Batteries](http://batteries.forge.ocamlcore.org/) library) has the +[Dllist](http://batteries.forge.ocamlcore.org/doc.preview:batteries-beta1/html/api/Dllist.html) +(doubly-linked list) data structure. The OCaml code above becomes +simply: + + let get_mask key = + let c1 = Dllist.get key + and c2 = Dllist.get (Dllist.next key) in + int_of_hex_chars c1 c2 + +No more passing around indexes or using `mod` to stay within the bounds +of the array -- the Dllist handles that for us. + +Similarly, a naïve Elixir approach: + + def get_mask(key, index) do + c1 = Enum.at(key, rem(index, length(key))) + c2 = Enum.at(key, rem(index + 1, length(key))) + int_of_hex_chars(c1, c2) + end + +And with streams activated: + + def get_mask(key) do + Enum.take(key, 2) |> int_of_hex_chars + end + +Check out the source code +([OCaml](https://github.com/vigetlabs/otp/blob/master/languages/OCaml/encrypt.ml), +[Haskell](https://github.com/vigetlabs/otp/blob/master/languages/Haskell/encrypt.hs), +[Elixir](https://github.com/vigetlabs/otp/blob/master/languages/Elixir/encrypt)) +to get a better sense of cyclical data structures in action. + +## Partial Function Application {#partialfunctionapplication} + +Most programming languages have a clear distinction between function +arguments (input) and return values (output). The line is less clear in +[ML](https://en.wikipedia.org/wiki/ML_%28programming_language%29)-derived +languages like Haskell and OCaml. Check this out (from Haskell's `ghci` +interactive shell): + + Prelude> let add x y = x + y + Prelude> add 5 7 + 12 + +We create a function, `add`, that (seemingly) takes two arguments and +returns their sum. + + Prelude> let add5 = add 5 + Prelude> add5 7 + 12 + +But what's this? Using our existing `add` function, we've created +another function, `add5`, that takes a single argument and adds five to +it. So while `add` appears to take two arguments and sum them, it +actually takes one argument and returns a function that takes one +argument and adds it to the argument passed to the initial function. + +When you inspect the type of `add`, you can see this lack of distinction +between input and output: + + Prelude> :type add + add :: Num a => a -> a -> a + +Haskell and OCaml use a concept called +[*currying*](https://en.wikipedia.org/wiki/Currying) or partial function +application. It's a pretty big departure from the C-derived languages +most of us are used to. Other languages may offer currying as [an +option](http://ruby-doc.org/core-2.1.1/Proc.html#method-i-curry), but +this is just how these languages work, out of the box, all of the time. + +Let's see this concept in action. To convert a number to its hex +representation, you call `printf "%x" num`. To convert a whole list of +numbers, pass the partially applied function `printf "%x"` to `map`, +[like +so](https://github.com/vigetlabs/otp/blob/master/languages/Haskell/encrypt.hs#L12): + + hexStringOfInts nums = concat $ map (printf "%x") nums + +For more info on currying/partial function application, check out +[*Learn You a Haskell for Great +Good*](http://learnyouahaskell.com/higher-order-functions). + +## A Friendly Compiler {#afriendlycompiler} + +I learned to program with C++ and Java, where `gcc` and `javac` weren't +my friends -- they were jerks, making me jump through a bunch of hoops +without catching any actual issues (or so teenage Dave thought). I've +worked almost exclusively with interpreted languages in the intervening +10+ years, so it was fascinating to work with Haskell and OCaml, +languages with compilers that catch real issues. Here's my original +`decrypt` function in Haskell: + + decrypt ciphertext key = case ciphertext of + [] -> [] + c1:c2:cs -> xor (intOfHexChars [c1, c2]) (getMask key) : decrypt cs (drop 2 key) + +Using pattern matching, I pull off the first two characters of the +ciphertext and decrypt them against they key, and then recurse on the +rest of the ciphertext. If the list is empty, we're done. When I +compiled the code, I received the following: + + decrypt.hs:16:26: Warning: + Pattern match(es) are non-exhaustive + In a case alternative: Patterns not matched: [_] + +The Haskell compiler is telling me that I haven't accounted for a list +consisting of a single character. And sure enough, this is invalid input +that a user could nevertheless use to call the program. Adding the +following handles the failure and fixes the warning: + + decrypt ciphertext key = case ciphertext of + [] -> [] + [_] -> error "Invalid ciphertext" + c1:c2:cs -> xor (intOfHexChars [c1, c2]) (getMask key) : decrypt cs (drop 2 key) + +## Elixir's \|\> operator {#elixirsoperator} + +According to [*Programming +Elixir*](https://pragprog.com/book/elixir/programming-elixir), the pipe +operator (`|>`) + +> takes the result of the expression to its left and inserts it as the +> first parameter of the function invocation to its right. + +It's borrowed from F#, so it's not an entirely novel concept, but it's +certainly new to me. To build our key, we want to take the first +argument passed into the program, convert it to a list of characters, +and then turn it to a cyclical stream. My initial approach looked +something like this: + + key = Stream.cycle(to_char_list(List.first(System.argv))) + +Using the pipe operator, we can flip that around into something much +more readable: + + key = System.argv |> List.first |> to_char_list |> Stream.cycle + +I like it. Reminds me of Unix pipes or any Western written language. +[Here's how I use the pipe operator in my encrypt +solution](https://github.com/vigetlabs/otp/blob/master/languages/Elixir/encrypt#L25-L31). + +\* \* \* + +At the end of this process, I think Haskell offers the most elegant code +and [Elixir](https://www.viget.com/services/elixir) the most potential +for us at Viget to use professionally. OCaml offers a good middle ground +between theory and practice, though the lack of a robust standard +library is a [bummer, man](https://www.youtube.com/watch?v=24Vlt-lpVOY). + +I had a great time writing and refactoring these solutions. I encourage +you to [check out the +code](https://github.com/vigetlabs/otp/tree/master/languages), fork the +repo, and take the challenge yourself. diff --git a/content/elsewhere/out-damned-tabs/index.md b/content/elsewhere/out-damned-tabs/index.md new file mode 100644 index 0000000..5f4573c --- /dev/null +++ b/content/elsewhere/out-damned-tabs/index.md @@ -0,0 +1,65 @@ +--- +title: "Out, Damned Tabs" +date: 2009-04-09T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/out-damned-tabs/ +--- + +Like many developers I know, I'm a little bit OCD about code formatting. +While there are about as many ideas of properly formatted code as there +are coders, I think we can all agree that code with tabs and trailing +whitespace is not it. Git has the `whitespace = fix` option, which does +a fine job removing trailing spaces before commits, but leaves the +spaces in the working copy, and doesn't manage tabs at all. + +I figured there had to be a better way to automate this type of code +formatting, and with help from [Kevin McFadden's +post](http://conceptsahead.com/off-axis/proper-trimming-on-save-with-textmate), +I think I've found one, by configuring +[TextMate](http://macromates.com/) to strip off trailing whitespace and +replace tabs with spaces whenever a file is saved. Here's how to set it +up: + +1. Open the Bundle Editor (Bundles \> Bundle Editor \> Show Bundle + Editor). + +2. Create a new bundle using the "+" menu at the bottom of the page. + Call it something like "Whitespace." + +3. With your new bundle selected, create a new command called "Save + Current File," and give it the following settings: + + - Save: Current File + - Command(s): blank + - Input: None + - Output: Discard + +4. Start recording a new macro (Bundles \> Macros \> Start Recording). + +5. Strip out trailing whitespace (Bundles \> Text \> + Converting/Stripping \> Remove Trailing Spaces in Document). + +6. Replace tabs with spaces (Text \> Convert \> Tabs to Spaces). + +7. Save the current document (Bundles \> Formatting \> Save Current + Document). + +8. Stop recording the macro (Bundles \> Macros \> Stop Recording). + +9. Save the macro (Bundles \> Macros \> Save Last Recording). Call it + something like "Strip Whitespace." + +10. Click in the Activation (Key Equivalent) text field and hit + Command+S. + +Alternatively, we've packaged the bundle up and put it up on +[GitHub](https://github.com/vigetlabs/whitespace-tmbundle/tree/master). +Instructions for setting it up are on the page, and patches are +encouraged. + +### How About You? {#how_about_you} + +This approach is working well for me; I'm curious if other people are +doing anything like this. If you've got an alternative way to deal with +extraneous whitespace in your code, please tell us how in the comments. diff --git a/content/elsewhere/pandoc-a-tool-i-use-and-like/index.md b/content/elsewhere/pandoc-a-tool-i-use-and-like/index.md new file mode 100644 index 0000000..f70a8b6 --- /dev/null +++ b/content/elsewhere/pandoc-a-tool-i-use-and-like/index.md @@ -0,0 +1,214 @@ +--- +title: "Pandoc: A Tool I Use and Like" +date: 2022-05-25T00:00:00+00:00 +draft: false +needs_review: true +canonical_url: https://www.viget.com/articles/pandoc-a-tool-i-use-and-like/ +--- + +Today I want to talk to you about one of my favorite command-line tools, +[Pandoc](https://pandoc.org/). From the project website: + +> If you need to convert files from one markup format into another, +> pandoc is your swiss-army knife. + +I spend a lot of time writing, and I love [Vim](https://www.vim.org/), +[Markdown](https://daringfireball.net/projects/markdown/), and the +command line (and avoid browser-based WYSIWYG editors when I can), so +that's where a lot of my Pandoc use comes in, but it has a ton of +utility outside of that -- really, anywhere you need to move between +different text-based formats, Pandoc can probably help. A few examples +from recent memory: + +### Markdown ➞ Craft Blog Post + +This website you're reading presently uses [Craft +CMS](https://craftcms.com/), a flexible and powerful content management +system that doesn't perfectly match my writing +process[^1^](#fn1){#fnref1 .footnote-ref role="doc-noteref"}. Rather +than composing directly in Craft, I prefer to write locally, pipe the +output through Pandoc, and put the resulting HTML into a text block in +the CMS. This gets me a few things I really like: + +- Curly quotes in place of straight ones and en-dashes in place of + `--` (from the [`smart` + extension](https://pandoc.org/MANUAL.html#extension-smart)) +- [Daring + Fireball-style](https://daringfireball.net/2005/07/footnotes) + footnotes with return links + +By default, Pandoc uses [Pandoc +Markdown](https://garrettgman.github.io/rmarkdown/authoring_pandoc_markdown.html) +when converting Markdown docs to other formats, an "extended and +slightly revised version" of the original syntax, which is how footnotes +and a bunch of other things work. + +### Markdown ➞ Rich Text (Basecamp) + +I also sometimes find myself writing decently long +[Basecamp](https://basecamp.com/) posts. Basecamp 3 has a fine WYSIWYG +editor (🪦 Textile), but again, I'd rather be in Vim. Pasting HTML into +Basecamp doesn't work (just shows the code verbatim), but I've found +that if I convert my Markdown notes to HTML and open the HTML in a +browser, I can copy and paste that directly into Basecamp with good +results. Leveraging MacOS' `open` command, this one-liner does the +trick[^2^](#fn2){#fnref2 .footnote-ref role="doc-noteref"}: + + cat [filename.md] + | pandoc -t html + > /tmp/output.html + && open /tmp/output.html + && read -n 1 + && rm /tmp/output.html + +This will convert the contents to HTML, save that to a file, open the +file in a browser, wait for the user to hit enter, and the remove the +file. Without that `read -n 1`, it'll remove the file before the browser +has a chance to open it. + +### HTML ➞ Text + +We built an app for one of our clients that takes in news articles (in +HTML) via an API and sends them as emails to *their* clients (think big +brands) if certain criteria are met. Recently, we were making +improvements to the plain text version of the emails, and we noticed +that some of the articles were coming in without any linebreaks in the +content. When we removed the HTML (via Rails' [`strip_tags` +helper](https://apidock.com/rails/ActionView/Helpers/SanitizeHelper/strip_tags)), +the resulting content was all on one line, which wasn't very readable. +So imagine an article like this: + +

Headline

A paragraph.

  • List item #1
  • List item #2
+ +Our initial approach (with `strip_tags`) gives us this: + + Headline A paragraph. List item #1 List item #2 + +Not great! But fortunately, some bright fellow had the idea to pull in +Pandoc, and some even brighter person packaged up some [Ruby +bindings](https://github.com/xwmx/pandoc-ruby) for it. Taking that same +content and running it through `PandocRuby.html(content).to_plain` gives +us: + + Headline + + A paragraph. + + - List item #1 + - List item #2 + +Much better, and though you can't tell from this basic example, Pandoc +does a great job with spacing and wrapping to generate really +nice-looking plain text from HTML. + +### HTML Element ➞ Text + +A few months ago, we were doing Pointless Weekend and needed a domain +for our +[Thrillr](https://www.viget.com/articles/plan-a-killer-party-with-thrillr/) +app. A few of us were looking through lists of fun top-level domains, +but we realized that AWS Route 53 only supports a limited set of them. +In order to get everyone the actual list, I needed a way to get all the +content out of an HTML `` in the DOM view that pops up +- Right click it, then go to "Copy", then "Inner HTML" +- You'll now have all of the `