Pull in Viget posts

This commit is contained in:
David Eisinger
2023-10-22 23:52:56 -04:00
parent 625d374135
commit 0438a6d828
77 changed files with 8219 additions and 5 deletions

View File

@@ -0,0 +1,182 @@
---
title: "Adding a NOT NULL Column to an Existing Table"
date: 2014-09-30T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/adding-a-not-null-column-to-an-existing-table/
---
*Despite some exciting advances in the field, like
[Node](http://nodejs.org/), [Redis](http://redis.io/), and
[Go](https://golang.org/), a well-structured relational database fronted
by a Rails or Sinatra (or Django, etc.) app is still one of the most
effective toolsets for building things for the web. In the coming weeks,
I'll be publishing a series of posts about how to be sure that you're
taking advantage of all your RDBMS has to offer.*
ASSUMING MY [LAST
POST](https://viget.com/extend/required-fields-should-be-marked-not-null)
CONVINCED YOU of the *why* of marking required fields `NOT NULL`, the
next question is *how*. When creating a brand new table, it's
straightforward enough:
CREATE TABLE employees (
id integer NOT NULL,
name character varying(255) NOT NULL,
created_at timestamp without time zone,
...
);
When adding a column to an existing table, things get dicier. If there
are already rows in the table, what should the database do when
confronted with a new column that 1) cannot be null and 2) has no
default value? Ideally, the database would allow you to add the column
if there is no existing data, and throw an error if there is. As we'll
see, depending on your choice of database platform, this isn't always
the case.
## A Naïve Approach {#anaïveapproach}
Let's go ahead and add a required `age` column to our employees table,
and let's assume I've laid my case out well enough that you're going to
require it to be non-null. To add our column, we create a migration like
so:
class AddAgeToEmployees < ActiveRecord::Migration
def change
add_column :employees, :age, :integer, null: false
end
end
The desired behavior on running this migration would be for it to run
cleanly if there are no employees in the system, and to fail if there
are any. Let's try it out, first in Postgres, with no employees:
== AddAgeToEmployees: migrating ==============================================
-- add_column(:employees, :age, :integer, {:null=>false})
-> 0.0006s
== AddAgeToEmployees: migrated (0.0007s) =====================================
Bingo. Now, with employees:
== AddAgeToEmployees: migrating ==============================================
-- add_column(:employees, :age, :integer, {:null=>false})
rake aborted!
StandardError: An error has occurred, this and all later migrations canceled:
PG::NotNullViolation: ERROR: column "age" contains null values
Exactly as we'd expect. Now let's try SQLite, without data:
== AddAgeToEmployees: migrating ==============================================
-- add_column(:employees, :age, :integer, {:null=>false})
rake aborted!
StandardError: An error has occurred, this and all later migrations canceled:
SQLite3::SQLException: Cannot add a NOT NULL column with default value NULL: ALTER TABLE "employees" ADD "age" integer NOT NULL
Regardless of whether or not there are existing rows in the table,
SQLite won't let you add `NOT NULL` columns without default values.
Super strange. More information on this ... *quirk* ... is available on
this [StackOverflow
thread](http://stackoverflow.com/questions/3170634/how-to-solve-cannot-add-a-not-null-column-with-default-value-null-in-sqlite3).
Finally, our old friend MySQL. Without data:
== AddAgeToEmployees: migrating ==============================================
-- add_column(:employees, :age, :integer, {:null=>false})
-> 0.0217s
== AddAgeToEmployees: migrated (0.0217s) =====================================
Looks good. Now, with data:
== AddAgeToEmployees: migrating ==============================================
-- add_column(:employees, :age, :integer, {:null=>false})
-> 0.0190s
== AddAgeToEmployees: migrated (0.0191s) =====================================
It ... worked? Can you guess what our existing user's age is?
> be rails runner "p Employee.first"
#<Employee id: 1, name: "David", created_at: "2014-07-09 00:41:08", updated_at: "2014-07-09 00:41:08", age: 0>
Zero. Turns out that MySQL has a concept of an [*implicit
default*](http://stackoverflow.com/questions/22868345/mysql-add-a-not-null-column/22868473#22868473),
which is used to populate existing rows when a default is not supplied.
Neat, but exactly the opposite of what we want in this instance.
### A Better Approach {#abetterapproach}
What's the solution to this problem? Should we just always use Postgres?
[Yes.](https://www.youtube.com/watch?v=bXpsFGflT7U)
But if that's not an option (say your client's support contract only
covers MySQL), there's still a way to write your migrations such that
Postgres, SQLite, and MySQL all behave in the same correct way when
adding `NOT NULL` columns to existing tables: add the column first, then
add the constraint. Your migration would become:
class AddAgeToEmployees < ActiveRecord::Migration
def up
add_column :employees, :age, :integer
change_column_null :employees, :age, false
end
def down
remove_column :employees, :age, :integer
end
end
Postgres behaves exactly the same as before. SQLite, on the other hand,
shows remarkable improvement. Without data:
== AddAgeToEmployees: migrating ==============================================
-- add_column(:employees, :age, :integer)
-> 0.0024s
-- change_column_null(:employees, :age, false)
-> 0.0032s
== AddAgeToEmployees: migrated (0.0057s) =====================================
Success -- the new column is added with the null constraint. And with
data:
== AddAgeToEmployees: migrating ==============================================
-- add_column(:employees, :age, :integer)
-> 0.0024s
-- change_column_null(:employees, :age, false)
rake aborted!
StandardError: An error has occurred, this and all later migrations canceled:
SQLite3::ConstraintException: employees.age may not be NULL
Perfect! And how about MySQL? Without data:
== AddAgeToEmployees: migrating ==============================================
-- add_column(:employees, :age, :integer)
-> 0.0145s
-- change_column_null(:employees, :age, false)
-> 0.0176s
== AddAgeToEmployees: migrated (0.0323s) =====================================
And with:
== AddAgeToEmployees: migrating ==============================================
-- add_column(:employees, :age, :integer)
-> 0.0142s
-- change_column_null(:employees, :age, false)
rake aborted!
StandardError: An error has occurred, all later migrations canceled:
Mysql2::Error: Invalid use of NULL value: ALTER TABLE `employees` CHANGE `age` `age` int(11) NOT NULL
BOOM. [Flawless victory.](https://www.youtube.com/watch?v=kXuCvIbY1v4)
\* \* \*
To summarize: never use `add_column` with `null: false`. Instead, add
the column and then use `change_column_null` to set the constraint for
correct behavior regardless of database platform. In a follow-up post,
I'll focus on what to do when you don't want to simply error out if
there is existing data, but rather migrate it into a good state before
setting `NOT NULL`.

View File

@@ -0,0 +1,93 @@
---
title: "Around \"Hello World\" in 30 Days"
date: 2010-06-02T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/around-hello-world-in-30-days/
---
I'll say this up front: I love my job. I love the web. I love Rails. And
I love working here at Viget. But lately, I've gone through periods
where web development feels a bit stale. Hugh Macleod has a great post
called [Beware of Turning Hobbies into
Jobs](http://gapingvoid.com/2008/01/10/beware-of-turning-hobbies-into-jobs/)
that sheds a bit of light on this problem: once you make a career out of
doing what you love, it's not solely yours anymore. There are clearly
[bigger problems one could
have](http://news.nationalgeographic.com/news/2010/06/100601-sinkhole-in-guatemala-2010-world-science/),
but I think this is something all developers struggle with at some point
in their careers.
This problem was weighing on my mind one morning, combined with a
looming speaking engagement I'd committed to for [DevNation
Chicago](http://devnation.us/events/8), when it hit me: I would spend a
month trying a new technology every day, and then share my experiences
in Chicago. Learning is a core value here at Viget, and my coworkers
were incredibly supportive, adding to the list of technologies and
asking to join me in learning several of them. With their help, coming
up with the list was no problem --- it was actually harder to get the
list *down* to 30. Here's what I finally committed to:
1. [Cassandra](http://cassandra.apache.org/)
2. [Chrome Extensions](https://code.google.com/chrome/extensions/)
3. [Clojure](http://clojure.org/)
4. [CoffeeScript](https://jashkenas.github.com/coffee-script/)
5. [CouchDB](http://couchdb.apache.org/)
6. [CSS3](http://www.css3.info/)
7. [Django](https://www.djangoproject.com/)
8. [Erlang](http://www.erlang.org/)
9. [Go](https://golang.org/)
10. [Haskell](http://www.haskell.org/)
11. [HTML5](https://en.wikipedia.org/wiki/HTML5)
12. [Io](http://www.iolanguage.com/)
13. [Jekyll](https://github.com/mojombo/jekyll)
14. [jQTouch](http://www.jqtouch.com/)
15. [Lua](http://www.lua.org/)
16. [MacRuby](http://www.macruby.org/)
17. [Mercurial](http://mercurial.selenic.com/)
18. [MongoDB](http://www.mongodb.org/)
19. [Node.js](http://nodejs.org/)
20. [OCaml](http://caml.inria.fr/)
21. [ooc](http://ooc-lang.org/)
22. [Redis](https://code.google.com/p/redis/)
23. [Riak](http://riak.basho.com/)
24. [Scala](http://www.scala-lang.org/)
25. [Scheme](https://en.wikipedia.org/wiki/Scheme_(programming_language))
26. [Sinatra](http://www.sinatrarb.com/)
27. [Squeak](http://www.squeak.org/)
28. [Treetop](http://treetop.rubyforge.org/)
29. [VIM](http://www.vim.org/)
30. [ZSH](http://www.zsh.org/)
Thirteen languages, most of them functional. Five datastores of various
[NoSQL](https://en.wikipedia.org/wiki/NoSQL) flavors. Five web
frameworks, and seven "others," including a new version control system,
text editor, and shell.
Once I'd committed myself to this project, an hour a day for 30 days, it
was surprisingly easy to stick with it. The hour time slot was critical,
both as a minimum (no giving up when things get too hard or too easy)
and as a maximum (it's easier to sit down with an intimidating piece of
technology at 7 p.m. when you know you'll be done by 8). I did have some
ups and downs, though. High points included Redis, Scheme, Erlang, and
CoffeeScript. Lows included Cassandra and CouchDB, which I couldn't even
get running in the allotted hour.
I created a simple [Tumblr blog](https://techmonth.tumblr.com)
and posted to it after every new tech, which kept me accountable and
spurred discussion on Twitter and at the office. My talk went over
surprisingly well at DevNation ([here are my
slides](http://www.slideshare.net/deisinger/techmonth)), and I hope to
give it again at future events.
All in all, it was a great experience and proved that projects that are
intimidating when considered all at once are easily manageable when
broken down into small pieces. The biggest lesson I took away from the
whole thing was that it's fundamental to find a way to make programming
fun. Working my way through [The Little
Schemer](https://www.amazon.com/Little-Schemer-Daniel-P-Friedman/dp/0262560992)
or building a simple webapp with [Node.js](http://nodejs.org/), I felt
like a kid again, pecking out my first QBasic programs. Learning how to
keep programming exciting is far more beneficial than any concrete
technical knowhow I gained.

View File

@@ -0,0 +1,113 @@
---
title: "AWS OpsWorks: Lessons Learned"
date: 2013-10-04T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/aws-opsworks-lessons-learned/
---
We've been using Amazon's [AWS
OpsWorks](http://aws.amazon.com/opsworks/) to manage our infrastructure
on a recent client project. The website describes OpsWorks as
> a DevOps solution for managing applications of any scale or complexity
> on the AWS cloud.
You can think of it as a middleground between something like Heroku and
a manually configured server environment. You can also think of it as
[Chef](http://www.opscode.com/chef/)-as-a-service. Before reading on,
I'd recommend reading this [Introduction to AWS
OpsWorks](http://artsy.github.io/blog/2013/08/27/introduction-to-aws-opsworks/),
a post I wish had existed when I was first diving into this stuff. With
that out of the way, here are a few lessons I had to learn the hard way
so hopefully you won't have to.
### You'll need to learn Chef {#youllneedtolearnchef}
The basis of OpsWorks is [Chef](http://www.opscode.com/chef/), and if
you want to do anything interesting with your instances, you're going to
have to dive in, fork the [OpsWorks
cookbooks](https://github.com/aws/opsworks-cookbooks), and start adding
your own recipes. Suppose, like we did, you want to add
[PDFtk](http://www.pdflabs.com/tools/pdftk-the-pdf-toolkit/) to your
servers to merge some documents:
1. Check the [OpsCode
Community](http://community.opscode.com/cookbooks) site for a
recipe.
2. [A recipe exists.](http://community.opscode.com/cookbooks/pdftk) You
lucky dog.
3. Add the recipe to your fork, push it up, and run it.
4. It fails. Turns out they renamed the `gcj` package to `gcj-jdk`.
Fix.
5. It fails again. The recipe is referencing an old version of PDFtk.
Fix.
6. [Great sexy success.](http://cdn.meme.li/i/d1v84.jpg)
A little bit tedious compared with `wget/tar/make`, for sure, but once
you get it configured properly, you can spin up new servers at will and
be confident that they include all the necessary software.
### Deploy hooks: learn them, love them {#deployhooks:learnthemlovethem}
Chef offers a number of [deploy
callbacks](http://docs.opscode.com/resource_deploy.html#callbacks) you
can use as a stand-in for Capistrano's `before`/`after` hooks. To use
them, create a directory in your app called `deploy` and add files named
for the appropriate callbacks (e.g. `deploy/before_migrate.rb`). For
example, here's how we precompile assets before migration:
rails_env = new_resource.environment["RAILS_ENV"]
Chef::Log.info("Precompiling assets for RAILS_ENV=#{rails_env}...")
execute "rake assets:precompile" do
cwd release_path
command "bundle exec rake assets:precompile"
environment "RAILS_ENV" => rails_env
end
### Layers: roles, but not *dedicated* roles {#layers:rolesbutnotdedicatedroles}
AWS documentation describes
[layers](http://docs.aws.amazon.com/opsworks/latest/userguide/workinglayers.html)
as
> how to set up and configure a set of instances and related resources
> such as volumes and Elastic IP addresses.
The default layer types ("PHP App Server", "MySQL") imply that layers
distinguish separate components of your infrastructure. While that's
partially true, it's better to think about layers as the *roles* your
EC2 instances fill. For example, you might have two instances in your
"Rails App Server" role, a single, separate instance for your "Resque"
role, and one of the two app servers in the "Cron" role, responsible for
sending nightly emails.
### Altering the Rails environment {#alteringtherailsenvironment}
If you need to manually execute a custom recipe against your existing
instances, the Rails environment is going to be set to "production" no
matter what you've defined in the application configuration. In order to
change this value, add the following to the "Custom Chef JSON" field:
{
"deploy": {
"app_name": {
"rails_env": "staging"
}
}
}
(Substituting in your own application and environment names.)
------------------------------------------------------------------------
We've found OpsWorks to be a solid choice for repeatable, maintainable
server infrastructure that still offers the root access we all crave.
Certainly, it's slower out of the gate than spinning up a new Heroku app
or logging into a VPS and `apt-get`ting it up, but the investment up
front leads to a more sustainable system over time. If this sounds at
all interesting to you, seriously go check out that [introduction
post](http://artsy.github.io/blog/2013/08/27/introduction-to-aws-opsworks/).
It's the post this post wishes it was.

View File

@@ -0,0 +1,51 @@
---
title: "Backup your Database in Git"
date: 2009-05-08T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/backup-your-database-in-git/
---
**Short version**: dump your production database into a git repository
for an instant backup solution.
**Long version**: keeping backups of production data is fundamental for
a well-run web application, but it's tricky to maintain history while
keeping disk usage at a reasonable level. You could continually
overwrite the backup with the latest data, but you risk automatically
replacing good data with bad. You could save each version in a separate,
timestamped file, but since most of the data is static, you would end up
wasting a lot of disk space.
When you think about it, a database dump is just SQL code, so why not
manage it the same way you manage the rest of your code --- in a source
code manager? Setting such a scheme up is dead simple. On your
production server, with git installed:
mkdir -p /path/to/backup cd /path/to/backup mysqldump -u [user] -p[pass] --skip-extended-insert [database] > [database].sql git init git add [database].sql git commit -m "Initial commit"
The `--skip-extended-insert` option tells mysqldump to give each table
row its own `insert` statement. This creates a larger initial commit
than the default bulk insert, but makes future commits much easier to
read and (I suspect) keeps the overall repository size smaller, since
each patch only includes the individual records added/updated/deleted.
From here, all we have to do is set up a cronjob to update the backup:
0 * * * * cd /path/to/backup && \ mysqldump -u [user] -p[pass] --skip-extended-insert [database] > [database].sql && \ git commit -am "Updating DB backup"
You may want to add another entry to run
[`git gc`](http://www.kernel.org/pub/software/scm/git/docs/git-gc.html)
every day or so in order to keep disk space down and performance up.
Now that you have all of your data in a git repo, you've got a lot of
options. Easily view activity on your site with `git whatchanged -p`.
Update your staging server to the latest data with
`git clone ssh://[hostname]/path/to/backup`. Add a remote on
[Github](https://github.com/) and get offsite backups with a simple
`git push`.
This technique might fall down if your app approaches
[Craigslist](http://craigslist.org/)-level traffic, but it's working
flawlessly for us on [SpeakerRate](http://speakerrate.com), and should
work well for your small- to medium-sized web application.

View File

@@ -0,0 +1,71 @@
---
title: "CoffeeScript for Ruby Bros"
date: 2010-08-06T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/coffeescript-for-ruby-bros/
---
Hello there, Ruby friend. You've perhaps heard of
[CoffeeScript](https://jashkenas.github.com/coffee-script/),
"JavaScript's less ostentatious kid brother," but it might *as yet* be
unclear why you'd want to stray from Ruby's loving embrace. Well,
friend, I've been playing with it off-and-on for the past few months,
and I've come to the following conclusion: **CoffeeScript combines the
simplicity of Javascript with the elegance of Ruby.**
## Syntax
Despite its compactness as a language, Javascript has always felt a bit
noisy to me. Its excessive punctuation is pretty much the only thing it
has in common with its namesake. CoffeeScript borrows from the syntaxes
of Ruby and Python to create a sort of minimalist Javascript. From
Python, we get significant whitespace and list comprehensions.
Otherwise, it's all Ruby: semicolons and parentheses around function
arguments are entirely optional. Like Ruby's `||=`, conditional
assignment is handled with `?=`. Conditionals can be inlined
(`something if something_else`). And every statement has an implicit
value, so `return` is unnecessary.
## Functions
Both Javascript and Ruby support functional programming. Ruby offers
numerous language features to make functional programming as concise as
possible, the drawback being the sheer number of ways to define a
function: at least six, by my count (`def`, `do/end`, `{ }`, `lambda`,
`Proc.new`, `proc`).
At the other extreme, Javascript offers but one way to define a
function: the `function` keyword. It's certainly simple, but especially
in callback-oriented code, you wind up writing `function` one hell of a
lot. CoffeeScript gives us the `->` operator, combining the brevity of
Ruby with the simplicity of Javascript:
thrice: (f) -> f() f() f() thrice -> puts "OHAI"
Which translates to:
(function(){ var thrice; thrice = function(f) { f(); f(); return f(); }; thrice(function() { return puts("OHAI"); }); })();
I'll tell you what that is: MONEY. Money in the BANK.
## It's Node
Though not dependent upon it, CoffeeScript is built to run on top of
[Node.js](http://nodejs.org/). This means you can take advantage of all
the incredible work people are doing with Node, including the
[Express](http://expressjs.com/) web framework, the [Redis Node
Client](https://github.com/fictorial/redis-node-client), and
[Connect](https://github.com/senchalabs/connect), a middleware framework
along the lines of [Rack](http://rack.rubyforge.org/). What's more, its
integration with Node allows you to run CoffeeScript programs from the
command line just like you would Ruby code.
CoffeeScript is an exciting technology, as both a standalone language
and as a piece of a larger Node.js toolkit. Take a look at
[Defer](http://gfxmonk.net/2010/07/04/defer-taming-asynchronous-javascript-with-coffeescript.html)
to see what the language might soon be capable of, and if you're
participating in this year's [Node.js
Knockout](http://nodeknockout.com/), watch out for the
[Rocketpants](http://nodeknockout.com/teams/2eb41a4c31f50c044a280000).

View File

@@ -0,0 +1,63 @@
---
title: "Convert a Ruby Method to a Lambda"
date: 2011-04-26T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/convert-ruby-method-to-lambda/
---
Last week I
[tweeted](https://twitter.com/#!/deisinger/status/60706017037660160):
> Convert a method to a lambda in Ruby: lambda(&method(:events_path)).
> OR JUST USE JAVASCRIPT.
It might not be clear what I was talking about or why it would be
useful, so allow me to elaborate. Say you've got the following bit of
Javascript:
var ytmnd = function() { alert("you're the man now " + (arguments[0] || "dog")); };
Calling `ytmnd()` gets us `you're the man now dog`, while
`ytmnd("david")` yields `you're the man now david`. Calling simply
`ytmnd` gives us a reference to the function that we're free to pass
around and call at a later time. Consider now the following Ruby code:
def ytmnd(name = "dog") puts "you're the man now #{name}" end
First, aren't default argument values and string interpolation awesome?
Love you, Ruby. Just as with our Javascript function, calling `ytmnd()`
prints "you're the man now dog", and `ytmnd("david")` also works as
you'd expect. But. BUT. Running `ytmnd` returns *not* a reference to the
method, but rather calls it outright, leaving you with nothing but Sean
Connery's timeless words.
To duplicate Javascript's behavior, you can convert the method to a
lambda with `sean = lambda(&method(:ytmnd))`. Now you've got something
you can call with `sean.call` or `sean.call("david")` and pass around
with `sean`.
BUT WAIT. Everything in Ruby is an object, even methods. And as it turns
out, a method object behaves very much like a lambda. So rather than
saying `sean = lambda(&method(:ytmnd))`, you can simply say
`sean = method(:ytmnd)`, and then call it as if it were a lambda with
`.call` or `[]`. Big ups to
[Justin](https://www.viget.com/about/team/jmarney/) for that knowledge
bomb.
### WHOOOO CARES
All contrivances aside, there are real-life instances where you'd want
to take advantage of this language feature. Imagine a Rails partial that
renders a list of filtered links for a given model. How would you tell
the partial where to send the links? You could pass in a string and use
old-school `:action` and `:controller` params or use `eval` (yuck). You
could create the lambda the long way with something like
`:base_url => lambda { |*args| articles_path(*args) }`, but using
`method(:articles_path)` accomplishes the same thing with much less line
noise.
I'm not sure it would have ever occurred to me to do something like this
before I got into Javascript. Just goes to show that if you want to get
better as a Rubyist, a great place to start is with a different language
entirely.

View File

@@ -0,0 +1,52 @@
---
title: "cURL and Your Rails 2 App"
date: 2008-03-28T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/curl-and-your-rails-2-app/
---
If you're anything like me, you've used
[cURL](https://en.wikipedia.org/wiki/CURL) to download a batch of MP3
files from the web, or to move a TAR file from one remote server to
another. It might come as a surprise, then, that cURL is a full-featured
HTTP client, which makes it perfect for interacting with RESTful web
services like the ones encouraged by Rails 2. To illustrate, let's
create a small Rails app called 'tv_show':
rails tv_show cd tv_show script/generate scaffold character name:string action:string rake db:migrate script/server
Fire up your web browser and create a few characters. Once you've done
that, open a new terminal window and try the following:
curl http://localhost:3000/characters.xml
You'll get a nice XML representation of your characters:
<?xml version"1.0" encoding="UTF-8"?> <characters type="array"> <character> <id type="integer">1</id> <name>George Sr.</name> <action>goes to jail</action> <created-at type="datetime">2008-03-28T11:01:57-04:00</created-at> <updated-at type="datetime">2008-03-28T11:01:57-04:00</updated-at> </character> <character> <id type="integer">2</id> <name>Gob</name> <action>rides a Segway</action> <created-at type="datetime">2008-03-28T11:02:07-04:00</created-at> <updated-at type="datetime">2008-03-28T11:02:12-04:00</updated-at> </character> <character> <id type="integer">3</id> <name>Tobias</name> <action>wears cutoffs</action> <created-at type="datetime">2008-03-28T11:02:20-04:00</created-at> <updated-at type="datetime">2008-03-28T11:02:20-04:00</updated-at> </character> </characters>
You can retrieve the representation of a specific character by
specifying his ID in the URL:
dce@roflcopter ~ > curl http://localhost:3000/characters/1.xml <?xml version="1.0" encoding="UTF-8"?> <character> <id type="integer">1</id> <name>George Sr.</name> <action>goes to jail</action> <created-at type="datetime">2008-03-28T11:01:57-04:00</created-at> <updated-at type="datetime">2008-03-28T11:01:57-04:00</updated-at> </character>
To create a new character, issue a POST request, use the -X flag to
specify the action, and the -d flag to define the request body:
curl -X POST -d "character[name]=Lindsay&character[action]=does+nothing" http://localhost:3000/characters.xml
Here's where things get interesting: unlike most web browsers, which
only support GET and POST, cURL supports the complete set of HTTP
actions. If we want to update one of our existing characters, we can
issue a PUT request to the URL of that character's representation, like
so:
curl -X PUT -d "character[action]=works+at+clothing+store" http://localhost:3000/characters/4.xml
If we want to delete a character, issue a DELETE request:
curl -X DELETE http://localhost:3000/characters/1.xml
For some more sophisticated uses of REST and Rails, check out
[rest-client](https://rest-client.heroku.com/rdoc/) and
[ActiveResource](http://ryandaigle.com/articles/2006/06/30/whats-new-in-edge-rails-activeresource-is-here).

View File

@@ -0,0 +1,23 @@
---
title: "DevNation Coming to San Francisco"
date: 2010-07-29T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/devnation-coming-to-san-francisco/
---
On Saturday, August 14th, we're taking the
[DevNation](http://devnation.us/) tour across the country for our first
ever stop in the Bay Area. Our friends at [Engine
Yard](http://www.engineyard.com/) will be hosting us for a day of talks,
hacking, and discussion. The lineup is our finest to date, featuring,
among others, speakers from [Pivotal Labs](http://pivotallabs.com/),
[LinkedIn](http://linkedin.com/), [Basho](http://basho.com/), and
[Yahoo!](http://yahoo.com/) and capped off by a keynote from [Chris
Wanstrath](http://chriswanstrath.com/)
([defunkt](https://twitter.com/defunkt) of [GitHub](https://github.com/)
fame). As always, breakfast and lunch will be provided.
If you're in the Bay Area, we'd love to meet you. Registration is only
\$50 if you sign by this Saturday, so save your money for the happy hour
and [sign up now](http://devnation.us/events/9).

View File

@@ -0,0 +1,232 @@
---
title: "Diving into Go: A Five-Week Intro"
date: 2014-04-25T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/diving-into-go-a-five-week-intro/
---
One of my favorite parts of being a developer here at Viget is our
[developer book club](https://viget.com/extend/confident-ruby-a-review).
We've read [some](http://www.confidentruby.com/)
[fantastic](http://www.poodr.com/)
[books](http://martinfowler.com/books/nosql.html), but for our most
recent go-round, we decided to try something different. A few of us have
been interested in the [Go programming language](https://golang.org/)
for some time, so we decided to combine two free online texts, [*An
Introduction to Programming in Go*](http://www.golang-book.com/) and
[*Go By Example*](https://gobyexample.com/), plus a few other resources,
into a short introduction to the language.
[Chris](https://viget.com/about/team/cjones) and
[Ryan](https://viget.com/about/team/rfoster) put together a curriculum
that I thought was too good not to share with the internet at large.
## Week 1 {#week1}
Chapter 1: [Getting Started](http://www.golang-book.com/1)
- Files and Folders
- The Terminal
- Text Editors
- Go Tools
- **Go By Example**
- [Hello World](https://gobyexample.com/hello-world)
Chapter 2: [Your First Program](http://www.golang-book.com/2)
- How to Read a Go Program
Chapter 3: [Types](http://www.golang-book.com/3)
- Numbers
- Strings
- Booleans
- **Go By Example**
- [Values](https://gobyexample.com/values)
- [Random Numbers](https://gobyexample.com/random-numbers)
- [String Functions](https://gobyexample.com/string-functions)
- [String Formatting](https://gobyexample.com/string-formatting)
- [Regular
Expressions](https://gobyexample.com/regular-expressions)
Chapter 4: [Variables](http://www.golang-book.com/4)
- How to Name a Variable
- Scope
- Constants
- Defining Multiple Variables
- An Example Program
- **Go By Example**
- [Variables](https://gobyexample.com/variables)
- [Constants](https://gobyexample.com/constants)
- [Number Parsing](https://gobyexample.com/number-parsing)
- [Time](https://gobyexample.com/time)
- [Epoch](https://gobyexample.com/epoch)
- [Time Formatting /
Parsing](https://gobyexample.com/time-formatting-parsing)
Chapter 5: [Control Structures](http://www.golang-book.com/5)
- For
- If
- Switch
- **Go By Example**
- [For](https://gobyexample.com/for)
- [If/Else](https://gobyexample.com/if-else)
- [Switch](https://gobyexample.com/switch)
- [Line Filters](https://gobyexample.com/line-filters)
Chapter 6: [Arrays, Slices and Maps](http://www.golang-book.com/6)
- Arrays
- Slices
- Maps
- **Go By Example**
- [Arrays](https://gobyexample.com/arrays)
- [Slices](https://gobyexample.com/slices)
- [Maps](https://gobyexample.com/maps)
- [Range](https://gobyexample.com/range)
- **Blog Posts**
- [Go Slices: usage and
internals](https://blog.golang.org/go-slices-usage-and-internals)
- [Arrays, Slices (and strings): The mechanics of
'append'](https://blog.golang.org/slices)
## Week 2 {#week2}
Chapter 7: [Functions](http://www.golang-book.com/7)
- Your Second Function
- Returning Multiple Values
- Variadic Functions
- Closure
- Recursion
- Defer, Panic & Recover
- **Go By Example**
- [Functions](https://gobyexample.com/functions)
- [Multiple Return
Values](https://gobyexample.com/multiple-return-values)
- [Variadic Functions](https://gobyexample.com/variadic-functions)
- [Closures](https://gobyexample.com/closures)
- [Recursion](https://gobyexample.com/recursion)
- [Panic](https://gobyexample.com/panic)
- [Defer](https://gobyexample.com/defer)
- [Collection
Functions](https://gobyexample.com/collection-functions)
Chapter 8: [Pointers](http://www.golang-book.com/8)
- The \* and & operators
- new
- **Go By Example**
- [Pointers](https://gobyexample.com/pointers)
- [Reading Files](https://gobyexample.com/reading-files)
- [Writing Files](https://gobyexample.com/writing-files)
## Week 3 {#week3}
Chapter 9: [Structs and Interfaces](http://www.golang-book.com/9)
- Structs
- Methods
- Interfaces
- **Go By Example**
- [Structs](https://gobyexample.com/structs)
- [Methods](https://gobyexample.com/methods)
- [Interfaces](https://gobyexample.com/interfaces)
- [Errors](https://gobyexample.com/errors)
- [JSON](https://gobyexample.com/json)
Chapter 10: [Concurrency](http://www.golang-book.com/10)
- Goroutines
- Channels
- **Go By Example**
- [Goroutines](https://gobyexample.com/goroutines)
- [Channels](https://gobyexample.com/channels)
- [Channel Buffering](https://gobyexample.com/channel-buffering)
- [Channel
Synchronization](https://gobyexample.com/channel-synchronization)
- [Channel Directions](https://gobyexample.com/channel-directions)
- [Select](https://gobyexample.com/select)
- [Timeouts](https://gobyexample.com/timeouts)
- [Non-Blocking Channel
Operations](https://gobyexample.com/non-blocking-channel-operations)
- [Closing Channels](https://gobyexample.com/closing-channels)
- [Range over
Channels](https://gobyexample.com/range-over-channels)
- [Timers](https://gobyexample.com/timers)
- [Tickers](https://gobyexample.com/tickers)
- [Worker Pools](https://gobyexample.com/worker-pools)
- [Rate Limiting](https://gobyexample.com/rate-limiting)
## Week 4 {#week4}
- **Videos**
- [Lexical Scanning in
Go](https://www.youtube.com/watch?v=HxaD_trXwRE)
- [Concurrency is not
parallelism](https://blog.golang.org/concurrency-is-not-parallelism)
- Blog Posts
- [Share Memory By
Communicating](https://blog.golang.org/share-memory-by-communicating)
- [A GIF decoder: an exercise in Go
interfaces](https://blog.golang.org/gif-decoder-exercise-in-go-interfaces)
- [Error handling and
Go](https://blog.golang.org/error-handling-and-go)
- [Defer, Panic, and
Recover](https://blog.golang.org/defer-panic-and-recover)
## Week 5 {#week5}
Chapter 11: [Packages](http://www.golang-book.com/11)
- Creating Packages
- Documentation
Chapter 12: [Testing](http://www.golang-book.com/12)
Chapter 13: [The Core Packages](http://www.golang-book.com/13)
- Strings
- Input / Output
- Files & Folders
- Errors
- Containers & Sort
- Hashes & Cryptography
- Servers
- Parsing Command Line Arguments
- Synchronization Primitives
- **Go By Example**
- [Sorting](https://gobyexample.com/sorting)
- [Sorting by
Functions](https://gobyexample.com/sorting-by-functions)
- [URL Parsing](https://gobyexample.com/url-parsing)
- [SHA1 Hashes](https://gobyexample.com/sha1-hashes)
- [Base64 Encoding](https://gobyexample.com/base64-encoding)
- [Atomic Counters](https://gobyexample.com/atomic-counters)
- [Mutexes](https://gobyexample.com/mutexes)
- [Stateful
Goroutines](https://gobyexample.com/stateful-goroutines)
- [Command-Line
Arguments](https://gobyexample.com/command-line-arguments)
- [Command-Line Flags](https://gobyexample.com/command-line-flags)
- [Environment
Variables](https://gobyexample.com/environment-variables)
- [Spawning Processes](https://gobyexample.com/spawning-processes)
- [Exec'ing Processes](https://gobyexample.com/execing-processes)
- [Signals](https://gobyexample.com/signals)
- [Exit](https://gobyexample.com/exit)
Chapter 14: [Next Steps](http://www.golang-book.com/14)
- Study the Masters
- Make Something
- Team Up
\* \* \*
Go is an exciting language, and a great complement to the Ruby work we
do. Working through this program was a fantastic intro to the language
and prepared us to create our own Go programs for great justice. Give it
a shot and let us know how it goes.

View File

@@ -0,0 +1,231 @@
---
title: "Email Photos to an S3 Bucket with AWS Lambda (with Cropping, in Ruby)"
date: 2021-04-07T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/
---
In my annual search for holiday gifts, I came across this [digital photo
frame](https://auraframes.com/digital-frames/color/graphite) that lets
you load photos via email. Pretty neat, but I ultimately didn\'t buy it
for a few reason: 1) it\'s pretty expensive, 2) I\'d be trusting my
family\'s data to an unknown entity, and 3) if the company ever goes
under or just decides to stop supporting the product, it might stop
working or at least stop updating. But I got to thinking, could I build
something like this myself? I\'ll save the full details for a later
article, but the first thing I needed to figure out was how to get
photos from an email into an S3 bucket that could be synced onto a
device.
I try to keep up with the various AWS offerings, and Lambda has been on
my radar for a few years, but I haven\'t had the opportunity to use it
in anger. Services like this really excel at the extremes of web
software --- at the low end, where you don\'t want to incur the costs of
an always-on server, and at the high-end, where you don\'t want to pay
for a whole fleet of them. Most of our work falls in the middle, where
developer time is way more costly than hosting infrastructure and so
using a more full-featured stack running on a handful of conventional
servers is usually the best option. But an email-to-S3 gateway is a
perfect use case for on-demand computing.
[]{#the-services}
## The Services [\#](#the-services "Direct link to The Services"){.anchor aria-label="Direct link to The Services"}
To make this work, we need to connect several AWS services:
- [Route 53](https://aws.amazon.com/route53/) (for domain registration
and DNS configuration)
- [SES](https://aws.amazon.com/ses/) (for setting up the email address
and \"rule set\" that triggers the Lambda function)
- [S3](https://aws.amazon.com/s3/) (for storing the contents of the
incoming emails as well as the resulting photos)
- [SNS](https://aws.amazon.com/sns/) (for notifying the Lambda
function of an incoming email)
- [Lambda](https://aws.amazon.com/lambda) (to process the incoming
email, extract the photos, crop them, and store the results)
- [CloudWatch](https://aws.amazon.com/cloudwatch) (for debugging
issues with the code)
- [IAM](https://aws.amazon.com/iam) (for setting the appropriate
permissions)
It\'s a lot, to be sure, but it comes together pretty easily:
1. Create a couple buckets in S3, one to hold emails, the other to hold
photos.
2. Register a domain (\"hosted zone\") in Route 53.
3. Go to Simple Email Service \> Domains and verify a new domain,
selecting the domain you just registered in Route 53.
4. Go to the SES \"rule sets\" interface and click \"Create Rule.\"
Give it a name and an email address you want to send your photos to.
5. For the rule action, pick \"S3\" and then the email bucket you
created in step 1 (we have to use S3 rather than just calling the
Lambda function directly because our emails exceed the maximum
payload size). Make sure to add an SNS (Simple Notification Service)
topic to go along with your S3 action, which is how we\'ll trigger
our Lambda function.
6. Go to the Lambda interface and create a new function. Give it a name
that makes sense for you and pick Ruby 2.7 as the language.
7. With your skeleton function created, click \"Add Trigger\" and
select the SNS topic you created in step 5. You\'ll need to add
ImageMagick as a layer[^1^](#fn1){#fnref1 .footnote-ref
role="doc-noteref"} and bump the memory and timeout (I used 512 MB
and 30 seconds, respectively, but you should use whatever makes you
feel good in your heart).
8. Create a couple environment variables: `BUCKET` should be name of
the S3 bucket you want to upload photos to, and `AUTHORIZED_EMAILS`
to hold all the valid email addresses separated by semicolons.
9. Give your function permissions to read and write to/from the two
buckets.
10. And finally, the code. We\'ll manage that locally rather than using
the web-based interface since we need to include a couple gems.
[]{#the-code}
## The Code [\#](#the-code "Direct link to The Code"){.anchor aria-label="Direct link to The Code"}
So as I said literally one sentence ago, we manage the code for this
Lambda function locally since we need to include a couple gems:
[`mail`](https://github.com/mikel/mail) to parse the emails stored in S3
and [`mini_magick`](https://github.com/minimagick/minimagick) to do the
cropping. If you don\'t need cropping, feel free to leave that one out
and update the code accordingly. Without further ado:
``` {.code-block .line-numbers}
require 'json'
require 'aws-sdk-s3'
require 'mail'
require 'mini_magick'
BUCKET = ENV["BUCKET"]
AUTHORIZED_EMAILS = ENV["AUTHORIZED_EMAILS"].split(";")
def lambda_handler(event:, context:)
message = JSON.parse(event["Records"][0]["Sns"]["Message"])
s3_info = message["receipt"]["action"]
client = Aws::S3::Client.new(region: "us-east-1")
# Get the incoming email from S3
object = client.get_object(
bucket: s3_info["bucketName"],
key: s3_info["objectKey"]
)
email = Mail.new(object.body.read)
sender = email.from.first
# Confirm that the sender is in the list, otherwise abort
unless AUTHORIZED_EMAILS.include?(sender)
puts "Unauthorized email: #{sender}"
exit
end
# Get all the images out of the email
attachments = email.parts.filter { |p| p.content_type =~ /^image/ }
attachments.each do |attachment|
# First, just put the original photo in the `photos` subdirectory
client.put_object(
body: attachment.body.to_s,
bucket: BUCKET,
key: "photos/#{attachment.filename}"
)
thumb = MiniMagick::Image.read(attachment.body.to_s)
# Crop the photo down for displaying on a webpage
thumb.combine_options do |i|
i.auto_orient
i.resize "440x264^"
i.gravity "center"
i.extent "440x264"
end
client.put_object(
body: thumb.to_blob,
bucket: BUCKET,
key: "thumbs/#{attachment.filename}"
)
dithered = MiniMagick::Image.read(attachment.body.to_s)
# Crop and dither the photo for displaying on an e-ink screen
dithered.combine_options do |i|
i.auto_orient
i.resize "880x528^"
i.gravity "center"
i.extent "880x528"
i.ordered_dither "o8x8"
i.monochrome
end
client.put_object(
body: dithered.to_blob,
bucket: BUCKET,
key: "dithered/#{attachment.filename}"
)
puts "Photo '#{attachment.filename}' uploaded"
end
{
statusCode: 200,
body: JSON.generate("#{attachments.size} photo(s) uploaded.")
}
end
```
If you\'re unfamiliar with dithering, [here\'s a great
post](https://surma.dev/things/ditherpunk/) with more info, but in
short, it\'s a way to simulate grayscale with only black and white
pixels like what you find on an e-ink/e-paper display.
[]{#deploying}
## Deploying [\#](#deploying "Direct link to Deploying"){.anchor aria-label="Direct link to Deploying"}
To deploy your code, you\'ll use the [AWS
CLI](https://aws.amazon.com/cli/). [Here\'s a pretty good
walkthrough](https://docs.aws.amazon.com/lambda/latest/dg/ruby-package.html)
of how to do it but I\'ll summarize:
1. Install your gems locally with
`bundle install --path vendor/bundle`.
2. Edit your code (in our case, it lives in `lambda_function.rb`).
3. Make a simple shell script that zips up your function and gems and
sends it up to AWS:
``` {.code-block .line-numbers}
#!/bin/sh
zip -r function.zip lambda_function.rb vendor
&& aws lambda update-function-code
--function-name [lambda-function-name]
--zip-file fileb://function.zip
```
And that\'s it! A simple, resilient, cheap way to email photos into an
S3 bucket with no servers in sight (at least none you care about or have
to manage).
------------------------------------------------------------------------
In closing, this project was a great way to get familiar with Lambda and
the wider AWS ecosystem. It came together in just a few hours and is
still going strong several months later. My typical bill is something on
the order of \$0.50 per month. If anything goes wrong, I can pop into
CloudWatch to view the result of the function, but so far, [so
smooth](https://static.viget.com/DP823L7XkAIJ_xK.jpg).
I\'ll be back in a few weeks detailing the rest of the project. Stay
tuned!
------------------------------------------------------------------------
1. ::: {#fn1}
I used the ARN
`arn:aws:lambda:us-east-1:182378087270:layer:image-magick:1`[↩︎](#fnref1){.footnote-back
role="doc-backlink"}
:::

View File

@@ -0,0 +1,86 @@
---
title: "Extract Embedded Text from PDFs with Poppler in Ruby"
date: 2022-02-10T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/extract-embedded-text-from-pdfs-with-poppler-in-ruby/
---
A recent client request had us adding an archive of magazine issues
dating back to the 1980s. Pretty straightforward stuff, with the hiccup
that they wanted the magazine content to be searchable. Fortunately, the
example PDFs they provided us had embedded text
content[^1^](#fn1){#fnref1 .footnote-ref role="doc-noteref"}, i.e. the
text was selectable. The trick was to figure out how to programmatically
extract that content.
Our first attempt involved the [`pdf-reader`
gem](https://rubygems.org/gems/pdf-reader/versions/2.2.1), which worked
admirably with the caveat that it had a little bit of trouble with
multi-column / art-directed layouts[^2^](#fn2){#fnref2 .footnote-ref
role="doc-noteref"}, which was a lot of the content we were dealing
with.
A bit of research uncovered [Poppler](https://poppler.freedesktop.org/),
"a free software utility library for rendering Portable Document Format
(PDF) documents," which includes text extraction functionality and has a
corresponding [Ruby
library](https://rubygems.org/gems/poppler/versions/3.4.9). This worked
great and here's how to do it.
## Install Poppler
Poppler installs as a standalone library. On Mac:
brew install poppler
On (Debian-based) Linux:
apt-get install libgirepository1.0-dev libpoppler-glib-dev
In a (Debian-based) Dockerfile:
RUN apt-get update &&
apt-get install -y libgirepository1.0-dev libpoppler-glib-dev &&
rm -rf /var/lib/apt/lists/*
Then, in your `Gemfile`:
gem "poppler"
## Use it in your application
Extracting text from a PDF document is super straightforward:
document = Poppler::Document.new(path_to_pdf)
document.map { |page| page.get_text }.join
The results are really good, and Poppler understands complex page
layouts to an impressive degree. Additionally, the library seems to
support a lot more [advanced
functionality](https://www.rubydoc.info/gems/poppler/3.4.9). If you ever
need to extract text from a PDF, Poppler is a good choice.
[*John Popper photo by Gage Skidmore, CC BY-SA
3.0*](https://commons.wikimedia.org/w/index.php?curid=39946499)
------------------------------------------------------------------------
1. [Note that we're not talking about extracting text from images/OCR;
if you need to take an image-based PDF and add a selectable text
layer to it, I recommend
[OCRmyPDF](https://pypi.org/project/ocrmypdf/).
[↩︎](#fnref1){.footnote-back role="doc-backlink"}]{#fn1}
2. [So for a page like this:]{#fn2}
+-----------------+---------------------+
| This is a story | my life got flipped |
| all about how | turned upside-down |
+-----------------+---------------------+
`pdf-reader` would parse this into "This is a story my life got
flipped all about how turned upside-down," which led to issues when
searching for multi-word phrases. [↩︎](#fnref2){.footnote-back
role="doc-backlink"}

View File

@@ -0,0 +1,131 @@
---
title: "First-Class Failure"
date: 2014-07-22T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/first-class-failure/
---
As a developer, nothing makes me more nervous than third-party
dependencies and things that can fail in unpredictable
ways^[1](%7Bfn:1:url%7D "see footnote"){#fnref:1 .footnote}^. More often
than not, these two go hand-in-hand, taking our elegant, robust
applications and dragging them down to the lowest common denominator of
the services they depend upon. A recent internal project called for
slurping in and then reporting against data from
[Harvest](http://www.getharvest.com/), our time tracking service of
choice and a fickle beast on its very best days.
I knew that both components (`/(im|re)porting/`) were prone to failure.
How to handle that failure in a graceful way, so that our users see
something more meaningful than a 500 page, and our developers have a
fighting chance at tracking and fixing the problem? Here's the approach
we took.
## Step 1: Model the processes {#step1:modeltheprocesses}
Rather than importing the data or generating the report with procedural
code, create ActiveRecord models for them. In our case, the models are
`HarvestImport` and `Report`. When a user initiates a data import or a
report generation, save a new record to the database *immediately*,
before doing any work.
## Step 2: Give 'em status {#step2:giveemstatus}
These models have a `status` column. We default it to "queued," since we
offload most of the work to a series of [Resque](http://resquework.org/)
tasks, but you can use "pending" or somesuch if that's more your speed.
They also have an `error` field for reasons that will become apparent
shortly.
## Step 3: Define an interface {#step3:defineaninterface}
Into both of these models, we include the following module:
module ProcessingStatus
def mark_processing
update_attributes(status: "processing")
end
def mark_successful
update_attributes(status: "success", error: nil)
end
def mark_failure(error)
update_attributes(status: "failed", error: error.to_s)
end
def process(cleanup = nil)
mark_processing
yield
mark_successful
rescue => ex
mark_failure(ex)
ensure
cleanup.try(:call)
end
end
Lines 2--12 should be self-explanatory: methods for setting the object's
status. The `mark_failure` method takes an exception object, which it
stores in the model's `error` field, and `mark_successful` clears said
error.
Line 14 (the `process` method) is where things get interesting. Calling
this method immediately marks the object "processing," and then yields
to the provided block. If the block executes without error, the object
is marked "success." If any^[2](#fn:2 "see footnote"){#fnref:2
.footnote}^ exception is thrown, the object marked "failure" and the
error message is logged. Either way, if a `cleanup` lambda is provided,
we call it (courtesy of Ruby's
[`ensure`](http://ruby.activeventure.com/usersguide/rg/ensure.html)
keyword).
## Step 4: Wrap it up {#step4:wrapitup}
Now we can wrap our nasty, fail-prone reporting code in a `process` call
for great justice.
class ReportGenerator
attr_accessor :report
def generate_report
report.process -> { File.delete(file_path) } do
# do some fail-prone work
end
end
# ...
end
The benefits are almost too numerous to count: 1) no 500 pages, 2)
meaningful feedback for users, and 3) super detailed diagnostic info for
developers -- better than something like
[Honeybadger](https://www.honeybadger.io/), which doesn't provide nearly
the same level of context. (`-> { File.delete(file_path) }` is just a
little bit of file cleanup that should happen regardless of outcome.)
\* \* \*
I've always found it an exercise in futility to try to predict all the
ways a system can fail when integrating with an external dependency.
Being able to blanket rescue any exception and store it in a way that's
meaningful to users *and* developers has been hugely liberating and has
contributed to a seriously robust platform. This technique may not be
applicable in every case, but when it fits, [it's
good](https://www.youtube.com/watch?v=HNfciDzZTNM&t=1m40s).
------------------------------------------------------------------------
1. ::: {#fn:1}
Well, [almost
nothing](https://github.com/github/hubot/blob/master/src/scripts/google-images.coffee#L5).
[ ↩](#fnref:1 "return to article"){.reversefootnote}
:::
2. ::: {#fn:2}
[Any descendent of
`StandardError`](http://stackoverflow.com/a/10048406), in any event.
[ ↩](#fnref:2 "return to article"){.reversefootnote}
:::

View File

@@ -0,0 +1,122 @@
---
title: "Five Turbo Lessons I Learned the Hard Way"
date: 2021-08-02T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/five-turbo-lessons-i-learned-the-hard-way/
---
We\'ve been using [Turbo](https://turbo.hotwired.dev/) on our latest
client project (a Ruby on Rails web application), and after a slight
learning curve, we\'ve been super impressed by how much dynamic behavior
it\'s allowed us to add while writing very little code. We have hit some
gotchas (or at least some undocumented behavior), often with solutions
that lie deep in GitHub issue threads. Here are a few of the things
we\'ve discovered along our Turbo journey.
[]{#turbo-stream-fragments-are-server-responses}
### Turbo Stream fragments are server responses (and you don\'t have to write them by hand) [\#](#turbo-stream-fragments-are-server-responses "Direct link to Turbo Stream fragments are server responses (and you don't have to write them by hand)"){.anchor aria-label="Direct link to Turbo Stream fragments are server responses (and you don't have to write them by hand)"}
[The docs on Turbo Streams](https://turbo.hotwired.dev/handbook/streams)
kind of bury the lede. They start out with the markup to update the
client, and only [further
down](https://turbo.hotwired.dev/handbook/streams#streaming-from-http-responses)
illustrate how to use them in a Rails app. Here\'s the thing: you don\'t
really need to write any stream markup at all. It\'s (IMHO) cleaner to
just use the built-in Rails methods, i.e.
render turbo_stream: turbo_stream.update("flash", partial: "shared/flash")
And though [DHH would
disagree](https://github.com/hotwired/turbo-rails/issues/77#issuecomment-757349251),
you can use an array to make multiple updates to the page.
[]{#send-unprocessable-entity-to-re-render-a-form-with-errors}
### Send `:unprocessable_entity` to re-render a form with errors [\#](#send-unprocessable-entity-to-re-render-a-form-with-errors "Direct link to Send :unprocessable_entity to re-render a form with errors"){.anchor aria-label="Direct link to Send :unprocessable_entity to re-render a form with errors"}
For create/update actions, we follow the usual pattern of redirect on
success, re-render the form on error. Once you enable Turbo, however,
that direct rendering stops working. The solution is to [return a 422
status](https://github.com/hotwired/turbo-rails/issues/12), though we
prefer the `:unprocessable_entity` alias (so like
`render :new, status: :unprocessable_entity`). This seems to work well
with and without JavaScript and inside or outside of a Turbo frame.
[]{#use-data-turbo-false-to-break-out-of-a-frame}
### Use `data-turbo="false"` to break out of a frame [\#](#use-data-turbo-false-to-break-out-of-a-frame "Direct link to Use data-turbo="false" to break out of a frame"){.anchor aria-label="Direct link to Use data-turbo=\"false\" to break out of a frame"}
If you have a link inside of a frame that you want to bypass the default
Turbo behavior and trigger a full page reload, [include the
`data-turbo="false"`
attribute](https://github.com/hotwired/turbo/issues/45#issuecomment-753444256)
(or use `data: { turbo: false }` in your helper).
*Update from good guy [Leo](https://www.viget.com/about/team/lbauza/):
you can also use
[`target="_top"`](https://turbo.hotwired.dev/handbook/frames#targeting-navigation-into-or-out-of-a-frame)
to load all the content from the response without doing a full page
reload, which seems (to me, David) what you typically want except under
specific circumstances.*
[]{#use-requestSubmit-to-trigger-a-turbo-form-submission-via-javaScript}
### Use `requestSubmit()` to trigger a Turbo form submission via JavaScript [\#](#use-requestSubmit-to-trigger-a-turbo-form-submission-via-javaScript "Direct link to Use requestSubmit() to trigger a Turbo form submission via JavaScript"){.anchor aria-label="Direct link to Use requestSubmit() to trigger a Turbo form submission via JavaScript"}
If you have some JavaScript (say in a Stimulus controller) that you want
to trigger a form submission with a Turbo response, you can\'t use the
usual `submit()` method. [This discussion
thread](https://discuss.hotwired.dev/t/triggering-turbo-frame-with-js/1622/15)
sums it up well:
> It turns out that the turbo-stream mechanism listens for form
> submission events, and for some reason the submit() function does not
> emit a form submission event. That means that it'll bring back a
> normal HTML response. That said, it looks like there's another method,
> requestSubmit() which does issue a submit event. Weird stuff from
> JavaScript land.
So, yeah, use `requestSubmit()` (i.e. `this.formTarget.requestSubmit()`)
and you\'re golden (except in Safari, where you might need [this
polyfill](https://github.com/javan/form-request-submit-polyfill)).
[]{#loading-the-same-url-multiple-times-in-a-turbo-frame}
### Loading the same URL multiple times in a Turbo Frame [\#](#loading-the-same-url-multiple-times-in-a-turbo-frame "Direct link to Loading the same URL multiple times in a Turbo Frame"){.anchor aria-label="Direct link to Loading the same URL multiple times in a Turbo Frame"}
I hit an interesting issue with a form inside a frame: in a listing of
comments, I set it up where you could click an edit link, and the
content would be swapped out for an edit form using a Turbo Frame.
Update and save your comment, and the new content would render. Issue
was, if you hit the edit link *again*, nothing would happen. Turns out,
a Turbo frame won't reload a URL if it thinks it already has the
contents of that URL (which it tracks in a `src` attribute).
The [solution I
found](https://github.com/hotwired/turbo/issues/245#issuecomment-847711320)
was to append a timestamp to the URL to ensure it\'s always unique.
Works like a charm.
*Update from good guy
[Joshua](https://www.viget.com/about/team/jpease/): this has been fixed
an a [recent
update](https://github.com/hotwired/turbo/releases/tag/v7.0.0-beta.7).*
[[Learn More]{.util-breadcrumb-md .mb-8 .group-hover:translate-y-20
.group-hover:opacity-0 .transition-all .ease-in-out
.duration-500}](https://www.viget.com/careers/application-developer/){.relative
.flex .group .flex-col .p-32 .md:p-40 .lg:p-64 .z-10}
### We're hiring Application Developers. Learn more and introduce yourself. {#were-hiring-application-developers.-learn-more-and-introduce-yourself. .text-20 .md:text-24 .lg:text-32 .font-bold .leading-[170%] .group-hover:-translate-y-20 .transition-transform .ease-in-out .duration-500}
![](data:image/svg+xml;base64,PHN2ZyBjbGFzcz0icmVjdC1pY29uLW1kIHNlbGYtZW5kIG10LTE2IGdyb3VwLWhvdmVyOi10cmFuc2xhdGUteS0yMCB0cmFuc2l0aW9uLWFsbCBlYXNlLWluLW91dCBkdXJhdGlvbi01MDAiIHZpZXdib3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBhcmlhLWhpZGRlbj0idHJ1ZSI+CjxwYXRoIGZpbGwtcnVsZT0iZXZlbm9kZCIgY2xpcC1ydWxlPSJldmVub2RkIiBkPSJNMTMuNzg0OCAxOS4zMDkxQzEzLjQ3NTggMTkuNTg1IDEzLjAwMTcgMTkuNTU4MyAxMi43MjU4IDE5LjI0OTRDMTIuNDQ5OCAxOC45NDA1IDEyLjQ3NjYgMTguNDY2MyAxMi43ODU1IDE4LjE5MDRMMTguNzg2NiAxMi44MzAxTDQuNzUxOTUgMTIuODMwMUM0LjMzNzc0IDEyLjgzMDEgNC4wMDE5NSAxMi40OTQzIDQuMDAxOTUgMTIuMDgwMUM0LjAwMTk1IDExLjY2NTkgNC4zMzc3NCAxMS4zMzAxIDQuNzUxOTUgMTEuMzMwMUwxOC43ODU1IDExLjMzMDFMMTIuNzg1NSA1Ljk3MDgyQzEyLjQ3NjYgNS42OTQ4OCAxMi40NDk4IDUuMjIwNzYgMTIuNzI1OCA0LjkxMTg0QzEzLjAwMTcgNC42MDI5MiAxMy40NzU4IDQuNTc2MTggMTMuNzg0OCA0Ljg1MjEyTDIxLjIzNTggMTEuNTA3NkMyMS4zNzM4IDExLjYyNDQgMjEuNDY5IDExLjc5MDMgMjEuNDk0NSAxMS45NzgyQzIxLjQ5OTIgMTIuMDExOSAyMS41MDE1IDEyLjA0NjEgMjEuNTAxNSAxMi4wODA2QzIxLjUwMTUgMTIuMjk0MiAyMS40MTA1IDEyLjQ5NzcgMjEuMjUxMSAxMi42NEwxMy43ODQ4IDE5LjMwOTFaIj48L3BhdGg+Cjwvc3ZnPg==){.rect-icon-md
.self-end .mt-16 .group-hover:-translate-y-20 .transition-all
.ease-in-out .duration-500}
These small issues aside, Turbo has been a BLAST to work with and has
allowed us to easily build a highly dynamic app that works surprisingly
well even with JavaScript disabled. We\'re excited to see how this
technology develops.

View File

@@ -0,0 +1,151 @@
---
title: "“Friends” (Undirected Graph Connections) in Rails"
date: 2021-06-09T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/friends-undirected-graph-connections-in-rails/
---
No, sorry, not THOSE friends. But if you\'re interested in how to do
some graph stuff in a relational database, SMASH that play button and
read on.
My current project is a social network of sorts, and includes the
ability for users to connect with one another. I\'ve built this
functionality once or twice before, but I\'ve never come up with a
database implementation I was perfectly happy with. This type of
relationship is perfect for a [graph
database](https://en.wikipedia.org/wiki/Graph_database), but we\'re
using a relational database and introducing a second data store
wouldn\'t be worth the overhead.
The most straightforward implementation would involve a join model
(`Connection` or somesuch) with two foreign key columns pointed at the
same table (`users` in our case). When you want to pull back a user\'s
contacts, you\'d have to query against both foreign keys, and then pull
back the opposite key to retrieve the list. Alternately, you could store
connections in both directions and hope that your application code
always inserts the connections in pairs (spoiler: at some point, it
won\'t).
But what if there was a better way? I stumbled on [this article that
talks through the problem in
depth](https://inviqa.com/blog/storing-graphs-database-sql-meets-social-network),
and it led me down the path of using an SQL view and the
[`UNION`](https://www.postgresqltutorial.com/postgresql-union/)
operator, and the result came together really nicely. Let\'s walk
through it step-by-step.
First, we\'ll model the connection between two users:
``` {.code-block .line-numbers}
class CreateConnections < ActiveRecord::Migration[6.1]
def change
create_table :connections do |t|
t.references :sender, null: false
t.references :receiver, null: false
t.timestamps
end
add_foreign_key :connections, :users, column: :sender_id, on_delete: :cascade
add_foreign_key :connections, :users, column: :receiver_id, on_delete: :cascade
add_index :connections,
"(ARRAY[least(sender_id, receiver_id), greatest(sender_id, receiver_id)])",
unique: true,
name: :connection_pair_uniq
end
end
```
I chose to call the foreign keys `sender` and `receiver`, not that I
particularly care who initiated the connection, but it seemed better
than `user_1` and `user_2`. Notice the index, which ensures that a
sender/receiver pair is unique *in both directions* (so if a connection
already exists where Alice is the sender and Bob is the receiver, we
can\'t insert a connection where the roles are reversed). Apparently
Rails has supported [expression-based
indices](https://bigbinary.com/blog/rails-5-adds-support-for-expression-indexes-for-postgresql)
since version 5. Who knew!
With connections modeled in our database, let\'s set up the
relationships between user and connection. In `connection.rb`:
belongs_to :sender, class_name: "User"
belongs_to :receiver, class_name: "User"
In `user.rb`:
has_many :sent_connections,
class_name: "Connection",
foreign_key: :sender_id
has_many :received_connections,
class_name: "Connection",
foreign_key: :receiver_id
Next, we\'ll turn to the
[Scenic](https://github.com/scenic-views/scenic) gem to create a
database view that normalizes sender/receiver into user/contact. Install
the gem, then run `rails generate scenic:model user_contacts`. That\'ll
create a file called `db/views/user_contacts_v01.sql`, where we\'ll put
the following:
SELECT sender_id AS user_id, receiver_id AS contact_id
FROM connections
UNION
SELECT receiver_id AS user_id, sender_id AS contact_id
FROM connections;
Basically, we\'re using the `UNION` operator to merge two queries
together (reversing sender and receiver), then making the result
queryable via a virtual table called `user_contacts`.
Finally, we\'ll add the contact relationships. In `user_contact.rb`:
belongs_to :user
belongs_to :contact, class_name: "User"
And in `user.rb`, right below the
`sent_connections`/`received_connections` stuff:
has_many :user_contacts
has_many :contacts, through: :user_contacts
And that\'s it! You\'ll probably want to write some validations and unit
tests but I can\'t give away all my tricks (or all of my client\'s
code).
Here\'s our friendship system in action:
``` {.code-block .line-numbers}
[1] pry(main)> u1, u2 = User.first, User.last
=> [#<User id: 1 first_name: "Ross" …>, #<User id: 7 first_name: "Rachel" …>]
[2] pry(main)> u1.sent_connections.create(receiver: u2)
=> #<Connection:0x00007f813cde5f70
id: 1,
sender_id: 1,
receiver_id: 7>
[3] pry(main)> UserContact.all
=> [#<UserContact:0x00007f813ccbefc0 user_id: 7, contact_id: 1>,
#<UserContact:0x00007f813cca40f8 user_id: 1, contact_id: 7>]
[4] pry(main)> u1.contacts
=> [#<User id: 7 first_name: "Rachel" …>]
[5] pry(main)> u2.contacts
=> [#<User id: 1 first_name: "Ross" …>]
[6] pry(main)> # they're lobsters
[7] pry(main)>
```
So there it is, a simple, easily queryable vertex/edge implementation in
a vanilla Rails app. I hope you have a great day, week, month, and even
year.
------------------------------------------------------------------------
[Network Diagram Vectors by
Vecteezy](https://www.vecteezy.com/free-vector/network-diagram)
[*\"I\'ll Be There for You\" (Theme from
Friends)*](https://archive.org/details/tvtunes_31736) © 1995 The
Rembrandts

View File

@@ -0,0 +1,121 @@
---
title: "Functional Programming in Ruby with Contracts"
date: 2015-03-31T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/functional-programming-in-ruby-with-contracts/
---
I read Thomas Reynolds' [*My Weird
Ruby*](http://awardwinningfjords.com/2015/03/03/my-weird-ruby.html) a
week or two ago, and I **loved** it. I'd never heard of the
[Contracts](https://github.com/egonSchiele/contracts.ruby) gem, but
after reading the post and the [well-written
docs](http://egonschiele.github.io/contracts.ruby/), I couldn't wait to
try it out. I'd been doing some functional programming as part of our
ongoing programming challenge series, and saw an opportunity to use
Contracts to rewrite my Ruby solution to the [One-Time
Pad](https://viget.com/extend/otp-a-language-agnostic-programming-challenge)
problem. Check out my [rewritten `encrypt`
program](https://github.com/vigetlabs/otp/blob/master/languages/Ruby/encrypt):
#!/usr/bin/env ruby
require "contracts"
include Contracts
Char = -> (c) { c.is_a?(String) && c.length == 1 }
Cycle = Enumerator::Lazy
Contract [Char, Char] => Num
def int_of_hex_chars(chars)
chars.join.to_i(16)
end
Contract ArrayOf[Num] => String
def hex_string_of_ints(nums)
nums.map { |n| n.to_s(16) }.join
end
Contract Cycle => Num
def get_mask(key)
int_of_hex_chars key.first(2)
end
Contract [], Cycle => []
def encrypt(plaintext, key)
[]
end
Contract ArrayOf[Char], Cycle => ArrayOf[Num]
def encrypt(plaintext, key)
char = plaintext.first.ord ^ get_mask(key)
[char] + encrypt(plaintext.drop(1), key.drop(2))
end
plaintext = STDIN.read.chars
key = ARGV.last.chars.cycle.lazy
print hex_string_of_ints(encrypt(plaintext, key))
Pretty cool, yeah? Compare with this [Haskell
solution](https://github.com/vigetlabs/otp/blob/master/languages/Haskell/encrypt.hs).
Some highlights:
### Typechecking
At its most basic, Contracts offers typechecking on function input and
output. Give it the expected classes of the arguments and the return
value, and you'll get a nicely formatted error message if the function
is called with something else, or returns something else.
### Custom types with lambdas {#customtypeswithlambdas}
Ruby has no concept of a single character data type -- running
`"string".chars` returns an array of single-character strings. We can
simulate a native char type using a lambda, as seen on line #6, which
says that the argument must be a string and must have a length of one.
### Tuples
If you're expecting an array of a specific length and type, you can
specify it, as I've done on line #9.
### Pattern matching {#patternmatching}
Rather than one `encrypt` method with a conditional to see if the list
is empty, we define the method twice: once for the base case (line #24)
and once for the recursive case (line #29). This keeps our functions
concise and allows us to do case-specific typechecking on the output.
### No unexpected `nil` {#nounexpectednil}
There's nothing worse than `undefined method 'foo' for nil:NilClass`,
except maybe littering your methods with presence checks. Using
Contracts, you can be sure that your functions aren't being called with
`nil`. If it happens that `nil` is an acceptable input to your function,
use `Maybe[Type]` à la Haskell.
### Lazy, circular lists {#lazycircularlists}
Unrelated to Contracts, but similarly inspired by *My Weird Ruby*, check
out the rotating encryption key made with
[`cycle`](http://ruby-doc.org/core-2.1.0/Enumerable.html#method-i-cycle)
and
[`lazy`](http://ruby-doc.org/core-2.1.0/Enumerable.html#method-i-lazy)
on line #36.
\* \* \*
As a professional Ruby developer with an interest in strongly typed
functional languages, I'm totally psyched to start using Contracts on my
projects. While you don't get the benefits of compile-time checking, you
do get cleaner functions, better implicit documentation, and more
overall confidence about your code.
And even if Contracts or FP aren't your thing, from a broader
perspective, this demonstrates that **experimenting with other
programming paradigms makes you a better programmer in your primary
language.** It was so easy to see the utility and application of
Contracts while reading *My Weird Ruby*, which would not have been the
case had I not spent time with Haskell, OCaml, and Elixir.

View File

@@ -0,0 +1,78 @@
---
title: "Get Lazy with Custom Enumerators"
date: 2015-09-28T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/get-lazy-with-custom-enumerators/
---
Ruby 2.0 added the ability to create [custom
enumerators](http://ruby-doc.org/core-2.2.0/Enumerator.html#method-c-new)
and they are
[bad](https://themoviegourmet.files.wordpress.com/2010/07/machete1.jpg)
[ass](https://lifevsfilm.files.wordpress.com/2013/11/grindhouse.jpg). I
tend to group [lazy
evaluation](https://en.wikipedia.org/wiki/Lazy_evaluation) with things
like [pattern matching](https://en.wikipedia.org/wiki/Pattern_matching)
and [currying](https://en.wikipedia.org/wiki/Currying) -- super cool but
not directly applicable to our day-to-day work. I recently had the
chance to use a custom enumerator to clean up some hairy business logic,
though, and I thought I'd share.
**Some background:** our client had originally requested the ability to
select two related places to display at the bottom of a given place
detail page, one of the primary pages in our app. Over time, they found
that content editors were not always diligent about selecting these
related places, often choosing only one or none. They requested that two
related places always display, using the following logic:
1. If the place has published, associated places, use those;
2. Otherwise, if there are nearby places, use those;
3. Otherwise, use the most recently updated places.
Straightforward enough. An early, naïve approach:
def associated_places
[
(associated_place_1 if associated_place_1.try(:published?)),
(associated_place_2 if associated_place_2.try(:published?)),
*nearby_places,
*recently_updated_places
].compact.first(2)
end
But if a place *does* have two associated places, we don't want to
perform the expensive call to `nearby_places`, and similarly, if it has
nearby places, we'd like to avoid calling `recently_updated_places`. We
also don't want to litter the method with conditional logic. This is a
perfect opportunity to build a custom enumerator:
def associated_places
Enumerator.new do |y|
y << associated_place_1 if associated_place_1.try(:published?)
y << associated_place_2 if associated_place_2.try(:published?)
nearby_places.each { |place| y << place }
recently_updated_places.each { |place| y << place }
end
end
`Enumerator.new` takes a block with "yielder" argument. We call the
yielder's `yield` method[^1^](#fn:1 "see footnote"){#fnref:1 .footnote},
aliased as `<<`, to return the next enumerable value. Now, we can just
say `@place.associated_places.take(2)` and we'll always get back two
places with minimum effort.
This code ticks all the boxes: fast, clean, and nerdy as hell. If you're
interested in learning more about Ruby's lazy enumerators, I recommend
[*Ruby 2.0 Works Hard So You Can Be
Lazy*](http://patshaughnessy.net/2013/4/3/ruby-2-0-works-hard-so-you-can-be-lazy)
by Pat Shaughnessy and [*Lazy
Refactoring*](https://robots.thoughtbot.com/lazy-refactoring) on the
Thoughtbot blog.
\* \* \*
1. ::: {#fn:1}
Confusing name -- not the same as the `yield` keyword.
[ ↩](#fnref:1 "return to article"){.reversefootnote}
:::

View File

@@ -0,0 +1,51 @@
---
title: "Getting into Open Source"
date: 2010-12-01T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/getting-into-open-source/
---
When evaluating a potential developer hire, one of the first things we
look for is a profile on [GitHub](https://github.com), and I'm always
surprised when someone doesn't have one. When asked, the most frequent
response is that people don't know where to begin contributing to open
source. This response might've had some validity in the
[SourceForge](http://sourceforge.net) days, but with the rise of GitHub,
it\'s become a lot easier to get involved. Here are four easy ways to
get started.
## 1. Documentation {#1_documentation}
There's a lot of great open source code out there that goes unused
simply because people can't figure out how to use it. A great way to get
your foot in the door is to improve documentation, whether by updating
the primary README, including examples in the source code, or simply
fixing typos and grammatical errors.
## 2. Something You Use {#2_something_you_use}
The vast majority of the plugins and gems that you use every day are
one-person operations. It is a bit intimidating to attempt to improve
code that someone else has spent so much time on, but if you see
something wrong, fork the project and fix it. You'll be amazed how easy
it is and how grateful the original authors will be.
## 3. Your Blog {#3_your_blog}
I don't necessarily recommend reinventing the wheel when it comes to
blogging platforms, but if you're looking for something small to code up
using your web framework of choice, writing the software that powers
your personal website is a good option. [The
Setup](http://usesthis.com/), one of my favorite sites, includes a link
to the project source in its footer.
## 4. Any Dumb Crap {#4_any_dumb_crap}
One of my favorite talks from RailsConf a few years back was Nathaniel
Talbott's [23
Hacks](http://en.oreilly.com/rails2008/public/schedule/detail/1980),
which encouraged developers to "enjoy tinkering, puttering, and
generally hacking around." Don't worry that your code isn't perfect and
might never light the world on fire; put it out there and keep improving
it. Simply put, there's almost no code worse than *no code*.

View File

@@ -0,0 +1,97 @@
---
title: "Gifts For Your Nerd"
date: 2009-12-16T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/gifts-for-your-nerd/
---
Shopping for a nerd this holiday season? A difficult proposition, to be
sure. We are, after all, complicated creatures. Fortunately, Viget
Extend is here to help. Here are some gifts your nerd is sure to love.
[![](https://www.viget.com/uploads/image/dce_iamakey.jpg){.left} **Lacie
iamaKey Flash
Drive**](https://www.amazon.com/LaCie-iamaKey-Flash-Drive-130870/dp/B001V7XPSA)
**(\$30)**
If your nerd goes to tech conferences with any regularity, your
residence is already littered with these things. USB flash drives are a
dime a dozen, but this one's different: stylish and rugged, and since
it's designed to be carried on a keychain, it'll always around when your
nerd needs it.
[![](https://www.viget.com/uploads/image/dce_aeropress.jpg){.left}
**AeroPress**](https://www.amazon.com/AeroPress-Coffee-and-Espresso-Maker/dp/B000GXZ2GS)
**(\$25)**
A simple device that makes a cup of espresso better than machines
costing twenty times as much. Buy this one for your nerd and wake up to
delicious, homemade espresso every morning. In other words, it\'s the
gift that keeps on giving. If espresso gives your nerd the jitters, you
can't go wrong with a [french
press](https://www.amazon.com/Bodum-Chambord-4-Cup-Coffee-Press/dp/B00012D0R2/).
[![](https://www.viget.com/uploads/image/dce_charge_tee.jpg){.left}
**SimpleBits Charge
Tee**](http://shop.simplebits.com/product/charge-tee-tri-blend)
**(\$22)**
Simple, vaguely Mac-ish graphic printed on an American Apparel Tri-Blend
tee, no lie the greatest and best t-shirt ever created.
[![](https://www.viget.com/uploads/image/dce_hard_graft.jpg){.left}
**Hard Graft iPhone
Case**](http://shop.hardgraft.com/product/base-phone-case) **(\$60)**
Your nerd probably already has a case for her iPhone, but it's made of
rubber or plastic. Class it up with this handmade leather-and-wool case.
Doubles as a slim wallet if your nerd is of the minimalist mindset, and
here's a hint: we all are.
[![](https://www.viget.com/uploads/image/dce_ignore.jpg){.left} **Ignore
Everybody**](https://www.amazon.com/Ignore-Everybody-Other-Keys-Creativity/dp/159184259X)
**by Hugh MacLeod (\$16)**
Give your nerd the motivation to finish that web application he's been
talking about for the last two years so you can retire.
[![](https://www.viget.com/uploads/image/dce_moleskine.jpg){.left}
**Moleskine
Notebook**](https://www.amazon.com/Moleskine-Squared-Notebook-Cover-Pocket/dp/8883707125)
**(\$10)**
What nerd doesn't love a new notebook? Just make sure it's graph paper;
unlined paper was not created for mathematical formulae and drawings of
robots. Alternatively, take a look at [Field
Notes](http://fieldnotesbrand.com). As for pens, I highly, *highly*
recommend the [Uni-ball
Signo](http://www.jetpens.com/product_info.php/cPath/239_90/products_id/466).
[![](https://www.viget.com/uploads/image/dce_canon.jpg){.left} **Canon
PowerShot S90**](https://www.amazon.com/dp/B002LITT42/) **(\$400)**
Packs the low-light photographic abilities of your nerd's DSLR into a
compact form factor that fits in his shirt pocket, right next to his
slide rule.
[![](https://www.viget.com/uploads/image/dce_newegg.png){.left} **Newegg
Gift
Card**](https://secure.newegg.com/GiftCertificate/GiftCardStep1.aspx)
If all else fails, a gift card from [Newegg](http://newegg.com) shows
you know your nerd a little better than the usual from Amazon.
[![](https://www.viget.com/uploads/image/dce_moto_guzzi.jpg){.left}
**Moto Guzzi V7
Classic**](http://www.autoblog.com/2009/09/30/review-moto-guzzi-v7-classic-is-an-italian-beauty-you-can-live/)
**(\$8500)**
Actually, this one's probably just me.
If your nerd is a little more design-oriented, check out Viget Inspire
for ideas from [Owen](https://www.viget.com/inspire/the-winter-scrooge/)
and
[Rob](https://www.viget.com/inspire/10-t-shirts-you-want-to-buy-a-designer/).
Got any other gift suggestions for the nerd in your life, or ARE YOU
YOURSELF a nerd? Link it up in the comments.

View File

@@ -0,0 +1,75 @@
---
title: "How (& Why) to Run Autotest on your Mac"
date: 2009-06-19T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/how-why-to-run-autotest-on-your-mac/
---
If you aren't using Autotest to develop your Ruby application, you're
missing out on effortless continuous testing. If you'd *like* to be
using Autotest, but can't get it running properly, I'll show you how to
set it up.
Autotest is a fantastic way to do TDD/BDD. Here's a rundown of the
benefits from the [project
homepage](http://www.zenspider.com/ZSS/Products/ZenTest/):
- Improves feedback by running tests continuously.
- Continually runs tests based on files you've changed.
- Get feedback as soon as you save. Keeps you in your editor allowing
you to get stuff done faster.
- Focuses on running previous failures until you've fixed them.
Like any responsible Ruby citizen, Autotest changes radically every
month or so. A few weeks ago, some enterprising developers released
autotest-mac (now
[autotest-fsevent](http://www.bitcetera.com/en/techblog/2009/05/27/mac-friendly-autotest/)),
which monitors code changes via native OS X system events rather than by
polling the hard drive, increasing battery and disk life and improving
performance. Here's how get Autotest running on your Mac, current as of
this morning:
1. Install autotest:
``` {#code}
gem install ZenTest
```
2. Or, if you've already got an older version installed:
``` {#code}
gem update ZenTest gem cleanup ZenTest
```
3. Install autotest-rails:
``` {#code}
gem install autotest-rails
```
4. Install autotest-fsevent:
``` {#code}
gem install autotest-fsevent
```
5. Install autotest-growl:
``` {#code}
gem install autotest-growl
```
6. Make a `~/.autotest` file, with the following:
``` {#code}
require "autotest/growl" require "autotest/fsevent"
```
7. Run `autotest` in your app root.
Autotest is a fundamental part of my development workflow, and well
worth the occasional setup headache; give it a shot and I think you'll
agree. These instructions should be enough to get you up and running,
unless you're reading this more than three weeks after it was published,
in which case all. bets. are. off.

View File

@@ -0,0 +1,57 @@
---
title: "HTML Sanitization In Rails That Actually Works"
date: 2009-11-23T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/html-sanitization-in-rails-that-actually-works/
---
Assuming you don't want to simply escape everything, sanitizing user
input is one of the relative weak points of the Rails framework. On
[SpeakerRate](http://speakerrate.com/), where users can use
[Markdown](http://daringfireball.net/projects/markdown/) to format
comments and descriptions, we've run up against some of the limitations
of Rails' built-in sanitization features, so we decided to dig in and
fix it ourselves.
In creating our own sanitizer, our goals were threefold: we want to
**let a subset of HTML in**. As the [Markdown
documentation](http://daringfireball.net/projects/markdown/syntax#html)
clearly states, "for any markup that is not covered by Markdown's
syntax, you simply use HTML itself." In keeping with the Markdown
philosophy, we can't simply strip all HTML from incoming comments, so
the included
[HTML::WhiteListSanitizer](https://github.com/rails/rails/blob/master/actionpack/lib/action_controller/vendor/html-scanner/html/sanitizer.rb#LID60)
is the obvious starting point.
Additionally, we want to **escape, rather than remove, non-approved
tags**, since some commenters want to discuss the merits of, say,
[`<h2 class="h2">`](http://speakerrate.com/talks/1698-object-oriented-css#c797).
Contrary to its documentation, WhiteListSanitizer simply removes all
non-whitelisted tags. Someone opened a
[ticket](https://rails.lighthouseapp.com/projects/8994-ruby-on-rails/tickets/916)
about this issue in August of 2008 with an included patch, but the
ticket was marked as resolved without ever applying it. Probably for the
best, as the patch introduces a new bug.
Finally, we want to **escape unclosed tags even if they belong to the
whitelist**. An unclosed `<strong>` tag can wreak havoc on the rest of a
page, not to mention what a `<div>` can do. Self-closing tags are okay.
With these requirements in mind, we subclassed HTML::WhiteListSanitizer
and fixed it up. Introducing, then:
![Jason
Statham](http://goremasternews.files.wordpress.com/2009/10/jason_statham.jpg "Jason Statham")
[**HTML::StathamSanitizer**](https://gist.github.com/241114).
User-generated markup, you're on notice: this sanitizer will take its
shirt off and use it to kick your ass. At this point, I've written more
about the code than code itself, so without further ado:
``` {#code .ruby}
module HTML class StathamSanitizer < WhiteListSanitizer protected def tokenize(text, options) super.map do |token| if token.is_a?(HTML::Tag) && options[:parent].include?(token.name) token.to_s.gsub(/</, "&lt;") else token end end end def process_node(node, result, options) result << case node when HTML::Tag if node.closing == :close && options[:parent].first == node.name options[:parent].shift elsif node.closing != :self options[:parent].unshift node.name end process_attributes_for node, options if options[:tags].include?(node.name) node else bad_tags.include?(node.name) ? nil : node.to_s.gsub(/</, "&lt;") end else bad_tags.include?(options[:parent].first) ? nil : node.to_s.gsub(/</, "&lt;") end end end end
```
As always, download and fork [at the
'hub](https://gist.github.com/241114).

View File

@@ -0,0 +1,30 @@
---
title: "Introducing: EmailLabsClient"
date: 2008-07-31T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/introducing-email-labs-client/
---
On my latest project, the client is using
[EmailLabs](http://www.emaillabs.com/) to manage their mailing lists. To
simplify interaction with their system, we've created
[EmailLabsClient](https://github.com/vigetlabs/email_labs_client/tree/master),
a small Ruby client for the EmailLabs API. The core of the program is
the `send_request` method:
``` {#code .ruby}
def self.send_request(request_type, activity) xml = Builder::XmlMarkup.new :target => (input = '') xml.instruct! xml.DATASET do xml.SITE_ID SITE_ID yield xml end Net::HTTP.post_form(URI.parse(ENDPOINT), :type => request_type, :activity => activity, :input => input) end
```
Then you can make API requests like this:
``` {#code .ruby}
def self.subscribe_user(mailing_list, email_address) send_request('record', 'add') do |body| body.MLID mailing_list body.DATA email_address, :type => 'email' end end
```
If you find yourself needing to work with an EmailLabs mailing list,
check it out. At the very least, you should get a decent idea of how to
interact with their API. It's up on
[GitHub](https://github.com/vigetlabs/email_labs_client/tree/master), so
if you add any functionality, send those patches our way.

View File

@@ -0,0 +1,80 @@
---
title: "JSON Feed Is Cool (+ a Simple Tool to Create Your Own)"
date: 2017-08-02T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/json-feed-validator/
---
A few months ago, Manton Reece and Brent Simmons [announced the creation
of JSON Feed](https://jsonfeed.org/2017/05/17/announcing_json_feed), a
new JSON-based syndication format similar to (but so much better than)
[RSS](https://en.wikipedia.org/wiki/RSS) and
[Atom](https://en.wikipedia.org/wiki/Atom_(standard)). One might
reasonably contend that Google killed feed-based content aggregation in
2013 when they end-of-lifed™ Google Reader, but RSS continues to enjoy
[underground
popularity](http://www.makeuseof.com/tag/rss-dead-look-numbers/) and
JSON Feed has the potential to make feed creation and consumption even
more widespread. So why are we^[1](#fn:1 "see footnote"){#fnref:1
.footnote}^ so excited about it?
## JSON \> XML {#jsonxml}
RSS and Atom are both XML-based formats, and as someone who's written
code to both produce and ingest these feeds, it's not how I'd choose to
spend a Saturday. Or even a Tuesday. Elements in XML have both
attributes and children, which is a mismatch for most modern languages'
native data structures. You end up having to use libraries like
[Nokogiri](http://www.nokogiri.org/) to write code like
`item.attributes["name"]` and `item.children[0]`. And producing a feed
usually involves a full-blown templating solution like ERB. Contrast
that with JSON, which maps perfectly to JavaScript objects (-\_-), Ruby
hashes/arrays, Elixir maps, etc., etc. Producing a feed becomes a call
to `.to_json`, and consuming one, `JSON.parse`.
## Flexibility
While still largely focused on content syndication, [the
spec](https://jsonfeed.org/version/1) includes support for plaintext and
title-less posts and custom extensions, meaning its potential uses are
myriad. Imagine a new generation of microblogs, Slack bots, and IoT
devices consuming and/or producing JSON feeds.
## Feeds Are (Still) Cool {#feedsarestillcool}
Not to get too high up on my horse or whatever, but as a longtime web
nerd, I'm dismayed by how much content creation has migrated to walled
gardens like Facebook/Instagram/Twitter/Medium that make it super easy
to get content *in*, but very difficult to get it back *out*. [Twitter
killed RSS in 2012](http://mashable.com/2012/09/05/twitter-api-rss), and
have you ever tried to get a list of your most recent Instagram photos
programatically? I wouldn't. Owning your own content and sharing it
liberally is what the web was made for, and JSON Feed has the potential
to make it easy and fun to do. [It's how things should be. It's how they
could be.](https://www.youtube.com/watch?v=TgqiSBxvdws)
------------------------------------------------------------------------
## Your Turn
If this sounds at all interesting to you, read the
[announcement](https://jsonfeed.org/2017/05/17/announcing_json_feed) and
the [spec](https://jsonfeed.org/version/1), listen to this [interview
with the
creators](https://daringfireball.net/thetalkshow/2017/05/31/ep-192), and
**try out this [JSON Feed
Validator](https://json-feed-validator.herokuapp.com/) I put up this
week**. You can use the [Daring Fireball
feed](https://daringfireball.net/feeds/json) or create your own. It's
pretty simple right now, running your input against a schema I
downloaded from [JSON Schema Store](http://schemastore.org/json/), but
[suggestions and pull requests are
welcome](https://github.com/vigetlabs/json-feed-validator).
------------------------------------------------------------------------
1. [The royal we, you
know?](https://www.youtube.com/watch?v=VLR_TDO0FTg#t=45s)
[ ↩](#fnref:1 "return to article"){.reversefootnote}

View File

@@ -0,0 +1,86 @@
---
title: "Large Images in Rails"
date: 2012-09-18T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/large-images-in-rails/
---
The most visually striking feature on the new
[WWF](http://worldwildlife.org/) site, as well as the source of the
largest technical challenges, is the photography. The client team is
working with gorgeous, high-fidelity photographs loaded with metadata,
and it was up to us to make them work in a web context. Here are a few
things we did to make the site look and perform like a veritable [snow
leopard](http://worldwildlife.org/species/snow-leopard).
## Optimize Images
The average uploaded photo into this system is around five megabytes, so
the first order of business was to find ways to get filesize down. Two
techniques turned out to be very effective:
[jpegtran](http://jpegclub.org/jpegtran/) and
[ImageMagick](http://www.imagemagick.org/script/index.php)'s `quality`
option. We run all photos through a custom
[Paperclip](https://github.com/thoughtbot/paperclip) processor that
calls out to jpegtran to losslessly optimize image compression and strip
out metadata. In some cases, we were seeing thumbnailed images go from
60k to 15k by removing unused color profile data. We save the resulting
images out at 75% quality with the following Paperclip directive:
has_attached_file :image,
:convert_options => { :all => "-quality 75" },
:styles => { # ...
Enabling this option has a huge impact on filesize (about a 90%
reduction) with no visible loss of quality. Be aware that we're working
with giant, unoptimized images; if you're going to be uploading images
that have already been saved out for the web, this level of compression
is probably too aggressive.
## Process in Background
Basic maths: large images × lots of crop styles = long processing time.
As the site grew, the delay after uploading a new photo increased until
it became unacceptable. It was time to implement background processing.
[Resque](https://github.com/defunkt/resque) and
[delayed_paperclip](https://github.com/jstorimer/delayed_paperclip) to
the ... rescue (derp). These two gems make it super simple to process
images outside of the request/response flow with a simple
`process_in_background :image` in your model.
A few notes: as of this writing, delayed_paperclip hasn't been updated
recently. [Here's a fork that
works](https://github.com/tommeier/delayed_paperclip) from tommeier. I
recommend using the
[rescue-ensure-connected](https://github.com/socialcast/resque-ensure-connected)
gem if you're going to run Resque in production to keep your
long-running processes from losing their DB connnections.
## Server Configuration
You'll want to put [far-future expires
headers](http://developer.yahoo.com/performance/rules.html#expires) on
these photos so that browsers know not to redownload them. If you
control the servers from which they'll be served, you can configure
Apache to send these headers with the following bit of configuration:
ExpiresActive On
ExpiresByType image/png "access plus 1 year"
ExpiresByType image/gif "access plus 1 year"
ExpiresByType image/jpeg "access plus 1 year"
([Similarly, for
nginx](http://www.agileweboperations.com/far-future-expires-headers-for-ruby-on-rails-with-nginx).)
When working with a bunch of large files, though, you're probably better
served by uploading them to S3 or RackSpace Cloud Files and serving them
from there.
------------------------------------------------------------------------
Another option to look at might be
[Dragonfly](https://github.com/markevans/dragonfly), which takes a
different approach to photo processing than does Paperclip, resizing on
the fly rather than on upload. This might obviate the need for Resque
but at unknown (by me) cost. We hope that some of this will be helpful
in your next photo-intensive project.

View File

@@ -0,0 +1,233 @@
---
title: "Lets Make a Hash Chain in SQLite"
date: 2021-06-30T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/lets-make-a-hash-chain-in-sqlite/
---
I\'m not much of a cryptocurrency enthusiast, but there are some neat
ideas in these protocols that I wanted to explore further. Based on my
absolute layperson\'s understanding, the \"crypto\" in
\"cryptocurrency\" describes three things:
1. Some public key/private key stuff to grant access to funds at an
address;
2. For certain protocols (e.g. Bitcoin), the cryptographic
puzzles[^1^](#fn:1 "see footnote"){#fnref:1 .footnote} that miners
have to solve in order to add new blocks to the ledger; and
3. The use of hashed signatures to ensure data integrity.
Of those three uses, the first two (asymmetric cryptography and
proof-of-work) aren\'t that interesting to me, at least from a technical
perspective. The third concept, though --- using cryptography to make
data verifiable and tamper-resistant --- that\'s pretty cool, and
something I wanted to dig into. I decided to build a little
proof-of-concept using [SQLite](https://www.sqlite.org/index.html), a
\"small, fast, self-contained, high-reliability, full-featured, SQL
database engine.\"
A couple notes before we dive in: these concepts aren\'t unique to the
blockchain; Wikipedia has good explanations of [cryptographic hash
functions](https://en.wikipedia.org/wiki/Cryptographic_hash_function),
[Merkle trees](https://en.wikipedia.org/wiki/Merkle_tree), and [hash
chains](https://en.wikipedia.org/wiki/Hash_chain) if any of this piques
your curiosity. This stuff is also [at the core of
git](https://initialcommit.com/blog/git-bitcoin-merkle-tree), which is
really pretty neat.
[]{#onto-the-code}
## Onto the code [\#](#onto-the-code "Direct link to Onto the code"){.anchor aria-label="Direct link to Onto the code"}
Implementing a rudimentary hash chain in SQL is pretty simple. Here\'s
my approach, which uses \"bookmarks\" as an arbitrary record type.
``` {.code-block .line-numbers}
PRAGMA foreign_keys = ON;
SELECT load_extension("./sha1");
CREATE TABLE bookmarks (
id INTEGER PRIMARY KEY,
signature TEXT NOT NULL UNIQUE
CHECK(signature = sha1(url || COALESCE(parent, ""))),
parent TEXT,
url TEXT NOT NULL UNIQUE,
FOREIGN KEY(parent) REFERENCES bookmarks(signature)
);
CREATE UNIQUE INDEX parent_unique ON bookmarks (
ifnull(parent, "")
);
```
This code is available on
[GitHub](https://github.com/dce/sqlite-hash-chain) in case you want to
try this out on your own. Let\'s break it down a little bit.
- First, we enable foreign key constraints, which aren\'t on by
default
- Then we pull in SQLite\'s [`sha1`
function](https://www.i-programmer.info/news/84-database/10527-sqlite-317-adds-sha1-extension.html),
which implements a common hashing algorithm
- Then we define our table
- `id` isn\'t mandatory but makes it easier to grab the last entry
- `signature` is the SHA1 hash of the bookmark URL and parent
entry\'s signature; it uses a `CHECK` constraint to ensure this
is guaranteed to be true
- `parent` is the `signature` of the previous entry in the chain
(notice that it\'s allowed to be null)
- `url` is the data we want to ensure is immutable (though as
we\'ll see later, it\'s not truly immutable since we can still
do cascading updates)
- We set a foreign key constraint that `parent` refers to another
row\'s `signature` unless it\'s null
- Then we create a unique index on `parent` that covers the `NULL`
case, since our very first bookmark won\'t have a parent, but no
other row should be allowed to have a null parent, and no two rows
should be able to have the same parent
Next, let\'s insert some data:
``` {.code-block .line-numbers}
INSERT INTO bookmarks (url, signature) VALUES ("google", sha1("google"));
WITH parent AS (SELECT signature FROM bookmarks ORDER BY id DESC LIMIT 1)
INSERT INTO bookmarks (url, parent, signature) VALUES (
"yahoo", (SELECT signature FROM parent), sha1("yahoo" || (SELECT signature FROM parent))
);
WITH parent AS (SELECT signature FROM bookmarks ORDER BY id DESC LIMIT 1)
INSERT INTO bookmarks (url, parent, signature) VALUES (
"bing", (SELECT signature FROM parent), sha1("bing" || (SELECT signature FROM parent))
);
WITH parent AS (SELECT signature FROM bookmarks ORDER BY id DESC LIMIT 1)
INSERT INTO bookmarks (url, parent, signature) VALUES (
"duckduckgo", (SELECT signature FROM parent), sha1("duckduckgo" || (SELECT signature FROM parent))
);
```
OK! Let\'s fire up `sqlite3` and then `.read` this file. Here\'s the
result:
``` {.code-block .line-numbers}
sqlite> SELECT * FROM bookmarks;
+----+------------------------------------------+------------------------------------------+------------+
| id | signature | parent | url |
+----+------------------------------------------+------------------------------------------+------------+
| 1 | 759730a97e4373f3a0ee12805db065e3a4a649a5 | | google |
| 2 | 64633167b8e44cb833fbfa349731d8a68e942ebc | 759730a97e4373f3a0ee12805db065e3a4a649a5 | yahoo |
| 3 | ce3df1337879e85bc488d4cae129719cc46cad04 | 64633167b8e44cb833fbfa349731d8a68e942ebc | bing |
| 4 | 675570ac126d492e449ebaede091e2b7dad7d515 | ce3df1337879e85bc488d4cae129719cc46cad04 | duckduckgo |
+----+------------------------------------------+------------------------------------------+------------+
```
This has some cool properties. I can\'t delete an entry in the chain:
`sqlite> DELETE FROM bookmarks WHERE id = 3;`
`Error: FOREIGN KEY constraint failed`
I can\'t change a URL:
`sqlite> UPDATE bookmarks SET url = "altavista" WHERE id = 3;`
`Error: CHECK constraint failed: signature = sha1(url || parent)`
I can\'t re-sign an entry:
`sqlite> UPDATE bookmarks SET url = "altavista", signature = sha1("altavista" || parent) WHERE id = 3;`
`Error: FOREIGN KEY constraint failed`
I **can**, however, update the last entry in the chain:
``` {.code-block .line-numbers}
sqlite> UPDATE bookmarks SET url = "altavista", signature = sha1("altavista" || parent) WHERE id = 4;
sqlite> SELECT * FROM bookmarks;
+----+------------------------------------------+------------------------------------------+-----------+
| id | signature | parent | url |
+----+------------------------------------------+------------------------------------------+-----------+
| 1 | 759730a97e4373f3a0ee12805db065e3a4a649a5 | | google |
| 2 | 64633167b8e44cb833fbfa349731d8a68e942ebc | 759730a97e4373f3a0ee12805db065e3a4a649a5 | yahoo |
| 3 | ce3df1337879e85bc488d4cae129719cc46cad04 | 64633167b8e44cb833fbfa349731d8a68e942ebc | bing |
| 4 | b583a025b5a43727978c169fe99f5422039194ea | ce3df1337879e85bc488d4cae129719cc46cad04 | altavista |
+----+------------------------------------------+------------------------------------------+-----------+
```
This is because a row isn\'t really \"locked in\" until it\'s pointed to
by another row. It\'s worth pointing out that an actual blockchain would
use a [consensus
mechanism](https://www.investopedia.com/terms/c/consensus-mechanism-cryptocurrency.asp)
to prevent any updates like this, but that\'s way beyond the scope of
what we\'re doing here.
[]{#cascading-updates}
## Cascading updates [\#](#cascading-updates "Direct link to Cascading updates"){.anchor aria-label="Direct link to Cascading updates"}
Given that we can change the last row, it\'s possible to update any row
in the ledger provided you 1) also re-sign all of its children and 2) do
it all in a single pass. Here\'s how you\'d update row 2 to
\"askjeeves\" with a [`RECURSIVE`
query](https://www.sqlite.org/lang_with.html#recursive_common_table_expressions)
(and sorry I know this is a little hairy):
``` {.code-block .line-numbers}
WITH RECURSIVE
t1(url, parent, old_signature, signature) AS (
SELECT "askjeeves", parent, signature, sha1("askjeeves" || COALESCE(parent, ""))
FROM bookmarks WHERE id = 2
UNION
SELECT t2.url, t1.signature, t2.signature, sha1(t2.url || t1.signature)
FROM bookmarks AS t2, t1 WHERE t2.parent = t1.old_signature
)
UPDATE bookmarks
SET url = (SELECT url FROM t1 WHERE t1.old_signature = bookmarks.signature),
parent = (SELECT parent FROM t1 WHERE t1.old_signature = bookmarks.signature),
signature = (SELECT signature FROM t1 WHERE t1.old_signature = bookmarks.signature)
WHERE signature IN (SELECT old_signature FROM t1);
```
Here\'s the result of running this update:
``` {.code-block .line-numbers}
+----+------------------------------------------+------------------------------------------+-----------+
| id | signature | parent | url |
+----+------------------------------------------+------------------------------------------+-----------+
| 1 | 759730a97e4373f3a0ee12805db065e3a4a649a5 | | google |
| 2 | de357e976171e528088843dfa35c1097017b1009 | 759730a97e4373f3a0ee12805db065e3a4a649a5 | askjeeves |
| 3 | 1b69dff11f3e8ffeade0f42521f9e1bd1bd78539 | de357e976171e528088843dfa35c1097017b1009 | bing |
| 4 | 924660e4f25e2ac8c38ca25bae201ad3a5b6e545 | 1b69dff11f3e8ffeade0f42521f9e1bd1bd78539 | altavista |
+----+------------------------------------------+------------------------------------------+-----------+
```
As you can see, row 2\'s `url` is updated, and rows 3 and 4 have updated
signatures and parents. Pretty cool, and pretty much the same thing as
what happens when you change a git commit via `rebase` --- all the
successive commits get new SHAs.
[[Learn More]{.util-breadcrumb-md .mb-8 .group-hover:translate-y-20
.group-hover:opacity-0 .transition-all .ease-in-out
.duration-500}](https://www.viget.com/careers/application-developer/){.relative
.flex .group .flex-col .p-32 .md:p-40 .lg:p-64 .z-10}
### We're hiring Application Developers. Learn more and introduce yourself. {#were-hiring-application-developers.-learn-more-and-introduce-yourself. .text-20 .md:text-24 .lg:text-32 .font-bold .leading-[170%] .group-hover:-translate-y-20 .transition-transform .ease-in-out .duration-500}
![](data:image/svg+xml;base64,PHN2ZyBjbGFzcz0icmVjdC1pY29uLW1kIHNlbGYtZW5kIG10LTE2IGdyb3VwLWhvdmVyOi10cmFuc2xhdGUteS0yMCB0cmFuc2l0aW9uLWFsbCBlYXNlLWluLW91dCBkdXJhdGlvbi01MDAiIHZpZXdib3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBhcmlhLWhpZGRlbj0idHJ1ZSI+CjxwYXRoIGZpbGwtcnVsZT0iZXZlbm9kZCIgY2xpcC1ydWxlPSJldmVub2RkIiBkPSJNMTMuNzg0OCAxOS4zMDkxQzEzLjQ3NTggMTkuNTg1IDEzLjAwMTcgMTkuNTU4MyAxMi43MjU4IDE5LjI0OTRDMTIuNDQ5OCAxOC45NDA1IDEyLjQ3NjYgMTguNDY2MyAxMi43ODU1IDE4LjE5MDRMMTguNzg2NiAxMi44MzAxTDQuNzUxOTUgMTIuODMwMUM0LjMzNzc0IDEyLjgzMDEgNC4wMDE5NSAxMi40OTQzIDQuMDAxOTUgMTIuMDgwMUM0LjAwMTk1IDExLjY2NTkgNC4zMzc3NCAxMS4zMzAxIDQuNzUxOTUgMTEuMzMwMUwxOC43ODU1IDExLjMzMDFMMTIuNzg1NSA1Ljk3MDgyQzEyLjQ3NjYgNS42OTQ4OCAxMi40NDk4IDUuMjIwNzYgMTIuNzI1OCA0LjkxMTg0QzEzLjAwMTcgNC42MDI5MiAxMy40NzU4IDQuNTc2MTggMTMuNzg0OCA0Ljg1MjEyTDIxLjIzNTggMTEuNTA3NkMyMS4zNzM4IDExLjYyNDQgMjEuNDY5IDExLjc5MDMgMjEuNDk0NSAxMS45NzgyQzIxLjQ5OTIgMTIuMDExOSAyMS41MDE1IDEyLjA0NjEgMjEuNTAxNSAxMi4wODA2QzIxLjUwMTUgMTIuMjk0MiAyMS40MTA1IDEyLjQ5NzcgMjEuMjUxMSAxMi42NEwxMy43ODQ4IDE5LjMwOTFaIj48L3BhdGg+Cjwvc3ZnPg==){.rect-icon-md
.self-end .mt-16 .group-hover:-translate-y-20 .transition-all
.ease-in-out .duration-500}
I\'ll be honest that I don\'t have any immediately practical uses for a
cryptographically-signed database table, but I thought it was cool and
helped me understand these concepts a little bit better. Hopefully it
gets your mental wheels spinning a little bit, too. Thanks for reading!
------------------------------------------------------------------------
1. ::: {#fn:1}
[Here\'s a pretty good explanation of what mining really
is](https://asthasr.github.io/posts/how-blockchains-work/), but, in
a nutshell, it\'s running a hashing algorithm over and over again
with a random salt until a hash is found that begins with a required
number of zeroes. [ ↩︎](#fnref:1 "return to body"){.reversefootnote}
:::

View File

@@ -0,0 +1,427 @@
---
title: "Lets Write a Dang ElasticSearch Plugin"
date: 2021-03-15T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/lets-write-a-dang-elasticsearch-plugin/
---
One of our current projects involves a complex interactive query builder
to search a large collection of news items. Some of the conditionals
fall outside of the sweet spot of Postgres (e.g. word X must appear
within Y words of word Z), and so we opted to pull in
[ElasticSearch](https://www.elastic.co/elasticsearch/) alongside it.
It\'s worked perfectly, hitting all of our condition and grouping needs
with one exception: we need to be able to filter for articles that
contain a term a minimum number of times (so \"Apple\" must appear in
the article 3 times, for example). Frustratingly, Elastic *totally* has
this information via its
[`term_vector`](https://www.elastic.co/guide/en/elasticsearch/reference/current/term-vector.html)
feature, but you can\'t use that data inside a query, as least as far as
I can tell.
The solution, it seems, is to write a custom plugin. I figured it out,
eventually, but it was a lot of trial-and-error as the documentation I
was able to find is largely outdated or incomplete. So I figured I\'d
take what I learned while it\'s still fresh in my mind in the hopes that
someone else might have an easier time of it. That\'s what internet
friends are for, after all.
Quick note before we start: all the version numbers you see are current
and working as of February 25, 2021. Hopefully this post ages well, but
if you try this out and hit issues, bumping the versions of Elastic,
Gradle, and maybe even Java is probably a good place to start. Also, I
use `projectname` a lot in the code examples --- that\'s not a special
word and you should change it to something that makes sense for you.
[]{#1-set-up-a-java-development-environment}
## 1. Set up a Java development environment [\#](#1-set-up-a-java-development-environment "Direct link to 1. Set up a Java development environment"){.anchor aria-label="Direct link to 1. Set up a Java development environment"}
First off, you\'re gonna be writing some Java. That\'s not my usual
thing, so the first step was to get a working environment to compile my
code. To do that, we\'ll use [Docker](https://www.docker.com/). Here\'s
a `Dockerfile`:
``` {.code-block .line-numbers}
FROM adoptopenjdk/openjdk12:jdk-12.0.2_10-ubuntu
RUN apt-get update &&
apt-get install -y zip unzip &&
rm -rf /var/lib/apt/lists/*
SHELL ["/bin/bash", "-c"]
RUN curl -s "https://get.sdkman.io" | bash &&
source "/root/.sdkman/bin/sdkman-init.sh" &&
sdk install gradle 6.8.2
WORKDIR /plugin
```
We use a base image with all the Java stuff but also a working Ubuntu
install so that we can do normal Linux-y things inside our container.
From your terminal, build the image:
`> docker build . -t projectname-java`
Then, spin up the container and start an interactive shell, mounting
your local working directory into `/plugin`:
`> docker run --rm -it -v ${PWD}:/plugin projectname-java bash`
[]{#2-configure-gradle}
## 2. Configure Gradle [\#](#2-configure-gradle "Direct link to 2. Configure Gradle"){.anchor aria-label="Direct link to 2. Configure Gradle"}
[Gradle](https://gradle.org/) is a \"build automation tool for
multi-language software development,\" and what Elastic recommends for
plugin development. Configuring Gradle to build the plugin properly was
the hardest part of this whole endeavor. Throw this into `build.gradle`
in your project root:
``` {.code-block .line-numbers}
buildscript {
repositories {
mavenLocal()
mavenCentral()
jcenter()
}
dependencies {
classpath "org.elasticsearch.gradle:build-tools:7.11.1"
}
}
apply plugin: 'java'
compileJava {
sourceCompatibility = JavaVersion.VERSION_12
targetCompatibility = JavaVersion.VERSION_12
}
apply plugin: 'elasticsearch.esplugin'
group = "com.projectname"
version = "0.0.1"
esplugin {
name 'contains-multiple'
description 'A script for finding documents that match a term a certain number of times'
classname 'com.projectname.containsmultiple.ContainsMultiplePlugin'
licenseFile rootProject.file('LICENSE.txt')
noticeFile rootProject.file('NOTICE.txt')
}
validateNebulaPom.enabled = false
```
You\'ll also need files named `LICENSE.txt` and `NOTICE.txt` --- mine
are empty, since the plugin is for internal use only. If you\'re going
to be releasing your plugin in some public way, maybe talk to a lawyer
about what to put in those files.
[]{#3-write-the-dang-plugin}
## 3. Write the dang plugin [\#](#3-write-the-dang-plugin "Direct link to 3. Write the dang plugin"){.anchor aria-label="Direct link to 3. Write the dang plugin"}
To write the actual plugin, I started with [this example
plugin](https://github.com/elastic/elasticsearch/blob/master/plugins/examples/script-expert-scoring/src/main/java/org/elasticsearch/example/expertscript/ExpertScriptPlugin.java)
which scores a document based on the frequency of a given term. My use
case was fortunately quite similar, though I\'m using a `filter` query,
meaning I just want a boolean, i.e. does this document contain this term
the requisite number of times? As such, I implemented a
[`FilterScript`](https://www.javadoc.io/doc/org.elasticsearch/elasticsearch/latest/org/elasticsearch/script/FilterScript.html)
rather than the `ScoreScript` implemented in the example code.
This file lives in (deep breath)
`src/main/java/com/projectname/containsmultiple/ContainsMultiplePlugin.java`:
``` {.code-block .line-numbers}
package com.projectname.containsmultiple;
import org.apache.lucene.index.LeafReaderContext;
import org.apache.lucene.index.PostingsEnum;
import org.apache.lucene.index.Term;
import org.elasticsearch.common.settings.Settings;
import org.elasticsearch.plugins.Plugin;
import org.elasticsearch.plugins.ScriptPlugin;
import org.elasticsearch.script.FilterScript;
import org.elasticsearch.script.FilterScript.LeafFactory;
import org.elasticsearch.script.ScriptContext;
import org.elasticsearch.script.ScriptEngine;
import org.elasticsearch.script.ScriptFactory;
import org.elasticsearch.search.lookup.SearchLookup;
import java.io.IOException;
import java.io.UncheckedIOException;
import java.util.Collection;
import java.util.Map;
import java.util.Set;
/**
* A script for finding documents that match a term a certain number of times
*/
public class ContainsMultiplePlugin extends Plugin implements ScriptPlugin {
@Override
public ScriptEngine getScriptEngine(
Settings settings,
Collection<ScriptContext<?>> contexts
) {
return new ContainsMultipleEngine();
}
// tag::contains_multiple
private static class ContainsMultipleEngine implements ScriptEngine {
@Override
public String getType() {
return "expert_scripts";
}
@Override
public <T> T compile(
String scriptName,
String scriptSource,
ScriptContext<T> context,
Map<String, String> params
) {
if (context.equals(FilterScript.CONTEXT) == false) {
throw new IllegalArgumentException(getType()
+ " scripts cannot be used for context ["
+ context.name + "]");
}
// we use the script "source" as the script identifier
if ("contains_multiple".equals(scriptSource)) {
FilterScript.Factory factory = new ContainsMultipleFactory();
return context.factoryClazz.cast(factory);
}
throw new IllegalArgumentException("Unknown script name "
+ scriptSource);
}
@Override
public void close() {
// optionally close resources
}
@Override
public Set<ScriptContext<?>> getSupportedContexts() {
return Set.of(FilterScript.CONTEXT);
}
private static class ContainsMultipleFactory implements FilterScript.Factory,
ScriptFactory {
@Override
public boolean isResultDeterministic() {
return true;
}
@Override
public LeafFactory newFactory(
Map<String, Object> params,
SearchLookup lookup
) {
return new ContainsMultipleLeafFactory(params, lookup);
}
}
private static class ContainsMultipleLeafFactory implements LeafFactory {
private final Map<String, Object> params;
private final SearchLookup lookup;
private final String field;
private final String term;
private final int count;
private ContainsMultipleLeafFactory(
Map<String, Object> params, SearchLookup lookup) {
if (params.containsKey("field") == false) {
throw new IllegalArgumentException(
"Missing parameter [field]");
}
if (params.containsKey("term") == false) {
throw new IllegalArgumentException(
"Missing parameter [term]");
}
if (params.containsKey("count") == false) {
throw new IllegalArgumentException(
"Missing parameter [count]");
}
this.params = params;
this.lookup = lookup;
field = params.get("field").toString();
term = params.get("term").toString();
count = Integer.parseInt(params.get("count").toString());
}
@Override
public FilterScript newInstance(LeafReaderContext context)
throws IOException {
PostingsEnum postings = context.reader().postings(
new Term(field, term));
if (postings == null) {
/*
* the field and/or term don't exist in this segment,
* so always return 0
*/
return new FilterScript(params, lookup, context) {
@Override
public boolean execute() {
return false;
}
};
}
return new FilterScript(params, lookup, context) {
int currentDocid = -1;
@Override
public void setDocument(int docid) {
/*
* advance has undefined behavior calling with
* a docid <= its current docid
*/
if (postings.docID() < docid) {
try {
postings.advance(docid);
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}
currentDocid = docid;
}
@Override
public boolean execute() {
if (postings.docID() != currentDocid) {
/*
* advance moved past the current doc, so this
* doc has no occurrences of the term
*/
return false;
}
try {
return postings.freq() >= count;
} catch (IOException e) {
throw new UncheckedIOException(e);
}
}
};
}
}
}
// end::contains_multiple
}
```
[]{#4-add-it-to-elasticSearch}
## 4. Add it to ElasticSearch [\#](#4-add-it-to-elasticSearch "Direct link to 4. Add it to ElasticSearch"){.anchor aria-label="Direct link to 4. Add it to ElasticSearch"}
With our code in place (and synced into our Docker container with a
mounted volume), it\'s time to compile it. In the Docker shell you
started up in step #1, build your plugin:
`> gradle build`
Assuming that works, you should now see a `build` directory with a bunch
of stuff in it. The file you care about is
`build/distributions/contains-multiple-0.0.1.zip` (though that\'ll
obviously change if you call your plugin something different or give it
a different version number). Grab that file and copy it to where you
plan to actually run ElasticSearch. For me, I placed it in a folder
called `.docker/elastic` in the main project repo. In that same
directory, create a new `Dockerfile` that\'ll actually run Elastic:
``` {.code-block .line-numbers}
FROM docker.elastic.co/elasticsearch/elasticsearch:7.11.1
COPY .docker/elastic/contains-multiple-0.0.1.zip /plugins/contains-multiple-0.0.1.zip
RUN elasticsearch-plugin install
file:///plugins/contains-multiple-0.0.1.zip
```
Then, in your project root, create the following `docker-compose.yml`:
``` {.code-block .line-numbers}
version: '3.2'
services: elasticsearch:
image: projectname_elasticsearch
build:
context: .
dockerfile: ./.docker/elastic/Dockerfile
ports:
- 9200:9200
environment:
- discovery.type=single-node
- script.allowed_types=inline
- script.allowed_contexts=filter
```
Those last couple lines are pretty important and your script won\'t work
without them. Build your image with `docker-compose build` and then
start Elastic with `docker-compose up`.
[]{#5-use-your-plugin}
## 5. Use your plugin [\#](#5-use-your-plugin "Direct link to 5. Use your plugin"){.anchor aria-label="Direct link to 5. Use your plugin"}
To actually see the plugin in action, first create an index and add some
documents (I\'ll assume you\'re able to do this if you\'ve read this far
into this post). Then, make a query with `curl` (or your Elastic wrapper
of choice), substituting `full_text`, `yabba` and `index_name` with
whatever makes sense for you:
``` {.code-block .line-numbers}
> curl -H "content-type: application/json"
-d '
{
"query": {
"bool": {
"filter": {
"script": {
"script": {
"source": "contains_multiple",
"lang": "expert_scripts",
"params": {
"field": "full_text",
"term": "yabba",
"count": 3
}
}
}
}
}
}
}'
"localhost:9200/index_name/_search?pretty"
```
The result should be something like:
``` {.code-block .line-numbers}
{
"took" : 6,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 0.0,
"hits" : [
{
"_index" : "index_name",
"_type" : "_doc",
"_id" : "10",
...
```
So that\'s that, an ElasticSearch plugin from start-to-finish. I\'m sure
there are better ways to do some of this stuff, and if you\'re aware of
any, let us know in the comments or write your own dang blog.

View File

@@ -0,0 +1,301 @@
---
title: "Level Up Your Shell Game"
date: 2013-10-24T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/level-up-your-shell-game/
---
The Viget dev team was recently relaxing by the fireplace, sipping a
fine cognac out of those fancy little glasses, when the conversation
turned (as it often does) to the Unix command line. We have good systems
in place for sharing Ruby techniques ([pull request code
reviews](https://viget.com/extend/developer-ramp-up-with-pull-requests))
and [Git tips](https://viget.com/extend/a-gaggle-of-git-tips), but
everyone seemed to have a simple, useful command-line trick or two that
the rest of the team had never encountered. Here are a few of our
favorites:
- [Keyboard
Shortcuts](https://viget.com/extend/level-up-your-shell-game#keyboard-shortcuts)
- [Aliases](https://viget.com/extend/level-up-your-shell-game#aliases)
- [History
Expansions](https://viget.com/extend/level-up-your-shell-game#history-expansions)
- [Argument
Expansion](https://viget.com/extend/level-up-your-shell-game#argument-expansion)
- [Customizing
`.inputrc`](https://viget.com/extend/level-up-your-shell-game#customizing-inputrc)
- [Viewing Processes on a Given Port with
`lsof`](https://viget.com/extend/level-up-your-shell-game#viewing-processes-on-a-given-port-with-lsof)
- [SSH
Configuration](https://viget.com/extend/level-up-your-shell-game#ssh-configuration)
- [Invoking Remote Commands with
SSH](https://viget.com/extend/level-up-your-shell-game#invoking-remote-commands-with-ssh)
Ready to get your
![](https://github.global.ssl.fastly.net/images/icons/emoji/neckbeard.png){.no-border
align="top" height="24"
style="display: inline; vertical-align: top; width: 24px !important; height: 24px !important;"}
on? Good. Let's go.
## Keyboard Shortcuts
[**Mike:**](https://viget.com/about/team/mackerman) I recently
discovered a few simple Unix keyboard shortcuts that save me some time:
Shortcut Result
---------------------- ----------------------------------------------------------------------------
`ctrl + u` Deletes the portion of your command **before** the current cursor position
`ctrl + w` Deletes the **word** preceding the current cursor position
`ctrl + left arrow` Moves the cursor to the **left by one word**
`ctrl + right arrow` Moves the cursor to the **right by one word**
`ctrl + a` Moves the cursor to the **beginning** of your command
`ctrl + e` Moves the cursor to the **end** of your command
Thanks to [Lawson Kurtz](https://viget.com/about/team/lkurtz) for
pointing out the beginning and end shortcuts
## Aliases
[**Eli:**](https://viget.com/about/team/efatsi) Sick of typing
`bundle exec rake db:test:prepare` or other long, exhausting lines of
terminal commands? Me too. Aliases can be a big help in alleviating the
pain of typing common commands over and over again.
They can be easily created in your `~/.bash_profile` file, and have the
following syntax:
alias gb="git branch"
I've got a whole slew of git and rails related ones that are fairly
straight-forward:
alias ga="git add .; git add -u ."
alias glo='git log --pretty=format:"%h%x09%an%x09%s"'
alias gpro="git pull --rebase origin"
...
alias rs="rails server"
And a few others I find useful:
alias editcommit="git commit --amend -m"
alias pro="cd ~/Desktop/Projects/"
alias s.="subl ."
alias psgrep="ps aux | grep"
alias cov='/usr/bin/open -a "/Applications/Google Chrome.app" coverage/index.html'
If you ever notice yourself typing these things out over and over, pop
into your `.bash_profile` and whip up some of your own! If
`~/.bash_profile` is hard for you to remember like it is for me, nothing
an alias can't fix: `alias editbash="open ~/.bash_profile"`.
**Note**: you'll need to open a new Terminal window for changes in
`~/.bash_profile` to take place.
## History Expansions
[**Chris:**](https://viget.com/about/team/cjones) Here are some of my
favorite tricks for working with your history.
**`!!` - previous command**
How many times have you run a command and then immediately re-run it
with `sudo`? The answer is all the time. You could use the up arrow and
then [Mike](https://viget.com/about/team/mackerman)'s `ctrl-a` shortcut
to insert at the beginning of the line. But there's a better way: `!!`
expands to the entire previous command. Observe:
$ rm path/to/thing
Permission denied
$ sudo !!
sudo rm path/to/thing
**`!$` - last argument of the previous command**
How many times have you run a command and then run a different command
with the same argument? The answer is all the time. Don't retype it, use
`!$`:
$ mkdir path/to/thing
$ cd !$
cd path/to/thing
**`!<string>` - most recent command starting with**
Here's a quick shortcut for running the most recent command that *starts
with* the provided string:
$ rake db:migrate:reset db:seed
$ rails s
$ !rake # re-runs that first command
**`!<number>` - numbered command**
All of your commands are stored in `~/.bash_history`, which you can view
with the `history` command. Each entry has a number, and you can use
`!<number>` to run that specific command. Try it with `grep` to filter
for specific commands:
$ history | grep heroku
492 heroku run rake search:reindex -r production
495 heroku maintenance:off -r production
496 heroku run rails c -r production
$ !495
This technique is perfect for an alias:
$ alias h?="history | grep"
$ h? heroku
492 heroku run rake search:reindex -r production
495 heroku maintenance:off -r production
496 heroku run rails c -r production
$ !495
Sweet.
## Argument Expansion
[**Ryan:**](https://viget.com/about/team/rfoster) For commands that take
multiple, similar arguments, you can use `{old,new}` to expand one
argument into two or more. For example:
mv app/models/foo.rb app/models/foobar.rb
can be
mv app/models/{foo,foobar}.rb
or even
mv app/models/foo{,bar}.rb
## Customizing .inputrc {#customizing-inputrc}
[**Brian:**](https://viget.com/about/team/blandau) One of the things I
have found to be a big time saver when using my terminal is configuring
keyboard shortcuts. Luckily if you're still using bash (which I am), you
can configure shortcuts and use them in a number of other REPLs that all
use readline. You can [configure readline keyboard shortcuts by editing
your `~/.inputrc`
file](http://cnswww.cns.cwru.edu/php/chet/readline/readline.html#SEC9).
Each line in the file defines a shortcut. It's made up of two parts, the
key sequence, and the command or macro. Here are three of my favorites:
1. `"\ep": history-search-backward`: This will map to escape-p and will
allow you to search for completions to the current line from your
history. For instance, it will allow you to type "`git`" into your
shell and then hit escape-p to cycle through all the git commands
you have used recently looking for the correct completion.
2. `"\t": menu-complete`: I always hated that when I tried to tab
complete something and then I'd get a giant list of possible
completions. By adding this line you can instead use tab to cycle
through all the possible completions stopping on which ever one is
the correct one.
3. `"\C-d": kill-whole-line`: There's a built-in key command for
killing a line after the cursor (control-k), but no way to kill the
whole line. This solves that. After adding this to your `.inputrc`
just type control-d from anywhere on the line and the whole line is
gone and you're ready to start fresh.
Don't like what I mapped these commands to? Feel free to use different
keyboard shortcuts by changing that first part in quotes. There's a lot
more you can do, just check out [all the commands you can
assign](http://cnswww.cns.cwru.edu/php/chet/readline/readline.html#SEC13)
or create your own macros.
## Viewing Processes on a Given Port with lsof
[**Zachary:**](https://viget.com/about/team/zporter) When working on
projects, I occassionally need to run the application on port 80. While
I could use a tool like [Pow](http://pow.cx/) to accomplish this, I
choose to use [Passenger
Standalone](http://www.modrails.com/documentation/Users%20guide%20Standalone.html).
However, when trying to start Passenger on port 80, I will get a
response that looks something like "The address 0.0.0.0:80 is already in
use by another process". To easily view all processes communicating over
port 80, I use [`lsof`](http://linux.die.net/man/8/lsof) like so:
sudo lsof -i :80
From here, I can pin-point who the culprit is and kill it.
## SSH Configuration
[**Patrick:**](https://viget.com/about/team/preagan) SSH is a simple
tool to use when you need shell access to a remote server. Everyone is
familiar with the most basic usage:
$ ssh production.host
Command-line options give you control over more options such as the user
and private key file that you use to authenticate:
$ ssh -l www-data -i /Users/preagan/.ssh/viget production.host
However, managing these options with the command-line is tedious if you
use different private keys for work-related and personal servers. This
is where your local `.ssh/config` file can help -- by specifying the
host that you connect to, you can set specific options for that
connection:
# ~/.ssh/config
Host production.host
User www-data
IdentityFile /Users/preagan/.ssh/viget
Now, simply running `ssh production.host` will use the correct username
and private key when authenticating. Additionally, services that use SSH
as the underlying transport mechanism will honor these settings -- you
can use this with Github to send an alternate private key just as
easily:
Host github.com
IdentityFile /Users/preagan/.ssh/github
**Bonus Tip**
This isn't limited to just setting host-specific options, you can also
use this configuration file to create quick aliases for hosts that
aren't addressable by DNS:
Host prod
Hostname 192.168.1.1
Port 6000
User www-data
IdentityFile /Users/preagan/.ssh/production-key
All you need to do is run `ssh prod` and you're good to go. For more
information on what settings are available, check out the manual
([`man ssh_config`](http://linux.die.net/man/5/ssh_config)).
## Invoking Remote Commands with SSH
[**David**:](https://viget.com/about/team/deisinger) You're already
using SSH to launch interactive sessions on your remote servers, but DID
YOU KNOW you can also pass the commands you want to run to the `ssh`
program and use the output just like you would a local operation? For
example, if you want to pull down a production database dump, you could:
1. `ssh` into your production server
2. Run `mysqldump` to generate the data dump
3. Run `gzip` to create a compressed file
4. Log out
5. Use `scp` to grab the file off the remote server
Or! You could use this here one-liner:
ssh user@host.com "mysqldump -u db_user -h db_host -pdb_password db_name | gzip" > production.sql.gz
Rather than starting an interactive shell, you're logging in, running
the `mysqldump` command, piping the result into `gzip`, and then taking
the result and writing it to a local file. From there, you could chain
on decompressing the file, importing it into your local database, etc.
**Bonus tip:** store long commands like this in
[boom](https://github.com/holman/boom) for easy recall.
------------------------------------------------------------------------
Well, that's all we've got for you. Hope you picked up something useful
along the way. What are your go-to command line tricks? Let us know in
the comments.

View File

@@ -0,0 +1,345 @@
---
title: "Local Docker Best Practices"
date: 2022-05-05T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/local-docker-best-practices/
---
Here at Viget, Docker has become an indispensable tool for local
development. We build and maintain a ton of apps across the team,
running different stacks and versions, and being able to package up a
working dev environment makes it much, much easier to switch between
apps and ramp up new devs onto projects. That's not to say that
developing with Docker locally isn't without its
drawbacks[^1^](#fn1){#fnref1 .footnote-ref role="doc-noteref"}, but
they're massively outweighed by the ease and convenience it unlocks.
Over time, we've developed our own set of best practices for effectively
setting Docker up for local development. Please note that last bit ("for
local development") -- if you're creating images for deployment
purposes, most of these principles don't apply. Our typical setup
involves the following containers, orchestrated with Docker Compose:
1. The application (e.g. Rails, Django, or Phoenix)
2. A JavaScript watcher/compiler (e.g. `webpack-dev-server`)
3. A database (typically PostgreSQL)
4. Additional necessary infrastructure (e.g. Redis, ElasticSearch,
Mailhog)
5. Occasionally, additional instances of the app doing things other
than running the development server (think background jobs)
So with that architecture in mind, here are the best practices we've
tried to standardize on:
1. [Don\'t put code or app-level dependencies into the
image](#1-dont-put-code-or-app-level-dependencies-into-the-image)
2. [Don\'t use a Dockerfile if you don\'t have
to](#2-dont-use-a-dockerfile-if-you-dont-have-to)
3. [Only reference a Dockerfile once in
`docker-compose.yml`](#3-only-reference-a-dockerfile-once-in-docker-compose-yml)
4. [Cache dependencies in named
volumes](#4-cache-dependencies-in-named-volumes)
5. [Put ephemeral stuff in named
volumes](#5-put-ephemeral-stuff-in-named-volumes)
6. [Clean up after `apt-get update`](#6-clean-up-after-apt-get-update)
7. [Prefer `exec` to `run`](#7-prefer-exec-to-run)
8. [Coordinate services with
`wait-for-it`](#8-coordinate-services-with-wait-for-it)
9. [Start entrypoint scripts with `set -e` and end with
`exec "$@"`](#9-start-entrypoint-scripts-with-set-e-and-end-with-exec)
10. [Target different CPU architectures with
`BUILDARCH`](#10-target-different-cpu-architectures-with-buildarch)
11. [Prefer `docker compose` to
`docker-compose`](#11-prefer-docker-compose-to-docker-compose)
------------------------------------------------------------------------
### 1. Don't put code or app-level dependencies into the image [\#](#1-dont-put-code-or-app-level-dependencies-into-the-image "Direct link to 1. Don't put code or app-level dependencies into the image"){.anchor} {#1-dont-put-code-or-app-level-dependencies-into-the-image}
Your primary Dockerfile, the one the application runs in, should include
all the necessary software to run the app, but shouldn't include the
actual application code itself -- that'll be mounted into the container
when `docker-compose run` starts and synced between the container and
the local machine.
Additionally, it's important to distinguish between system-level
dependencies (like ImageMagick) and application-level ones (like
Rubygems and NPM packages) -- the former should be included in the
Dockerfile; the latter should not. Baking application-level dependencies
into the image means that it'll have to be rebuilt every time someone
adds a new one, which is both time-consuming and error-prone. Instead,
we install those dependencies as part of a startup script.
### 2. Don't use a Dockerfile if you don't have to [\#](#2-dont-use-a-dockerfile-if-you-dont-have-to "Direct link to 2. Don't use a Dockerfile if you don't have to"){.anchor} {#2-dont-use-a-dockerfile-if-you-dont-have-to}
With point #1 in mind, you might find you don't need to write a
Dockerfile at all. If your app doesn't have any special dependencies,
you might be able to point your `docker-compose.yml` entry right at the
official Docker repository (i.e. just reference `ruby:2.7.6`). This
isn't very common -- most apps and frameworks require some amount of
infrastructure (e.g. Rails needs a working version of Node), but if you
find yourself with a Dockerfile that contains just a single `FROM` line,
you can just cut it.
### 3. Only reference a Dockerfile once in `docker-compose.yml` [\#](#3-only-reference-a-dockerfile-once-in-docker-compose-yml "Direct link to 3. Only reference a Dockerfile once in docker-compose.yml"){.anchor} {#3-only-reference-a-dockerfile-once-in-docker-compose-yml}
If you're using the same image for multiple services (which you
should!), only provide the build instructions in the definition of a
single service, assign a name to it, and then reference that name for
the additional services. So as an example, imagine a Rails app that uses
a shared image for running the development server and
`webpack-dev-server`. An example configuration might look like this:
services:
rails:
image: appname_rails
build:
context: .
dockerfile: ./.docker-config/rails/Dockerfile
command: ./bin/rails server -p 3000 -b '0.0.0.0'
node:
image: appname_rails
command: ./bin/webpack-dev-server
This way, when we build the services (with `docker-compose build`), our
image only gets built once. If instead we'd omitted the `image:`
directives and duplicated the `build:` one, we'd be rebuilding the exact
same image twice, wasting your disk space and limited time on this
earth.
### 4. Cache dependencies in named volumes [\#](#4-cache-dependencies-in-named-volumes "Direct link to 4. Cache dependencies in named volumes"){.anchor} {#4-cache-dependencies-in-named-volumes}
As mentioned in point #1, we don't bake code dependencies into the image
and instead install them on startup. As you can imagine, this would be
pretty slow if we installed every gem/pip/yarn library from scratch each
time we restarted the services (hello NOKOGIRI), so we use Docker's
named volumes to keep a cache. The config above might become something
like:
volumes:
gems:
yarn:
services:
rails:
image: appname_rails
build:
context: .
dockerfile: ./.docker-config/rails/Dockerfile
command: ./bin/rails server -p 3000 -b '0.0.0.0'
volumes:
- .:/app
- gems:/usr/local/bundle
- yarn:/app/node_modules
node:
image: appname_rails
command: ./bin/webpack-dev-server
volumes:
- .:/app
- yarn:/app/node_modules
Where specifically you should mount the volumes to will vary by stack,
but the same principle applies: keep the compiled dependencies in named
volumes to massively decrease startup time.
### 5. Put ephemeral stuff in named volumes [\#](#5-put-ephemeral-stuff-in-named-volumes "Direct link to 5. Put ephemeral stuff in named volumes"){.anchor} {#5-put-ephemeral-stuff-in-named-volumes}
While we're on the subject of using named volumes to increase
performance, here's another hot tip: put directories that hold files you
don't need to edit into named volumes to stop them from being synced
back to your local machine (which carries a big performance cost). I'm
thinking specifically of `log` and `tmp` directories, in addition to
wherever your app stores uploaded files. A good rule of thumb is, if
it's `.gitignore`'d, it's a good candidate for a volume.
### 6. Clean up after `apt-get update` [\#](#6-clean-up-after-apt-get-update "Direct link to 6. Clean up after apt-get update"){.anchor} {#6-clean-up-after-apt-get-update}
If you use Debian-based images as the starting point for your
Dockerfiles, you've noticed that you have to run `apt-get update` before
you're able to `apt-get install` your dependencies. If you don't take
precautions, this is going to cause a bunch of additional data to get
baked into your image, drastically increasing its size. Best practice is
to do the update, install, and cleanup in a single `RUN` command:
RUN apt-get update &&
apt-get install -y libgirepository1.0-dev libpoppler-glib-dev &&
rm -rf /var/lib/apt/lists/*
### 7. Prefer `exec` to `run` [\#](#7-prefer-exec-to-run "Direct link to 7. Prefer exec to run"){.anchor} {#7-prefer-exec-to-run}
If you need to run a command inside a container, you have two options:
`run` and `exec`. The former is going to spin up a new container to run
the command, while the latter attaches to an existing running container.
In almost every instance, assuming you pretty much always have the
services running while you're working on the app, `exec` (and
specifically `docker-compose exec`) is what you want. It's faster to
spin up and doesn't carry any chance of leaving weird artifacts around
(which will happen if you're not careful about including the `--rm` flag
with `run`).
### 8. Coordinate services with `wait-for-it` [\#](#8-coordinate-services-with-wait-for-it "Direct link to 8. Coordinate services with wait-for-it"){.anchor} {#8-coordinate-services-with-wait-for-it}
Given our dependence on shared images and volumes, you may encounter
issues where one of your services starts before another service's
`entrypoint` script finishes executing, leading to errors. When this
occurs, we'll pull in the [`wait-for-it` utility
script](https://github.com/vishnubob/wait-for-it), which takes a web
location to check against and a command to run once that location sends
back a response. Then we update our `docker-compose.yml` to use it:
volumes:
gems:
yarn:
services:
rails:
image: appname_rails
build:
context: .
dockerfile: ./.docker-config/rails/Dockerfile
command: ./bin/rails server -p 3000 -b '0.0.0.0'
volumes:
- .:/app
- gems:/usr/local/bundle
- yarn:/app/node_modules
node:
image: appname_rails
command: [
"./.docker-config/wait-for-it.sh",
"rails:3000",
"--timeout=0",
"--",
"./bin/webpack-dev-server"
]
volumes:
- .:/app
- yarn:/app/node_modules
This way, `webpack-dev-server` won't start until the Rails development
server is fully up and running.
[]{#9-start-entrypoint-scripts-with-set-e-and-end-with-exec}
### 9. Start entrypoint scripts with `set -e` and end with `exec "$@"` [\#](#9-start-entrypoint-scripts-with-set-e-and-end-with-exec "Direct link to 9. Start entrypoint scripts with set -e and end with exec "$@""){.anchor aria-label="Direct link to 9. Start entrypoint scripts with set -e and end with exec \"$@\""}
The setup we\'ve described here depends a lot on using
[entrypoint](https://docs.docker.com/compose/compose-file/#entrypoint)
scripts to install dependencies and manage other setup. There are two
things you should include in **every single one** of these scripts, one
at the beginning, one at the end:
- At the top of the file, right after `#!/bin/bash` (or similar), put
`set -e`. This will ensure that the script exits if any line exits
with an error.
- At the end of the file, put `exec "$@"`. Without this, the
instructions you pass in with the
[command](https://docs.docker.com/compose/compose-file/#command)
directive won\'t execute.
[Here\'s a good StackOverflow
answer](https://stackoverflow.com/a/48096779) with some more
information.
### 10. Target different CPU architectures with `BUILDARCH` [\#](#10-target-different-cpu-architectures-with-buildarch "Direct link to 10. Target different CPU architectures with BUILDARCH"){.anchor} {#10-target-different-cpu-architectures-with-buildarch}
We're presently about evenly split between Intel and Apple Silicon
laptops. Most of the common base images you pull from
[DockerHub](https://hub.docker.com/) are multi-platform (for example,
look at the "OS/Arch" dropdown for the [Ruby
image](https://hub.docker.com/layers/library/ruby/2.7.6/images/sha256-1af3ca0ab535007d18f7bc183cc49c228729fc10799ba974fbd385889e4d658a?context=explore)),
and Docker will pull the correct image for the local architecture.
However, if you're doing anything architecture-specific in your
Dockerfiles, you might encounter difficulties.
As mentioned previously, we'll often need a specific version of Node.js
running inside a Ruby-based image. A way we'd commonly set this up is
something like this:
FROM ruby:2.7.6
RUN curl -sS https://nodejs.org/download/release/v16.17.0/node-v16.17.0-linux-x64.tar.gz
| tar xzf - --strip-components=1 -C "/usr/local"
This works fine on Intel Macs, but blows up on Apple Silicon -- notice
the `x64` in the above URL? That needs to be `arm64` on an M1. The
easiest option is to specify `platform: linux/amd64` for each service
using this image in your `docker-compose.yml`, but that's going to put
Docker into emulation mode, which has performance drawbacks as well as
[other known
issues](https://docs.docker.com/desktop/mac/apple-silicon/#known-issues).
Fortunately, Docker exposes a handful of [platform-related
arguments](https://docs.docker.com/engine/reference/builder/#automatic-platform-args-in-the-global-scope)
we can lean on to target specific architectures. We'll use `BUILDARCH`,
the architecture of the local machine. While there's no native
conditional functionality in the Dockerfile spec, we can do a little bit
of shell scripting inside of a `RUN` command to achieve the desired
result:
FROM ruby:2.7.6
ARG BUILDARCH
RUN if [ "$BUILDARCH" = "arm64" ];
then curl -sS https://nodejs.org/download/release/v16.17.0/node-v16.17.0-linux-arm64.tar.gz
| tar xzf - --strip-components=1 -C "/usr/local";
else curl -sS https://nodejs.org/download/release/v16.17.0/node-v16.17.0-linux-x64.tar.gz
| tar xzf - --strip-components=1 -C "/usr/local";
fi
This way, a dev running on Apple Silicon will download and install
`node-v16.17.0-linux-arm64`, and someone with Intel will use
`node-v16.17.0-linux-x64`.
### 11. Prefer `docker compose` to `docker-compose` [\#](#11-prefer-docker-compose-to-docker-compose "Direct link to 11. Prefer docker compose to docker-compose"){.anchor} {#11-prefer-docker-compose-to-docker-compose}
Though both `docker compose up` and `docker-compose up` (with or without
a hyphen) work to spin up your containers, per this [helpful
StackOverflow answer](https://stackoverflow.com/a/66516826),
"`docker compose` (with a space) is a newer project to migrate compose
to Go with the rest of the docker project."
*Thanks [Dylan](https://www.viget.com/about/team/dlederle-ensign/) for
this one.*
[[Learn More]{.util-breadcrumb-md .mb-8 .group-hover:translate-y-20
.group-hover:opacity-0 .transition-all .ease-in-out
.duration-500}](https://www.viget.com/careers/application-developer/){.relative
.flex .group .flex-col .p-32 .md:p-40 .lg:p-64 .z-10}
### We're hiring Application Developers. Learn more and introduce yourself. {#were-hiring-application-developers.-learn-more-and-introduce-yourself. .text-20 .md:text-24 .lg:text-32 .font-bold .leading-[170%] .group-hover:-translate-y-20 .transition-transform .ease-in-out .duration-500}
![](data:image/svg+xml;base64,PHN2ZyBjbGFzcz0icmVjdC1pY29uLW1kIHNlbGYtZW5kIG10LTE2IGdyb3VwLWhvdmVyOi10cmFuc2xhdGUteS0yMCB0cmFuc2l0aW9uLWFsbCBlYXNlLWluLW91dCBkdXJhdGlvbi01MDAiIHZpZXdib3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBhcmlhLWhpZGRlbj0idHJ1ZSI+CjxwYXRoIGZpbGwtcnVsZT0iZXZlbm9kZCIgY2xpcC1ydWxlPSJldmVub2RkIiBkPSJNMTMuNzg0OCAxOS4zMDkxQzEzLjQ3NTggMTkuNTg1IDEzLjAwMTcgMTkuNTU4MyAxMi43MjU4IDE5LjI0OTRDMTIuNDQ5OCAxOC45NDA1IDEyLjQ3NjYgMTguNDY2MyAxMi43ODU1IDE4LjE5MDRMMTguNzg2NiAxMi44MzAxTDQuNzUxOTUgMTIuODMwMUM0LjMzNzc0IDEyLjgzMDEgNC4wMDE5NSAxMi40OTQzIDQuMDAxOTUgMTIuMDgwMUM0LjAwMTk1IDExLjY2NTkgNC4zMzc3NCAxMS4zMzAxIDQuNzUxOTUgMTEuMzMwMUwxOC43ODU1IDExLjMzMDFMMTIuNzg1NSA1Ljk3MDgyQzEyLjQ3NjYgNS42OTQ4OCAxMi40NDk4IDUuMjIwNzYgMTIuNzI1OCA0LjkxMTg0QzEzLjAwMTcgNC42MDI5MiAxMy40NzU4IDQuNTc2MTggMTMuNzg0OCA0Ljg1MjEyTDIxLjIzNTggMTEuNTA3NkMyMS4zNzM4IDExLjYyNDQgMjEuNDY5IDExLjc5MDMgMjEuNDk0NSAxMS45NzgyQzIxLjQ5OTIgMTIuMDExOSAyMS41MDE1IDEyLjA0NjEgMjEuNTAxNSAxMi4wODA2QzIxLjUwMTUgMTIuMjk0MiAyMS40MTA1IDEyLjQ5NzcgMjEuMjUxMSAxMi42NEwxMy43ODQ4IDE5LjMwOTFaIj48L3BhdGg+Cjwvc3ZnPg==){.rect-icon-md
.self-end .mt-16 .group-hover:-translate-y-20 .transition-all
.ease-in-out .duration-500}
So there you have it, a short list of the best practices we've developed
over the last several years of working with Docker. We'll try to keep
this list updated as we get better at doing and documenting this stuff.
If you're interested in reading more, here are a few good links:
- [Ruby on Whales: Dockerizing Ruby and Rails
development](https://evilmartians.com/chronicles/ruby-on-whales-docker-for-ruby-rails-development)
- [Docker: Right for Us. Right for
You?](https://www.viget.com/articles/docker-right-for-us-right-for-you-1/)
- [Docker + Rails: Solutions to Common
Hurdles](https://www.viget.com/articles/docker-rails-solutions-to-common-hurdles/)
------------------------------------------------------------------------
1. [Namely, there's a significant performance hit when running Docker
on Mac (as we do) in addition to the cognitive hurdle of all your
stuff running inside containers. If I worked at a product shop,
where I was focused on a single codebase for the bulk of my time,
I'd think hard before going all in on local
Docker.[↩︎](#fnref1){.footnote-back role="doc-backlink"}]{#fn1}

View File

@@ -0,0 +1,122 @@
---
title: "Maintenance Matters: Continuous Integration"
date: 2022-08-26T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/maintenance-matters-continuous-integration/
---
*This article is part of a series focusing on how developers can center
and streamline software maintenance. *The other articles in the
Maintenance Matters series are: **[Code
Coverage](https://www.viget.com/articles/maintenance-matters-code-coverage/){target="_blank"},
**[Documentation](https://www.viget.com/articles/maintenance-matters-documentation/){target="_blank"},****
[Default
Formatting](https://www.viget.com/articles/maintenance-matters-default-formatting/){target="_blank"}, [Building
Helpful
Logs](https://www.viget.com/articles/maintenance-matters-helpful-logs/){target="_blank"},
[Timely
Upgrades](https://www.viget.com/articles/maintenance-matters-timely-upgrades/){target="_blank"},
and [Code
Reviews](https://www.viget.com/articles/maintenance-matters-code-reviews/){target="_blank"}.**
As Annie said in her [intro
post](https://www.viget.com/articles/maintenance-matters/):
> There are many factors that go into a successful project, but in this
> series, we're focusing on the small things that developers usually
> have control over. Over the next few months, we'll be expanding on
> many of these in separate articles.
Today I'd like to talk to you about **Continuous Integration**, as I
feel strongly that it's something no software effort should be without.
Now, before we start, I should clarify:
[Wikipedia](https://en.wikipedia.org/wiki/Continuous_integration)
defines Continuous Integration as "the practice of merging all
developers' working copies to a shared mainline several times a day."
Maybe this was a revolutionary idea in 1991? I don't know, I was in
second grade. Nowadays, at least at Viget, the whole team frequently
merging their work into a common branch is the noncontroversial default.
For the purposes of this Maintenance Matters article, I'll be focused on
this aspect of CI:
> In addition to automated unit tests, organisations using CI typically
> use a build server to implement continuous processes of applying
> quality control in general -- small pieces of effort, applied
> frequently.
If you're not familiar with the concept, it's pretty simple: a typical
Viget dev project includes one or more [GitHub Action
Workflows](https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions)
that define a series of tasks that should run every time code is pushed
to the central repository. At a minimum, the workflow checks out the
code, installs the necessary dependencies, and runs the automated test
suite. In most cases, pushes to the `main` branch will trigger automatic
deployment to an internal QA environment (a process known as [continuous
deployment](https://en.wikipedia.org/wiki/Continuous_deployment)). If
any step fails (e.g. the tests don't pass or a dependency can't be
installed), the process aborts and the team gets notified.
CI is a very tactical, concrete thing you do, but more than that, it's a
mindset -- it's your team's values made concrete. It's one thing to say
"all projects must have 100% code coverage"; it's another thing entirely
to move your deploys into a CI task that only runs after the coverage
check, so that nothing can go live until it's fully tested. Continuous
Integration is code that improves the way you write code, and a
commitment to continuous improvement.
So what can you do with Continuous Integration? I've mentioned the two
primary tasks (running tests and automated deployment), but that's
really just the tip of the iceberg. You can also:
- Check code coverage
- Run linters (like `rubocop`, `eslint`, or `prettier`) to enforce
coding standards
- Scan for security issues with your dependencies
- Tag releases in Sentry (or your error tracking tool of choice)
- Deploy feature branches to
[Vercel](https://vercel.com/)/[Netlify](https://www.netlify.com/)/[Fly.io](https://fly.io/)
for easy previews during code review
- Build Docker images and push them to a registry
- Create release artifacts
Really, anything a computer can do, a CI runner can do:
- Send messages to Slack
- Spin up new servers as part of a blue/green deployment strategy
- Run your seed script, assert that every model has a valid record
- Grep your codebase for git conflict artifacts
- Assert that all images have been properly optimized
That's not to say you can't overdo it -- you can. It can take a long
time to configure, and workflows can take a long time to run as
codebases grow. It can cost a lot if you're running a lot of builds. It
can be error-prone, with issues that only occur in CI. And it can be
interpersonally fraught -- as I said, it's your team's values made
concrete, and sometimes getting that alignment is the hardest part.
Nevertheless, I consider some version of CI to be mandatory for any
software project. It should be part of initial project setup -- get
aligned with your team on what standards you want to enforce, choose
your CI tool, and get it configured ASAP, ideally before development
begins in earnest. It's much easier to stick with established, codified
standards than to come back and try to add them later.
As mentioned previously, we're big fans of GitHub Actions and its
seamless integration with the rest of our workflow. [Here's a good guide
for getting started](https://docs.github.com/en/actions/quickstart).
We've also used and enjoyed [CircleCI](https://circleci.com/), [GitLab
CI/CD](https://docs.gitlab.com/ee/ci/), and
[Jenkins](https://www.jenkins.io/). Ultimately, the tool doesn't matter
all that much provided it can reliably trigger jobs on push and report
failures, so find the one that works best for your team.
That's the what, why, and how of Continuous Integration. Of course, all
this is precipitated by having a high-functioning team. And there's no
[GitHub Action for
**that**](https://github.com/marketplace?type=actions&query=good+development+team),
unfortunately.
*The next article in this series is [Maintenance Matters: Code
Coverage.](https://www.viget.com/articles/maintenance-matters-code-coverage/)*

View File

@@ -0,0 +1,190 @@
---
title: "Making an Email-Powered E-Paper Picture Frame"
date: 2021-05-12T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/making-an-email-powered-e-paper-picture-frame/
---
Over the winter, inspired by this [digital photo
frame](http://toolsandtoys.net/aura-mason-smart-digital-picture-frame/)
that uses email to add new photos, I built and programmed a trio of
e-paper picture frames for my family, and I thought it\'d be cool to
walk through the process in case someone out there wants to try
something similar.
![image](IMG_0120.jpeg)
In short, it\'s a Raspberry Pi Zero connected to a roughly 5-by-7-inch
e-paper screen, running some software I wrote in Go and living inside a
frame I put together. This project consists of four main parts:
1. The email-to-S3 gateway, [described in detail in a previous
post](https://www.viget.com/articles/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/);
2. The software to display the photos on the screen;
3. Miscellaneous Raspberry Pi configuration; and
4. The physical frame itself.
As for materials, you\'ll need the following:
- [A Raspberry Pi Zero with
headers](https://www.waveshare.com/raspberry-pi-zero-wh.htm)
- [An e-paper
display](https://www.waveshare.com/7.5inch-hd-e-paper-hat.htm)
- A micro SD card (and some way to write to it)
- Some 1x4 lumber (I used oak)
- [4x metal standoffs](https://www.amazon.com/gp/product/B00TX464XQ)
- [A 6x8 piece of
acrylic](https://www.amazon.com/gp/product/B07J4WX7BH)
- Some wood glue to attach the boards, and some wood screws to attach
the standoffs
I\'ll get more into the woodworking tools down below.
[]{#the-email-to-s3-gateway}
## The Email-to-S3 Gateway [\#](#the-email-to-s3-gateway "Direct link to The Email-to-S3 Gateway"){.anchor aria-label="Direct link to The Email-to-S3 Gateway"}
Like I said, [I\'ve already documented this part pretty
thoroughly](https://www.viget.com/articles/email-photos-to-an-s3-bucket-with-aws-lambda-with-cropping-in-ruby/),
but in short, we use an array of AWS services to set up an email address
that fires off a Lambda function when it receives an email. The function
extracts the attachments from the email, crops them a couple of ways
(one for display on a webpage, the other for display on the screen), and
uploads the results into an S3 bucket.
![image](Screen_Shot_2021-05-09_at_1_26_39_PM.png)
[]{#the-software}
## The Software [\#](#the-software "Direct link to The Software"){.anchor aria-label="Direct link to The Software"}
The next task was to write the code that runs on the Pi that can update
the display periodically. I also thought it\'d be cool if it could
expose a simple web interface on the local network to let my family
members browse the photos and display them on the frame. When selecting
a language, I could have gone with either Ruby or Python, the former
since that\'s what I\'m most familiar with, the latter because that\'s
what [the code provided by
Waveshare](https://github.com/waveshare/e-Paper/tree/master/RaspberryPi_JetsonNano/python/lib/waveshare_epd),
the manufacturer, is written in.
But I chose neither of those options, reader, opting instead for Go. Why
Go, you ask?
- **I wanted something robust.** Ideally, this code will run on these
devices for years with no downtime. If something does go wrong, I
won\'t have any way to debug the problems remotely, instead having
to wait until the next time I\'m on the same wifi network with the
failing device. Go\'s explicit error checking was appealing in this
regard.
- **I wanted deployment to be simple.** I didn\'t have any appetite
for all the configuration required to get a Python or Ruby app
running on the Pi. The fact that I could compile my code into a
single binary that I could `scp` onto the device and manage with
`systemd` was compelling.
- **I wanted a web UI**, but it wasn\'t the main focus. With Go, I
could just import the built-in `net/http` to add simple web
functionality.
To interface with the screen, I started with [this super awesome GitHub
project](https://github.com/gandaldf/rpi). Out of the box, it didn\'t
work with my screen, I *think* because Waveshare offers a bunch of
different screens and the specific instructions differ between them. So
I forked it and found the specific Waveshare Python code that worked
with my screen ([this
one](https://github.com/waveshare/e-Paper/blob/master/RaspberryPi_JetsonNano/python/lib/waveshare_epd/epd7in5_HD.py),
I believe), and then it was just a matter of updating the Go code to
match the Python, which was tricky because I don\'t know very much about
low-level electronics programming, but also pretty easy since the Go and
Python are set up in pretty much the same way.
[Here\'s my
fork](https://github.com/dce/rpi/blob/master/epd7in5/epd7in5.go) --- if
you go with the exact screen I linked to above, it *should* work, but
there\'s a chance you end up having to do what I did and customizing it
to match Waveshare\'s official source.
Writing the main Go program was a lot of fun. I managed to do it all ---
interfacing with the screen, displaying a random photo, and serving up a
web interface --- in one (IMO) pretty clean file. [Here\'s the
source](https://github.com/dce/e-paper-frame), and I\'ve added some
scripts to hopefully making hacking on it a bit easier.
[]{#configuring-the-raspberry-pi}
## Configuring the Raspberry Pi [\#](#configuring-the-raspberry-pi "Direct link to Configuring the Raspberry Pi"){.anchor aria-label="Direct link to Configuring the Raspberry Pi"}
Setting up the Pi was pretty straightforward, though not without a lot
of trial-and-error the first time through:
1. Flash Raspberry Pi OS onto the SD card
2. [Configure your wifi
information](https://www.raspberrypi.org/documentation/configuration/wireless/wireless-cli.md)
and [enable
SSH](https://howchoo.com/g/ote0ywmzywj/how-to-enable-ssh-on-raspbian-without-a-screen#create-an-empty-file-called-ssh)
3. Plug it in --- if it doesn\'t join your network, you probably messed
something up in step 2
4. SSH in (`ssh pi@<192.168.XXX.XXX>`, password `raspberry`) and put
your public key in `.ssh`
5. Go ahead and run a full system update
(`sudo apt update && sudo apt upgrade -y`)
6. Install the AWS CLI and NTP (`sudo apt-get install awscli ntp`)
7. You\'ll need some AWS credentials --- if you already have a local
`~/.aws/config`, just put that file in the same place on the Pi; if
not, run `aws configure`
8. Enable SPI --- run `sudo raspi-config`, then select \"Interface
Options\", \"SPI\"
9. Upload `frame-server-arm` from your local machine using `scp`; I
have it living in `/home/pi/frame`
10. Copy the [cron
script](https://github.com/dce/e-paper-frame/blob/main/etc/random-photo)
into `/etc/cron.hourly` and make sure it has execute permissions
(then give it a run to pull in the initial photos)
11. Add a line into the root user\'s crontab to run the script on
startup: `@reboot /etc/cron.hourly/random-photo`
12. Copy the [`systemd`
service](https://github.com/dce/e-paper-frame/blob/main/etc/frame-server.service)
into `/etc/systemd/system`, then enable and start it
And that should be it. The photo gallery should be accessible at a local
IP and the photo should update hourly (though not ON the hour as that\'s
not how `cron.hourly` works for some reason).
![image](IMG_0122.jpeg)
[]{#building-the-frame}
## Building the Frame [\#](#building-the-frame "Direct link to Building the Frame"){.anchor aria-label="Direct link to Building the Frame"}
This part is strictly optional, and there are lots of ways you can
display your frame. I took (a lot of) inspiration from this [\"DIY
Modern Wood and Acrylic Photo
Stand\"](https://evanandkatelyn.com/2017/10/modern-wood-and-acrylic-photo-stand/)
with just a few modifications:
- I used just one sheet of acrylic instead of two
- I used a couple small pieces of wood with a shallow groove to create
a shelf for the screen to rest on
- I used a drill press to make a 3/4\" hole in the middle of the board
to run the cable through
- I didn\'t bother with the pocket holes --- wood glue is plenty
strong
The tools I used were: a table saw, a miter saw, a drill press, a
regular cordless drill (**do not** try to make the larger holes in the
acrylic with a drill press omfg), an orbital sander, and some 12\"
clamps. I\'d recommend starting with some cheap pine before using nicer
wood --- you\'ll probably screw something up the first time if you\'re
anything like me.
This project was a lot of fun. Each part was pretty simple --- I\'m
certainly no expert at AWS, Go programming, or woodworking --- but
combined together they make something pretty special. Thanks for
reading, and I hope this inspires you to make something for your mom or
someone else special to you.
*Raspberry Pi illustration courtesy of [Jonathan
Rutheiser](https://commons.wikimedia.org/wiki/File:Raspberry_Pi_Vector_Illustration.svg)*

View File

@@ -0,0 +1,81 @@
---
title: "Manual Cropping with Paperclip"
date: 2012-05-31T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/manual-cropping-with-paperclip/
---
It's relatively straightforward to add basic manual (browser-based)
cropping support to your
[Paperclip](https://github.com/thoughtbot/paperclip) image attachments.
See [RJCrop](https://github.com/jschwindt/rjcrop) for one valid
approach. What's not so straightforward, though, is adding manual
cropping while preserving Paperclip's built-in thumbnailing
capabilities. Here's how.
Just so we're on the same page, when we're talking about "thumbnailing,"
we're talking about the ability to set a size of `50x50#`, which means
"scale and crop the image into a 50 by 50 pixel square." If the original
image is 200x100, it would first be scaled down to 100x50, and then 25
pixels trimmed from both sides to arrive at the final dimensions. This
is not a native capability of ImageMagick, but rather the result of some
decently complex code in Paperclip.
Our goal is to allow a user to select a portion of an image and then
create a thumbnail of *just that selected portion*, ideally taking
advantage of Paperclip\'s existing cropping/scaling logic.
Any time you're dealing with custom Paperclip image processing, you're
talking about creating a custom
[Processor](https://github.com/thoughtbot/paperclip#post-processing). In
this case, we'll be subclassing the default
[Thumbnail](https://github.com/thoughtbot/paperclip/blob/master/lib/paperclip/thumbnail.rb)
processor and making a few small tweaks. We'll imagine you have a model
with the fields `crop_x`, `crop_y`, `crop_width`, and `crop_height`. How
those get set is left as an exercise for the reader (though I recommend
[JCrop](http://deepliquid.com/content/Jcrop.html)). Some code, then:
module Paperclip
class ManualCropper < Thumbnail
def initialize(file, options = {}, attachment = nil)
super
@current_geometry.width = target.crop_width
@current_geometry.height = target.crop_height
end
def target
@attachment.instance
end
def transformation_command
crop_command = [
"-crop",
"#{target.crop_width}x"
"#{target.crop_height}+"
"#{target.crop_x}+"
"#{target.crop_y}",
"+repage"
]
crop_command + super
end
end
end
In our `initialize` method, we call super, which sets a whole host of
instance variables, include `@current_geometry`, which is responsible
for creating the geometry string that will crop and scale our image. We
then set its `width` and `height` to be the dimensions of our cropped
image.
We also override the `transformation_command` method, prepending our
manual crop to the instructions provided by `@current_geometry`. The end
result is a geometry string which crops the image, repages it, then
scales the image and crops it a second time. Simple, but not certainly
not intuitive, at least not to me.
From here, you can include this cropper using the `:processers`
directive in your `has_attached_file` declaration, and you should be
good to go. This simple approach assumes that the crop dimensions will
always be set, so tweak accordingly if that's not the case.

View File

@@ -0,0 +1,79 @@
---
title: "Getting (And Staying) Motivated to Code"
date: 2009-01-21T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/motivated-to-code/
---
When you're working on code written by another programmer --- whether a
coworker, open source contributor, or (worst of all) *yourself* from six
months ago --- it's all too easy to get frustrated and fall into an
unproductive state. The following are some ways I've found to overcome
this apprehension and get down to business.
### Tiny Improvements, Tiny Commits
When confronted with a sprawling, outdated codebase, it's easy to get
overwhelmed. To get started, I suggest making a tiny improvement. Add a
[named
scoped](http://ryandaigle.com/articles/2008/3/24/what-s-new-in-edge-rails-has-finder-functionality).
Use a more advanced
[enumerable](http://www.ruby-doc.org/core/classes/Enumerable.html)
method. And, as soon as you've finished, commit it. Committing feels
great and really hammers home that you've accomplished something of
value. Additionally, committing increases momentum and gives you the
courage to take on larger changes.
### Make a List
In *Getting Things Done*, [David Allen](http://www.davidco.com/) says,
> You'll invariably feel a relieving of pressure about anything you have
> a commitment to change or do, when you decide on the very next
> physical action required to move it forward.
I like to take it a step further: envision the program as I want it to
be, and then list the steps it will take to get there. Even though the
list will change substantially along the way, having a path and a
destination removes a lot of the anxiety of working with unfamiliar
code.
To manage such lists, I love [Things](https://culturedcode.com/things/),
but a piece of paper works just as well.
### Delete Something
As projects grow and requirements change, a lot of code outlives its
usefulness; but it sticks around anyway because, on the surface, its
presence isn't hurting anything. I'm sure you've encountered this ---
hell, I'm sure you've got extraneous code in your current project. When
confronted with such code, delete it. Deleting unused code increases
readability, decreases the likelihood of bugs, and adds to your
understanding of the remaining code. But those reasons aside, it feels
*great*. If I suspect a method isn't being used anywhere, I'll do
grep -lir "method_name" app/
to find all the places where the method name occurs.
### Stake your Claim
On one project, I couldn't do any feature development --- or even make
any commits --- until I'd rewritten the entire test suite to use
[Shoulda](http://thoughtbot.com/projects/shoulda/). It was mentally
draining work and took much longer than it shoulda (see what I did
there?). If you need to add functionality to one specific piece of the
site, take the time to address those classes and call it a victory. You
don't have to fix everything at once, and it's much easier to bring code
up to speed one class at a time. With every improvement you make, your
sense of ownership over the codebase will increase and so will your
motivation.
### In Closing
As Rails moves from an upstart framework to an established technology,
the number of legacy projects will only increase. But even outside the
scope of Rails development, or working with legacy code at all, I think
maintaining motivation is the biggest challenge we face as developers.
I'd love to hear your tips for getting and staying motivated to code.

View File

@@ -0,0 +1,46 @@
---
title: "Multi-line Memoization"
date: 2009-01-05T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/multi-line-memoization/
---
Here's a quick tip that came out of a code review we did last week. One
easy way to add caching to your Ruby app is to
[memoize](https://en.wikipedia.org/wiki/Memoization) the results of
computationally expensive methods:
``` {#code .ruby}
def foo @foo ||= expensive_method end
```
The first time the method is called, `@foo` will be `nil`, so
`expensive_method` will be called and its result stored in `@foo`. On
subsequent calls, `@foo` will have a value, so the call to
`expensive_method` will be bypassed. This works well for one-liners, but
what if our method requires multiple lines to determine its result?
``` {#code .ruby}
def foo arg1 = expensive_method_1 arg2 = expensive_method_2 expensive_method_3(arg1, arg2) end
```
A first attempt at memoization yields this:
``` {#code .ruby}
def foo unless @foo arg1 = expensive_method_1 arg2 = expensive_method_2 @foo = expensive_method_3(arg1, arg2) end @foo end
```
To me, using `@foo` three times obscures the intent of the method. Let's
do this instead:
``` {#code .ruby}
def foo @foo ||= begin arg1 = expensive_method_1 arg2 = expensive_method_2 expensive_method_3(arg1, arg2) end end
```
This clarifies the role of `@foo` and reduces LOC. Of course, if you use
the Rails built-in [`memoize`
method](http://ryandaigle.com/articles/2008/7/16/what-s-new-in-edge-rails-memoization),
you can avoid accessing these instance variables entirely, but this
technique has utility in situations where requiring ActiveSupport would
be overkill.

View File

@@ -0,0 +1,38 @@
---
title: "New Pointless Project: I Dig Durham"
date: 2011-02-25T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/new-pointless-project-i-dig-durham/
---
*This post originally appeared on [Pointless
Corp](http://pointlesscorp.com/).*
There's a lot of love at Viget South for our adopted hometown of Durham,
NC. A few of us decided to use the first [Pointless
Weekend](https://viget.com/flourish/pointless-weekend-3-new-pointless-projects) to
build a tiny application to highlight some of Durham's finer points and,
48 hours later, launched [I Dig Durham](http://idigdurham.com/). Simply
tweet to [\@idigdurham](https://twitter.com/idigdurham) (or include the
hashtag [#idigdurham](https://twitter.com/search?q=%23idigdurham)) or
post a photo to Flickr
tagged [idigdurham](http://www.flickr.com/photos/tags/idigdurham) and
we'll pull it into the site. What's more, you can order
a [t-shirt](https://idigdurham.spreadshirt.com/) with the logo on it,
with all proceeds going to [Urban Ministries of
Durham](http://www.umdurham.org/).
As [Rails Rumble](http://railsrumble.com/) (and [Node
Knockout](http://nodeknockout.com/)) veterans, we knew that there's
basically no such thing as too simple a product for these competitions
--- no matter how little you think you have to do, you're always
sweating bullets with half an hour left to go. With that in mind, we
kept I Dig Durham as simple as possible, leaving us plenty of time to
really polish the site.
Though basically feature complete, we've got a few tweaks we plan to
make to the site, and we'd like to expand the underlying app to support
I Dig sites for more of our favorite cities, but it\'s a good start from
[North Carolina\'s top digital
agency](https://www.viget.com/durham)\...though we may be biased.

View File

@@ -0,0 +1,55 @@
---
title: "New Pointless Project: OfficeGames"
date: 2012-02-28T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/new-pointless-project-officegames/
---
*This post originally appeared on [Pointless
Corp](http://pointlesscorp.com/).*
We're a competitive company, so for this year's [Pointless
Weekend](http://www.pointlesscorp.com/blog/2012-pointless-weekend-kicks-off),
the team in Viget's Durham office thought it'd be cool to put together a
simple app for keeping track of competitions around the office. 48 hours
later (give or take), we launched [OfficeGames](http://officegam.es/).
We're proud of this product, and plan to continue improving it in the
coming weeks. Some of the highlights for me:
## Everyone Doing Everything
We're a highly collaborative company, but by and large, when it comes to
client work, everyone on the team has a fairly narrow role.
[Zachary](https://www.viget.com/about/team/zporter) writes Ruby code.
[Todd](https://www.viget.com/about/team/tmoy) does UX.
[Jeremy](https://www.viget.com/about/team/jfrank) focuses on the front
end. Not so for Pointless weekend -- UX, design, and development duties
were spread out across the entire team. Everyone had the repo checked
out and was committing code.
## Responsive Design with Bootstrap
We used Twitter's [Bootstrap](https://twitter.github.com/bootstrap/)
framework to build our app. The result is a responsive design that
shines on the iPhone but holds up well on larger screens. I was
impressed with how quickly we were able to get a decent-looking site
together, and how well the framework held up once Jeremy and
[Doug](https://www.viget.com/about/team/davery) started implementing
some of [Mark](https://www.viget.com/about/team/msteinruck)'s design
ideas.
## Rails as a Mature Framework
I was impressed with the way everything came together on the backend. It
seems to me that we're finally realizing the promise of the Rails
framework: common libraries that handle the application plumbing, while
still being fully customizable, so developers can quickly knock out the
boilerplate and then focus on the unique aspects of their applications.
We used [SimplestAuth](https://github.com/vigetlabs/simplest_auth),
[InheritedResources](https://github.com/josevalim/inherited_resources),
and [SimpleForm](https://github.com/plataformatec/simple_form) to great
effect.
Sign your office up for [OfficeGames](http://officegam.es/) and then add
your coworkers to start tracking scores. Let us know what you think!

View File

@@ -0,0 +1,61 @@
---
title: "On Confidence and Real-Time Strategy Games"
date: 2011-06-30T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/on-confidence-and-real-time-strategy-games/
---
I want to talk about confidence and how it applies to being a successful
developer. But before I do that, I want to talk about
*[Z](https://en.wikipedia.org/wiki/Z_(video_game))*, a real-time
strategy game from the mid-'90s.
[![](https://upload.wikimedia.org/wikipedia/en/thumb/6/68/Z_The_Bitmap_Brothers.PNG/256px-Z_The_Bitmap_Brothers.PNG)](https://en.wikipedia.org/wiki/File:Z_The_Bitmap_Brothers.PNG)
In other popular RTSes of the time, like *Warcraft* and *Command and
Conquer*, you collected `/(gold|Tiberium|Vespene gas)/` and used it to
build units with which to smite your enemies. Z was different: no
resources, only territories that were held by either you or your
opponent. The more territories you held, the more factories you had
*and* the faster each of your factories was able to manufacture units.
If you spent a lot of time playing a Blizzard RTS (and of course you
did), your instinct is to spend the first portion of a match fortifying
your base and amassing an army, after which you head out in search of
your enemy. Try this strategy in Z, though, and by the time you put
together a respectable force, your opponent has three times as many
units and the game is all but decided. Instead, the winning strategy is
to expand early and often, defending your territories as best you can
before pushing forward.
## So What
As developers, our confidence comes from the code we've written and the
successes we've had. When we find ourselves in unfamiliar territory
(such as a new technology or problem domain), our instinct is to act
like a Starcraft player --- keep a low profile, build two (ALWAYS TWO)
barracks, and code away until we have something we're confident in. This
will get you pretty far against the Zerg swarm, but it's a losing
strategy in the realm of software development: the rest of the team
isn't waiting around for you to find your comfort zone. They're making
decisions in your absence, and they very likely aren't the same
decisions you'd make. Your lack of confidence leads to poor
implementation which leads to less confidence, from both your team and
yourself.
Instead, I contend that real-world development is closer to Z than it is
to Starcraft: show confidence early (despite lacking total understanding
of the problem) and your teammates and clients will be inclined to trust
your technical leadership, leading to better technical decisions and a
better product, giving you more confidence and your team all the more
reason to follow your advice. Just as territories lead to units lead to
more territories, confidence leads to good code leads to more
confidence.
**In short:** *display* confidence at the beginning of a project so that
you can *have* confidence when it really counts.
Do you agree? I'd love to hear your thoughts. Best comment gets [my
personal copy of Z](http://www.flickr.com/photos/deisinger/5888230612)
from 1996. You're on your own for the Windows 95 box.

View File

@@ -0,0 +1,107 @@
---
title: "OTP: a Language-Agnostic Programming Challenge"
date: 2015-01-26T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/otp-a-language-agnostic-programming-challenge/
---
We spend our days writing Ruby and JavaScript (and love it), but we're
always looking for what's next or just what's nerdy and interesting. We
have folks exploring Rust, Go, D and Elixir, to name a few. I'm
personally interested in strongly-typed functional languages like
Haskell and OCaml, but I've had little success getting through their
corresponding [animal books](http://www.oreilly.com/). I decided that if
I was going to get serious about learning this stuff, I needed a real
problem to solve.
Inspired by an [online course on
Cryptography](https://www.coursera.org/course/crypto), I specced out a
simple [one-time pad](https://en.wikipedia.org/wiki/One-time_pad)
encryptor/decryptor, [pushed it up to
GitHub](https://github.com/vigetlabs/otp) and issued a challenge to the
whole Viget dev team: write a pair of programs in your language of
choice to encrypt and decrypt a message from the command line.
## The Challenge {#thechallenge}
When you [exclusive or](https://en.wikipedia.org/wiki/Exclusive_or)
(XOR) a value by a second value, and then XOR the resulting value by the
second value, you get the original value back. Suppose you and I want to
exchange a secret message, the word "hi", and we've agreed on a secret
key, the hexadecimal number `b33f` (or in binary, 1011 0011 0011 1111).
**To encrypt:**
1. Convert the plaintext ("hi") to its corresponding [ASCII
values](https://en.wikipedia.org/wiki/ASCII#ASCII_printable_code_chart)
("h" becomes 104 or 0110 1000, "i" 105 or 0110 1001).
2. XOR the plaintext and the key:
Plaintext: 0110 1000 0110 1001
Key: 1011 0011 0011 1111
XOR: 1101 1011 0101 0110
3. Convert the result to hexadecimal:
1101 = 13 = d
1011 = 11 = b
0101 = 5 = 5
0110 = 6 = 6
4. So the resulting ciphertext is "db56".
**To decrypt:**
1. Expand the ciphertext and key to their binary forms, and XOR:
Ciphertext: 1101 1011 0101 0110
Key: 1011 0011 0011 1111
XOR: 0110 1000 0110 1001
2. Convert the resulting binary numbers to their corresponding ASCII
values:
0110 1000 = 104 = h
0110 1001 = 105 = i
3. So, as expected, the resulting plaintext is "hi".
The [Wikipedia](https://en.wikipedia.org/wiki/One-time_pad) page plus
the [project's
README](https://github.com/vigetlabs/otp#one-time-pad-otp) provide more
detail. It's a simple problem conceptually, but in order to create a
solution that passes the test suite, you'll need to figure out:
- Creating a basic command-line executable
- Reading from `STDIN` and `ARGV`
- String manipulation
- Bitwise operators
- Converting to and from hexadecimal
\* \* \*
As of today, we've created solutions in [~~eleven~~ ~~twelve~~ thirteen
languages](https://github.com/vigetlabs/otp/tree/master/languages):
- [C](https://viget.com/extend/otp-the-fun-and-frustration-of-c)
- D
- [Elixir](https://viget.com/extend/otp-ocaml-haskell-elixir)
- Go
- [Haskell](https://viget.com/extend/otp-ocaml-haskell-elixir)
- JavaScript 5
- JavaScript 6
- Julia
- [Matlab](https://viget.com/extend/otp-matlab-solution-in-one-or-two-lines)
- [OCaml](https://viget.com/extend/otp-ocaml-haskell-elixir)
- Ruby
- Rust
- Swift (thanks [wasnotrice](https://github.com/wasnotrice)!)
The results are varied and fascinating -- stay tuned for future posts
about some of our solutions. [In the
meantime](https://www.youtube.com/watch?v=TDkhl-CgETg), we'd love to see
how you approach the problem, whether in a new language or one we've
already attempted. [Fork the repo](https://github.com/vigetlabs/otp) and
show us what you've got!

View File

@@ -0,0 +1,192 @@
---
title: "OTP: a Functional Approach (or Three)"
date: 2015-01-29T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/otp-ocaml-haskell-elixir/
---
I intially started the [OTP
challenge](https://viget.com/extend/otp-a-language-agnostic-programming-challenge)
as a fun way to write some [OCaml](https://ocaml.org/). It was, so much
so that I wrote solutions in two other functional languages,
[Haskell](https://wiki.haskell.org/Haskell) and
[Elixir](http://elixir-lang.org/). I structured all three sets of
programs the same so that I could easily see their similarities and
differences. Check out the `encrypt` program in
[all](https://github.com/vigetlabs/otp/blob/master/languages/OCaml/encrypt.ml)
[three](https://github.com/vigetlabs/otp/blob/master/languages/Haskell/encrypt.hs)
[languages](https://github.com/vigetlabs/otp/blob/master/languages/Elixir/encrypt)
and then I'll share some of my favorite parts. Go ahead, I'll wait.
## Don't Cross the Streams {#dontcrossthestreams}
One tricky part of the OTP challenge is that you have to cycle over the
key if it's shorter than the plaintext. My initial approaches involved
passing around an offset and using the modulo operator, [like
this](https://github.com/vigetlabs/otp/blob/6d607129f78ccafa9a294ca04da9e4c8bf7b7cc1/decrypt.ml#L11-L14):
let get_mask key index =
let c1 = List.nth key (index mod (List.length key))
and c2 = List.nth key ((index + 1) mod (List.length key)) in
int_from_hex_chars c1 c2
Pretty gross, huh? Fortunately, both
[Haskell](http://hackage.haskell.org/package/base-4.7.0.2/docs/Prelude.html#v:cycle)
and
[Elixir](http://elixir-lang.org/docs/master/elixir/Stream.html#cycle/1)
have built-in functionality for lazy, cyclical lists, and OCaml (with
the [Batteries](http://batteries.forge.ocamlcore.org/) library) has the
[Dllist](http://batteries.forge.ocamlcore.org/doc.preview:batteries-beta1/html/api/Dllist.html)
(doubly-linked list) data structure. The OCaml code above becomes
simply:
let get_mask key =
let c1 = Dllist.get key
and c2 = Dllist.get (Dllist.next key) in
int_of_hex_chars c1 c2
No more passing around indexes or using `mod` to stay within the bounds
of the array -- the Dllist handles that for us.
Similarly, a naïve Elixir approach:
def get_mask(key, index) do
c1 = Enum.at(key, rem(index, length(key)))
c2 = Enum.at(key, rem(index + 1, length(key)))
int_of_hex_chars(c1, c2)
end
And with streams activated:
def get_mask(key) do
Enum.take(key, 2) |> int_of_hex_chars
end
Check out the source code
([OCaml](https://github.com/vigetlabs/otp/blob/master/languages/OCaml/encrypt.ml),
[Haskell](https://github.com/vigetlabs/otp/blob/master/languages/Haskell/encrypt.hs),
[Elixir](https://github.com/vigetlabs/otp/blob/master/languages/Elixir/encrypt))
to get a better sense of cyclical data structures in action.
## Partial Function Application {#partialfunctionapplication}
Most programming languages have a clear distinction between function
arguments (input) and return values (output). The line is less clear in
[ML](https://en.wikipedia.org/wiki/ML_%28programming_language%29)-derived
languages like Haskell and OCaml. Check this out (from Haskell's `ghci`
interactive shell):
Prelude> let add x y = x + y
Prelude> add 5 7
12
We create a function, `add`, that (seemingly) takes two arguments and
returns their sum.
Prelude> let add5 = add 5
Prelude> add5 7
12
But what's this? Using our existing `add` function, we've created
another function, `add5`, that takes a single argument and adds five to
it. So while `add` appears to take two arguments and sum them, it
actually takes one argument and returns a function that takes one
argument and adds it to the argument passed to the initial function.
When you inspect the type of `add`, you can see this lack of distinction
between input and output:
Prelude> :type add
add :: Num a => a -> a -> a
Haskell and OCaml use a concept called
[*currying*](https://en.wikipedia.org/wiki/Currying) or partial function
application. It's a pretty big departure from the C-derived languages
most of us are used to. Other languages may offer currying as [an
option](http://ruby-doc.org/core-2.1.1/Proc.html#method-i-curry), but
this is just how these languages work, out of the box, all of the time.
Let's see this concept in action. To convert a number to its hex
representation, you call `printf "%x" num`. To convert a whole list of
numbers, pass the partially applied function `printf "%x"` to `map`,
[like
so](https://github.com/vigetlabs/otp/blob/master/languages/Haskell/encrypt.hs#L12):
hexStringOfInts nums = concat $ map (printf "%x") nums
For more info on currying/partial function application, check out
[*Learn You a Haskell for Great
Good*](http://learnyouahaskell.com/higher-order-functions).
## A Friendly Compiler {#afriendlycompiler}
I learned to program with C++ and Java, where `gcc` and `javac` weren't
my friends -- they were jerks, making me jump through a bunch of hoops
without catching any actual issues (or so teenage Dave thought). I've
worked almost exclusively with interpreted languages in the intervening
10+ years, so it was fascinating to work with Haskell and OCaml,
languages with compilers that catch real issues. Here's my original
`decrypt` function in Haskell:
decrypt ciphertext key = case ciphertext of
[] -> []
c1:c2:cs -> xor (intOfHexChars [c1, c2]) (getMask key) : decrypt cs (drop 2 key)
Using pattern matching, I pull off the first two characters of the
ciphertext and decrypt them against they key, and then recurse on the
rest of the ciphertext. If the list is empty, we're done. When I
compiled the code, I received the following:
decrypt.hs:16:26: Warning:
Pattern match(es) are non-exhaustive
In a case alternative: Patterns not matched: [_]
The Haskell compiler is telling me that I haven't accounted for a list
consisting of a single character. And sure enough, this is invalid input
that a user could nevertheless use to call the program. Adding the
following handles the failure and fixes the warning:
decrypt ciphertext key = case ciphertext of
[] -> []
[_] -> error "Invalid ciphertext"
c1:c2:cs -> xor (intOfHexChars [c1, c2]) (getMask key) : decrypt cs (drop 2 key)
## Elixir's \|\> operator {#elixirsoperator}
According to [*Programming
Elixir*](https://pragprog.com/book/elixir/programming-elixir), the pipe
operator (`|>`)
> takes the result of the expression to its left and inserts it as the
> first parameter of the function invocation to its right.
It's borrowed from F#, so it's not an entirely novel concept, but it's
certainly new to me. To build our key, we want to take the first
argument passed into the program, convert it to a list of characters,
and then turn it to a cyclical stream. My initial approach looked
something like this:
key = Stream.cycle(to_char_list(List.first(System.argv)))
Using the pipe operator, we can flip that around into something much
more readable:
key = System.argv |> List.first |> to_char_list |> Stream.cycle
I like it. Reminds me of Unix pipes or any Western written language.
[Here's how I use the pipe operator in my encrypt
solution](https://github.com/vigetlabs/otp/blob/master/languages/Elixir/encrypt#L25-L31).
\* \* \*
At the end of this process, I think Haskell offers the most elegant code
and [Elixir](https://www.viget.com/services/elixir) the most potential
for us at Viget to use professionally. OCaml offers a good middle ground
between theory and practice, though the lack of a robust standard
library is a [bummer, man](https://www.youtube.com/watch?v=24Vlt-lpVOY).
I had a great time writing and refactoring these solutions. I encourage
you to [check out the
code](https://github.com/vigetlabs/otp/tree/master/languages), fork the
repo, and take the challenge yourself.

View File

@@ -0,0 +1,65 @@
---
title: "Out, Damned Tabs"
date: 2009-04-09T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/out-damned-tabs/
---
Like many developers I know, I'm a little bit OCD about code formatting.
While there are about as many ideas of properly formatted code as there
are coders, I think we can all agree that code with tabs and trailing
whitespace is not it. Git has the `whitespace = fix` option, which does
a fine job removing trailing spaces before commits, but leaves the
spaces in the working copy, and doesn't manage tabs at all.
I figured there had to be a better way to automate this type of code
formatting, and with help from [Kevin McFadden's
post](http://conceptsahead.com/off-axis/proper-trimming-on-save-with-textmate),
I think I've found one, by configuring
[TextMate](http://macromates.com/) to strip off trailing whitespace and
replace tabs with spaces whenever a file is saved. Here's how to set it
up:
1. Open the Bundle Editor (Bundles \> Bundle Editor \> Show Bundle
Editor).
2. Create a new bundle using the "+" menu at the bottom of the page.
Call it something like "Whitespace."
3. With your new bundle selected, create a new command called "Save
Current File," and give it the following settings:
- Save: Current File
- Command(s): blank
- Input: None
- Output: Discard
4. Start recording a new macro (Bundles \> Macros \> Start Recording).
5. Strip out trailing whitespace (Bundles \> Text \>
Converting/Stripping \> Remove Trailing Spaces in Document).
6. Replace tabs with spaces (Text \> Convert \> Tabs to Spaces).
7. Save the current document (Bundles \> Formatting \> Save Current
Document).
8. Stop recording the macro (Bundles \> Macros \> Stop Recording).
9. Save the macro (Bundles \> Macros \> Save Last Recording). Call it
something like "Strip Whitespace."
10. Click in the Activation (Key Equivalent) text field and hit
Command+S.
Alternatively, we've packaged the bundle up and put it up on
[GitHub](https://github.com/vigetlabs/whitespace-tmbundle/tree/master).
Instructions for setting it up are on the page, and patches are
encouraged.
### How About You? {#how_about_you}
This approach is working well for me; I'm curious if other people are
doing anything like this. If you've got an alternative way to deal with
extraneous whitespace in your code, please tell us how in the comments.

View File

@@ -0,0 +1,214 @@
---
title: "Pandoc: A Tool I Use and Like"
date: 2022-05-25T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/pandoc-a-tool-i-use-and-like/
---
Today I want to talk to you about one of my favorite command-line tools,
[Pandoc](https://pandoc.org/). From the project website:
> If you need to convert files from one markup format into another,
> pandoc is your swiss-army knife.
I spend a lot of time writing, and I love [Vim](https://www.vim.org/),
[Markdown](https://daringfireball.net/projects/markdown/), and the
command line (and avoid browser-based WYSIWYG editors when I can), so
that's where a lot of my Pandoc use comes in, but it has a ton of
utility outside of that -- really, anywhere you need to move between
different text-based formats, Pandoc can probably help. A few examples
from recent memory:
### Markdown ➞ Craft Blog Post
This website you're reading presently uses [Craft
CMS](https://craftcms.com/), a flexible and powerful content management
system that doesn't perfectly match my writing
process[^1^](#fn1){#fnref1 .footnote-ref role="doc-noteref"}. Rather
than composing directly in Craft, I prefer to write locally, pipe the
output through Pandoc, and put the resulting HTML into a text block in
the CMS. This gets me a few things I really like:
- Curly quotes in place of straight ones and en-dashes in place of
`--` (from the [`smart`
extension](https://pandoc.org/MANUAL.html#extension-smart))
- [Daring
Fireball-style](https://daringfireball.net/2005/07/footnotes)
footnotes with return links
By default, Pandoc uses [Pandoc
Markdown](https://garrettgman.github.io/rmarkdown/authoring_pandoc_markdown.html)
when converting Markdown docs to other formats, an "extended and
slightly revised version" of the original syntax, which is how footnotes
and a bunch of other things work.
### Markdown ➞ Rich Text (Basecamp)
I also sometimes find myself writing decently long
[Basecamp](https://basecamp.com/) posts. Basecamp 3 has a fine WYSIWYG
editor (🪦 Textile), but again, I'd rather be in Vim. Pasting HTML into
Basecamp doesn't work (just shows the code verbatim), but I've found
that if I convert my Markdown notes to HTML and open the HTML in a
browser, I can copy and paste that directly into Basecamp with good
results. Leveraging MacOS' `open` command, this one-liner does the
trick[^2^](#fn2){#fnref2 .footnote-ref role="doc-noteref"}:
cat [filename.md]
| pandoc -t html
> /tmp/output.html
&& open /tmp/output.html
&& read -n 1
&& rm /tmp/output.html
This will convert the contents to HTML, save that to a file, open the
file in a browser, wait for the user to hit enter, and the remove the
file. Without that `read -n 1`, it'll remove the file before the browser
has a chance to open it.
### HTML ➞ Text
We built an app for one of our clients that takes in news articles (in
HTML) via an API and sends them as emails to *their* clients (think big
brands) if certain criteria are met. Recently, we were making
improvements to the plain text version of the emails, and we noticed
that some of the articles were coming in without any linebreaks in the
content. When we removed the HTML (via Rails' [`strip_tags`
helper](https://apidock.com/rails/ActionView/Helpers/SanitizeHelper/strip_tags)),
the resulting content was all on one line, which wasn't very readable.
So imagine an article like this:
<h1>Headline</h1> <p>A paragraph.</p> <ul><li>List item #1</li> <li>List item #2</li></ul>
Our initial approach (with `strip_tags`) gives us this:
Headline A paragraph. List item #1 List item #2
Not great! But fortunately, some bright fellow had the idea to pull in
Pandoc, and some even brighter person packaged up some [Ruby
bindings](https://github.com/xwmx/pandoc-ruby) for it. Taking that same
content and running it through `PandocRuby.html(content).to_plain` gives
us:
Headline
A paragraph.
- List item #1
- List item #2
Much better, and though you can't tell from this basic example, Pandoc
does a great job with spacing and wrapping to generate really
nice-looking plain text from HTML.
### HTML Element ➞ Text
A few months ago, we were doing Pointless Weekend and needed a domain
for our
[Thrillr](https://www.viget.com/articles/plan-a-killer-party-with-thrillr/)
app. A few of us were looking through lists of fun top-level domains,
but we realized that AWS Route 53 only supports a limited set of them.
In order to get everyone the actual list, I needed a way to get all the
content out of an HTML `<select>` element, and you'll never guess what I
did (unless you guessed "use Pandoc"). In Firefox:
- Right click the select element, then click "Inspect"
- Find the `<select>` in the DOM view that pops up
- Right click it, then go to "Copy", then "Inner HTML"
- You'll now have all of the `<option>` elements on your clipboard
- In your terminal, run `pbpaste | pandoc -t plain`
The result is something like this:
.ac - $76.00
.academy - $12.00
.accountants - $94.00
.agency - $19.00
.apartments - $47.00
.associates - $29.00
.au - $15.00
.auction - $29.00
...
### Preview Mermaid/Markdown (`--standalone`)
A different client recently asked for an architecture diagram of a
complex system that [Andrew](https://www.viget.com/about/team/athomas/)
and I were working on, and we opted to use
[Mermaid](https://mermaid-js.github.io/mermaid/#/) (which is rad BTW) to
create sequence diagrams to illustrate all of the interactions. Both
GitHub and GitLab support Mermaid natively, which is really neat, but we
wanted a way to quickly iterate on our diagrams without having to push
changes to the remote repo.
We devised a simple build chain ([demo version available
here](https://github.com/dce/mermaid-js-demo)) that watches for changes
to a Markdown file, converts the Mermaid blocks to SVG, and then uses
Pandoc to take the resulting document and convert it to a styled HTML
page using the `--standalone` option ([here's the key
line](https://github.com/dce/mermaid-js-demo/blob/main/bin/build#L7=)).
Then we could simply make our changes and refresh the page to see our
progress.
### Generate a PDF
Finally, and this is not something I need to do very often, but Pandoc
also includes several ways to create PDF documents. The simplest (IMO)
is to install `wkhtmltopdf`, then instruct Pandoc to convert its input
to HTML but use `.pdf` in the output filename, so something like:
echo "# Hello\n\nIs it me you're looking for?" | pandoc -t html -o hello.pdf
[The result is quite nice.](https://static.viget.com/hello.pdf)
------------------------------------------------------------------------
I think that's about all I have to say about Pandoc for today. A couple
final thoughts:
- Pandoc is incredibly powerful -- I've really only scratched the
surface here. Look at the [man page](https://manpages.org/pandoc) to
get a sense of everything it can do.
- Pandoc is written in Haskell, and [the
source](https://github.com/jgm/pandoc/blob/master/src/Text/Pandoc/Readers/Markdown.hs)
is pretty fun to look through if you're a certain kind of person.
So install Pandoc with your package manager of choice and give it a
shot. I think you'll find it unexpectedly useful.
*[Swiss army knife icons created by smalllikeart -
Flaticon](https://www.flaticon.com/free-icons/swiss-army-knife "swiss army knife icons")*
------------------------------------------------------------------------
1. [My writing process is (generally):]{#fn1}
1. Write down an idea in my notebook
2. Gradually add a series of bullet points (this can sometimes take
awhile)
3. Once I feel like I have a solid outline, copy that into a
Markdown file
4. Start collecting links (in the `[1]:` footnote style)
5. Write a intro
6. Convert the bullet points to headers, edit + rearrange
7. Fill in all the sections, write jokes, etc.
8. Write a conclusion
9. Create a Gist, get feedback from the team
10. Convert Markdown to HTML, copy to clipboard
(`cat [file] | pandoc -t html | pbcopy`)
11. Create a new post in Craft, add a text section, flip to code
view, paste clipboard contents
12. Fill in the rest of the post metadata
13. 🚢 [↩︎](#fnref1){.footnote-back role="doc-backlink"}
2. [I've actually got this wired up as a Vim command in
`.vimrc`:]{#fn2}
command Mdpreview ! cat %
\ | pandoc -t html
\ > /tmp/output.html
\ && open /tmp/output.html
\ && read -n 1
\ && rm /tmp/output.html
[↩︎](#fnref2){.footnote-back role="doc-backlink"}

View File

@@ -0,0 +1,183 @@
---
title: "Use .pluck If You Only Need a Subset of Model Attributes"
date: 2014-08-20T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/pluck-subset-rails-activerecord-model-attributes/
---
*Despite some exciting advances in the field, like
[Node](http://nodejs.org/), [Redis](http://redis.io/), and
[Go](https://golang.org/), a well-structured relational database fronted
by a Rails or Sinatra (or Django, etc.) app is still one of the most
effective toolsets for building things for the web. In the coming weeks,
I'll be publishing a series of posts about how to be sure that you're
taking advantage of all your RDBMS has to offer.*
IF YOU ONLY REQUIRE a few attributes from a table, rather than
instantiating a collection of models and then running a `.map` over them
to get the data you need, it's much more efficient to use `.pluck` to
pull back only the attributes you need as an array. The benefits are
twofold: better SQL performance and less time and memory spent in
Rubyland.
To illustrate, let's use an app I've been working on that takes
[Harvest](http://www.getharvest.com/) data and generates reports. As a
baseline, here is the execution time and memory usage of `rails runner`
with a blank instruction:
$ time rails runner ""
real 0m2.053s
user 0m1.666s
sys 0m0.379s
$ memory_profiler.sh rails runner ""
Peak: 109240
In other words, it takes about two seconds and 100MB to boot up the app.
We calculate memory usage with a modified version of [this Unix
script](http://stackoverflow.com/a/1269490).
Now, consider a TimeEntry model in our time tracking application (of
which there are 314,420 in my local database). Let's say we need a list
of the dates of every single time entry in the system. A naïve approach
would look something like this:
dates = TimeEntry.all.map { |entry| entry.logged_on }
It works, but seems a little slow:
$ time rails runner "TimeEntry.all.map { |entry| entry.logged_on }"
real 0m14.461s
user 0m12.824s
sys 0m0.994s
Almost 14.5 seconds. Not exactly webscale. And how about RAM usage?
$ memory_profiler.sh rails runner "TimeEntry.all.map { |entry| entry.logged_on }"
Peak: 1252180
About 1.25 gigabytes of RAM. Now, what if we use `.pluck` instead?
dates = TimeEntry.pluck(:logged_on)
In terms of time, we see major improvements:
$ time rails runner "TimeEntry.pluck(:logged_on)"
real 0m4.123s
user 0m3.418s
sys 0m0.529s
So from roughly 15 seconds to about four. Similarly, for memory usage:
$ memory_profiler.sh bundle exec rails runner "TimeEntry.pluck(:logged_on)"
Peak: 384636
From 1.25GB to less than 400MB. When we subtract the overhead we
calculated earlier, we're going from 15 seconds of execution time to
two, and 1.15GB of RAM to 300MB.
## Using SQL Fragments {#usingsqlfragments}
As you might imagine, there's a lot of duplication among the dates on
which time entries are logged. What if we only want unique values? We'd
update our naïve approach to look like this:
dates = TimeEntry.all.map { |entry| entry.logged_on }.uniq
When we profile this code, we see that it performs slightly worse than
the non-unique version:
$ time rails runner "TimeEntry.all.map { |entry| entry.logged_on }.uniq"
real 0m15.337s
user 0m13.621s
sys 0m1.021s
$ memory_profiler.sh rails runner "TimeEntry.all.map { |entry| entry.logged_on }.uniq"
Peak: 1278784
Instead, let's take advantage of `.pluck`'s ability to take a SQL
fragment rather than a symbolized column name:
dates = TimeEntry.pluck("DISTINCT logged_on")
Profiling this code yields surprising results:
$ time rails runner "TimeEntry.pluck('DISTINCT logged_on')"
real 0m2.133s
user 0m1.678s
sys 0m0.369s
$ memory_profiler.sh rails runner "TimeEntry.pluck('DISTNCT logged_on')"
Peak: 107984
Both running time and memory usage are virtually identical to executing
the runner with a blank command, or, in other words, the result is
calculated at an incredibly low cost.
## Using `.pluck` Across Tables {#using.pluckacrosstables}
Requirements have changed, and now, instead of an array of timestamps,
we need an array of two-element arrays consisting of the timestamp and
the employee's last name, stored in the "employees" table. Our naïve
approach then becomes:
dates = TimeEntry.all.map { |entry| [entry.logged_on, entry.employee.last_name] }
Go grab a cup of coffee, because this is going to take awhile.
$ time rails runner "TimeEntry.all.map { |entry| [entry.logged_on, entry.employee.last_name] }"
real 7m29.245s
user 6m52.136s
sys 0m15.601s
memory_profiler.sh rails runner "TimeEntry.all.map { |entry| [entry.logged_on, entry.employee.last_name] }"
Peak: 3052592
Yes, you're reading that correctly: 7.5 minutes and 3 gigs of RAM. We
can improve performance somewhat by taking advantage of ActiveRecord's
[eager
loading](http://guides.rubyonrails.org/active_record_querying.html#eager-loading-associations)
capabilities.
dates = TimeEntry.includes(:employee).map { |entry| [entry.logged_on, entry.employee.last_name] }
Benchmarking this code, we see significant performance gains, since
we're going from over 300,000 SQL queries to two.
$ time rails runner "TimeEntry.includes(:employee).map { |entry| [entry.logged_on, entry.employee.last_name] }"
real 0m21.270s
user 0m19.396s
sys 0m1.174s
$ memory_profiler.sh rails runner "TimeEntry.includes(:employee).map { |entry| [entry.logged_on, entry.employee.last_name] }"
Peak: 1606204
Faster (from 7.5 minutes to 21 seconds), but certainly not fast enough.
Finally, with `.pluck`:
dates = TimeEntry.includes(:employee).pluck(:logged_on, :last_name)
Benchmarks:
$ time rails runner "TimeEntry.includes(:employee).pluck(:logged_on, :last_name)"
real 0m4.180s
user 0m3.414s
sys 0m0.543s
$ memory_profiler.sh rails runner "TimeEntry.includes(:employee).pluck(:logged_on, :last_name)"
Peak: 407912
A hair over 4 seconds execution time and 400MB RAM -- hardly any more
expensive than without employee names.
## Conclusion
- Prefer `.pluck` to instantiating a collection of ActiveRecord
objects and then using `.map` to build an array of attributes.
- `.pluck` can do more than simply pull back attributes on a single
table: it can run SQL functions, pull attributes from joined tables,
and tack on to any scope.
- Whenever possible, let the database do the heavy lifting.

View File

@@ -0,0 +1,86 @@
---
title: "Practical Uses of Ruby Blocks"
date: 2010-10-25T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/practical-uses-of-ruby-blocks/
---
Blocks are one of Ruby\'s defining features, and though we use them all
the time, a lot of developers are much more comfortable calling methods
that take blocks than writing them. Which is a shame, really, as
learning to use blocks in a tasteful manner is one of the best ways to
up your Ruby game. Here are a few examples extracted from a recent
project to give you a few ideas.
## `if_present?`
Often times, I'll want to assign a result to a variable and then execute
a block of code if that variable has a value. Here's the most
straightforward implementation:
user = User.find_by_login(login) if user ... end
Some people like to inline the assignment and conditional, but this
makes me ([and Ben](https://www.viget.com/extend/a-confusing-rubyism/))
stabby:
if user = User.find_by_login(login) ... end
To keep things concise *and* understandable, let's write a method on
`Object` that takes a block:
class Object def if_present? yield self if present? end end
This way, we can just say:
User.find_by_login(login).if_present? do |user| ... end
We use Rails' [present?](http://apidock.com/rails/Object/present%3F)
method rather than an explicit `nil?` check to ignore empty collections
and strings.
## `if_multiple_pages?`
Methods that take blocks are a great way to wrap up complex conditional
logic. I often have to generate pagination and previous/next links for
JavaScript-powered scrollers, which involves calculating the number of
pages and then, if there are multiple pages, displaying the links.
Here's a helper that calculates the number of pages and then passes the
page count into the provided block:
def if_multiple_pages?(collection, per_page = 10) pages = (collection.size / (per_page || 10).to_f).ceil yield pages if pages > 1 end
Use it like so:
<% if_multiple_pages? Article.published do |pages| %> <ol> <% 1.upto(pages) do |page| %> <li><%= link_to page, "#" %></li> <% end %> </ol> <% end %>
## `list_items_for`
As you saw above, Rails helpers that take blocks can help create more
elegant view code. Things get tricky when you want your helpers to
output markup, though. Here's a helper I made to create list items for a
collection with "first" and "last" classes on the appropriate elements:
def list_items_for(collection, opts = {}, &block) opts.reverse_merge!(:first_class => "first", :last_class => "last") concat(collection.map { |item| html_class = [ opts[:class], (opts[:first_class] if item == collection.first), (opts[:last_class] if item == collection.last) ] content_tag :li, capture(item, &block), :class => html_class.compact * " " }.join) end
Here it is in use:
<% list_items_for Article.published.most_recent(4) do |article| %> <%= link_to article.title, article %> <% end %>
Which outputs the following:
<li class="first"><a href="/articles/4">Article #4</a></li> <li><a href="/articles/3">Article #3</a></li> <li><a href="/articles/2">Article #2</a></li> <li class="last"><a href="/articles/1">Article #1</a></li>
Rather than yield, `list_items_for` uses
[concat](http://apidock.com/rails/ActionView/Helpers/TextHelper/concat)
and
[capture](http://apidock.com/rails/ActionView/Helpers/CaptureHelper/capture)
in order to get the generated markup where it needs to be.
Opportunities to use blocks in your code are everywhere once you start
to look for them, whether in simple cases, like the ones outlined above,
or more complex ones, like Justin\'s [block/exception tail call
optimization technique](https://gist.github.com/645951). If you've got
any good uses of blocks in your own work, put them in a
[gist](https://gist.github.com/) and link them up in the comments.

View File

@@ -0,0 +1,49 @@
---
title: "Protip: TimeWithZone, All The Time"
date: 2008-09-10T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/protip-timewithzone-all-the-time/
---
If you've ever tried to retrieve a list of ActiveRecord objects based on
their timestamps, you've probably been bitten by the quirky time support
in Rails:
>> Goal.create(:description => "Run a mile") => #<Goal id: 1, description: "Run a mile", created_at: "2008-09-09 19:32:57", updated_at: "2008-09-09 19:32:57"> >> Goal.find(:all, :conditions => ['created_at < ?', Time.now]) => []
Huh? Checking the logs, we see that the two commands above correspond to
the following queries:
INSERT INTO "goals" ("updated_at", "description", "created_at") VALUES('2008-09-09 19:32:57', 'Run a mile', '2008-09-09 19:32:57') SELECT * FROM "goals" WHERE created_at < '2008-09-09 15:33:17'
Rails stores `created_at` relative to [Coordinated Universal
Time](https://en.wikipedia.org/wiki/Coordinated_Universal_Time), while
`Time.now` is based on the system clock, running four hours behind. The
solution? ActiveSupport's
[TimeWithZone](http://caboo.se/doc/classes/ActiveSupport/TimeWithZone.html):
>> Goal.find(:all, :conditions => ['created_at < ?', Time.zone.now]) => [#<Goal id: 1, description: "Run a mile", created_at: "2008-09-09 19:32:57", updated_at: "2008-09-09 19:32:57">]
**Rule of thumb:** always use TimeWithZone in your Rails projects. Date,
Time and DateTime simply don't play well with ActiveRecord. Instantiate
it with `Time.zone.now` and `Time.zone.local`. To discard the time
element, use `beginning_of_day`.
## BONUS TIP {#bonus_protip}
Since it's a subclass of Time, interpolating a range of TimeWithZone
objects fills in every second between the two times --- not so useful if
you need a date for every day in a month:
>> t = Time.zone.now => Tue, 09 Sep 2008 14:26:45 EDT -04:00 >> (t..(t + 1.month)).to_a.size [9 minutes later] => 2592001
Fortunately, the desired behavior is just a monkeypatch away:
class ActiveSupport::TimeWithZone def succ self + 1.day end end >> (t..(t + 1.month)).to_a.size => 31
For more information about time zones in Rails, [Geoff
Buesing](http://mad.ly/2008/04/09/rails-21-time-zone-support-an-overview/)
and [Ryan
Daigle](http://ryandaigle.com/articles/2008/1/25/what-s-new-in-edge-rails-easier-timezones)
have good, up-to-date posts.

View File

@@ -0,0 +1,78 @@
---
title: "PUMA on Redis"
date: 2011-07-27T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/puma-on-redis/
---
A few weeks ago, we celebrated the launch of [the new
PUMA.com](https://www.viget.com/blog/relaunching-pumacom-startup-style/),
the culmination of a nearly two-year effort here at Viget. The whole
site is driven by a CMS written in Rails, and I'm very proud of the
technological platform we've developed. I want to focus on one piece of
that platform, [Redis](http://redis.io/), and how it makes the site both
rock solid and screaming fast.
## Fragment Caching
The app was initially created to serve category marketing sites like
[Running](http://www.puma.com/running) and
[Football](http://www.puma.com/football). When we set out to overhaul it
to serve the main PUMA site, we knew performance was of paramount
importance. We made extensive use of fragment caching throughout the
site, using Redis as our cache store. [Some
claim](http://stackoverflow.com/questions/4221735/rails-and-caching-is-it-easy-to-switch-between-memcache-and-redis/4342279#4342279)
that Redis is not as well suited for this purpose as Memcached, but it
held up well in our pre-launch testing and continues to perform well in
production.
We used Redis as our cache store for two reasons. First, we were already
using it for other purposes, so reusing it kept the technology stack
simpler. But more importantly, Redis\' wildcard key matching makes cache
expiration a snap. It's well known that cache expiration is one of [two
hard things in computer
science](http://martinfowler.com/bliki/TwoHardThings.html), but using
wildcard key searching, it's dirt simple to pull back all keys that
begin with "views" and contain the word "articles" and expire them
everytime an article is changed. Memcached has no such ability.
## API Caching
The PUMA site leverages third-party APIs to pull in product
availability, retail store information, and marketing campaigns, among
other things. External APIs are good for only two things: being slow and
returning unexpected results. In a defensive masterstroke, we developed
[CacheBar](https://github.com/vigetlabs/cachebar) to keep our responses
speedy and stable.
CacheBar sits between [HTTParty](http://httparty.rubyforge.org/) and the
web. When it receives a successful response, it stores it in Redis in
two places: as a normal string value with an expiration set on a per-API
basis (usually between an hour and a day) and in a hash of all that
API's responses. When the primary key expires, we attempt to fetch the
data from the API. Successful responses are again stored in both
locations, but if the response is unsuccessful, we pull the saved
response from the hash and set it as the value for the primary key with
a five-minute expiration. This way, we avoid the backup that happens as
a result of too many slow responses.
More information is available on the [CacheBar GitHub
page](https://github.com/vigetlabs/cachebar).
## Data Structures
The PUMA app uses Redis\' hashes, lists, and sets (sorted and unsorted)
as well as normal string values. Having all these data structures at our
disposal has proven incredibly useful, not to mention damn fun to use.
\* \* \*
Redis has far exceeded my expectations in both usefulness and
performance. Add it to your stack, and you'll be amazed at the ways it
can make your app faster and more robust.
If you're in North Carolina's Triangle region and you'd like to hear
more about the PUMA project, come out to tomorrow night's [Refresh the
Triangle](http://refreshthetriangle.org/) meeting, where I'll be talking
about this stuff alongside several other team members.

View File

@@ -0,0 +1,96 @@
---
title: "Rails Admin Interface Generators"
date: 2011-05-31T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/rails-admin-interface-generators/
---
Here at Viget, we're always looking for ways to reduce duplicated
effort, and one component that nearly every single one of our
applications needs is an admin interface. As such, we've spent a lot of
time trying to find the perfect drop-in admin interface generator. We've
been happy with [Typus](https://github.com/fesplugas/typus) for the past
year or two and have been able to contribute back to the project on a
[number](https://github.com/fesplugas/typus/commit/3fb53f58ce606ae80beaa712eef81dcf0d6b03bc)
[of](https://github.com/fesplugas/typus/commit/b6ead488b218d187f948e85ec70c3b01a589ebae)
[occasions](https://github.com/fesplugas/typus/commit/00b7b47ebd97a630623e80c006ef5401060bd848).
Lately, though, a pair of new libraries have been making some noise:
[ActiveAdmin](http://activeadmin.info/) and
[RailsAdmin](https://github.com/sferik/rails_admin). How do they stack
up to Typus? Read on, friend.
### Typus
[Typus](https://github.com/fesplugas/typus) is a library by Francesc
Esplugas originally started in 2007. Typus takes a different approach
than the other two libraries in that it uses generated controllers that
live in your Rails app to serve up its pages, rather than keeping all of
its application code within the library. This approach offers (in this
author's opinion) increased extensibility at the expense of code
duplication --- it's dirt simple to override the (e.g.) `create` action
in your `Admin::CommentsController` when the need arises, but you'll
still need a separate controller for every model where the default
behavior is good enough.
Installing Typus is very straightforward: add the gem to your Gemfile,
bundle it, run `rails generate typus` to get a basic admin interface up,
then run `rails generate typus:migration` to get user authentication.
The authors of the plugin recently fixed one of my biggest gripes,
adding generators to make adding new admin controllers a snap.
Configuration is all handled by a few YAML files. In terms of looks,
Typus isn't going to win any awards out of the box, but they've made it
very simple to copy the views into your app's `views/` folder, where
you're free to override them.
### ActiveAdmin
I just heard about [ActiveAdmin](http://activeadmin.info/) from Peter
Cooper's [Ruby Weekly](http://rubyweekly.com/) newsletter, though the
project was started in April 2010. Before anything else, you have to
admit that the project homepage looks pretty nice, and I'm happy to
report that that same attention to aesthetics carries into the project
itself. Configuration files for each model in the admin interface are
written in Ruby and live in `app/admin`. It's clear that a lot of
thought has gone into the configuration API, and the [Github
page](https://github.com/gregbell/active_admin) contains thorough
documentation for how to use it.
I've long been jealous of [Django](https://www.djangoproject.com/)'s
generated admin interfaces, and ActiveAdmin is the first Rails project
I've seen that can rival it in terms of overall slickness, both from a
UI and development standpoint. The trouble with libraries that give you
so much out of the box is that it's often difficult to do things that
the author's didn't anticipate, and I'd need to spend more than an hour
with ActiveAdmin in order to determine if that's the case here.
### RailsAdmin
[RailsAdmin](https://github.com/sferik/rails_admin) is another recent
entry into the admin interface generator space, beginning as a Ruby
Summer of Code project in August of last year. I had some difficulty
getting it installed, finally having success after pointing the Gemfile
entry at the GitHub repository
(`gem 'rails_admin', :git => 'git://github.com/sferik/rails_admin.git'`).
The signup process was similarly unpolished, outsourcing entirely to
[Devise](https://github.com/plataformatec/devise) to the point that
anyone can navigate to `/users/sign_up` in your application and become
an admin.
Once inside the admin interface, things seem to work pretty well.
There's some interesting functionality available for associating models,
and the dashboard has some neat animated graphs. I'll be curious to
watch this project as it develops --- if they can smooth off some of the
rough edges, I think they'll really have something.
### Conclusion
[ActiveAdmin](http://activeadmin.info/) offers an incredibly slick
out-of-the-box experience; [Typus](https://github.com/fesplugas/typus)
seems to offer more ways to override default behavior. If I was starting
a new project today, it would depend on how much customization I thought
I'd have to do through the life of the project as to which library I
would choose. I put a small project called
[rails_admin_interfaces](https://github.com/dce/rails_admin_interfaces)
on GitHub with branches for each of these libraries so you can try them
out and draw your own conclusions.

View File

@@ -0,0 +1,35 @@
---
title: "Refresh 006: Dr. jQuery"
date: 2008-04-28T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/refresh-006-dr-jquery/
---
This past Thursday night saw the sixth meeting of [Refresh the
Triangle](http://refreshthetriangle.org/), the local chapter of the
Refresh tech network that Viget's helping to organize. [Nathan
Huening](http://onwired.com/about/nathan-huening/) from
[OnWired](http://onwired.com/) gave a great talk called "Dr. jQuery (Or,
How I Learned to Stop Worrying and Love the [DOM]{.caps})," and his
passion for the material was evident. In a series of increasingly
complex examples, Nathan showed off the power and simplicity of the
[jQuery](http://jquery.com/) JavaScript library. He demonstrated that
most of jQuery can be reduced to "grab things, do stuff," starting with
simple [CSS]{.caps} modifications and moving to [AJAX]{.caps},
animation, and custom functionality.
To get a good taste of the presentation, you can use
[FireBug](http://www.getfirebug.com/) to run Nathan's [sample
code](http://dev.onwired.com/refresh/examples.js) against the [demo
page](http://dev.onwired.com/refresh/) he set up. You'll want to be
running [FireFox 2](http://www.getfirefox.com/), as [FF3]{.caps} Beta 5
gave me a lot of grief while I tried to follow Nathan's examples.
Big thanks to Nathan and to Duke's [Blackwell
Interactive](http://www.blackwell.duke.edu/) for hosting the event, as
well as to everyone who came out; maybe we\'ve got you pictured on our
[Flickr](http://www.flickr.com/photos/refreshthetriangle/sets/72157604778999205/)
page. 
Hope to see you next month.

View File

@@ -0,0 +1,36 @@
---
title: "Refresh Recap: The Future of Data"
date: 2009-09-25T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/refresh-recap-the-future-of-data/
---
Last night's [Refresh the Triangle](http://refreshthetriangle.org/)
meeting featured a fantastic talk from our own [Ben
Scofield](https://www.viget.com/about/team/bscofield) titled "The Future
of Data." Hosted by the kind folks at [Bronto](http://bronto.com/), it
focused on HTML5 client-side storage as well as alternatives to
traditional relational databases. Client-side storage has a lot of
interesting implications, both in terms of improving website performance
by caching user-specific information locally, as well as building
standalone offline applications that run in the browser. On the server
side, the ["NoSql" movement](https://en.wikipedia.org/wiki/Nosql) has
been gaining a lot of attention among the web development community, and
Ben stressed that all data storage systems have their roles, from simple
key-value stores to sophisticated graph databases.
Ben's given a similar talk at [several](http://www.rubynation.org/)
[technical](http://developer-day.com/events/2009-boston.html)
[conferences](http://windycityrails.org), and he did a great job
refitting the presentation to suit the Refresh crowd of designers,
developers, and "other." A lively discussion followed the talk and
continued at [Tyler's](http://www.tylerstaproom.com/restaurants/durham)
downstairs. Big thanks to Bronto for their gracious hosting, to Ben for
a great talk, and to everyone who came out. To learn more, you can check
out Ben's
[slides](http://www.slideshare.net/bscofield/the-future-of-data) and his
notes on his [personal
site](http://benscofield.com/2009/09/refreshing-the-triangle/). If
you're in the triangle area, we'd love to see you at
[Refresh](http://refreshthetriangle.org/) next month!

View File

@@ -0,0 +1,75 @@
---
title: "Regular Expressions in MySQL"
date: 2011-09-28T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/regular-expressions-in-mysql/
---
Did you know MySQL supports using [regular
expressions](https://en.wikipedia.org/wiki/Regular_expression) in
`SELECT` statements? I'm surprised at the number of developers who
don't, despite using SQL and regexes on a daily basis. That's not to say
that putting a regex into your SQL should be a daily occurrence. In
fact, it can [cause more problems than it
solves](https://en.wikiquote.org/wiki/Jamie_Zawinski#Attributed), but
it's a handy tool to have in your belt under certain circumstances.
## Basic Usage
Regular expressions in MySQL are invoked with the
[`REGEXP`](http://dev.mysql.com/doc/refman/5.1/en/regexp.html) keyword,
aliased to `RLIKE`. The most basic usage is a hardcoded regular
expression in the right hand side of a conditional clause, e.g.:
SELECT * FROM users WHERE email RLIKE '^[a-c].*[0-9]@';
This SQL would grab every user whose email address begins with 'a', 'b',
or 'c' and has a number as the final character of its local portion.
## Something More Advanced
The regex used with RLIKE does not need to be hardcoded into the SQL
statement, and can *in fact* be a column in the table being queried. In
a recent project, we were tasked with creating an interface for managing
redirect rules à la
[mod_rewrite](http://httpd.apache.org/docs/current/mod/mod_rewrite.html).
We were able to do the entire match in the database, using SQL like this
(albeit with a few more joins, groups and orders):
SELECT * FROM redirect_rules WHERE '/news' RLIKE pattern;
In this case, '/news' is the incoming request path and `pattern` is the
column that stores the regular expression. In our benchmarks, we found
this approach to be much faster than doing the regular expression
matching in Ruby, mostly because of the lack of ActiveRecord overhead.
## Caveats
Using regular expressions in your SQL has the potential to be slow.
These queries can't use indexes, so a full table scan is required. If
you can get away with using `LIKE`, which has some regex-like
functionality, you should. As always: benchmark, benchmark, benchmark.
Additionally, MySQL supports
[POSIX](https://en.wikipedia.org/wiki/POSIX) regular expressions, not
[PCRE](http://www.pcre.org/) like Ruby. There are things (like negative
lookaheads) that you simply can't do, though you probably ought not to
be doing them in your SQL anyway.
## In PostgreSQL
Support for regular expressions in PostgreSQL is similar to that of
MySQL, though the syntax is different (e.g. `email ~ '^a'` instead of
`email RLIKE '^a'`). What's more, Postgres contains some useful
functions for working with regular expressions, like `substring` and
`regexp_replace`. See the
[documentation](http://www.postgresql.org/docs/9.0/static/functions-matching.html)
for more information.
## Conclusion
In certain circumstances, regular expressions in SQL are a handy
technique that can lead to faster, cleaner code. Don\'t use `RLIKE` when
`LIKE` will suffice and be sure to benchmark your queries with datasets
similar to the ones you'll be facing in production.

View File

@@ -0,0 +1,87 @@
---
title: "Required Fields Should Be Marked NOT NULL"
date: 2014-09-25T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/required-fields-should-be-marked-not-null/
---
*Despite some exciting advances in the field, like
[Node](http://nodejs.org/), [Redis](http://redis.io/), and
[Go](https://golang.org/), a well-structured relational database fronted
by a Rails or Sinatra (or Django, etc.) app is still one of the most
effective toolsets for building things for the web. In the coming weeks,
I'll be publishing a series of posts about how to be sure that you're
taking advantage of all your RDBMS has to offer.*
A "NOT NULL constraint" enforces that a database column does not accept
null values. Null, according to
[Wikipedia](https://en.wikipedia.org/wiki/Null_(SQL)), is
> a special marker used in Structured Query Language (SQL) to indicate
> that a data value does not exist in the database. Introduced by the
> creator of the relational database model, E. F. Codd, SQL Null serves
> to fulfill the requirement that all true relational database
> management systems (RDBMS) support a representation of "missing
> information and inapplicable information."
One could make the argument that null constraints in the database are
unnecessary, since Rails includes the `presence` validation. What's
more, the `presence` validation handles blank (e.g. empty string) values
that null constraints do not. For several reasons that I will lay out
through the rest of this section, I contend that null constraints and
presence validations should not be mutually exclusive, and in fact, **if
an attribute's presence is required at the model level, its
corresponding database column should always require a non-null value.**
## Why use non-null columns for required fields? {#whyusenon-nullcolumnsforrequiredfields}
### Data Confidence {#dataconfidence}
The primary reason for using NOT NULL constraints is to have confidence
that your data has no missing values. Simply using a `presence`
validation offers no such confidence. For example,
[`update_attribute`](http://apidock.com/rails/ActiveRecord/Persistence/update_attribute)
ignores validations, as does `save` if you call it with the
[`validate: false`](http://apidock.com/rails/v4.0.2/ActiveRecord/Persistence/save)
option. Additionally, database migrations that manipulate the schema
with raw SQL using `execute` bypass validations.
### Undefined method 'foo' for nil:NilClass {#undefinedmethodfoofornil:nilclass}
One of my biggest developer pet peeves is seeing a
`undefined method 'foo' for nil:NilClass` come through in our error
tracking service du jour. Someone assumed that a model's association
would always be present, and one way or another, that assumption turned
out to be false. The merits of the [Law of
Demeter](https://en.wikipedia.org/wiki/Law_of_Demeter) are beyond the
scope of this post, but suffice it to say that if you're going to say
something like `@athlete.team.name` in your code, you better be damn
sure that a) the athlete's `team_id` has a value and b) it corresponds
to the ID of an actual team. We'll get to that second bit in our
discussion of foreign key constraints in a later post, but the first
part, ensuring that `team_id` has a value, demands a `NOT NULL` column.
### Migration Issues {#migrationissues}
Another benefit of using `NOT NULL` constraints is that they force you
to deal with data migration issues. Suppose a change request comes in to
add a required `age` attribute to the `Employee` model. The easy
approach would be to add the column, allow it to be null, and add a
`presence` validation to the model. This works fine for new employees,
but all of your existing employees are now in an invalid state. If, for
example, an employee then attempts a password reset, updating their
`password_reset_token` field would fail due to the missing age value.
If you'd created the `age` column to require a non-null value, you would
have been forced to deal with the issue of existing users immediately
and thus avoided this issue. That said, there's no obvious value for
what to fill in for all of the existing users' ages, but better to have
that discussion at development time than to spend weeks or months
dealing with the fallout of invalid users in the system.
\* \* \*
I hope I've laid out a case for using non-null constraints for all
required database fields for great justice. In the next post, I'll show
the proper way to add non-null columns to existing tables.

View File

@@ -0,0 +1,57 @@
---
title: "Romanize: Another Programming Puzzle"
date: 2015-03-06T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/romanize-another-programming-puzzle/
---
We had such a good time working through our [first programming
challenge](https://viget.com/extend/otp-a-language-agnostic-programming-challenge)
that we decided to put another one together. We had several ideas, but
[Pat](https://viget.com/about/team/preagan)'s idea of converting to and
from Roman numerals won out, and a few hours later,
[Romanize](https://github.com/vigetlabs/romanize) was born.
The name of the game is to write, in a language of your choice, a pair
of programs that work like this:
> ./deromanize I
1
> ./deromanize II
2
> ./deromanize MCMIV
1904
> ./romanize 1
I
> ./romanize 2
II
> ./romanize 1904
MCMIV
It\'s a deceptively difficult problem, especially if, like me, you only
understand how Roman numerals work in the vaguest sense. And it's one
thing to create a solution that passes the test suite, and another
entirely to write something concise and elegant -- going from Arabic to
Roman, especially, seems to defy refactoring.
We've created working solutions in ~~seven~~ ~~eight~~ ~~nine~~ ten
languages:
- C (via [Steve132](https://github.com/Steve132))
- Clojure
- Elixir
- Go
- Haskell (plus check out [this cool
thing](https://gist.github.com/sgronblo/e3d73a61c5dd968b7d29) from
[sgronblo](https://github.com/sgronblo) using QuickCheck and Parsec)
- OCaml
- Node.js (via [Xylem](https://github.com/Xylem))
- PHP
- Ruby
- Swift (shout out to [wasnotrice](https://github.com/wasnotrice))
What's gonna be number eleven? **You decide!** [Fork the
repo](https://github.com/vigetlabs/romanize) and give it your best shot.
When you're done, send us a PR.

View File

@@ -0,0 +1,50 @@
---
title: "RubyInline in Shared Rails Environments"
date: 2008-05-23T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/rubyinline-in-shared-rails-environments/
---
As an end-to-end web production company, we have a vested interest in
making Rails applications easier to deploy for both development and
production purposes. We've developed
[Tyrant](http://trac.extendviget.com/tyrant/wiki), a Rails app for
running Rails apps, and we're [eagerly
watching](https://www.viget.com/extend/passenger-let-it-ride/) as new
solutions are created and refined.
But it's a new market, and current solutions are not without their share
of obstacles. In working with both Tyrant and [Phusion
Passenger](http://www.modrails.com/), we've encountered difficulties
running applications that use
[RubyInline](http://www.zenspider.com/ZSS/Products/RubyInline/) to embed
C into Ruby code (e.g.,
[ImageScience](http://seattlerb.rubyforge.org/image_science/classes/ImageScience.html),
my image processing library of choice). Try to start up an app that uses
RubyInline code in a shared environment, and you might encounter the
following error:
/Library/Ruby/Gems/1.8/gems/RubyInline-3.6.7/lib/inline.rb:325:in `mkdir': Permission denied - /home/users/www-data/.ruby_inline (Errno::EACCES)
RubyInline uses the home directory of the user who started the server to
compile the inline code; problems occur when the current process is
owned by a different user. "Simple," you think. "I'll just open that
directory up to everybody." Not so fast, hotshot. Try to start the app
again, and you get the following:
/home/users/www-data/.ruby_inline is insecure (40777). It may not be group or world writable. Exiting.
Curses! Fortunately, VigetExtend is here to help. Drop this into your
environment-specific config file:
``` {#code .ruby}
temp = Tempfile.new('ruby_inline', '/tmp') dir = temp.path temp.delete Dir.mkdir(dir, 0755) ENV['INLINEDIR'] = dir
```
We use the [Tempfile](http://ruby-doc.org/core/classes/Tempfile.html)
library to generate a guaranteed-unique filename in the `/tmp`
directory, prepended with "ruby_inline." After storing the filename, we
delete the tempfile and create a directory with the proper permissions
in its place. We then store the directory path in the `INLINEDIR`
environment variable, so that RubyInline knows to use it to compile.

View File

@@ -0,0 +1,46 @@
---
title: "Sessions on PCs and Macs"
date: 2009-02-09T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/sessions-on-pcs-and-macs/
---
When switching from Windows to a Mac, one thing that takes some getting
used to is the difference between closing and quitting a program. On the
Mac, as one [Mac-Forums poster puts
it](http://www.mac-forums.com/forums/switcher-hangout/99903-does-pushing-red-gel-button-really-close-application.html),
"To put it simply...you *close* windows. You *quit* applications."
Windows draws [no such
distinction](http://www.macobserver.com/article/2008/07/03.6.shtml#435860)
--- the application ends when its last window is closed. This may not
seem like much of a difference, but it has serious potential
ramifications when dealing with browsers and sessions; to quote the
[Ruby on Rails
wiki](http://wiki.rubyonrails.org/rails/pages/HowtoChangeSessionOptions):
> You can control when the current session will expire by setting the
> :session_expires value with a Time object. **[If not set, the session
> will terminate when the user's browser is
> closed.]{style="font-weight: normal;"}**
In other words, if you use the session to persist information like login
state, the user experience for an out-of-the-box Rails app is
dramatically different depending on what operating system is used to
access it (all IE jokes aside). I probably quit my browser three times a
week, whereas I close all browser windows closer to three times an hour.
Were I running Windows, this might not be an option.
On my two most recent projects, I've used Adam Salter's [Sliding
Sessions
plugin](https://github.com/adamsalter/sliding_sessions/tree/master),
which allows me to easily set the duration of the session during every
request. This way, I can set the session to expire two weeks after the
last request, independent of browser activity --- a much saner default
setup, in my opinion.
It's well-known that Mac users are [vastly over-represented among web
developers](http://www.webdirections.org/the-state-of-the-web-2008/browsers-and-operating-systems/#operating-systems),
so I think there's a distinct possibility that a silent majority of
users are receiving a sub-optimal user experience in many Rails
apps --- and nobody really seems concerned.

View File

@@ -0,0 +1,46 @@
---
title: "Shoulda Macros with Blocks"
date: 2009-04-29T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/shoulda-macros-with-blocks/
---
When I'm not working on client projects, I keep myself busy
with [SpeakerRate](http://speakerrate.com), a site that lets conference
goers rate the talks they've attended. After a number of similar
suggestions from users, we decided to display the total number of
ratings alongside the averages. Although only talks can be rated,
speakers, events and series also have ratings through their associated
talks. As you can imagine, calculating the total ratings for each of
these required a lot of somewhat repetitive code in the models, and
*very* repetitive code in the associated tests.
Fortunately, since we're using
[Shoulda](http://thoughtbot.com/projects/shoulda/), we were able to DRY
things up considerably with a macro:
``` {#code .ruby}
class Test::Unit::TestCase def self.should_sum_total_ratings klass = model_class context "finding total ratings" do setup do @ratable = Factory(klass.to_s.downcase) end should "have zero total ratings if no rated talks" do assert_equal 0, @ratable.total_ratings end should "have one total rating if one delivery & content rating" do talk = block_given? ? yield(@ratable) : @ratable Factory(:content_rating, :talk => talk) Factory(:delivery_rating, :talk => talk) assert_equal 1, @ratable.reload.total_ratings end end end end
```
This way, if we're testing a talk, we can just say:
``` {#code .ruby}
class TalkTest < Test::Unit::TestCase context "A Talk" do should_sum_total_ratings end end
```
But if we're testing something that has a relationship with multiple
talks, our macro accepts a block that serves as a factory to create a
talk with the appropriate relationship. For events, we can do something
like:
``` {#code .ruby}
class EventTest < Test::Unit::TestCase context "An Event" do should_sum_total_ratings do |event| Factory(:talk, :event => event) end end end
```
I\'m pretty happy with this solution, but having to type "event" three
times still seems a little verbose. If you\'ve got any suggestions for
refactoring, let us know in the comments.
 

View File

@@ -0,0 +1,135 @@
---
title: "Simple APIs using SerializeWithOptions"
date: 2009-07-09T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/simple-apis-using-serializewithoptions/
---
While we were creating the [SpeakerRate
API](http://speakerrate.com/api), we noticed that ActiveRecord's
serialization system, while expressive, requires entirely too much
repetition. As an example, keeping a speaker's email address out of an
API response is simple enough:
``` {.code-block .line-numbers}
@speaker.to_xml(:except => :email)
```
But if we want to include speaker information in a talk response, we
have to exclude the email attribute again:
``` {.code-block .line-numbers}
@talk.to_xml(:include => { :speakers => { :except => :email } })
```
Then imagine that a talk has a set of additional directives, and the API
responses for events and series include lists of talks, and you can see
how our implementation quickly turned into dozens of lines of repetitive
code strewn across several controllers. We figured there had to be a
better way, so when we couldn\'t find one, we
created [SerializeWithOptions](https://github.com/vigetlabs/serialize_with_options). 
At its core, SerializeWithOptions is a simple DSL for describing how to
turn an ActiveRecord object into XML or JSON. To use it, put
a `serialize_with_options` block in your model, like so:
``` {.code-block .line-numbers}
class Speaker < ActiveRecord::Base
# ...
serialize_with_options do
methods :average_rating, :avatar_url
except :email, :claim_code
includes :talks
end
# ...
end
class Talk < ActiveRecord::Base
# ...
serialize_with_options do
methods :average_rating
except :creator_id
includes :speakers, :event, :series
end
# ...
end
```
With this configuration in place, calling `@speaker.to_xml` is the same
as calling:
``` {.code-block .line-numbers}
@speaker.to_xml(
:methods => [:average_rating, :avatar:url],
:except => [:email, :claim_code],
:include => {
:talks => {
:methods => :average_rating,
:except => :creator_id
}
}
)
```
Once you've defined your serialization options, your controllers will
end up looking like this:
``` {.code-block .line-numbers}
def show
@post = Post.find(params[:id]) respond_to do |format|
format.html
format.xml { render :xml => @post }
format.json { render :json => @post }
end
end
```
Source code and installation instructions are available on GitHub. We
hope this can help you DRY up your app's API, or, if it doesn't have
one, remove your last excuse.
**UPDATE 6/14:** We've added a few new features to SerializeWithOptions
to handle some real-world scenarios we've encountered. You can now
specify multiple `serialize_with_options` blocks:
``` {.code-block .line-numbers}
class Speaker < ActiveRecord::Base
# ...
serialize_with_options do
methods :average_rating, :avatar_url
except :email, :claim_code
includes :talks
end
serialize_with_options :with_email do
methods :average_rating, :avatar_url
except :claim_code
includes :talks
end
# ...
end
```
You can now call `@speaker.to_xml` and get the default options, or
`@speaker.to_xml(:with_email)` for the second set. When pulling in
nested models, SerializeWithOptions will use configuration blocks with
the same name if available, otherwise it will use the default.
Additionally, you can now pass a hash to `:includes` to set a custom
configuration for included models
``` {.code-block .line-numbers}
class Speaker < ActiveRecord::Base
# ...
serialize_with_options do
methods :average_rating, :avatar_url
except :email, :claim_code
includes :talks => { :include => :comments }
end
# ...
end
```
Use this method if you want to nest multiple levels of models or
overwrite other settings.

View File

@@ -0,0 +1,28 @@
---
title: "Simple App Stats with StatBoard"
date: 2012-11-28T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/simple-app-stats-with-statboard/
---
We build a lot of small apps here at Viget as part of [Pointless
Corp](http://pointlesscorp.com/), like
[SpeakerRate](http://speakerrate.com/),
[OfficeGames](http://officegam.es/), and
[BabyBookie](http://babybookie.com/). It's fun to track how many people
are using them, and rather than write yet another Rakefile to generate
reports, I decided to create a simple [Rails
Engine](http://edgeapi.rubyonrails.org/classes/Rails/Engine.html) to
display some basic stats. Announcing, then,
[StatBoard](https://github.com/vigetlabs/stat_board):
![](https://raw.github.com/vigetlabs/stat_board/master/screenshot.png){style="box-shadow: none"}
Installation is a cinch: add the gem to your Gemfile, mount the app in
`routes.rb`, and set the models to query (full instructions available on
the [GitHub
page](https://github.com/vigetlabs/stat_board#basic-configuration)). The
code itself is embarrassingly simple, so if you have any ideas for
improvements, or just want to see how a simple Rails Engine works, take
a look.

View File

@@ -0,0 +1,207 @@
---
title: "Simple Commit Linting for Issue Number in GitHub Actions"
date: 2023-04-28T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/simple-commit-linting-for-issue-number-in-github-actions/
---
I don\'t believe there is **a** right way to do software; I think teams
can be effective (or ineffective!) in a lot of different ways using all
sorts of methodologies and technologies. But one hill upon which I will
die is this: referencing tickets in commit messages pays enormous
dividends over the long haul and you should always do it. As someone who
regularly commits code to apps created in the Obama era, nothing warms
my heart like running
[`:Git blame`](https://github.com/tpope/vim-fugitive#fugitivevim) on
some confusing code and seeing a reference to a GitHub Issue where I can
get the necessary context. And, conversely, nothing sparks nerd rage
like `fix bug` or `PR feedback` or, heaven forbid, `oops`.
In a recent [project
retrospective](https://www.viget.com/articles/get-the-most-out-of-your-internal-retrospectives/),
the team identified that we weren\'t being as consistent with this as
we\'d like, and decided to take action. I figured some sort of commit
linting would be a good candidate for [continuous
integration](https://www.viget.com/articles/maintenance-matters-continuous-integration/)
--- when a team member pushes a branch up to GitHub, check the commits
and make sure they include a reference to a ticket.
I looked into [commitlint](https://commitlint.js.org/), but I found it a
lot more opinionated than I am --- I really just want to make sure
commits begin with either `[#XXX]` (an issue number) or `[n/a]` --- and
rather difficult to reconfigure. After struggling with it for a few
hours, I decided to just DIY it with a simple inline script. If you just
want something you can drop into a GitHub Actions YAML file to lint your
commits, here it is (but stick around and I\'ll break it down and then
show how to do it in a few other languages):
``` yaml
steps:
- name: Checkout code
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Set up ruby 3.2.1
uses: ruby/setup-ruby@v1
with:
ruby-version: 3.2.1
- name: Lint commits
run: |
git log --format=format:%s HEAD ^origin/main | ruby -e '
$stdin.each_line do |msg|
next if /^\[(#\d+|n\/a)\]/.match?(msg)
warn %(Commits must begin with [#XXX] or [n/a] (#{msg.strip}))
exit 1
end
'
```
A few notes:
- That `fetch-depth: 0` is essential in order to be able to compare
the branch being built with `main` (or whatever you call your
primary development branch) --- by default, your Action only knows
about the current branch.
- `git log --format=format:%s HEAD ^origin/main` is going to give you
the first line of every commit that\'s in the source branch but not
in `main`; those are the commits we want to lint.
- With that list of commits, we loop through each message and compare
it with the regular expression `/^\[(#\d+|n\/a)\]/`, i.e. does this
message begin with either `[#XXX]` (where `X` are digits) or
`[n/a]`?
- If any message does **not** match, print an error out to standard
error (that\'s `warn`) and exit with a non-zero status (so that the
GitHub Action fails).
If you want to try this out locally (or perhaps modify the script to
validate messages in a different way), here\'s a `docker run` command
you can use:
``` bash
echo '[#123] Message 1
[n/a] Message 2
[#122] Message 3' | docker run --rm -i ruby:3.2.1 ruby -e '
$stdin.each_line do |msg|
next if /^\[(#\d+|n\/a)\]/.match?(msg)
warn %(Commits must begin with [#XXX] or [n/a] (#{msg.strip}))
exit 1
end
'
```
Note that running this command should output nothing since these are all
valid commit messages; modify one of the messages if you want to see the
failure state.
[]{#other-languages}
## Other Languages [\#](#other-languages "Direct link to Other Languages"){.anchor aria-label="Direct link to Other Languages"}
Since there\'s a very real possibility you might not otherwise install
Ruby in your GitHub Actions, and because I weirdly enjoy writing the
same code in a bunch of different languages, here are scripts for
several of Viget\'s other favorites:
[]{#javaScript}
### JavaScript [\#](#javaScript "Direct link to JavaScript"){.anchor aria-label="Direct link to JavaScript"}
``` bash
git log --format=format:%s HEAD ^origin/main | node -e "
let msgs = require('fs').readFileSync(0).toString().trim().split('\n');
for (let msg of msgs) {
if (msg.match(/^\[(#\d+|n\/a)\]/)) { continue; }
process.stderr.write('Commits must begin with [#XXX] or [n/a] (' + msg + ')');
process.exit(1);
}
"
```
To test:
``` bash
echo '[#123] Message 1
[n/a] Message 2
[#122] Message 3' | docker run --rm -i node:18.15.0 node -e "
let msgs = require('fs').readFileSync(0).toString().trim().split('\n');
for (let msg of msgs) {
if (msg.match(/^\[(#\d+|n\/a)\]/)) { continue; }
process.stderr.write('Commits must begin with [#XXX] or [n/a] (' + msg + ')');
process.exit(1);
}
"
```
[]{#php}
### PHP [\#](#php "Direct link to PHP"){.anchor aria-label="Direct link to PHP"}
``` bash
git log --format=format:%s HEAD ^origin/main | php -r '
while ($msg = fgets(STDIN)) {
if (preg_match("/^\[(#\d+|n\/a)\]/", $msg)) { continue; }
fwrite(STDERR, "Commits must begin with #[XXX] or [n/a] (" . trim($msg) . ")\n");
exit(1);
}
'
```
To test:
``` bash
echo '[#123] Message 1
[n/a] Message 2
[#122] Message 3' | docker run --rm -i php:8.2.4 php -r '
while ($msg = fgets(STDIN)) {
if (preg_match("/^\[(#\d+|n\/a)\]/", $msg)) { continue; }
fwrite(STDERR, "Commits must begin with #[XXX] or [n/a] (" . trim($msg) . ")\n");
exit(1);
}
'
```
[]{#python}
### Python [\#](#python "Direct link to Python"){.anchor aria-label="Direct link to Python"}
``` bash
git log --format=format:%s HEAD ^origin/main | python -c '
import sys
import re
for msg in sys.stdin:
if re.match(r"^\[(#\d+|n\/a)\]", msg):
continue
print("Commits must begin with #[xxx] or [n/a] (%s)" % msg.strip(), file=sys.stderr)
sys.exit(1)
'
```
To test:
``` bash
echo '[#123] Message 1
[n/a] Message 2
[#122] Message 3' | docker run --rm -i python:3.11.3 python -c '
import sys
import re
for msg in sys.stdin:
if re.match(r"^\[(#\d+|n\/a)\]", msg):
continue
print("Commits must begin with #[xxx] or [n/a] (%s)" % msg.strip(), file=sys.stderr)
sys.exit(1)
'
```
------------------------------------------------------------------------
So there you have it: simple GitHub Actions commit linting in most of
Viget\'s favorite languages (try as I might, I could not figure out how
to do this in [Elixir](https://elixir-lang.org/), at least not in a
concise way). As I said up front, writing good tickets and then
referencing them in commit messages so that they can easily be surfaced
with `git blame` pays **huge** dividends over the life of a codebase. If
you\'re not already in the habit of doing this, well, the best time to
start was `Initial commit`, but the second best time is today.

View File

@@ -0,0 +1,80 @@
---
title: "Simple, Secure File Transmission"
date: 2013-08-29T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/simple-secure-file-transmission/
---
Often I need to send a file containing sensitive information, like a
database dump or a digital certificate, to a client or fellow developer.
It's difficult to know the correct level of paranoia to exhibit in
situations like these. Obviously, nobody's sitting in front of a
computer in a dark room, just *waiting* for me to leak the SSL
certificate for the staging Solr EC2 box. At the same time, I know there
are many people with access to my email, my Dropbox, my Basecamp posts,
and it would be irresponsible of me to rely on their collective good
faith to keep this information secure.
I've settled on a simple solution that doesn't inconvenience the sender
or receiver too terribly much (assuming they're both on modern,
Unix-compatible machines) while making things considerably more
difficult for any would-be eavesdroppers. Suppose I want to send an AWS
PEM certificate to [Chris](https://viget.com/about/team/cjones),
disregarding the fact that he's sitting maybe four feet from me right
now. Here's what I'd do:
### Step 1: Encrypt with OpenSSL
I have a short shell script, `encrypt.sh`, that lives in my `~/.bin`
directory:
#!/bin/sh
openssl aes-256-cbc -a -salt -pass "pass:$2" -in $1 -out $1.enc
echo "openssl aes-256-cbc -d -a -pass \"pass:XXX\" -in $1.enc -out $1"
This script takes two arguments: the file you want to encrypt and a
password (or, preferably, a [passphrase](https://xkcd.com/936/)). To
encrypt the certificate, I'd run:
encrypt.sh production.pem
"I can get you a toe by 3 o'clock this afternoon."
The script creates an encrypted file, `production.pem.enc`, and outputs
instructions for decrypting it, but with the password blanked out.
### Step 2: Send the encryped file
From here, I'd move the encrypted file to my Dropbox public folder and
send Chris the generated link, as well as the output of `encrypt.sh`,
over IM:
![](http://i.imgur.com/lSEsz5z.jpg)
Once he acknowledges that he's received the file, I immediately delete
it.
### Step 3: Send the password (via another channel)
Now I need to send Chris the password. Here's what I **don't** do: send
it to him over the same channel that I used to send the file itself.
Instead, I pull out my phone and send it to him as a text message:
![](http://i.imgur.com/pQHZlkO.jpg)
Now Chris has the file, instructions to decrypt it, and the passphrase,
so he's good to go. An attacker, meanwhile, would need access to both
his Google chat and iOS messages, or at least a sweet [\$5
wrench](http://xkcd.com/538/). (Friday is two-for-one XKCD day, in case
you missed the sign out front.)
------------------------------------------------------------------------
So that's what I've been doing when I have to send private files across
the network. I'm sure a security expert could find a hundred ways that
it's insufficient, but I hope said strawman expert would agree that this
is a much better approach than sending this information in the clear.
I'm curious what others do in these types of situations -- let me know
in the comments below.

View File

@@ -0,0 +1,86 @@
---
title: "Single-Use jQuery Plugins"
date: 2009-07-16T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/single-use-jquery-plugins/
---
One of the best features of [jQuery](http://jquery.com/) is its simple,
powerful plugin system. The most obvious reason to write a plugin is the
same reason you'd create a Rails plugin: to package up functionality for
reuse. While code reuse between projects is certainly a worthy goal, it
sets a prohibitively high bar when deciding whether or not to pull a
piece of functionality into a plugin.
There are a number of good reasons to create jQuery plugins for behavior
specific to the app under development. Consider the following example, a
simple plugin to create form fields for an arbitrary number of nested
resources, adapted from a recent project:
(function($) { $.fn.cloneableFields = function() { return this.each(function() { var container = $(this); var fields = container.find("fieldset:last"); var label = container.metadata().label || "Add"; container.count = function() { return this.find("fieldset").size(); }; // If there are existing entries, hide the form fields by default if (container.count() > 1) { fields.hide(); } // When link is clicked, add a new set of fields and set their keys to // the total number of fieldsets, e.g. instruction_attributes[5][name] var addLink = $("<a/>").text(label).click(function() { html = fields.html().replace(/\[\d+\]/g, "[" + container.count() + "]"); $(this).before("<fieldset>" + html + "</fieldset>"); return false; }); container.append(addLink); }); }; })(jQuery);
## Cleaner Code {#cleaner_code}
When I was first starting out with jQuery and unobtrusive JavaScript, I
couldn't believe how easy it was to hook into the DOM and add behavior.
I ended up with monstrous `application.js` files consisting solely of
giant `$(document).ready()` functions --- exactly the kind of spaghetti
code I switched to Ruby and Rails to avoid. A pre-refactoring version of
[SpeakerRate](http://www.speakerrate.com) had one over 700 lines long.
By pulling this feature into a plugin, rather than some version of the
above code in our `$(document).ready()` function, we can stash it in a
separate file and replace it with a single line:
$("div.cloneable").cloneableFields();
Putting feature details into separate files turns our `application.js`
into a high-level view of the behavior of the site.
## State Maintenance {#state_maintenance}
In JavaScript, functions created inside of other functions maintain a
link to variables declared in the outer function. In the above example,
we create variables called `container` and `fields` when the page is
loaded, and then access those variables in the `click()` handler of the
inserted link. This way, we can avoid performing potentially expensive
jQuery selectors every time an event is fired.
Right now, you might be thinking, "But David, isn't
`$(document).ready()` also a function? Shouldn't this same principle
apply?" Yes and no, dear reader. Variables declared in
`$(document).ready()` can be accessed by functions declared there, but
since it's only called once, there will only be one copy of those
variables for the page. By using the standard `return this.each()`
plugin pattern, we ensure that there will be a copy of our variables for
each selector match, so that we can have multiple sets of
CloneableFields on a single page.
## Faster Scripts {#faster_scripts}
Aside from being able to store the results of selectors in variables,
there are other performance gains to be had by containing your features
in plugins. If a behavior involves attaching event listeners to five
different DOM elements, rather than running selectors to search for each
of these elements when the page loads, we'll get better performance by
searching for a containing element and then invoking our plugin on it,
since we'll only have to make one call on pages that don't have the
feature. Furthermore, inside your plugin, you'll be more inclined to
scope your selectors properly, further increasing performance.
If you opt to put your features into separate files, make sure compress
all your JavaScript into one file in production to reduce the number of
HTTP requests.
## Conclusion
As Rubyists, the reasons to package up jQuery features follow many of
the ideas to which we already subscribe: DRY, separation of concerns,
and idiomatic code. Using jQuery plugins is by no means the only way to
achieve clean JavaScript; the April edition of
[JSMag](http://www.jsmag.com/main.issues.description/id=19/) has a great
article about containing features within object literals, a more
framework-agnostic approach. Whatever method you choose, do *something*
to avoid the novel-length `$(document).ready()` function. Your future
self will thank you for it.

View File

@@ -0,0 +1,71 @@
---
title: "Social Media API Gotchas"
date: 2010-09-13T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/social-media-api-gotchas/
---
I've been heads-down for the last few weeks developing the web site for
the new [PUMA Social](http://www.puma.com/social) campaign. A major part
of this site is a web-based game that rewards users for performing
activities on various sites across the internet, and as such, I've
become intimately familiar with the APIs of several popular web sites
and their various --- shall we say --- *quirks*. I've collected the most
egregious here with the hope that I can save the next developer a bit of
anguish.
## Facebook Graph API for "Likes" is busted {#facebook_graph_api_for_8220likes8221_is_busted}
Facebook's [Graph API](https://developers.facebook.com/docs/api) is
awesome. It's fantastic to see them embracing
[REST](https://en.wikipedia.org/wiki/Representational_State_Transfer)
and the open web. That said, the documentation doesn't paint an accurate
picture of the Graph API's progress, and there are aspects that aren't
ready for prime time. Specifically, the "Like" functionalty:
- For a page (like
[http://www.facebook.com/puma](https://www.facebook.com/puma)), you
can retrieve a maximum of 500 fans, selected at random. For a page
with more than 2.2 million fans, this is of ... *limited* use.
- For an individual item like a status update or photo, you can
retrieve a list of the people who've "liked" it, but it's a small
subset of the people you can view on the site itself. You might
think this is a question of privacy, but I found that some users who
are returned without providing authentication information are
omitted when authenticated.
- For individual users, accessing the things they've "liked" only
includes pages, not normal wall activity or pages elsewhere on the
web.
## Facebook Tabs retrieve content with POST {#facebook_tabs_retrieve_content_with_post}
Facebook lets you put tabs on your page with content served from
third-party websites. They're understandably strict about what tags
you're allowed to use --- no `<script>` or `<body>` tags, for example
--- and they typically do a good job explaining what rules are being
violated.
On the other hand, I configured a Facebook app to pull in tab content
from our Ruby on Rails application and was greeted with the unhelpful
"We've encountered an error with the page you requested." It took a lot
of digging, but I discovered that Facebook retrieves tab content with
`POST` (rather than `GET`) requests, and what's more, it submits them
with a `Content-Type` header of "application/x-www-form-urlencoded,"
which triggers an InvalidAuthenticityToken exception if you save
anything to the database during the request/response cycle.
## Twitter Search API `from_user_id` is utter crap {#twitter_search_api_from_user_id_is_utter_crap}
Twitter has a fantastic API, with one glaring exception. Results from
the [search
API](http://apiwiki.twitter.com/Twitter-Search-API-Method:-search)
contain fields named `from_user` and `from_user_id`; `from_user` is the
user's Twitter handle and `from_user_id` is a made-up number that has
nothing to do with the user's actual user ID. This is apparently a
[known
issue](https://code.google.com/p/twitter-api/issues/detail?id=214) that
is too complicated to fix. Do yourself a favor and match by screen name
rather than unique ID.

View File

@@ -0,0 +1,67 @@
---
title: "Static Asset Packaging for Rails 3 on Heroku"
date: 2011-03-29T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/static-asset-packaging-rails-3-heroku/
---
**Short Version:** the easiest way to combine and minify static assets
(CSS and Javascript) in your Rails 3 app running on Heroku is to use
[AssetPackager](https://github.com/sbecker/asset_packager) with [this
fork of Heroku Asset
Packager](https://github.com/cbeier/heroku_asset_packager). It just
works.
**Long version:** in his modern day classic, [High Performance Web
Sites](https://www.amazon.com/High-Performance-Web-Sites-Essential/dp/0596529309),
Steve Souders\' very first rule is to "make fewer HTTP requests." In
practical terms, among other things, this means to combine separate CSS
and Javascript files whenever possible. The creators of the Rails
framework took this advice to heart, adding the `:cache => true` option
to the
[`javascript_include_tag`](http://apidock.com/rails/ActionView/Helpers/AssetTagHelper/javascript_include_tag)
and
[`stylesheet_link_tag`](http://apidock.com/rails/ActionView/Helpers/AssetTagHelper/stylesheet_link_tag)
helpers to provide asset concatenation at no cost to the developer.
As time went on, our needs outgrew the capabilities of `:cache => true`
and solutions like
[AssetPackager](https://github.com/sbecker/asset_packager) came onto the
scene, offering increased control over how assets are combined as well
as *minification*, stripping comments and unnecessary whitespace from
CSS and Javascript files before packaging them together. Later, even
more sophisticated solutions like
[Jammit](https://documentcloud.github.com/jammit/) arrived, offering
even more minification capabilities including inlining small images into
CSS.
Of course, static asset packaging wasn't the only part of the Rails
ecosystem that was undergoing major changes during this time. An
increased emphasis on ease of deployment saw the rise of
[Capistrano](https://github.com/capistrano/capistrano/wiki),
[Passenger](http://www.modrails.com/), and eventually
[Heroku](https://heroku.com/), which offers hands-free system
maintenance and simple `git push heroku` deployment. This simplification
is not without trade-offs, though; you can only write to your app's
`tmp` directory and the lack of root access means that you can't install
additional software. Both of these limitations have ramifications for
static asset packaging, namely:
1. Both `:cache => true` and standard AssetPackager work by writing
files into your app's `public` directory, which, as you can likely
guess, is *verboten*.
2. Jammit has several compression options, but all of them require Java
support, which we don't have on Heroku. You have the option of
compressing your assets and checking them into your repository by
hand, but I for one can't stand putting build artifacts in the repo.
I've seen a lot of questions about how to do static asset packaging on
Heroku and just as many bad answers (which I'll avoid linking to here).
The best solution we've found uses
[AssetPackager](https://github.com/sbecker/asset_packager) along with
[this fork of Heroku Asset
Packager](https://github.com/cbeier/heroku_asset_packager) that has been
modified to work with Rails 3. It's not the sexiest solution, but it
works, and you'll never have to think about it again.

View File

@@ -0,0 +1,85 @@
---
title: "Stop Pissing Off Your Designers"
date: 2009-04-01T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/stop-pissing-off-your-designers/
---
A few weeks ago, our local [Refresh](http://refreshthetriangle.org)
group pitted me (representing web developers) against Viget designer
[Mindy](https://www.viget.com/about/team/mwagner) in a battle for the
ages. Our talk, "Ten Things Designers Do That Piss Developers Off (and
Vice Versa)," offered a back-and-forth look at some of the issues that
crop up between web professionals. Despite the overwhelming strength of
my arguments, I won't deny that she got some good shots in. Here are
some of the key lessons I took away.
### Stay off the bandwagon
One of Mindy's best points was the tendency of developers, when
selecting technologies to use on a project, to go with what's new and
hip rather than what's the best fit or what will yield the best final
result. I think we can all relate to learning a new technology or
technique and then wanting to immediately apply it to whatever we're
working on.
Technology bandwagon-jumping goes hand-in-hand with another common
problem: over-engineering. In my experience, when a chosen technology is
a bad fit for a project, it's typically because it's too powerful. An
over-engineered solution is a nightmare for the next developer --- in a
past life, I maintained a Spring-powered, Lucene-searchable monstrosity
running on dedicated hardware that would have been better served with a
WordPress install on Dreamhost.
When selecting technologies, stick with the best fit, whether that's
what you know best or what will lead to the best final product. If
you're just dying to try out some new technology, do what I do: redo
your personal site (in lieu of actually posting any new content to it).
### Avoid the knee-jerk "No"
Picture this: you're sitting at your desk one morning, happily reading
Hacker News, when an IM window pops up on your screen. It's your PM, and
she's got a new feature request from the client. It's not a major
change, but it will involve a substantial overhaul of the messaging
system you built. What's your response? Be honest --- you give her
seventeen reasons why the requested change is a bad idea.
When discussing feature requests, keep in mind that the ultimate goal is
to create the best product possible. Requirements change, and though it
sucks to complicate elegant solutions, sometimes change is necessary. As
an added benefit, if you avoid staying "no" instinctively, when a
*truly* bad idea lands on your plate, your objections will carry a lot
more weight.
### Remember: you are not the user
Mindy noted a trait common to many developers: a lack of empathy for the
user, or rather, the mistaken idea that we ourselves are the typical
user. In other words, developers are prone to creating features that
they would want to use, regardless of how well they might serve the
actual audience of the site.
When deciding on geeky features, it's important to keep your audience in
mind. If you're designing a site about web productivity, by all means,
go nuts --- bookmarklets, keyboard shortcuts, customizable RSS feeds,
the whole nine yards. But if your site's intended audience is, say,
gardening enthusiasts, your time would probably be better spent
elsewhere.
### But in the end
We all want to create the best web sites possible. Disagreements arise
about definitions of "best"; while a designer wants a site that's
attractive and intuitive, the developer wants one that is stable and
maintainable. In the end, these qualities aren't mutually exclusive ---
the highest-quality websites have them all.
Mindy has posted [her
thoughts](https://www.viget.com/inspire/stop-driving-your-developers-crazy)
on the talk, and our slides are available on
[SlideShare](http://www.slideshare.net/mindywagner/10-things-designers-do-that-piss-developers-off-and-vice-versa).
And if you're in Durham (or lesser nearby cities), come on out to the
next [Refresh](http://refreshthetriangle.org) meeting.

View File

@@ -0,0 +1,136 @@
---
title: "Testing Solr and Sunspot (locally and on CircleCI)"
date: 2018-11-27T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/testing-solr-and-sunspot-locally-and-on-circleci/
---
I don\'t usually write complex search systems, but when I do, I reach
for [Solr](http://lucene.apache.org/solr/) and the awesome
[Sunspot](http://sunspot.github.io/) gem. I pulled them into a recent
client project, and while Sunspot makes it a breeze to define your
search indicies and queries, its testing philosophy can best be
described as \"figure it out yourself, smartypants.\"
I found a [seven-year old code
snippet](https://dzone.com/articles/install-and-test-solrsunspot) that
got me most of the way, but needed to make some updates to make it
compatible with modern RSpec and account for a delay on Circle between
Solr starting and being available to index documents. Here\'s the
resulting config, which should live in `spec/support/sunspot.rb`:
``` {.code-block .line-numbers}
require 'sunspot/rails/spec_helper'
require 'net/http'
try_server = proc do |uri|
begin
response = Net::HTTP.get_response uri
response.code != "503"
rescue Errno::ECONNREFUSED
end
end
start_server = proc do |timeout|
server = Sunspot::Rails::Server.new
uri = URI.parse("http://0.0.0.0:#{server.port}/solr/default/update?wt=json")
try_server[uri] or begin
server.start
at_exit { server.stop }
timeout.times.any? do
sleep 1
try_server[uri]
end
end
end
original_session = nil # always nil between specs
sunspot_server = nil # one server shared by all specs
if defined? Spork
Spork.prefork do
sunspot_server = start_server[60] if Spork.using_spork?
end
end
RSpec.configure do |config|
config.before(:each) do |example|
if example.metadata[:solr]
sunspot_server ||= start_server[60] || raise("SOLR connection timeout")
else
original_session = Sunspot.session
Sunspot.session = Sunspot::Rails::StubSessionProxy.new(original_session)
end
end
config.after(:each) do |example|
if example.metadata[:solr]
Sunspot.remove_all!
else
Sunspot.session = original_session
end
original_session = nil
end
end
```
*(Fork me at
<https://gist.github.com/dce/3a9b5d8623326214f2e510839e2cac26>.)*
With this code in place, pass `solr: true` as RSpec metadata^[1](#f1)^
to your `describe`, `context`, and `it` blocks to test against a live
Solr instance, and against a stub instance otherwise.
[]{#a-couple-other-sunspot-related-things}
## A couple other Sunspot-related things [\#](#a-couple-other-sunspot-related-things "Direct link to A couple other Sunspot-related things"){.anchor aria-label="Direct link to A couple other Sunspot-related things"}
While I\'ve got you here, thinking about search, here are a few other
neat tricks to make working with Sunspot and Solr easier.
[]{#use-foreman-to-start-all-the-things}
### Use Foreman to start all the things [\#](#use-foreman-to-start-all-the-things "Direct link to Use Foreman to start all the things"){.anchor aria-label="Direct link to Use Foreman to start all the things"}
Install the [Foreman](http://ddollar.github.io/foreman/) gem and create
a `Procfile` like so:
rails: bundle exec rails server -p 3000
webpack: bin/webpack-dev-server
solr: bundle exec rake sunspot:solr:run
Then you can boot up all your processes with a simple `foreman start`.
[]{#configure-sunspot-to-use-the-same-solr-instance-in-dev-and-test}
### Configure Sunspot to use the same Solr instance in dev and test [\#](#configure-sunspot-to-use-the-same-solr-instance-in-dev-and-test "Direct link to Configure Sunspot to use the same Solr instance in dev and test"){.anchor aria-label="Direct link to Configure Sunspot to use the same Solr instance in dev and test"}
[By
default](https://github.com/sunspot/sunspot/blob/3328212da79178319e98699d408f14513855d3c0/sunspot_rails/lib/generators/sunspot_rails/install/templates/config/sunspot.yml),
Sunspot wants to run two different Solr processes, listening on two
different ports, for the development and test environments. You only
need one instance of Solr running --- it\'ll handle setting up a
\"core\" for each environment. Just set the port to the same number in
`config/sunspot.yml` to avoid starting up and shutting down Solr every
time you run your test suite.
[]{#sunspot-doesnt-reindex-automatically-in-test-mode}
### Sunspot doesn\'t reindex automatically in test mode [\#](#sunspot-doesnt-reindex-automatically-in-test-mode "Direct link to Sunspot doesn't reindex automatically in test mode"){.anchor aria-label="Direct link to Sunspot doesn't reindex automatically in test mode"}
Just a little gotcha: typically, Sunspot updates the index after every
update to an indexed model, but not so in test mode. You\'ll need to run
some combo of `Sunspot.commit` and `[ModelName].reindex` after making
changes that you want to test against.
------------------------------------------------------------------------
That\'s all I\'ve got. Have a #blessed Tuesday and a happy holiday
season.
[1.]{#f1} e.g. `describe "viewing the list of speakers", solr: true do`
[](#a1)

View File

@@ -0,0 +1,54 @@
---
title: "Testing Your Codes Text"
date: 2011-08-31T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/testing-your-codes-text/
---
The "Ubiquitous Automation" chapter of [*The Pragmatic
Programmer*](https://books.google.com/books?id=5wBQEp6ruIAC&lpg=PA254&vq=ubiquitous%20automation&pg=PA230#v=onepage&q&f=false)
opens with the following quote:
> Civilization advances by extending the number of important operations
> we can perform without thinking.
>
> --Alfred North Whitehead
As a responsible and accomplished developer, when you encounter a bug in
your application, what's the first thing you do? Write a failing test
case, of course, and only once that's done do you focus on fixing the
problem. But what about when the bug is not related to the *behavior* of
your application, but rather to its configuration, display, or some
other element outside the purview of normal testing practices? I contend
that you can and should still write a failing test.
**Scenario:** In the process of merging a topic branch into master, you
encounter a conflict in one of your ERB files. You fix the conflict and
commit the resolution, run the test suite, and then deploy your changes
to production. An hour later, you receive an urgent email from your
client wondering what happened to the footer of their site. As it turns
out, there were *two* conflicts in the file, and you only fixed the
first, committing the conflict artifacts of the second into the repo.
Your gut instinct is to zip off a quick fix, write a self-deprecating
commit message, and act like the whole thing never happened. But
consider writing a rake task like this:
namespace :preflight do task :git_conflict do paths = `grep -lir '<<<\\|>>>' app lib config`.split(/\n/) if paths.any? puts "\ERROR: Found git conflict artifacts in the following files\n\n" paths.each {|path| puts " - #{path}" } exit 1 end end end
This task greps through your `app`, `lib`, and `config` directories
looking for occurrences of `<<<` or `>>>` and, if it finds any, prints a
list of the offending files and exits with an error. Hook this into the
rake task run by your continuous integration server and never worry
about accidentally deploying errant git artifacts again:
namespace :preflight do task :default do Rake::Task['cover:ensure'].invoke Rake::Task['preflight:all'].invoke end task :all do Rake::Task['preflight:git_conflict'].invoke end task :git_conflict do paths = `grep -lir '<<<\\|>>>' app lib config`.split(/\n/) if paths.any? puts "\ERROR: Found git conflict artifacts in the following files\n\n" paths.each {|path| puts " - #{path}" } exit 1 end end end Rake::Task['cruise'].clear task :cruise => 'preflight:default'
We've used this technique to keep our deployment configuration in order,
to ensure that we're maintaining best practices, and to keep our
applications in shape as they grow and team members change. Think of it
as documentation taken to the next level -- text to explain the best
practice, code to enforce it. Assuming you're diligent about running
your tests, every one of these tasks you write is a problem that will
never make it to production.

View File

@@ -0,0 +1,49 @@
---
title: "The Balanced Developer"
date: 2011-10-31T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/the-balanced-developer/
---
In preparation for a recent team offsite, I spent some time thinking
about what I hold dear as a software developer. One idea I kept coming
back to is the notion of *balance.* I see balance manifesting itself
several ways in the work of a successful developer, some of which
follow.
## Speed Versus Quality
The most obvious example is the balance of development speed and
quality. When building software, it's never a good idea to write code as
fast as possible without any attention toward maintainability, just as
it's never a good idea to spend such an inordinate amount of time
designing and tweaking your software that it never ships to customers.
The balanced developer focuses on delivering value both immediately
*and* through the life of the software.
## Shiny Versus Proven
When it comes to selecting tools and technologies, again, balance is
key. An unbalanced developer selects technologies simply because they're
new and exciting, or rejects them simply because they're unknown and
unproven. A balanced developer evaluates new technologies on their own
merits, weighing gains in functionality against the inherent risks.
## Doing Versus Sharing
If you've ever looked at someone's code after hearing them speak at a
conference, you know that there's not necessarily a correlation between
someone's ability to speak about technology and their ability to create
it. At the opposite end of the spectrum, there's the quiet fellow who
sits in your company's basement, writing fast, elegant code that no one
ever notices. The balanced developer understands that doing work and
sharing work are most effective in combination.
## That Said...
To hijack an old saying, you should strive for balance in all things,
including balance itself. Falling perfectly in the middle of every pair
of tradeoffs would be, frankly, *unbalanced*. Everyone has their strong
opinions, and that's a good thing, provided it's balanced out with a
healthy dose of pragmatism.

View File

@@ -0,0 +1,85 @@
---
title: "The Little Schemer Will Expand/Blow Your Mind"
date: 2017-09-21T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/the-little-schemer-will-expand-blow-your-mind/
---
I thought I\'d take a break from the usual web dev content we post here
to tell you about my favorite technical book, *The Little Schemer*, by
Daniel P. Friedman and Matthias Felleisen: why you should read it, how
you should read it, and a couple tools to help you on your journey.
[]{#why-read-the-little-schemer}
## Why read *The Little Schemer* [\#](#why-read-the-little-schemer "Direct link to Why read The Little Schemer"){.anchor aria-label="Direct link to Why read The Little Schemer"}
**It teaches you recursion.** At its core, *TLS* is a book about
recursion \-- functions that call themselves with modified versions of
their inputs in order to obtain a result. If you\'re a working
developer, you\'ve probably worked with recursive functions if you\'ve
(for example) modified a deeply-nested JSON structure. *TLS* starts as a
gentle introduction to these concepts, but things quickly get out of
hand.
**It teaches you functional programming.** Again, if you program in a
language like Ruby or JavaScript, you write your fair share of anonymous
functions (or *lambdas* in the parlance of Scheme), but as you work
through the book, you\'ll use recursion to build lambdas that do some
pretty amazing things.
**It teaches you (a) Lisp.**
Scheme/[Racket](https://en.wikipedia.org/wiki/Racket_(programming_language))
is a fun little language that\'s (in this author\'s humble opinion) more
approachable than Common Lisp or Clojure. It\'ll teach you things like
prefix notation and how to make sure your parentheses match up. If you
like it, one of those other languages is a great next step.
**It\'s different, and it\'s fun.** *TLS* is *computer science* as a
distinct discipline from \"making computers do stuff.\" It\'d be a cool
book even if we didn\'t have modern personal computers. It\'s halfway
between a programming book and a collection of logic puzzles. It\'s
mind-expanding in a way that your typical animal drawing tech book
can\'t approach.
[]{#how-to-read-the-little-schemer}
## How to read *The Little Schemer* [\#](#how-to-read-the-little-schemer "Direct link to How to read The Little Schemer"){.anchor aria-label="Direct link to How to read The Little Schemer"}
**Get a paper copy of the book.** You can find PDFs of the book pretty
easily, but do yourself a favor and pick up a dead-tree copy. Make
yourself a bookmark half as wide as the book, and use it to cover the
right side of each page as you work through the questions on the left.
**Actually write the code.** The book does a great job showing you how
to write increasingly complex functions, but if you want to get the most
out of it, write the functions yourself and then check your answers
against the book\'s.
**Run your code in the Racket REPL.** Put your functions into a file,
and then load them into the interactive Racket console so that you can
try them out with different inputs. I\'ll give you some tools to help
with this at the end.
**Skip the rote recursion explanations.** This book is a fantastic
introduction to recursion, but by the third or fourth in-depth
walkthrough of how a recursive function gets evaluated, you can probably
just skim. It\'s a little bit overkill.
[]{#and-some-tools-to-help-you-get-started}
## And some tools to help you get started [\#](#and-some-tools-to-help-you-get-started "Direct link to And some tools to help you get started"){.anchor aria-label="Direct link to And some tools to help you get started"}
Once you\'ve obtained a copy of the book, grab Racket
(`brew install racket`) and
[rlwrap](https://github.com/hanslub42/rlwrap) (`brew install rlwrap`),
subbing `brew` for your platform\'s package manager. Then you can start
an interactive session with `rlwrap racket -i`, which is a much nicer
experience than calling `racket -i` on its own. In true indieweb
fashion, I\'ve put together a simple GitHub repo called [Little Schemer
Workbook](https://github.com/dce/little-schemer-workbook) to help you
get started.
So check out *The Little Schemer.* Just watch out for those jelly
stains.

View File

@@ -0,0 +1,138 @@
---
title: "The Right Way to Store and Serve Dragonfly Thumbnails"
date: 2018-06-29T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/the-right-way-to-store-and-serve-dragonfly-thumbnails/
---
We love and use [Dragonfly](https://github.com/markevans/dragonfly) to
manage file uploads in our Rails applications. Specifically, its API for
generating thumbnails is a huge improvement over its predecessors. There
is one area where the library falls short, though: out of the box,
Dragonfly doesn\'t do anything to cache the result of a resize/crop,
meaning a naïve implementation would rerun these operations every time
we wanted to show a thumbnailed image to a user.
[The Dragonfly documentation offers some
suggestion](https://markevans.github.io/dragonfly/cache#processing-on-the-fly-and-serving-remotely)
about how to handle this issue, but makes it clear that you\'re pretty
much on your own:
``` {.code-block .line-numbers}
Dragonfly.app.configure do
# Override the .url method...
define_url do |app, job, opts|
thumb = Thumb.find_by_signature(job.signature)
# If (fetch 'some_uid' then resize to '40x40') has been stored already, give the datastore's remote url ...
if thumb
app.datastore.url_for(thumb.uid)
# ...otherwise give the local Dragonfly server url
else
app.server.url_for(job)
end
end
# Before serving from the local Dragonfly server...
before_serve do |job, env|
# ...store the thumbnail in the datastore...
uid = job.store
# ...keep track of its uid so next time we can serve directly from the datastore
Thumb.create!(uid: uid, signature: job.signature)
end
end
```
To summarize: create a `Thumb` model to track uploaded crops. The
`define_url` callback executes when you ask for the URL for a thumbnail,
checking if a record exists in the database with a matching signature
and, if so, returning the URL to the stored image (e.g. on S3). The
`before_serve` block defines what happens when Dragonfly receives a
request for a thumbnailed image (the ones that look like `/media/...`),
storing the thumbnail and then creating a corresponding record in the
database.
The problem with this approach is that if someone gets ahold of the
initial `/media/...` URL, they can cause your app to reprocess the same
image multiple times, or store multiple copies of the same image, or
just fail outright. Here\'s how we can do it better.
First, create the `Thumbs` table, and put unique indexes on both
columns. This ensures we\'ll never store multiple versions of the same
cropping of any given image.
``` {.code-block .line-numbers}
class CreateThumbs < ActiveRecord::Migration[5.2]
def change
create_table :thumbs do |t|
t.string :signature, null: false
t.string :uid, null: false
t.timestamps
end
add_index :thumbs, :signature, unique: true
add_index :thumbs, :uid, unique: true
end
end
```
Then, create the model. Same idea: ensure uniqueness of signature and
UID.
``` {.code-block .line-numbers}
class Thumb < ApplicationRecord
validates :signature,
:uid,
presence: true,
uniqueness: true
end
```
Then replace the `before_serve` block from above with the following:
``` {.code-block .line-numbers}
before_serve do |job, env|
thumb = Thumb.find_by_signature(job.signature)
if thumb
throw :halt,
[301, { "Location" => job.app.remote_url_for(thumb.uid) }, [""]]
else
uid = job.store
Thumb.create!(uid: uid, signature: job.signature)
end
end
```
*([Here\'s the full resulting
config.](https://gist.github.com/dce/4e79183a105e415ca0e5e1f1709089b8))*
The key difference here is that, before manipulating, storing, and
serving an image, we check if we already have a thumbnail with the
matching signature. If we do, we take advantage of a [cool
feature](http://markevans.github.io/dragonfly/v0.9.15/file.URLs.html#Overriding_responses)
of Dragonfly (and of Ruby) and `throw`^1^ a Rack response that redirects
to the existing asset which Dragonfly
[catches](https://github.com/markevans/dragonfly/blob/a6835d2a9a1195df840c643d6f24df88b1981c91/lib/dragonfly/server.rb#L55)
and returns to the user.
------------------------------------------------------------------------
So that\'s that: a bare minimum approach to storing and serving your
Dragonfly thumbnails without the risk of duplicates. Your app\'s needs
may vary slightly, but I think this serves as a better default than what
the docs recommend. Let me know if you have any suggestions for
improvement in the comments below.
*Dragonfly illustration courtesy of
[Vecteezy](https://www.vecteezy.com/vector-art/165467-free-insect-line-icon-vector).*
1. For more information on Ruby\'s `throw`/`catch` mechanism, [here is
a good explanation from *Programming
Ruby*](http://phrogz.net/ProgrammingRuby/tut_exceptions.html#catchandthrow)
or see chapter 4.7 of Avdi Grimm\'s [*Confident
Ruby*](https://pragprog.com/book/agcr/confident-ruby).

View File

@@ -0,0 +1,219 @@
---
title: "Things About Which The Viget Devs Are Excited (May 2020 Edition)"
date: 2020-05-14T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/things-about-which-the-viget-devs-are-excited-may-2020-edition/
---
A couple months back, the Viget dev team convened in central Virginia to
reflect on the year and plan for the future. As part of the meeting, we
did a little show-and-tell, where everyone got the chance to talk about
a technology or resource that\'s attracted their interest. Needless to
say, *plans have changed*, but what hasn\'t changed are our collective
curiosity about nerdy things and our desire to share them with one
another and with you, internet person. So with that said, here\'s
what\'s got us excited in the world of programming, technology, and web
development.
[]{#annie}
## [Annie](https://www.viget.com/about/team/akiley) [\#](#annie "Direct link to Annie"){.anchor aria-label="Direct link to Annie"}
I'm excited about Wagtail CMS for Django projects. It provides a lot of
high-value content management features (hello permissions management and
photo cropping) so you don't need to reinvent the wheel, but it lets you
customize behavior when you need to. We've had two projects lately that
need a sophisticated CMS to manage data that we serve through an API,
and Wagtail has allowed us to get a solid CMS up and running and focus
on the business logic behind the API.
- <https://wagtail.io/>
[]{#chris-m}
## [Chris M.](https://www.viget.com/about/team/cmanning) [\#](#chris-m "Direct link to Chris M."){.anchor aria-label="Direct link to Chris M."}
Svelte is a component framework for building user interfaces. It's
purpose is similar to other frameworks like React and Vue, but I'm
excited about Svelte because of its differences.
For example, instead of a virtual DOM, Svelte compiles your component to
more performant imperative code. And Svelte is reactive, so updating a
variable is just that: there's no need to call `setState` or a similar
API to trigger updates. Differences like these offer less boilerplate
and better performance from the start.
The Svelte community is also working on Sapper, an application framework
for server-side rendering similar to Next.js.
- <https://svelte.dev/>
- <https://sapper.svelte.dev/>
[]{#danny}
## [Danny](https://www.viget.com/about/team/dbrown) [\#](#danny "Direct link to Danny"){.anchor aria-label="Direct link to Danny"}
I\'ve been researching the Golang MVC framework, Revel. At Viget, we
often use Ruby on Rails for any projects that need an MVC framework. I
enjoy programming in Go, so I started researching what they have to
offer in that department. Revel seemed to be created to be mimic Rails
or other MVC frameworks, which made it very easy to pick up for a Rails
developer. The main differences come in the language distinctions
between Golang and Ruby.
Namely, the statically-typed, very explicit Go contrasts greatly with
Ruby when comparing similar features. For most cases, it takes more
lines of code in Go to achieve similar things, though that is not
necessarily a bad thing. Additionally, standard Go ships with an
incredible amount of useful packages, making complex features in
Go/Revel require fewer dependencies. Finally, Go is an incredibly fast
language. For very large projects, Go is often a great language to use,
so that you can harness its power. However, for smaller-scale projects,
it can be a bit overkill.
- <https://revel.github.io/>
[]{#david}
## [David](https://www.viget.com/about/team/deisinger) [\#](#david "Direct link to David"){.anchor aria-label="Direct link to David"}
I'm excited about [Manjaro Linux running the i3 tiling window
manager](https://manjaro.org/download/community/i3/). I picked up an old
Thinkpad and installed this combo, and I've been impressed with how fun
and usable it is, and how well it runs on a circa-2008 computer. For
terminal-focused workflows, this setup is good as hell. Granted, it's
Linux, so there's still a fair bit of fiddling required to get things
working exactly as you'd like, but for a hobbyist OS nerd like me,
that's all part of the fun.
[]{#doug}
## [Doug](https://www.viget.com/about/team/davery) [\#](#doug "Direct link to Doug"){.anchor aria-label="Direct link to Doug"}
The improvements to iOS Machine Learning have been exciting --- it's
easier than ever to build iOS apps that can recognize speech, identify
objects, and even train themselves on-device without needing a network.
The Create ML tool simplifies the model training flow, allowing any iOS
dev to jump in with very little experience.
ML unlocks powerful capabilities for mobile apps, allowing apps to
recognize and act on data the way a user would. It's a fascinating area
of native app dev, and one we'll see a lot more of in apps over the next
few years.
- <https://developer.apple.com/machine-learning/core-ml/>
- <https://developer.apple.com/videos/play/wwdc2018/703>
- <https://developer.apple.com/documentation/createml/creating_an_image_classifier_model>
[]{#dylan}
## [Dylan](https://www.viget.com/about/team/dlederle-ensign) [\#](#dylan "Direct link to Dylan"){.anchor aria-label="Direct link to Dylan"}
I\'ve been diving into LiveView, a new library for the Elixir web
framework, Phoenix. It enables the sort of fluid, realtime interfaces
we\'d normally make with a Javascript framework like React, without
writing JavaScript by hand. Instead, the logic stays on the server and
the LiveView.js library is responsible for updating the DOM when state
changes. It\'s a cool new approach that could be a nice option in
between static server rendered pages and a full single page app
framework.
- <https://www.viget.com/articles/what-is-phoenix-liveview/>
- <https://blog.appsignal.com/2019/06/18/elixir-alchemy-building-go-with-phoenix-live-view.html>
[[Learn More]{.util-breadcrumb-md .mb-8 .group-hover:translate-y-20
.group-hover:opacity-0 .transition-all .ease-in-out
.duration-500}](https://www.viget.com/careers/application-developer/){.relative
.flex .group .flex-col .p-32 .md:p-40 .lg:p-64 .z-10}
### We're hiring Application Developers. Learn more and introduce yourself. {#were-hiring-application-developers.-learn-more-and-introduce-yourself. .text-20 .md:text-24 .lg:text-32 .font-bold .leading-[170%] .group-hover:-translate-y-20 .transition-transform .ease-in-out .duration-500}
![](data:image/svg+xml;base64,PHN2ZyBjbGFzcz0icmVjdC1pY29uLW1kIHNlbGYtZW5kIG10LTE2IGdyb3VwLWhvdmVyOi10cmFuc2xhdGUteS0yMCB0cmFuc2l0aW9uLWFsbCBlYXNlLWluLW91dCBkdXJhdGlvbi01MDAiIHZpZXdib3g9IjAgMCAyNCAyNCIgZmlsbD0ibm9uZSIgeG1sbnM9Imh0dHA6Ly93d3cudzMub3JnLzIwMDAvc3ZnIiBhcmlhLWhpZGRlbj0idHJ1ZSI+CjxwYXRoIGZpbGwtcnVsZT0iZXZlbm9kZCIgY2xpcC1ydWxlPSJldmVub2RkIiBkPSJNMTMuNzg0OCAxOS4zMDkxQzEzLjQ3NTggMTkuNTg1IDEzLjAwMTcgMTkuNTU4MyAxMi43MjU4IDE5LjI0OTRDMTIuNDQ5OCAxOC45NDA1IDEyLjQ3NjYgMTguNDY2MyAxMi43ODU1IDE4LjE5MDRMMTguNzg2NiAxMi44MzAxTDQuNzUxOTUgMTIuODMwMUM0LjMzNzc0IDEyLjgzMDEgNC4wMDE5NSAxMi40OTQzIDQuMDAxOTUgMTIuMDgwMUM0LjAwMTk1IDExLjY2NTkgNC4zMzc3NCAxMS4zMzAxIDQuNzUxOTUgMTEuMzMwMUwxOC43ODU1IDExLjMzMDFMMTIuNzg1NSA1Ljk3MDgyQzEyLjQ3NjYgNS42OTQ4OCAxMi40NDk4IDUuMjIwNzYgMTIuNzI1OCA0LjkxMTg0QzEzLjAwMTcgNC42MDI5MiAxMy40NzU4IDQuNTc2MTggMTMuNzg0OCA0Ljg1MjEyTDIxLjIzNTggMTEuNTA3NkMyMS4zNzM4IDExLjYyNDQgMjEuNDY5IDExLjc5MDMgMjEuNDk0NSAxMS45NzgyQzIxLjQ5OTIgMTIuMDExOSAyMS41MDE1IDEyLjA0NjEgMjEuNTAxNSAxMi4wODA2QzIxLjUwMTUgMTIuMjk0MiAyMS40MTA1IDEyLjQ5NzcgMjEuMjUxMSAxMi42NEwxMy43ODQ4IDE5LjMwOTFaIj48L3BhdGg+Cjwvc3ZnPg==){.rect-icon-md
.self-end .mt-16 .group-hover:-translate-y-20 .transition-all
.ease-in-out .duration-500}
[]{#eli}
## [Eli](https://www.viget.com/about/team/efatsi) [\#](#eli "Direct link to Eli"){.anchor aria-label="Direct link to Eli"}
I've been building a "Connected Chessboard" off and on for the last 3
years with my brother. There's a lot of fun stuff on the firmware side
of things, using OO concepts to organize move detection using analog
light sensors. But the coolest thing I recently learned was how to make
use of [analog multiplexers](https://www.sparkfun.com/products/13906).
With three input pins, you provide a binary representation of a number
0-7, and the multiplexer then performs magic to pipe input/output data
through one of 8 pins. By linking 8 of these together, and then a 9th
multiplexer on top of those (thanks chessboard for being an 8x8 square),
I can take 64 analog readings using only 7 IO pins. #how-neat-is-that
[]{#joe}
## [Joe](https://www.viget.com/about/team/jjackson) [\#](#joe "Direct link to Joe"){.anchor aria-label="Direct link to Joe"}
I\'m a self-taught developer and I\'ve explored and been interested in
some foundational topics in CS, like boolean logic, assembly/machine
code, and compiler design. This book, [The Elements of Computing
Systems: Building a Modern Computer from First
Principles](https://www.amazon.com/Elements-Computing-Systems-Building-Principles/dp/0262640686/ref=ed_oe_p),
and its [companion website](https://www.nand2tetris.org/) is a great
resource that gives you enough depth in everything from circuit design,
to compiler design.
[]{#margaret}
## [Margaret](https://www.viget.com/about/team/mwilliford) [\#](#margaret "Direct link to Margaret"){.anchor aria-label="Direct link to Margaret"}
I've enjoyed working with Administrate, a lightweight Rails engine that
helps you put together an admin dashboard built by Thoughtbot. It solves
the same problem as Active Admin, but without a custom DSL. It's easy to
quickly throw up an admin interface for resource CRUD, but anything
beyond your most simple use case will require going custom. The
documentation is straightforward and sufficient, and lays out how to go
about customizing and building on top of the existing framework. The
source code is available on Github and easy to follow. I haven't tried
it with a large scale application, but for getting something small-ish
up and running quickly, it's a great option.
[]{#shaan}
## [Shaan](https://www.viget.com/about/team/ssavarirayan) [\#](#shaan "Direct link to Shaan"){.anchor aria-label="Direct link to Shaan"}
I\'m excited about Particle\'s embedded IoT development platform. We
built almost of of our hardware projects using Particle\'s stack, and
there\'s a good reason for it. They sell microcontrollers that come
out-the-box with WiFi and Bluetooth connectivity built-in. They make it
incredibly easy to build connected devices, by allowing you to expose
functions on your device to the web through their API. Your web app can
then make calls to your device to either trigger functionality or get
data. It\'s really easy to manage multiple devices and they make remote
deployment of your device (setting up WiFi, etc.) a piece of cake.
- <https://docs.particle.io/quickstart/photon/>
[]{#sol}
## [Sol](https://www.viget.com/about/team/shawk) [\#](#sol "Direct link to Sol"){.anchor aria-label="Direct link to Sol"}
I'm excited about old things that are still really good. It's easy to
get lost in the hype of the new and shiny, but our industry has a long
history and many common problems have been solved over and over by
people smarter than you or I. One of those problems is running tasks
intelligently based on changed files. The battle-tested tool that has
solved this problem for decades is
[Make](https://www.gnu.org/software/make/manual/make.html). Even if you
aren't leveraging the intelligence around caching intermediary targets,
you can use Make for project-specific aliases to save yourself and your
coworkers some brain cycles remembering and typing out common commands
(e.g. across a number of other tools used for a given project).
TL;DR Make is old and still great.
------------------------------------------------------------------------
So there it is, some cool tech from your friendly Viget dev team. Hope
you found something worth exploring further, and if you like technology
and camaraderie, [we\'re always looking for great, nerdy
folks](https://www.viget.com/careers/).

View File

@@ -0,0 +1,63 @@
---
title: "Three Magical Git Aliases"
date: 2012-04-25T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/three-magical-git-aliases/
---
Git is an enormously powerful tool, but certainly not the most
beginner-friendly. The basic commands are straightforward enough, but
until you wrap your head around its internal model, it's easy to wind up
in a jumble of merge commits or worse. Here are three aliases I use as
part of my daily workflow that help me avoid many of the common
pitfalls.
### GPP (`git pull --rebase && git push`)
**I can't push without pulling, and I can't pull without rebasing.** I'm
not sure this is still a point of debate, but if so, let me make my side
known: I hate hate *hate* merge commits. And of course, what does Git
tell you after an unsuccessful push?
Merge the remote changes (e.g. 'git pull') before pushing again.
This will create a merge commit, regardless of whether there are any
conflicts between your changes and the remote. There are ways to prevent
these merge commits [at the configuration
level](https://viget.com/extend/only-you-can-prevent-git-merge-commits),
but they aren't foolproof. This alias is.
### GMF (`git merge --ff-only`)
**I can't create merge commits.** Similar to the last, this alias
prevents me from ever creating merge commits. I do my work in a topic
branch, and when the time comes to merge it back to the mainline
development branch, I check that branch out and pull down the latest
changes. At this point, if I were to type `git merge [branchname]`, I'd
create a merge commit.
Using this alias, though, the merge fails and I receive a warning that
this is not a [fast-forward
merge](https://365git.tumblr.com/post/504140728/fast-forward-merge). I
then check out my topic branch, rebase master, and then run the merge
successfully.
### GAP (`git add --patch`)
**I can't commit a code change without looking at it first.** Running
this command rather than `git add .` or using a commit flag lets me view
individual changes and decide whether or not I want to stage them. This
forces me to give everything I'm committing a final check and ensure
there isn't any undesirable code. It also allows me to break a set of
changes up into multiple commits, even if those changes are in the same
file.
What `git add --patch` doesn't do is stage new files, so you'll have to
add those by hand once you're done patching.
------------------------------------------------------------------------
Hope you find one or more of these aliases helpful. These *and more!*
available in my
[dotfiles](https://github.com/dce/dotfiles/blob/master/.aliases#L7).

View File

@@ -0,0 +1,46 @@
---
title: "Unfuddle User Feedback"
date: 2009-06-02T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/unfuddle-user-feedback/
---
Recently, we wanted a better system for managing feedback from
[SpeakerRate](http://speakerrate.com/) users. While we do receive some
general site suggestions, most of the feedback we get involves discrete
corrections to data (a speaker who has been entered into the system
twice, for example). We started to create a simple admin interface for
managing these requests, when we realized that the ticket tracking
system we use internally, [Unfuddle](http://unfuddle.com/), already has
all the features we need.
Fortunately, Unfuddle has a full-featured
[API](http://unfuddle.com/docs/api), so programatically creating tickets
is simply a matter of adding
[HTTParty](http://railstips.org/2008/7/29/it-s-an-httparty-and-everyone-is-invited)
to our `Feedback` model:
``` {#code .ruby}
class Feedback < ActiveRecord::Base include HTTParty base_uri "viget.unfuddle.com/projects/#{UNFUDDLE[:project]} validates_presence_of :description after_create :post_to_unfuddle, :if => proc { Rails.env == "production" } def post_to_unfuddle self.class.post("/tickets.xml", :basic_auth => UNFUDDLE[:auth], :query => { :ticket => ticket }) end private def ticket returning(Hash.new) do |ticket| ticket[:summary] = "#{self.topic}" ticket[:description] = "#{self.name} (#{self.email}) - #{self.created_at}:\n\n#{self.description}" ticket[:milestone_id] = UNFUDDLE[:milestone] ticket[:priority] = 3 end end end
```
We store our Unfuddle configuration in
`config/initializers/unfuddle.rb`:
``` {#code .ruby}
UNFUDDLE = { :project => 12345, :milestone => 12345, # the 'feedback' milestone :auth => { :username => "username", :password => "password" } }
```
Put your user feedback into Unfuddle, and you get all of its features:
email notification, bulk ticket updates, commenting, file attachments,
etc. This technique isn't meant to replace customer-service oriented
software like [Get Satisfaction](http://getsatisfaction.com/) (we're
using both on SpeakerRate), and if you're not already using a ticketing
system to manage your project, this is probably overkill; something like
[Lighthouse](http://lighthouseapp.com/) or [GitHub
Issues](https://github.com/blog/411-github-issue-tracker) would better
suit your needs, and both have APIs if you want to set up a similar
system. But for us here at Viget, who manage all aspects of our projects
through a ticketing system, seeing actionable user feedback in the same
place as the rest of our tasks has been extremely convenient.

View File

@@ -0,0 +1,260 @@
---
title: "Using Microcosm Presenters to Manage Complex Features"
date: 2017-06-14T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/using-microcosm-presenters-to-manage-complex-features/
---
We made [Microcosm](http://code.viget.com/microcosm/) to help us manage
state and data flow in our JavaScript applications. We think it\'s
pretty great. We recently used it to help our friends at
[iContact](https://www.icontact.com/) launch a [brand new email
editor](https://www.icontact.com/big-news). Today, I\'d like to show you
how I used one of my favorite features of Microcosm to ship a
particularly gnarly feature.
In addition to adding text, photos, and buttons to their emails, users
can add *code blocks* which let them manually enter HTML to be inserted
into the email. The feature in question was to add server-side code
santization, to make sure user-submitted HTML isn\'t invalid or
potentially malicious. The logic is roughly defined as follows:
- User modifies the HTML & hits \"preview\";
- HTML is sent up to the server and sanitized;
- The resulting HTML is displayed in the canvas;
- If the code is unmodified, user can \"apply\" the code or continue
editing;
- If the code is modified, user can \"apply\" the modified code or
\"reject\" the changes and continue editing;
- If at any time the user unfocuses the block, the code should return
to the last applied state.
Here\'s a flowchart that might make things clearer (did for me, in any
event):
![](http://i.imgur.com/URfAcl9.png)
This feature is too complex to handle with React component state, but
too localized to store in application state (the main Microcosm
instance). Fortunately, Microcosm gives us the perfect tool to handle
this scenario:
[Presenters](http://code.viget.com/microcosm/api/Presenter.html).
Using a Presenter, we can build an app-within-an-app, with a unique
domain, actions, and state, and communicate with the main repository as
necessary.
First, we define some
[Actions](http://code.viget.com/microcosm/api/actions.html) that only
pertain to this Presenter:
``` {.code-block .line-numbers}
const changeInputHtml = html => html
const acceptChanges = () => {}
const rejectChanges = () => {}
```
We don\'t export these functions, so they only exist in the context of
this file.
Next, we\'ll define the Presenter itself:
``` {.code-block .line-numbers}
class CodeEditor extends Presenter {
setup(repo, props) {
repo.addDomain('html', {
getInitialState() {
return {
originalHtml: props.block.attributes.htmlCode,
inputHtml: props.block.attributes.htmlCode,
unsafeHtml: null,
status: 'start'
}
},
```
The `setup` function is invoked when the Presenter is created. It
receives a fork of the main Microcosm repo as its first argument. We
invoke the
[`addDomain`](http://code.viget.com/microcosm/api/microcosm.html#adddomainkey-config-options)
function to add a new domain to the forked repo. The main repo will
never know about this new bit of state.
Now, let\'s instruct our new domain to listen for some actions:
``` {.code-block .line-numbers}
register() {
return {
[scrubHtml]: this.scrubSuccess,
[changeInputHtml]: this.inputHtmlChanged,
[acceptChanges]: this.changesAccepted,
[rejectChanges]: this.changesRejected
}
},
```
The
[`register`](http://code.viget.com/microcosm/api/domains.html#register)
method defines the mapping of Actions to handler functions. You should
recognize those actions from the top of the file, minus `scrubHtml`,
which is defined in a separate API module.
Now, still inside the domain object, let\'s define some handlers:
``` {.code-block .line-numbers}
inputHtmlChanged(state, inputHtml) {
let status = inputHtml === state.originalHtml ? 'start' : 'changed'
return { ...state, inputHtml, status }
},
scrubSuccess(state, { html, modified }) {
if (modified) {
return {
...state,
status: 'modified',
unsafeHtml: state.inputHtml,
inputHtml: html
}
} else {
return { ...state, status: 'validated' }
}
},
```
Handlers always take `state` as their first object and must return a new
state object. Now, let\'s add some more methods to our main `CodeEditor`
class.
``` {.code-block .line-numbers}
renderPreview = ({ html }) => {
this.send(updateBlock, this.props.block.id, {
attributes: { htmlCode: html }
})
}
componentWillUnmount() {
this.send(updateBlock, this.props.block.id, {
attributes: { htmlCode: this.repo.state.html.originalHtml }
})
}
```
Couple cool things going on here. The `renderPreview` function uses
[`this.send`](http://code.viget.com/microcosm/api/presenter.html#sendaction-...params)
to send an action to the main Microcosm instance, telling it to update
the canvas with the given HTML. And `componentWillUnmount` is noteworthy
in that it demonstrates that Presenters are just React components under
the hood.
Next, let\'s add some buttons to let the user trigger these actions.
``` {.code-block .line-numbers}
buttons(status, html) {
switch (status) {
case 'changed':
return (
<div styleName="buttons">
<ActionButton
action={scrubHtml}
value={html}
onDone={this.renderPreview}
>
Preview changes
</ActionButton>
</div>
)
case 'validated':
return (
<div styleName="buttons">
<ActionButton action={acceptChanges}>
Apply changes
</ActionButton>
</div>
)
// ...
```
The
[ActionButton](http://code.viget.com/microcosm/api/action-button.html)
component is pretty much exactly what it says on the tin --- a button
that triggers an action when pressed. Its callback functionality (e.g.
`onOpen`, `onDone`) lets you update the button as the action moves
through its lifecycle.
Finally, let\'s bring it all home and create our model and view:
``` {.code-block .line-numbers}
getModel() {
return {
status: state => state.html.status,
inputHtml: state => state.html.inputHtml
}
}
render() {
const { status, inputHtml } = this.model
const { name } = this.props
return (
<div>
{this.buttons(status, inputHtml)}
<textarea
id={name}
name={name}
value={inputHtml}
onChange={e => this.repo.push(changeInputHtml, e.target.value)}
disabled={status === 'modified'}
styleName="textarea"
/>
</div>
)
}
}
```
The
[docs](http://code.viget.com/microcosm/api/presenter.html#getmodelprops-state)
explain `getModel` better than I can:
> `getModel` assigns a model property to the presenter, similarly to
> `props` or `state`. It is recalculated whenever the Presenter's
> `props` or `state` changes, and functions returned from model keys are
> invoked every time the repo changes.
The `render` method is pretty straightahead React, though it
demonstrates how you interact with the model.
------------------------------------------------------------------------
The big takeaways here:
**Presenters can have their own repos.** These can be defined inline (as
I\'ve done) or in a separate file/object. I like seeing everything in
one place, but you can trot your own trot.
**Presenters can manage their own state.** Presenters receive a fork of
the main app state when they\'re instantiated, and changes to that state
(e.g. via an associated domain) are not automatically synced back to the
main repo.
**Presenters can use `send` to communicate with the main repository.**
Despite holding a fork of state, you can still use `this.send` (as we do
in `renderPreview` above) to push changes up the chain.
**Presenters can have their own actions.** The three actions defined at
the top of the file only exist in the context of this file, which is
exactly what we want, since that\'s the only place they make any sense.
**Presenters are just React components.** Despite all this cool stuff
we\'re able to do in a Presenter, under the covers, they\'re nothing but
React components. This way you can still take advantage of lifecycle
methods like `componentWillUnmount` (and `render`, natch).
------------------------------------------------------------------------
So those are Microcosm Presenters. We think they\'re pretty cool, and
hope you do, too. If you have any questions, hit us up on
[GitHub](https://github.com/vigetlabs/microcosm) or right down there.

View File

@@ -0,0 +1,29 @@
---
title: "Viget Devs Storm Chicago"
date: 2009-09-15T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/viget-devs-storm-chicago/
---
[![](http://farm1.static.flickr.com/28/53100874_f605bd5f42_m.jpg){align="right"}](http://www.flickr.com/photos/laffy4k/53100874/)
This past weekend, Ben and I travelled to Chicago to speak at [Windy
City Rails](http://windycityrails.org/). It was a great conference;
highlights included [Dean Wampler](http://www.deanwampler.com/)'s
discussion of functional programming in Ruby, [Noel
Rappin](http://www.railsprescriptions.com/)'s talk on handling
difficult-to-test portions of your applications, and our own [Ben
Scofield](https://www.viget.com/about/team/bscofield)'s very polished
presentation about modeling difficult domains. This was the second time
I'd seen [Yeduda Katz](http://yehudakatz.com/)'s keynote on the history
and future of the Rails framework, and that the talk has come so far in
the last three months says promising things for the progress of Rails 3.
Speaking at Windy City Rails was just as much an excuse to hang out in
Chicago as it was a chance to share knowledge with the Ruby community.
We knocked out the major [food](http://www.portillos.com/)
[groups](http://www.loumalnatis.com/) on the first day, and I set aside
a few days after the conference to wander the city. Big thanks to the
conference organizers and attendees. Videos of all of the talks will be
appearing on the [conference website](http://windycityrails.org/) soon.

View File

@@ -0,0 +1,153 @@
---
title: "Whats in a Word? Building a Verbose Party Game"
date: 2023-05-25T00:00:00+00:00
draft: false
canonical_url: https://www.viget.com/articles/whats-in-a-word-building-a-verbose-party-game/
---
Fun group party game. Somehow incorporate AI. Build it quickly. These
were the constraints we put on our mostly Colorado-based Pointless Corp.
team before [Pointless
Palooza](https://www.viget.com/articles/the-enduring-point-of-pointless-corp/).
A few of us wanted to partake in the old Pointless tradition of trying
out another role on the project, so simplicity was key. Thankfully,
Haley was itching to build a digital version of a very fun, but
relatively straightforward, game.
[Just One](https://boardgamegeek.com/boardgame/254640/just-one) is a
collaborative word association game that encourages both teamwork and
creativity. On a given turn, everyone except one player (the Guesser)
knows a random mystery word and must attempt to get the Guesser to guess
the mystery word via one-word hints. The catch is that every hint must
be unique, so duplicate hints are eliminated and not shown to the
Guesser.
Haley mentioned that she often hacked the board game to accommodate more
than 7 players and thought that doing so added a fun element of chaos to
the rounds. A digital version of the game would facilitate expanding the
party and it seemed like an easy enough lift for our team.
It's easier to play than explain, so mosey on over to
[verbose.club](https://verbose.club) and give it a try. And, if you want
to know more about how each of us fared going heads down on one project
for 48 hours (and counting), read on.
![image](662shots_so-1.png)
## [**Haley**](https://www.viget.com/about/team/hjohnson/) **\| Pointless Role: Design \| Day Job: PM** {#haley-pointless-role-design-day-job-pm dir="ltr"}
**My favorite part of building verbose.club** was being granted
permission to focus on one project with my teammates. We hopped on Meets
or huddles to discuss requirements. Nicole and I jammed in the same
Figma file or wireframe. I got to drop everything to QA a 600-word word
bank. Viget has great ways of collaborating remotely, but it was also
fun to be part of the in-office crew, having late night snacks between
cram sessions like we were in college again.
**Something I learned**: I tried my hand at being a "designer" and
learned quickly that nothing is too precious. Sometimes the code
translates things differently. Also, design systems are essential to
throwing together screens quickly. And Figma has tons of libraries that
you can use instead of starting from scratch!
------------------------------------------------------------------------
## [**Haroon**](https://www.viget.com/about/team/hmatties/) **\| Pointless Role: Dev \| Day Job: Product Design** {#haroon-pointless-role-dev-day-job-product-design dir="ltr"}
**My favorite part of building verbose.club** was stepping into a new
role, or at least trying to. I got a chance to build out styled
components and pages for our game with React, Typescript, and Tailwind.
Though my constant questions for Andrew and David were likely annoying,
it was an extremely rewarding experience to see a project come to life
from another perspective.
**Something I learned** is that it\'s best to keep commits atomic,
meaning contributions to the codebase are small, isolated, and clear.
Though a best practice for many, this approach made it easier for me as
a novice to contribute quickly, and likely made it easier for Andrew to
fix things later.
------------------------------------------------------------------------
## [**Nicole**](https://www.viget.com/about/team/nrymarz/) **\| Pointless Role: Design \| Day Job: PM** {#nicole-pointless-role-design-day-job-pm dir="ltr"}
**My favorite part of building verbose.club** was seeing our team
immediately dive in with a "we're in this together" approach. I am still
relatively new to Viget and it was my first time working with a handful
of my teammates, so I really appreciated the collaborative environment
and how everyone was really open to hearing new ideas, trying something
new, and working together to make something cool.
**Something I learned** was how to use [Whimsical](http://whimsical.com)
and [Figma](http://figma.com) to make wireframes and designs. I had used
these tools before; but, it was my first time "building" anything at
Viget --- and it was super fun. I'm glad I got to try something outside
of my usual PM role.
------------------------------------------------------------------------
## [**Andrew**](https://www.viget.com/about/team/athomas/) **\| Pointless Role: CEO Dev \| Day Job: Dev**
**My favorite part of building verbose.club** was coordinating work
among my teammates. With less than 3 days to do everything, we had to
hit the ground running. To start, our PMs and designer jumped on
wireframing and design thinking while David brought to life a Rails
back-end & API. I was in a good spot to act as the messenger between
these two lines of work: parsing high-level thinking & decisions from
the designers into information useful to David in crafting the API; then
shuttling David's feedback back to the broader team.
Next up, it was time to build out the user-facing Remix app. We were
able to split this work into two parallel streams. I built out unstyled
screens, game business logic, and the necessary glue code, resulting in
a functional app with a barebones UI. In parallel, Haroon built out
high-fidelity screens, using static placeholders where dynamic content
would eventually live. From there, collaborating to merge our work and
upgrade the UI was a simple task.
I think our team came out of this project inspired by what can be
accomplished with smart coordination. We managed to efficiently build
consensus, parallelize work, and learn from one another --- and the
result speaks for itself.
**Something I learned** was that parallelizing a project is tricky and
communication is key. When multiple streams of work are progressing in
parallel, decisions are constantly being made --- some big, some small
--- and many having more influence than immediately apparent. Keeping
all contributors 100% in the loop is likely impossible, and certainly
unrealistic. Smart coordination and communication is the magic sauce
that makes it all work.
------------------------------------------------------------------------
## [**David**](https://www.viget.com/about/team/deisinger/) **\| Pointless Role: Dev \| Day Job: Dev** {#david-pointless-role-dev-day-job-dev dir="ltr"}
**My favorite part of working on verbose.club** was helping from afar. I
was 1,500 miles and several time zones away from most of the team, so I
really focused on doing low-level work that would enable the rest of the
team to be successful: setting up an API, getting the app running on
Docker for easy local development, and making it straightforward for
others to deploy code. It was sort of like being the bass player in a
rad band.
**Something I learned** is that [Caddy](https://caddyserver.com) is
super legit. Here's the entire web server config file, which
automatically sets up HTTPS and proxies traffic to our Remix app:
verbose.club
reverse_proxy remix-prod:3001
Our overall architecture (running with `docker compose`) looks like
this:
![image](verbose-arch.png)
------------------------------------------------------------------------
In two days' time, we had a beautiful, functioning game that we played
during our Pointless celebration happy hour. Since then, we've added
some cool animations and the ability to pull in AI players --- no human
friends required! So, grab some friends (robot or otherwise) and check
out [verbose.club](https://verbose.club)!

View File

@@ -0,0 +1,69 @@
---
title: "“Whats new since the last deploy?”"
date: 2014-03-11T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/whats-new-since-the-last-deploy/
---
Managing deployments is one of the trickier aspects of creating software
for the web. Several times a week, a project manager will ask the dev
team something to the effect of "what's new since the last deploy?" --
if we did a deploy right now, what commits would that include?
Fortunately, the tooling around this stuff has never been better (as Tim
Bray says, ["These are the good old
days."](https://www.tbray.org/ongoing/When/201x/2014/01/01/Software-in-2014#p-8)).
Easy enough to pull this info via command line and paste a list into
Campfire, but if you're using GitHub and Capistrano, here's a nifty way
to see this information on the website without bothering the team. As
the saying goes, teach a man to `fetch` and whatever shut up.
## Tag deploys with Capistrano {#tagdeployswithcapistrano}
The first step is to tag each deploy. Drop this recipe in your
`config/deploy.rb` ([original
source](http://wendbaar.nl/blog/2010/04/automagically-tagging-releases-in-github/)):
namespace :git do
task :push_deploy_tag do
user = `git config --get user.name`.chomp
email = `git config --get user.email`.chomp
puts `git tag #{stage}-deploy-#{release_name} #{current_revision} -m "Deployed by #{user} <#{email}>"`
puts `git push --tags origin`
end
end
Then throw a `after 'deploy:restart', 'git:push_deploy_tag'` into the
appropriate deploy environment files. Note that this task works with
Capistrano version 2 with the
[capistrano-ext](https://rubygems.org/gems/capistrano-ext) library. For
Cap 3, check out [this
gist](https://gist.github.com/zporter/3e70b74ce4fe9b8a17bd) from
[Zachary](https://viget.com/about/team/zporter).
## GitHub Tag Interface {#githubtaginterface}
Now that you're tagging the head commit of each deploy, you can take
advantage of an (as far as I can tell) unadvertised GitHub feature: the
tags interface. Simply visit (or have your PM visit)
`github.com/<organization>/<repo>/tags` (e.g.
<https://github.com/rails/rails/tags>) to see a list of tags in reverse
chronological order. From here, they can click the most recent tag
(`production-deploy-2014...`), and then the link that says "\[N\]
commits to master since this tag" to see everything that would go out in
a new deploy. Or if you're more of a visual learner, here's a gif for
great justice:
![](http://i.imgur.com/GeKYwA5.gif)
------------------------------------------------------------------------
This approach assumes a very basic development and deployment model,
where deploys are happening straight from the same branch that features
are being merged into. As projects grow more complex, [so must your
deployment
model](https://viget.com/advance/successful-release-management-and-how-to-communicate-about-it).
Automatically tagging deploys as we've outlined here breaks down under
more complex systems, but the GitHub tag interface continues to provide
value if you're tagging your deploys in any manner.

View File

@@ -0,0 +1,161 @@
---
title: "Why I Still Like Ruby (and a Few Things I Dont Like)"
date: 2020-08-06T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/why-i-still-like-ruby-and-a-few-things-i-dont-like/
---
*(Illustration by
[roseannepage](https://www.deviantart.com/roseannepage/art/Groudon-Seat-500169718)*)
The Stack Overflow [2020 Developer
Survey](https://insights.stackoverflow.com/survey/2020#technology-most-loved-dreaded-and-wanted-languages-loved)
came out a couple months back, and while I don't put a ton of stock in
surveys like this, I was surprised to see Ruby seem to fare so poorly --
most notably its rank on the "most dreaded" list. Again, who cares
right, but it did make me take a step back and try to take an honest
assessment of Ruby's pros and cons, as someone who's been using Ruby
professionally for 13 years but loves playing around with other
languages and paradigms. First off, some things I really like.
### It's a great scripting language
Matz's original goal in creating Ruby was to build a truly
object-oriented scripting language[^1^](#fn1){#fnref1}, and that's my
favorite use of the language: simple, reusable programs that automate
repetitive tasks. It has fantastic regex and unix support (check out
[`Open3`](https://docs.ruby-lang.org/en/2.0.0/Open3.html) as an
example). I might not always build Ruby apps, but I'll probably always
reach for it for scripting and glue code.
### A class is a little program
This one took me awhile to get my head around, and I'm not sure I'll do
a perfect job explaining it, but when a Ruby class gets loaded, it's
evaluated as a series of expressions from top-to-bottom, like a normal
program. This means you can, for example, define a class method and then
turn around and call it in the class' top-level source; in fact, this is
how things like `has_many` in Rails works -- you're just calling a class
method defined in the parent class. In other languages, you'd have to
reach for something like macros to accomplish this same functionality.
[Here's an alternate
explanation](https://mufid.github.io/blog/2016/ruby-class-evaluation/)
of what I'm getting at and [here's a cool
post](https://dev.to/baweaver/decorating-ruby-part-two-method-added-decoration-48mj)
that illustrates what sort of power this unlocks.
### The community is open source-focused
Ruby has a rich ecosystem of third-party code that Viget both benefits
from and contributes to, and with a few notable
exceptions[^2^](#fn2){#fnref2}, it's all made available without the
expectation of direct profit. This means that you can pull a library
into your codebase and not have to worry about the funding status of the
company that built it (thinking specifically of things like
[Gatsby](https://www.gatsbyjs.org/) and [Strapi](https://strapi.io/)).
Granted, with time, money, and a dedicated staff, the potential is there
to build better open source products than what small teams can do in
their free time, but in my experience, open source development and the
profit motive tend not to mix well.
### Bundler is good
It's simple, universal, and works well, and it makes it tough to get
into other languages that haven't figured this stuff out.
### It has a nice aesthetic
It's easy to make code that looks good (at least to my eye) and is easy
to understand. There's less temptation to spend a lot of time polishing
code the way I've experienced with some functional languages.
## And some things I don't like as much
Lest ye think I'm some diehard Ruby fan, I've got some gripes, as well.
### It's not universal
As I said, my favorite use of Ruby is as a scripting language, so it's
unfortunate that it doesn't come installed by default on most unix-y
systems, unlike Perl, Python, and Bash. If you want to share some Ruby
code with someone who isn't already a Ruby dev, you have to talk about,
like, asdf, rbenv, or Docker first.
### Functions aren't really first-class
You can write code in an FP style in Ruby, but there's a difference
between that and what you get in a truly functional language. I guess
the biggest difference is that a method and a lambda/block (I know
they're [a little
different](https://yehudakatz.com/2012/01/10/javascript-needs-blocks/)
don't @ me) are distinct things, and the block/yield syntax, while nice,
isn't as nice as just passing functions around. I wish I could just do:
square = -> (x) { x * x }
[1, 2, 3].map(square)
Or even!
[1, 2, 3].map(@object.square)
(Where `@object.square` gives me the handle to a function that then gets
passed each item in the array. I recognize this is incompatible with
optional parentheses but let me dream.)
### It is probably too flexible
Just like skiing, the most dangerous time to be a Ruby developer is the
"early intermediate" phase -- you've learned the syntax and language
features, and all of a sudden EVERYTHING is possible. Want to craft the
perfect DSL? Do it. Want to redefine what `+` does for your Integer
subclass? Do it. Want to open up a third-party library and inject a
custom header? You get my point.
As I've said, Ruby makes it easy to write nice-looking code, but it
takes restraint (and mistakes) to write maintanable code. I suppose the
same could be said about the programming discipline in general, but I
can see the appeal of simpler languages like Go.
### Type checking is cool and Ruby doesn't have it
The first languages I learned were C++ and Java, which formed my
opinions of explicit typing and compilation and made Ruby such a
revelation, but a lot has changed in the subsequent decade, and modern
typed languages are awesome. It'd be neat to be able to compile a
project and have some level of confidence about its correctness before
running the test suite. That said, I sure do appreciate the readability
of things like [RSpec](https://rspec.info/) that rely on the
dynamic/message-passing nature of Ruby. Hard to imagine writing
something as nice as this in, like, Haskell:
it { is_expected.not_to allow_values("Landlord", "Tenant").for(:client_type) }
(As I was putting this post together, I became aware of a lot of
movement in the "typed Ruby" space, so we'll see where that goes. Check
out
[RBS](https://developer.squareup.com/blog/the-state-of-ruby-3-typing/)
and [Sorbet](https://sorbet.org/) for more info.)
------------------------------------------------------------------------
So those are my thoughts. In the end, it's probably best to know several
languages well in order to really be able to understand
strengths/weaknesses and pick the appropriate one for the task at hand.
If you're interested in other thoughts along these lines, you could
check out [this Reddit
thread](https://www.reddit.com/r/ruby/comments/hpta1o/i_am_tired_of_hearing_that_ruby_is_fine/)
(and [this
comment](https://www.reddit.com/r/ruby/comments/hpta1o/i_am_tired_of_hearing_that_ruby_is_fine/fxvfzgo/)
in particular) or [this blog
post](http://codefol.io/posts/when-should-you-not-use-rails/), but what
really matters is whether or not Ruby is suitable for your needs and
tastes, not what bloggers/commenters/survey-takers think.
------------------------------------------------------------------------
1. [[*The History of
Ruby*](https://www.sitepoint.com/history-ruby/)[](#fnref1)]{#fn1}
2. [I.e. [Phusion
Passenger](https://www.phusionpassenger.com/)[](#fnref2)]{#fn2}

View File

@@ -0,0 +1,190 @@
---
title: "Write You a Parser for Fun and Win"
date: 2013-11-26T00:00:00+00:00
draft: false
needs_review: true
canonical_url: https://www.viget.com/articles/write-you-a-parser-for-fun-and-win/
---
As a software developer, you're probably familiar with the concept of a
parser, at least at a high level. Maybe you took a course on compilers
in school, or downloaded a copy of [*Create Your Own Programming
Language*](http://createyourproglang.com), but this isn't the sort of
thing many of us get paid to work on. I'm writing this post to describe
a real-world web development problem to which creating a series of
parsers was the best, most elegant solution. This is more in-the-weeds
than I usually like to go with these things, but stick with me -- this
is cool stuff.
## The Problem
Our client, the [Chronicle of Higher Education](http://chronicle.com/),
[hired us](https://viget.com/work/chronicle-vitae) to build
[Vitae](http://chroniclevitae.com/), a series of tools for academics to
find and apply to jobs, chief among which is the *profile*, an online
résumé of sorts. I'm not sure when the last time you looked at a career
academic's CV was, but these suckers are *long*, packed with degrees,
publications, honors, etc. We created some slick [Backbone-powered
interactions](https://viget.com/extend/backbone-js-on-vitae) for
creating and editing individual items, but a user with 70 publications
still faced a long road to create her profile.
Since academics are accustomed to following well-defined formats (e.g.
bibliographies), [KV](https://viget.com/about/team/kvigneault) had the
idea of creating formats for each profile element, and giving users the
option to create and edit all their data of a given type at once, as
text. So, for example, a user might enter his degrees in the following
format:
Duke University
; Ph.D.; Biomedical Engineering
University of North Carolina
2010; M.S.; Biology
2007; B.S.; Biology
That is to say, the user has a bachelor's and a master's in Biology from
UNC, and is working on a Ph.D. in Biomedical Engineering at Duke.
## The Solution
My initial, naïve approach to processing this input involved splitting
it up by line and attempting to suss out what each line was supposed to
be. It quickly became apparent that this was untenable for even one
model, let alone the 15+ that we eventually needed.
[Chris](https://viget.com/about/team/cjones) suggested creating custom
parsers for each resource, an approach I'd initially written off as
being too heavy-handed for our needs.
What is a parser, you ask? [According to
Wikipedia](https://en.wikipedia.org/wiki/Parsing#Computer_languages),
it's
> a software component that takes input data (frequently text) and
> builds a data structure -- often some kind of parse tree, abstract
> syntax tree or other hierarchical structure -- giving a structural
> representation of the input, checking for correct syntax in the
> process.
Sounds about right. I investigated
[Treetop](http://treetop.rubyforge.org/), the most well-known Ruby
library for creating parsers, but I found it to be targeted more toward
building standalone tools rather than use inside a larger application.
Searching further, I found
[Parslet](http://kschiess.github.io/parslet/), a "small Ruby library for
constructing parsers in the PEG (Parsing Expression Grammar) fashion."
Parslet turned out to be the perfect tool for the job. Here, for
example, is a basic parser for the above degree input:
class DegreeParser < Parslet::Parser
root :degree_groups
rule(:degree_groups) { degree_group.repeat(0, 1) >>
additional_degrees.repeat(0) }
rule(:degree_group) { institution_name >>
(newline >> degree).repeat(1).as(:degrees_attributes) }
rule(:additional_degrees) { blank_line.repeat(2) >> degree_group }
rule(:institution_name) { line.as(:institution_name) }
rule(:degree) { year.as(:year).maybe >>
semicolon >>
name >>
semicolon >>
field_of_study }
rule(:name) { segment.as(:name) }
rule(:field_of_study) { segment.as(:field_of_study) }
rule(:year) { spaces >>
match("[0-9]").repeat(4, 4) >>
spaces }
rule(:line) { spaces >>
match('[^ \r\n]').repeat(1) >>
match('[^\r\n]').repeat(0) }
rule(:segment) { spaces >>
match('[^ ;\r\n]').repeat(1) >>
match('[^;\r\n]').repeat(0) }
rule(:blank_line) { spaces >> newline >> spaces }
rule(:newline) { str("\r").maybe >> str("\n") }
rule(:semicolon) { str(";") }
rule(:space) { str(" ") }
rule(:spaces) { space.repeat(0) }
end
Let's take this line-by-line:
**2:** the `root` directive tells the parser what rule to start parsing
with.
**4-5:** `degree_groups` is a Parslet rule. It can reference other
rules, Parslet instructions, or both. In this case, `degree_groups`, our
parsing root, is made up of zero or one `degree_group` followed by any
number of `additional_degrees`.
**7-8:** a `degree_group` is defined as an institution name followed by
one more more newline + degree combinations. The `.as` method defines
the keys in the resulting output hash. Use names that match up with your
ActiveRecord objects for great justice.
**10:** `additional_degrees` is just a blank line followed by another
`degree_group`.
**12:** `institution_name` makes use of our `line` directive (which
we'll discuss in a minute) and simply gives it a name.
**14-18:** Here's where a degree (e.g. "1997; M.S.; Psychology") is
defined. We use the `year` rule, defined on line 23 as four digits in a
row, give it the name "year," and make it optional with the `.maybe`
method. `.maybe` is similar to the `.repeat(0, 1)` we used earlier, the
difference being that the latter will always put its results in an
array. After that, we have a semicolon, the name of the degree, another
semicolon, and the field of study.
**20-21:** `name` and `field_of_study` are segments, text content
terminated by semicolons.
**23-25:** a `year` is exactly four digits with optional whitespace on
either side.
**27-29:** a `line` (used here for our institution name) is at least one
non-newline, non-whitespace character plus everything up to the next
newline.
**31-33:** a `segment` is like a `line`, except it also terminates at
semicolons.
**35-39:** here we put names to some literal string matches, like
semicolons, spaces, and newlines.
In the actual app, the common rules between parsers (year, segment,
newline, etc.) are part of a parent class so that only the
resource-specific instructions would be included in this parser. Here's
what we get when we pass our degree info to this new parser:
[{:institution_name=>"Duke University"@0,
:degrees_attributes=>
[{:name=>" Ph.D."@17, :field_of_study=>" Biomedical Engineering"@24}]},
{:institution_name=>"University of North Carolina"@49,
:degrees_attributes=>
[{:year=>"2010"@78, :name=>" M.S."@83, :field_of_study=>" Biology"@89},
{:year=>"2007"@98, :name=>" B.S."@103, :field_of_study=>" Biology"@109}]}]
The values are Parslet nodes, and the `@XX` indicates where in the input
the rule was matched. With a little bit of string coercion, this output
can be fed directly into an ActiveRecord model. If the user's input is
invalid, Parslet makes it similarly straightforward to point out the
offending line.
------------------------------------------------------------------------
This component of Vitae was incredibly satisfying to work on, because it
solved a real-world issue for our users while scratching a nerdy
personal itch. I encourage you to learn more about parsers (and
[Parslet](http://kschiess.github.io/parslet/) specifically) and to look
for ways to use them in projects both personal and professional.