Github Ruby Styleguide

rauljara · on Dec 12, 2012

There was one thing with block chaining that I was really hoping they'd hit on. They didn't. They had:

  # good
  names.select { |name| name.start_with?("S") }.map { |name| name.upcase }

I've noticed that sort of chaining is changing towards:

  names.select { |name| name.start_with?("S") }
       .map    { |name| name.upcase }

Which can be further chained (if need be) like so:

  names.select { |name| name.start_with?("S") }
       .map    { |name| name.upcase }
       .sort
       .join(', ')

Which, imo, is just better than trying to do it all on one line or using intermediate variables. It's more of a value call when it comes to the intermediate variables, though. I'd also use

  .map(&:upcase)

instead of

  .map { |word| word.upcase }

But I think it's perfectly understandable if you find the word.upcase more explicit and readable.

jordoh · on Dec 12, 2012

This seems to be a popular style in javascript (particularly when using jQuery), but it's not actually valid ruby (edit: ruby 1.8.7, that is - see reply). The first line forms a valid statement, so the dots would need to be at the end of each line to indicate a line continuation. So it would need to be:

  names.select { |name| name.start_with?("S") }.
        map    { |name| name.upcase }.
        sort.
        join(', ')

Which is slightly more annoying in that you need to modify two lines to add a new chained line (the line you are adding and the dot added to the previous line).

This also creates a mild headache when refactoring, as any change to the length of the "names" variable would force you to re-align the entire chain. You could do something like:

  names.
    select { |name| name.start_with?("S") }.
    map    { |name| name.upcase }.
    sort.
    join(', ')

rauljara · on Dec 12, 2012

> but it's not actually valid ruby.

You are correct that it is invalid for 1.8.6. The code runs just fine in 1.9, though. I take advantage of many things that don't work in 1.8.6, so I'm not particularly concerned with my method chaining breaking compatibility.

I prefer the more-than-2-spaces indentation because, by the shape of the code, it is clearer to me that this is a method chain and not a change in control structure. That is also obvious by actually reading the code, though, so I'm fine putting that down to personal preference.

jordoh · on Dec 12, 2012

Great to know about 1.9 - just looked through the 1.9 changelog and saw it described as "Newlines allowed before ternary colon operator (:) and method call dot operator (.)", so it does seem like they are encouraging this specific formatting style.

I agree that the fully indented code is much easier to visually parse - whether that outweighs the refactoring cost is certainly going to vary depending on the team (in addition to personal preference, as you mention).

prawks · on Dec 12, 2012

Agreed. I'm no rubyist, but it's not only much more readable, but functionally easier to manage in code. I can easily comment out one or more blocks or methods in the chain in this case if need be, for example. I'm surprised they advocate it on one line, actually.

edwinnathaniel · on Dec 12, 2012

Startin' to look like Java (with its fluent-interface pattern) :). Keep it up!

thumbs up

natefinch · on Dec 12, 2012

I'm not a Ruby programmer, but a couple of these things just look completely wrong:

# good

def some_method(some_arr)

  some_arr.size

end

Omitting the return statement is a good thing? How do you know if this method even returns anything? .size could be a method that doesn't return anything, right, since parens are optional?

#good

do_something if something_else

This, to me, is horrible.

If you're just scanning down the file, you can easily miss the if statement, and assume the do_something always goes off.

Also, it's incredibly hard to process when I'm mentally executing the code in my head. Generally if there's a method call on the left, you just do it... but now we hit this if statement, ok, so mentally undo the method call we just processed, now check the if statement... now skip back to the do_something and reapply the method call... and now skip the if statement and continue on in the code.

...or you could just do if something_else then do_something (if you really must have single line if statements, which I am also against, but not so much as the if statement after the method it's controlling access to)

xentronium · on Dec 12, 2012

> Omitting the return statement is a good thing? How do you know if this method even returns anything?

All methods return something.

> If you're just scanning down the file, you can easily miss the if statement, and assume the do_something always goes off.

This is actually one of the arguable points in that styleguide. Some people share your disapproval for inline if modifiers. Anyway it's not that bad after getting used to as long as lines aren't too long and predicates aren't too complex.

natefinch · on Dec 12, 2012

> All methods return something.

Ahh, that helps, thanks.

matt_yoho · on Dec 12, 2012

> Omitting the return statement is a good thing? How do you know if this method even returns anything?

In Ruby, all methods return the value of the last expression executed in the method.

> .size could be a method that doesn't return anything, right, since parens are optional?

This is in theory true but never happens in practice.

Edit: Ruby has no "non-method" properties on objects. Everything is a method call, regardless of use of parens.

> If you're just scanning down the file, you can easily miss the if statement, and assume the do_something always goes off.

An experienced Rubyist would catch it, if not at first glance, with the second. This is used relatively sparingly, and in scenarios conceptually similar to an early bailout return statement. It makes sense where it makes sense, if you will.

Neither of these is a problem in practice. It may be an issue of familiarity and comfort with the language, but in my experience it doesn't take long to be comfortable with either of those conventions.

masklinn · on Dec 12, 2012

> .size could be a method that doesn't return anything, right, since parens are optional?

It can only be a method. Instance attributes can not be public in ruby (and are delineated by the `@` prefix sigil), the closest equivalent (attr_reader/attr_writer/attr_accessor) is a dynamically generated method (or pair of methods)

lucian1900 · on Dec 12, 2012

Returning the last expression is quite handy for a lot of functional patterns. One gets used to it very quickly, especially one has used lisp at all.

Parens being optional is indeed a flaw in Ruby's design.

Post-if can be handy, sometimes.

xentronium · on Dec 12, 2012

> Parens being optional is indeed a flaw in Ruby's design.

This flaw allows us to have nice DSLs unencumbered with parens.

hamburglar · on Dec 12, 2012

There may be such a thing as a "nice DSL" but the absurd proliferation of poorly-thought-out DSLs that results from ruby's syntax is really not a positive thing.

Ha ha, my cute gem makes it look like you're making static declarations in a domain-specific language, but what you're really creating is a bunch of code that executes at unpredictable times and in an order that's impossible to untangle! Good luck! On the upside, you can create a trivial example of [something] in four lines of code!

xentronium · on Dec 12, 2012

Understandable concern, but nobody can force you to use the said gem, can they? Such fluff shouldn't be a big deal as long as it is not ingrained in most developers' minds.

Besides, it looks like ruby community is currently gravitating towards more javaesque style of programming, heavy on patterns like dependency injection, very explicit, with lots of classes and almost none metaprogramming.

lloeki · on Dec 12, 2012

> I'm not a Ruby programmer

and (without any form of negative judgement), this is why

> a couple of these things just look completely wrong

to you, because they're idiomatic of Ruby.

First, lack of return.

A random example in Rails's ActiveSupport[0]:

    def to_s
      "(GMT#{formatted_offset}) #{name}"
    end

It just reads like ":to_s is defined as being a String constructed this way". Note that formatted_offset and name are resolved as methods in the context (i.e self), and self. is simply omitted, which is also idiomatic.

Not only functions always return something (their last statement's return value), every statement actually returns something, so

    foo = if a == b then 1 else 2

works and is used, but also with case/when, and whatever you can think of. With proper indentation this is perfectly sane and readable, and avoid redundant, 'side effect' (from the point of view of code flow) assignments, easing comprehension and refactoring, while making mistakes less likely (foo will always and obviously be assigned something).

What's more, a Ruby construct known as blocks are used with methods (such as Enumerable#map or Enumerable#select) needing return values, and omitting return is a readability and typing boon.

That's not to say return has no use, but then when you use it, you mean it, so that it is used in match/exit pattern methods, where you're doing some checks, and dispatch-return according to various situations, i.e you return early.

It is also used in Procs, which differ from lambdas in that a return will return from the caller, not the Proc.

Second, right-side if.

This aids readability when you're concerned about doing a number of sequential tasks, and some of them are conditioned. All possible tasks are aligned, and all conditions are on the right side. An absence of condition means "always". Sounds almost like column-based source code (fortran, RPG on AS400...).

You mention scanning down the file, but idiomatic ruby would not make you scan down the file: you would be faced with a function, whose size would make its structure/pattern apparent, and with such a pattern the logic would be readily seen, dangling on the right side.

Such conditions are often used with the first part of this discussion, so that you see immediately what is returned right there on the left, possibly conditioned.

[0]: https://github.com/rails/rails/blob/5fe88b11f11bb3b30bc23c57...

kreeger · on Dec 12, 2012

I can't stress enough how important it is to _not_ use `and` and `or`. As a former die-hard fan of those operators, it bit me in the ass so hard when those didn't exactly work the way they were supposed to. I've been using && and || now for years and I haven't looked back. /PSA

Pewpewarrows · on Dec 12, 2012

I agree. But in all code that I write, regardless of pretty much whatever language it's written in, I always explicitly add parentheses around what I want the order of operations to be.

"1 * 3 + 4 / 20 && foo" is unreadable. No one wants to memorize the order of operations for every language they use and have to figure it out in their head. Make life easier for the people who have to maintain that line of code you just wrote.

dpritchett · on Dec 12, 2012

I’ve learned a lot of programming languages over the years, and I’ve taken care never to learn the operator precedence rules in any of them. It’s easy to get them wrong and get bitten and why should I require that people reading my code learn those stupid rules.

- Tim Bray

[1] http://www.tbray.org/ongoing/When/201x/2010/06/29/No-Default...

oinksoft · on Dec 12, 2012

For language-specific operators I could understand ... but to me, the cognitive load of the parentheses is worse than having to think for a moment about this particular set of operators which have the same precedence in every language I've used (and the arithmetic operators which I've had to know the order of since early grade school).

Maybe I'll eat my foot some day, but

  (a == 1) && (b == 2)

or

  (2 * 3) + 1

is nothing but code smell to me.

Pewpewarrows · on Dec 12, 2012

There are plenty of things that you can leave out when writing parseable source code, but don't because it would make the code horribly un-readable.

That is, there's a reason for whitespace around operators, after commas, extraneous newlines, etc. It makes the code easier to read and maintain. Now while I'm willing to admit that this whole area is largely aesthetic and personal, if you submitted a line of code to me that was just this without the parentheses:

> 2 * 3 + 1

I would reject the code review in a heartbeat. Part of making your code readable to others and less error-prone during maintenance is adding extraneous stuff to it that isn't necessary for the code to parse/compile/interpret.

sirclueless · on Dec 12, 2012

Those are both perfectly reasonable to me. There are in fact languages that would interpret your example #1 incorrectly, so if someone wants to disambiguate I have no qualms. Your example #2 is sort of braindead, but in a more complicated expression, where 2 3 or 1 were method calls or something, I would appreciate the parens.

prawks · on Dec 12, 2012

I can't believe more people don't do this. It seems like such a no-brainer when I'm writing. Not only for others, but for myself as well.

alanctgardner2 · on Dec 12, 2012

To anyone who's wondering, it seems like 'and' and 'or' are the same as && and ||, but with lower precedence [0]. Does anyone have a good reason why they both exist? It's a pretty big gotcha.

[0] http://stackoverflow.com/questions/1426826/difference-betwee...

philwelch · on Dec 12, 2012

It's an imported Perlism, back from the days of the old assert idiom `foo == bar or die`.

If `foo == bar` evaluates to true, the `or` is short-circuited, else it calls the built-in `die` function which kills the process.

lloeki · on Dec 12, 2012

More precisely && and || are meant to stand in boolean tests, whereas and and or are meant to tie expressions together DSL style, as an alternative to if/then, like so:

   @current_user.logged_in? or redirect_to login_path
   @current_user.can? :do_this or render :status => 403
   shirt.blue? and return bar
   put_suit_on or put_pants_on

graywh · on Dec 12, 2012

That's a good example

  shirt.blue? and return bar

would be equivalent to

  return bar if shirt.blue?

You can do the same with or and unless.

philwelch · on Dec 12, 2012

I find the postfix if and unless far more readable than the Perlish and and or operators. The "or die" construction for asserts is really amusing though.

gurkendoktor · on Dec 12, 2012

&&/|| can bite you just as well:

if first_user = User.find(4) && second_user = User.find(6) ... end

rauljara · on Dec 12, 2012

I may be missing something obvious here. How would this code bite you?

alanctgardner2 · on Dec 12, 2012

> first_user = User.find(4) && second_user = User.find(6)

It looks like it would evaluate as: first_user = (User.find(4) and second_user = User.find(6))

That is to say, the assignment comes after the boolean operation, which is unexpected.

pooriaazimi · on Dec 12, 2012

Well, I expect

    first_user = User.find(4) && second_user = User.find(6)

to be literally equivalent to

    second_user = User.find(6)
    first_user = User.find(4) && second_user

I honestly can't see why you might want it to mean something else...

Edit: I see. if you do something like

    if (first_user = User.find(4) && second_user = User.find(6)) {
      ..
    }

it might bite you. You might expect it to be equivalent to

    first_user = User.find(4)
    second_user = User.find(6)
    if (first_user && second_user) {
      ..
    }

which I personally think is a very, very bad practice. Parenthesis should always be used when there's even the slightest possibility that you or another maintainer/contributor might be confused about.

alanctgardner2 · on Dec 12, 2012

Personally, I'm almost painfully verbose with my parens to avoid exactly this scenario. Better to be explicit about my intent to the interpreter and other coders.

Having an assignment that could be skipped by short-circuiting also seems like bad practice, but I realize it was designed to be a toy example

gurkendoktor · on Dec 13, 2012

Skipping the second assignment if I don't need it is more of an optimisation. The "long form" would be,

first_user = User.find(4) if first_user second_user = User.find(6) if second_user ... end end

gurkendoktor · on Dec 13, 2012

> Parenthesis should always be used when there's even the slightest possibility that you or another maintainer/contributor might be confused about.

Yes, but you can make the same point in favor of using "and" and "or" (and deprecating &&/||).

graywh · on Dec 12, 2012

Avdi Grimm gives a good explanation: http://devblog.avdi.org/2010/08/02/using-and-and-or-in-ruby/

look_lookatme · on Dec 12, 2012

Agreed. It's just not worth it, especially with implicit return.

sabat · on Dec 12, 2012

They're not synonyms; "and" is not a more readable version of "&&".

As Avdi Grimm says: and and or, despite an apparent similarity to && and ||, have very different roles.

It's worth using "and" and "or", especially considering that && and || bind too tightly for some purposes. I try to avoid && and || unless I'm doing a specific boolean operation (e.g. x = foo || bar). Not for flow control, though.

If you really can't grok the logic, use parens.

yalue · on Dec 12, 2012

I agree with almost everything except for using named groups over the $1-$9 variables in regular expression matching. This may be the case if there are many capture groups in the expression, but I would argue that

    /(?<meaningful_var>regexp)/ =~ string

is a fair deal more difficult to read than simply

    /(regexp)/ =~ string

Almost everybody familiar with PCRE will be familiar with the simpler form, and it's usually the case that the shorter the regular expression, the easier it is to understand and read.

viseztrance · on Dec 12, 2012

The following are equivalent:

  if /(?<meaningful_var>regexp)/ =~ string
    puts meaningful_var
  end

  if /(regexp)/ =~ string
    puts $1
  end

This may not look like much, but bare in mind you may be using the current selection in several places, at which point you'll probably use the following for readability either way:

  meaningful_var = $1

In any case I think that sometimes the latter is preferable, that's why I'm not into these kind of black and white conventions. OMG, this line is 85 characters, you suck.

yalue · on Dec 12, 2012

It is true that one will probably immediately reassign the $- variables. And that's definitely true that there are cases when conventions impede good style. I would probably say, though, that in most cases the latter of your examples is preferable.

    if string =~ /First: (.*?) Last: (.*?)\s/
      first_name = $1
      last_name = $2
      # etc...
    end

looks far better to me than

    if string =~ /First: (?<first_name>.*?) Last: (?<last_name>.*?)\s/
      # etc...
    end

fox91 · on Dec 12, 2012

I think it's good to name these variables, obviously you need to use short names, otherwise you'll get very long regexps. but in any case it will increase readability imho.

jhund · on Dec 12, 2012

I use pretty much the same style for my Ruby work. One thing I do differently is replacing

  %w(word1 word2 word3)

with

  %w[word1 word2 word3]

both result in

  ["word1", "word2", "word3"]

I prefer the bracket delimiters since they better indicate the array nature than the parens. In % notation you can choose your own delimiters, so why not use brackets?

yxhuvud · on Dec 12, 2012

Both are fine.

Just don't use linebreaks or weird utf8 characters, please.

EDIT: As delimeters. Feel free to use them inside the strings.

dsr12 · on Dec 12, 2012

Now-a-days we have computers which can show more than 80 characters at a time. "Keep lines fewer than 80 characters", Is this recommendation still valid?

bryanlarsen · on Dec 12, 2012

Shorter lines are easier to read. Short lines let you put up several editor windows side by side even on a cramped laptop. Short lines let you use a larger font to save your eyes.

There can also be a code win: when forced to deal with short lines, it's often easier to split a long line of code into two easier to understand lines than it is to split on an operator.

Sometimes a split line is a lot worse than a long line, which is why the rule shouldn't be applied religious -- you should only be called on it if you have many lines over the limit.

IMO, 80 is a little bit tight; I prefer 100. But I wouldn't bike shed it if there was a standard in place.

fox91 · on Dec 12, 2012

I really can't stand longer lines, I think that this is a best practice for every code. I usually work with the monitor splitted in two parts with maybe 2 different files in each half, long lines would make that really difficult to read.

Moreover if you need lines longer than 80 chars you're probably doing it the wrong way: long lines are difficult to understand

philwelch · on Dec 12, 2012

I find 80 a bit cramped myself. There are a lot of lovely one-line idioms, like postfix if and iterator blocks, which feel constrained at 80 characters. Worse of all, I spend time worrying about line width.

In principle, having a line width limit is a worthwhile idea. 120 rarely gives me problems, and I can fit 240 characters wide in any case.

olgeni · on Dec 12, 2012

> Now-a-days we have computers which can show more than 80 characters at a time

Of course we do. We also have serial ports that go way faster than 9600, so at last "ed" can be used to its full potential.

lucisferre · on Dec 12, 2012

Lots of reasons to keep lines sort, though you could argue 80 is pushing it. I've found 100-120 is generally fine for most scenarios on modern hardware.

Just keep in mind that code will be read and shown in a variety of different places. For example reading the code on github or other source control tracking tools, reading it on different terminals with splits active (tmux and VIm splits), etc. Keeping it short just makes it more portable for readability purposes. That's definitely a good thing.

barclay · on Dec 12, 2012

Yes(!) Goddamn you kids and your Xcode.

Now get off my damned lawn.

Karunamon · on Dec 12, 2012

I'm a little curious why

    %w(strings go here)

is preferred to

     [strings, go, here]

The first form requires extra work to escape spaces, the second form is a representation of an array everywhere in Ruby, including in the debugger and IRB.

danenania · on Dec 12, 2012

I'm sure this would only apply to an array with single word strings. If you need strings with spaces, %w definitely isn't appropriate.

masklinn · on Dec 12, 2012

> The first form requires extra work to escape spaces

The first form is generally used for symbol-type strings (word lists, essentially, hence "w"). It is less syntactically noisy, shorter and requires less waffling around with non-word keys.

That's especially flagrant if you don't ignore half the actual array,

    %w(foo bar baz)

really results in

    ["foo", "bar", "baz"]

(the quotes are rather important, because they're noisy, repetitive and annoying to type).

It's one of the few things I regularly wish were in more languages.

Karunamon · on Dec 12, 2012

Ahh.. I forgot about the quotes. That's what I get for posting immediately after waking up. Thanks :)

onedognight · on Dec 12, 2012

I do not know of the single language, including the language of math, that gives equal precedence to their and and or operators. Sadly Ruby does, and it's a blight on the language. It's like giving equal precedence to + and \. It's a choice, but not one that anyone would expect.

What's worse, Ruby give the "correct" relative precedence to &&/||. I suspect this inconsistency is the reason the OP cautions against using and/or*. It's too bad because I miss the low precedence and/or operators (from perl) that can be used to avoid parenthesis that &&/|| would require in the presence of assignment, etc.

graywh · on Dec 12, 2012

Ruby's and and or are for flow control like if and unless, not binary logic operators like (or synonyms for) && and ||.

danenania · on Dec 12, 2012

Agree nearly 100%, except I see no problem with omitting parentheses around arguments in method definitions. I've never had any trouble parsing this and it's a bit cleaner to my eye.

OriginalSyn · on Dec 12, 2012

In team environments you typically sacrifice convenience for maintainability and while you may not have a problem with optional parenthesis, it invites ambiguity and a more junior programmer may make a mistake because of it so the idea is to close off that edge case completely and not have to worry about it. This is the same reason a lot of shops ban the terniary operator for anything more than the simplest of variable assignments. It's not saying it's not useful, it's just that there are cases where if they had just been there in the first place coding errors could have been avoided.

danenania · on Dec 12, 2012

I agree with your general point, but I don't really see how omitting those parentheses invites ambiguity.

def reply_to_post text

end

What could anyone imagine text to be in this context aside from an argument? I get that using parentheses is more familiar coming from languages with C syntax, but I see no other rationale. I've never worked with anyone who stumbled over this. It's always just: 'oh cool, I didn't realize you could omit parentheses there.'

xentronium · on Dec 12, 2012

It just doesn't scan well. It's quite easy to misread this as reply_to_post_text, for example.

OriginalSyn · on Dec 12, 2012

I may have been thinking of a edge case in CoffeeScript when I wrote that, but you're right.

lucisferre · on Dec 12, 2012

I think the issue is around the ambiguity and cognitive friction associated with figuring out where you require them and where you don't. Generally speaking I don't have a lot of problems with it, but I could see why someone might just say "always" to this one.

yxhuvud · on Dec 12, 2012

Agreed, unless the definition line spans multiple lines. Then ()'s are necessary for readability.

pmahoney · on Dec 12, 2012

Regarding "def self.method" vs. "class << self; def method; end", one very big advantage of using the "class << self" technique (which Github discourages) is that it allows grepping for "def method" to find the location of the method definition.

Perhaps my editor is old and featureless (and Ruby's dynamic nature doesn't help), but I grep for such things multiple times per day.

zxcdw · on Dec 12, 2012

This makes me want to learn Ruby so bad. :(

TommyDANGerous · on Dec 12, 2012

Love it. Great read.

sabat · on Dec 12, 2012

I like what I've read, but with a couple of cavaets. Ruby is the way it is on purpose. Discouraging people from using features of the language because the OP (presumably) does not understand how to use them: that's an anti-pattern.

The and and or keywords are banned. It's just not worth it. Always use && and || instead.

If you find "and" and "or" not to be "worth it", it tells me that you don't understand the difference. The "&&" and "||" versions bind more tightly, and that is not always desirable. Perl has the same set of constructs, and the Perl monks have been using them correctly for years.

As you can see all the classes in a class hierarchy actually share one class variable.

Well, yeah. That's meant to be a feature of the language, not "nasty behavior". It's good advice to use class instance variables, sure. But let's not imply that Ruby is a badly designed language; it's quite well-designed. This isn't Javascript. :-)

edit: formatting

bryanlarsen · on Dec 12, 2012

Ruby is mostly well-designed, but it certainly has its warts. One of them is that 'or' and 'and' have the same precedence. They don't in Perl. That's one reason while Perl monks use them, but Ruby rockstars don't.

And please don't try to tell me that the lambda/proc/block mess is "well-designed".