I made a post about the niceties of blocks: https://maxlap.dev/blog/2022/02/10/what-makes-ruby-blocks-gr...
This runs you into problems similar to missing a break in such statements with many language
Care to share an example problem?
Return in lambda/proc is error-prone
People want to do early returns from looping over a collection, so take the easy solution of adding more messy language semantics instead of finding a semantically simple solution instead. (For that matter, have fun working out the semantics of break and next when used in a block that isn't an argument to an enumeration method. How do you as a method author opt in to distinguishing between the two after yielding to a block?)
This is generally the case with Ruby everywhere. Why does that thing have an edge case with weird semantics? To work around the edge case with weird semantics somewhere else. It's all fine if you want to just try things until something works. But if you want to really understand what's going on and write code that's a first-class participant in the language features, it gets really frustrating to try to deal with it all.
For my (buggy, unfinished, languishing without updates) prototype Ruby compiler, lambda and proc (and blocks) are all implemented nearly the same way, with the exception that for proc/blocks, return/break/next will act as if having unwound the stack to the scope where the proc was defined first (or throw an error if escaped from there).
The distinction is very obvious when you think of it in that way - a proc acts as if called in the context/scope it was defined, while a lambda acts as if called in the scope it is called in.
> How do you as a method author opt in to distinguishing between the two after yielding to a block?
You don't. A block acts as a proc, not a lambda. If you want lambda semantics, then take a lambda as an argument - don't take a block/proc and be surprised it doesn't act like something it isn't.
I want to write a method that takes a block and distinguishes between next and break, exactly like the methods of enumeration do. It's obviously possible because a super common interface does it.
Last time I looked, that interface does it by being written in native code that interfaces with the interpreter. That is, it's not part of the language semantics. It's a special case with weird rules unlike what anything else gets to do.
Or at least it was. Maybe the language has actually made it accessible since then, but I'm not optimistic. That's not the ruby way.
Effectively blocks are self-contained chunks of code. You can change things around them, but you can’t change how keywords work inside them. Because you’re crossing a method boundary when you call a block you’re not able to access next and break. (Or capture return.)
Ruby is defining scope here and C methods are not limited by the language they define.
I think it'd be nice if Ruby language constructs offered a few of the building blocks used by the standard library so that you could implement more of the standard library without dipping down into some low level mechanism - for my prototype Ruby compiler a priority was to implement as much as possible in Ruby, and there are certainly more things in the core classes that can't be done in pure Ruby than I'd like.
But it's not that much, and while there are plenty of warts, the inconsistencies are often smaller than people think.
Assuming you don't care whether a "next" was actually called, but only whether you've exited the block and whether or not you exited it via a break, you can do this check in a number of ways, but it I will agree it's a bit dirty that sometimes you do need to rely on standard library functionality rather than language constructs if you want to do these things.
Here's one way of doing it, since "break" within a Fiber triggers a LocalJumpError:
def next_or_break(&block)
Fiber.new(&block).resume
:next_or_end_of_block_reached
rescue LocalJumpError
:break
end
p(next_or_break do next end)
p(next_or_break do break end)
You don't need to introduce a Fiber or rely on the standard library, you can just use the core language, leveraging the fact that code in ensure sections gets called even when you are returning "past" the calling method.
(You may need a dirty catch-all rescue so you can set a flag before reraising that lets you distinguish "bypassing direct return by exception" from "bypassing direct return by break or proc-semantics return", but that's still the core language, not standard library.)
I put example code in another comment: https://news.ycombinator.com/item?id=44065226
However, in Ruby blocks aren't just about flexibility, more importantly they're about generality. They're not there to resolve an edge case at all (Ruby also has keywords for loops). They're a generalization of control flow that is just slightly less general than continuations. In practical use they provide an alterative to Lisp macros for many use cases.
These were some of the problems Matz was trying to sort out with his design of Ruby--creating a language that was fun and easy to use for day-to-day programming, with as much of the meta-programming power of Lisp and Smalltalk as possible. Blocks are one of his true innovatations that came from trying to balance that tension.
How do they differ from Smallalk blocks? (I don't know.)
One way to think of about it is this: anonymous functions as originally implemented in early Lisps are code as an object, closures are code with its lexical environment as an object. You can think of a Ruby block as code with its lexical environment and its call stack as an object.
So they don't just handle return differently than closures, they have access to the call stack of the site where they're created like a continuation. This is why they handle return differently, but this is just one of the things that falls out from that. It also comes with other control flow features like "redo", "retry", "next", "rescue", "finally", and others. These are all call stack control (control flow) conveniences, just like return is. All of them can be thought of as being abstractions built on top of continuations (just ask a Scheme hacker).
Originally Ruby was basically a Lisp without macros, but with continuations, a Smalltalk like object system and a lot of syntactic affordances inspired by Perl, and other languages. Blocks are one of the conveniences built on top of the Lispy semantics.
Note that I'm explaining how blocks work as an abstraction (vidarh below explains how they work as a concretion, as implemented in MRI).
At-a-glance afaict Smalltalk provides those features too, so I would guess Smalltalk blocks may have access to the call stack too?
The invovation is to have those features tied to convenient syntax.
(I should check how Smalltalk blocks behave.)
https://wirfs-brock.com/allen/things/smalltalk-things/effici...
Although I would say he didn’t get 100% there although that this point Ruby isn’t too far from that.
These are ideas that I think are worth trying to take even further. In fact, I’ve been experimenting with that.
Continuations were introduced in Ruby 1.8, via a callcc method in a Kernel module or something like that.
Since 1.9 they are in some sort of deprecated status.
I don't think that even the yield() stuff is implemented using continuations.
I’m also sure you’re right that they’re not implemented using continuations. However, my understanding is that Ruby was originally conceived as a language with continuations. I’ll see if I can find a reference for that. But from what I recall reading in a blog post from someone who was at a programming language conference in 1997 when Matz introduced the language that’s how he described it.
What does that even mean?
If you create a block deeply within a cascade of nested function calls, nineteen activation levels deep, and return that block out of all those nestings, is it still aware of the nineteen levels that have terminated, and to what purpose/benefit?
What example Ruby code would break without continuing access to the dynamic scope that has terminated, rather than just the lexical scope?
A block cannot be returned, because it is not an object; it has to be used to create a Proc object with either proc or lambda semantics to be returned.
With lambda semantics, its just a closure, doesn't care about the dynamic scope, and returns immediately to whatever calls it with return/next, or returns from the calling method with break.
With proc semantics, it does retain the connection, and return or break will result in an error when those scopes have terminated, but next will still return to the caller. (You don't generally want to return a proc for that reason, the use for procs is passing down a call chain, not returning them up.)
Blocks should be nothing special. They're anonymous functions that capture the environment mutably. The only new part is all the special bits added to handle the weird edge cases that they're trying to pretend don't exist.
I think the "Building an intuition" section of my blog post[1] makes a good case for that.
When dealing with loops, you have 3 nested constructs interacting: a wrapping function, a loop statement and the loop's body; and you have 3 keywords to choose where the flow of the code goes.
return returns from the wrapping function
break leaves the loop statement
next / continue leaves loop's body
When dealing with blocks or anonymous functions, it's instead 3 nested "functions" that are interacting: a wrapping function, a called function and an anonymous functions (or block).
Ruby's blocks, let you use the same 3 keywords to choose where the flow of the code goes.
return returns from the wrapping function (ex: my_func)
break returns from the called function (ex: each, map)
next returns from the block
Quite consistent. But since we are talking about functions instead of statements (loop), return values are also involved. Allowing both break and next to provide a return value fits well in that model and is quite useful. The 3 keywords are basically return, but they have different targets.
[1] https://maxlap.dev/blog/2022/02/10/what-makes-ruby-blocks-gr...What is the edge case? It seems to be there so that Ruby enumeration methods can provide the behavior expected of looping statements (which is kind of necessary if you want the looping statements to just be semantic sugar for enumeration methods so that they can work correctly with any enumerable.)
> Blocks should be nothing special.
"Special" compared to what?
> How do you as a method author opt in to distinguishing between [break and next] after yielding to a block?
I don't use Ruby much lately, but if I yield to a block which calls break, control will pass to the code following the block (and not back to my method). If the block calls next or simply finishes, control passes back to me though I cannot know if next was called or not (but do I care? I can go ahead and yield the next element of a collection either way)
Yes, you can. That's what ensure (the Ruby equivalent of "finally") is for. Or, better using File.open with a block where you do the work, which uses ensure under the hood.
The semantics are:
* next returns from the block, optionally taking a return value that will be returned as the result of the function (this is identical to the behavior of "return" from a Proc with lambda semantics)
* break returns from the function that yielded to the block, optionally taking a return value that will be returned as the result of the function (EDIT: deleted comparison to returns from a Proc with proc rather than lambda semantics here, because it wasn't quite accurate.)
This is, incidentally, also exactly the semantics when they are called from a block passed to an enumeration method, there is no special case there.
> How do you as a method author opt in to distinguishing between the two after yielding to a block?
If control returns to your method, it was a return from the block, either via return (from a block with lambda semantics), next, or running through the end of the block. If, OTOH, break is called (or return from a block with proc semantics), control won't return to the calling method normally, but code in any "ensure" sections applicable will be run, which can be used to identify that this has occurred, and even to override the return value provided by the break.
The simplest possible function illustrating this:
def how_exited
yield
direct = true
return "next, completion, or lambda return"
rescue
direct = true
return "exception"
ensure
return "break" if not direct
end
A block is a part of the AST. Like a pair of braces.
A proc is a function.
A lambda is a proc that treats args and returns differently.
FWIW, Matz himself called this difference "a design mistake".
E.g, given:
def foo = proc do 42 end
def bar = proc do return 42 end
Then `foo.call` is fine, but `bar.call` will indeed give a LocalJumpError as you say.But a return in a proc that hasn't escaped its defining scope is fine:
def baz = proc do return 42 end.call
Calling `baz` here will just return 42 to the surrounding scope. def foo
proc do
return 42
end.call
end
def bar
proc do
return 42
end
end
p foo
p bar.call
Will produce: 42
test.rb:11:in `block in bar': unexpected return (LocalJumpError)
from test.rb:16:in `<main>'So in other words, it's handled.
https://www.goodreads.com/book/show/35980970-mastering-ruby-... (not sure why the book was removed from the Pragmatic Bookshelf website).
vidarh•1mo ago
You can obtain a value of a block by naming it, and when you do, what you obtain is a proc.
A Ruby implementation could if it chooses make any block a proc, because you shouldn't be able to tell the difference without extensive contortions (e.g. you could iterate over ObjectSpace and show that a block causes the creation - or not - of an object of the Proc class). And in-fact my (woefully buggy and incomplete) Ruby compiler prototype does just that.
jez•1mo ago
Ruby doesn’t haven to allocate a full-on, garbage-collected, closure object every time a function accepts a block: it only has to do this if the block gets stored to a variable. If the block is only ever yield’d to, the allocation can be skipped.
And when your language’s primary looping mechanism is done with blocks, the difference adds up:
Ruby was able to get away with its closure-heavy standard library APIs without a JIT for almost 3 decades because of the affordances that blocks provide over procs/lambdas.drnewman•1mo ago