Ruby Under a Microscope

分词与语法解析

使用 Ripper 输出 lex 结果。

require 'ripper'
require 'pp'
#ripper is not parser, it can't find error.
code = <<STR
10.times do |n|
  puts n
end
STR

puts code
pp Ripper.lex(code)

Ripper.sexp 输出 parse 结果，也可以使用命令行 ruby --dump parsetree xxxx.rb 得到。前者是 Ripper 的 AST 展示格式，后者是实际内部的 c 语言 node 节点信息。
Ruby 使用手写的 tokenizer ，以及 bison 写的 parser —— parse.y ,bison生成的解释器是 LALR Parser。

编译

Ruby 1.8 没有编译器， Ruby 1.9 之后引入了 YARV( yet another ruby vm) 中间指令。但是 Ruby 并没有独立的编译器，而是在运行时动态编译成字节码，并交给 VM 解释执行。很可惜， Ruby 还是没有 JIT，将字节码编译成本地机器码。但是从测试来看， 1.9 之后的性能，已经远比 1.8 高（简单测试是 4.25 倍左右）， 1.8 还是原始的解释执行 AST 的方式。
编译的过程本质是遍历 AST ，然后生成 YARV 字节码的过程，具体参考 https://github.com/ruby/ruby/blob/trunk/compile.c 中的 iseq_compile_each 函数，一个大的 switch 派发。
NODE_SCOPE 表示开始一个新的作用域，作用域绑定着一个本地表 local table，类似 JVM 里的局部变量区，参数和局部变量的信息会放在这里。
查看 YARV 字节码：

code = <<END
10.times do |n|
  puts n
end
END

puts RubyVM::InstructionSequence.compile(code).disasm

输出

== disasm: #<ISeq:<compiled>@<compiled>>================================
== catch table
| catch type: break  st: 0002 ed: 0008 sp: 0000 cont: 0008
|------------------------------------------------------------------------
0000 trace            1                                               (   1)
0002 putobject        10
0004 send             <callinfo!mid:times, argc:0>, <callcache>, block in <compiled>
0008 leave
== disasm: #<ISeq:block in <compiled>@<compiled>>=======================
== catch table
| catch type: redo   st: 0002 ed: 0010 sp: 0000 cont: 0002
| catch type: next   st: 0002 ed: 0010 sp: 0000 cont: 0010
|------------------------------------------------------------------------
local table (size: 2, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] n<Arg>
0000 trace            256                                             (   1)
0002 trace            1                                               (   2)
0004 putself
0005 getlocal_OP__WC__0 2
0007 opt_send_without_block <callinfo!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, <callcache>
0010 trace            512                                             (   3)
0012 leave                                                            (   2)

其中的 local table 就是本地表，<callinfo!mid:times, argc:0>, <callcache>, block in <compiled> 这里表示为 10.times 传递了一个 Block，它的指令在下面。

此外，想函数的默认参数、命名参数都是通过生成额外的指令来支持，前者就是加入判断，后者是引入匿名的 hash 表。

YARV 执行代码

整体上， YARV 跟 JVM 的构造机器类似。 YARV 也是有一个调用栈，每个栈帧 rb_control_frame_t 包含 sp( stack pointer，指向栈顶）， pc（程序计数器，当前指令地址），self（接收者）和 type （节点类型）等信息。CFP (current frame pointer) 指向当前的 rb_control_frame_t。调用就是压入和弹出栈帧，栈帧内部维护操作数栈，pc 指向指令地址，对操作数和接收者进行入栈出栈操作，根据指令求值。YARV 也被称为是双堆栈虚拟机。
所有 YARV 指令定义在 https://github.com/ruby/ruby/blob/bd2fd73196bbff7dc5349c624342e212c09d174e/insns.def，最终经过 Miniruby 转成 vm.inc 的 c 语言代码。
指令基本格式

  instruction comment
  @c: category
  @e: english description
  @j: japanese description
  instruction form:
    DEFINE_INSN
    instruction_name
    (instruction_operands, ..)
    (pop_values, ..)
    (return value)
    {
       .. // insn body
    }

DEFINE_INSN
getlocal
(lindex_t idx, rb_num_t level)
()
(VALUE val)
{
    int i, lev = (int)level;
    const VALUE *ep = GET_EP();

    /* optimized insns generated for level == (0|1) in defs/opt_operand.def */
    for (i = 0; i < lev; i++) {
	ep = GET_PREV_EP(ep);
    }
    val = *(ep - idx);
}

本地变量的访问，通过 getlocal 和 setlocal 指令，当 CFP 变化的时候，为了访问本栈帧之外的 local 变量， YARV 还引入了一个叫 EP（ environment pointer) 的指针，它被设置为 SP-1。栈帧之间的 EP 形成了一种层次结构（其实就是嵌套作用域），通过 EP 的移动来访问外部环境的本地变量。
内部栈还有两个特殊栈帧 special 和 svar/cref， special 用于保存传递了 Block 代码块的指针，指向代码块所在的栈帧，让 EP 可以找到正确的栈帧。后者 svar 用于保存特殊变量，$ 开头的一些特殊变量，特别是跟正则相关的，比如 $&, $~ 等。而 cref 用于标示是否要在一个新的词法作用域内(lexical scope)执行。 Ruby 中开启新的词法作用域的只有：使用class关键字定义一个类；使用module 定义一个模块；使用def关键字定义一个方法。而 Block 是没有的。这一块在 ruby 元编程里有详细描述。

控制结构和方法调度

if 语句本质上是使用 branchunless 或者 branchif 根据 test 的计算结果为 true/false 来决定跳转到哪个代码分支继续执行。
跨作用域的跳转（比如 break 跳转到父作用域），则是使用 throw 指令 + 捕获表实现，向下找到最近的捕获表的 break 指针，然后充值 pc 和 ep 指针，从指针后的代码开始执行。rescue、ensure、retry、redo 和 next 的实现与此类似。

for 只是 each 的封装，查看

code = <<END
for i in 0..5
  puts i
end
END

puts RubyVM::InstructionSequence.compile(code).disasm

输出：

== disasm: #<ISeq:<compiled>@<compiled>>================================
== catch table
| catch type: break  st: 0002 ed: 0008 sp: 0000 cont: 0008
|------------------------------------------------------------------------
local table (size: 2, argc: 0 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] i
0000 trace            1                                               (   1)
0002 putobject        0..5
0004 send             <callinfo!mid:each, argc:0>, <callcache>, block in <compiled>
0008 leave
== disasm: #<ISeq:block in <compiled>@<compiled>>=======================
== catch table
| catch type: redo   st: 0006 ed: 0014 sp: 0000 cont: 0006
| catch type: next   st: 0006 ed: 0014 sp: 0000 cont: 0014
|------------------------------------------------------------------------
local table (size: 2, argc: 1 [opts: 0, rest: -1, post: 0, block: -1, kw: -1@-1, kwrest: -1])
[ 2] ?<Arg>
0000 getlocal_OP__WC__0 2                                             (   3)
0002 setlocal_OP__WC__1 2                                             (   1)
0004 trace            256
0006 trace            1                                               (   2)
0008 putself
0009 getlocal_OP__WC__1 2
0011 opt_send_without_block <callinfo!mid:puts, argc:1, FCALL|ARGS_SIMPLE>, <callcache>
0014 trace            512                                             (   3)
0016 leave

可以看到是调用了 0..5 的 each 方法，然后将 block 参数拷贝给了局部变量 i。

send 是最核心和最复杂的控制结构， ruby 有 11 种方法类型
ISEQ 是普通方法， CFUNC 是 c 语言编写的代码，都是 ruby 的内部实现。ATTRSET 是 attr_writer， IVAR 是 attr_reader， BMETHOD 表示 define_method 传入的 proc 对象定义的方法，UNDEF 用于移除方法， MISSING 是方法不存在的时候调用，其他等等。
对于 attr_reader 和 attr_writer， ruby 内部做了优化，不会创建新的栈帧，因为方法非常简短并且不会出错，也就需要错误的堆栈信息。内部使用 c 语言实现的 vm_setivar 和 vm_getivar 快速调用。
命名参数本质是创建了一个 hash 来包装，如果 hash.key? 返回 false，也就是不存在，就使用默认值。具体可以看下面这段代码的输出：

code = <<END
  def add_two(a: 2, b: 3)
    a + b
  end

  puts add_two(1, 1)
END

puts RubyVM::InstructionSequence.compile(code).disasm

对象与类

Ruby 对象 RObject

在 include/ruby/ruby.h 中定义：

struct RBasic {
    VALUE flags;
    const VALUE klass;
}
#ifdef __GNUC__
    __attribute__((aligned(sizeof(VALUE))))
#endif
;

struct RObject {
    struct RBasic basic;
    union {
	struct {
	    uint32_t numiv;
	    VALUE *ivptr;
            void *iv_index_tbl; /* shortcut for RCLASS_IV_INDEX_TBL(rb_obj_class(obj)) */
	} heap;
	VALUE ary[ROBJECT_EMBED_LEN_MAX];
    } as;
};

其中：

 RBasic 里的 class 指针指向了 RClass，也就是对象所属的 class。
 flags 用于存储内部专用的各种标志位。
 numiv 表示实例变量数目
 ivptr 实例变量数组
 iv_index_tbl  指向散列表的指针，该散列表保存了实例变量名(ID) 及其在 ivptr 数组中位置的映射，这些散列值存储在 RClass 的结构体中，该指针只是一个简单的缓存来加速访问。

基本类型对象，比如字符串、数组、正则表达式等有单独的结构体，例如 RString 、RArray 和 RRegexp 等等，每个实例内部同样有 basic 指针。

struct RString {
    struct RBasic basic;
    union {
	struct {
	    long len;
	    char *ptr;
	    union {
		long capa;
		VALUE shared;
	    } aux;
	} heap;
	char ary[RSTRING_EMBED_LEN_MAX + 1];
    } as;
};

struct RArray {
    struct RBasic basic;
    union {
	struct {
	    long len;
	    union {
		long capa;
		VALUE shared;
	    } aux;
	    const VALUE *ptr;
	} heap;
	const VALUE ary[RARRAY_EMBED_LEN_MAX];
    } as;
};

等等。但是数字、符号等一些简单的立即值，没有使用任何结构体，而是直接将它们放到 VALUE 内，并且留前面几个 bit 位来标记类型：

   [   Integer value   | Flags ]

基本类型对象也有实例变量，但是单独保存在 generic_iv_tbl 的特殊散列表里。

RClass 结构体

Ruby 2.3.0 开始将 RClass 从 include/ruby/ruby.h 迁移到了 internal.h 中，为了信息隐藏：

struct rb_classext_struct {
    struct st_table *iv_index_tbl;
    struct st_table *iv_tbl;
    struct rb_id_table *const_tbl;
    struct rb_id_table *callable_m_tbl;
    rb_subclass_entry_t *subclasses;
    rb_subclass_entry_t **parent_subclasses;
    /**
     * In the case that this is an `ICLASS`, `module_subclasses` points to the link
     * in the module's `subclasses` list that indicates that the klass has been
     * included. Hopefully that makes sense.
     */
    rb_subclass_entry_t **module_subclasses;
    rb_serial_t class_serial;
    const VALUE origin_;
    VALUE refined_class;
    rb_alloc_func_t allocator;
};

struct RClass {
    struct RBasic basic;
    VALUE super;
    rb_classext_t *ptr;
    struct rb_id_table *m_tbl;
};

其中：

 m_table 是方法的散列表，以方法名或者 ID 为键，以每个方法的定义的指针为值。
 iv_index_tbl 是属性名散列表，是实例变量名和 RObject 实例变量数组属性值索引位置的映射。RObject 里有 iv_index_tbl 缓存指向这个散列表。
 iv_tbl 类级别的实例变量和类变量，包括他们的名字和值。
 const_tbl 常量散列表。
 origin 用于实现 Module#prepend 特性。
 allocator 用于分配内存。
 super 指向超类 RClass 的指针。

类变量(@@)和类的实例变量(@)的区别：类变量在所有子类中共用；类实例变量在该类和子类中创建各自独自的值。但是他们都保存在 iv_tbl，通过名字区分。
类变量的查找顺序是会从该类和超类找起来，找到最高层的的超类的变量副本。而类的实例变量只在当前类查找。
类的类方法 (self.xxx) 是保存在元类 metaclass，类的 RBasic 里 klass 指向的就是它的元类。一个小实验：

irb(main):001:0> ObjectSpace.count_objects[:T_CLASS]
=> 912
irb(main):002:0> class Test end
=> nil
irb(main):003:0> ObjectSpace.count_objects[:T_CLASS]
=> 914

方法查找和常量查找

moule 也是 class，但是移除了 iv_index_tbl， allocator 等，因为模块没有对象级别的属性和方法。
include 一个模块，本质上是拷贝该模块的 RClass 结构体形成一个新副本，然后作为类的新超类，模块副本加入了祖先继承链。include 的核心逻辑在 ruby.c 里的 rb_include_module 和 include_modules_at 函数里。复制发生在 rb_include_class_new 函数。

VALUE
rb_include_class_new(VALUE module, VALUE super)
{
    VALUE klass = class_alloc(T_ICLASS, rb_cClass);

    if (BUILTIN_TYPE(module) == T_ICLASS) {
	module = RBASIC(module)->klass;
    }
    if (!RCLASS_IV_TBL(module)) {
	RCLASS_IV_TBL(module) = st_init_numtable();
    }
    if (!RCLASS_CONST_TBL(module)) {
	RCLASS_CONST_TBL(module) = rb_id_table_create(0);
    }
    RCLASS_IV_TBL(klass) = RCLASS_IV_TBL(module);
    RCLASS_CONST_TBL(klass) = RCLASS_CONST_TBL(module);

    RCLASS_M_TBL(OBJ_WB_UNPROTECT(klass)) =
      RCLASS_M_TBL(OBJ_WB_UNPROTECT(RCLASS_ORIGIN(module))); /* TODO: unprotected? */

    RCLASS_SET_SUPER(klass, super);
    if (RB_TYPE_P(module, T_ICLASS)) {
	RBASIC_SET_CLASS(klass, RBASIC(module)->klass);
    }
    else {
	RBASIC_SET_CLASS(klass, module);
    }
    OBJ_INFECT(klass, module);
    OBJ_INFECT(klass, super);

    return (VALUE)klass;
}

static int include_modules_at(const VALUE klass, VALUE c, VALUE module, int search_super);

void
rb_include_module(VALUE klass, VALUE module)
{
    int changed = 0;

    rb_frozen_class_p(klass);
    Check_Type(module, T_MODULE);
    OBJ_INFECT(klass, module);

    changed = include_modules_at(klass, RCLASS_ORIGIN(klass), module, TRUE);
    if (changed < 0)
	rb_raise(rb_eArgError, "cyclic include detected");
}

static enum rb_id_table_iterator_result
add_refined_method_entry_i(ID key, VALUE value, void *data)
{
    rb_add_refined_method_entry((VALUE)data, key);
    return ID_TABLE_CONTINUE;
}

static int
include_modules_at(const VALUE klass, VALUE c, VALUE module, int search_super)
{
    VALUE p, iclass;
    int method_changed = 0, constant_changed = 0;
    struct rb_id_table *const klass_m_tbl = RCLASS_M_TBL(RCLASS_ORIGIN(klass));

    while (module) {
	int superclass_seen = FALSE;
	struct rb_id_table *tbl;

	if (RCLASS_ORIGIN(module) != module)
	    goto skip;
	if (klass_m_tbl && klass_m_tbl == RCLASS_M_TBL(module))
	    return -1;
	/* ignore if the module included already in superclasses */
	for (p = RCLASS_SUPER(klass); p; p = RCLASS_SUPER(p)) {
	    int type = BUILTIN_TYPE(p);
	    if (type == T_ICLASS) {
		if (RCLASS_M_TBL(p) == RCLASS_M_TBL(module)) {
		    if (!superclass_seen) {
			c = p;  /* move insertion point */
		    }
		    goto skip;
		}
	    }
	    else if (type == T_CLASS) {
		if (!search_super) break;
		superclass_seen = TRUE;
	    }
	}
	iclass = rb_include_class_new(module, RCLASS_SUPER(c));
	c = RCLASS_SET_SUPER(c, iclass);

	{
	    VALUE m = module;
	    if (BUILTIN_TYPE(m) == T_ICLASS) m = RBASIC(m)->klass;
	    rb_module_add_to_subclasses_list(m, iclass);
	}

	if (FL_TEST(klass, RMODULE_IS_REFINEMENT)) {
	    VALUE refined_class =
		rb_refinement_module_get_refined_class(klass);

	    rb_id_table_foreach(RMODULE_M_TBL(module), add_refined_method_entry_i, (void *)refined_class);
	    FL_SET(c, RMODULE_INCLUDED_INTO_REFINEMENT);
	}

	tbl = RMODULE_M_TBL(module);
	if (tbl && rb_id_table_size(tbl)) method_changed = 1;

	tbl = RMODULE_CONST_TBL(module);
	if (tbl && rb_id_table_size(tbl)) constant_changed = 1;
      skip:
	module = RCLASS_SUPER(module);
    }

    if (method_changed) rb_clear_method_cache_by_class(klass);
    if (constant_changed) rb_clear_constant_cache();

    return method_changed;
}

在代码最后，如果发现方法有变化，就清空方法缓存，如果常量有变化，就清空常量缓存。参考 http://ju.outofmemory.cn/entry/135587

Ruby 的方法缓存包含两层：全局的方法缓存，用于缓存接收者和实现类之间的映射，因为方法查找是要遍历整个继承链的，缓存可以加速这个调用。其次是内联方法缓存，缓存 Ruby 已经执行的已编译的 YARV 指令信息，这样可以避免查找，加速的原理和 clojure 的 direct linking 技术是一样的。无论是定义新方法、include 模块或者其他类似的操作， Ruby 都会去清空这两个缓冲。
多次include 不同模块，最近 include 的模块作为直接超类向上延伸。
模块也可以 include 模块，规则与类 include 模块一致，也是副本插入作为超类，作为目标类和原始超类之间新的超类。
Module prepend 例子：

module Professor
  def name
    "Prof. #{super}"
  end
end
class Mathematician
  attr_accessor :name
  prepend Professor
end

p  = Mathematician.new
p.name = 'Johann Carl Friedrich Gauss'

p p.name

prepend 虽然仍然会将 Professor 设置为 Mathematician 的新超类，但是同时会拷贝一份 Mathematician 作为 Mathematician 原生类（Origin class)，将这个原生类作为 Professor 的超类，这就可以解释为什么 Professor#name 的 super 能调用到 Mathematician 的 name 方法。参考 http://ju.outofmemory.cn/entry/135588
修改已被 include 模块，比如增加方法，所有 include 该模块的类都将包含新方法，因为共享 m_tbl 方法表，Ruby 在 include 的时候拷贝的只是 RClass struct，不拷贝底层的方法表，看下面例子：

module Professor
  def letcures ; end
end

class Mathematician
  attr_accessor :first_name
  attr_accessor :last_name

  include Professor
end

p = Mathematician.new
p.first_name = 'hello'
p.last_name = 'world'

p p.methods.sort

#open Professor, adds new method
module Professor
  def classroom; end
end

p p.methods.sort

但是修改已被 include 模块中 include 的其他模块，不会影响插入到 include 类的已经被拷贝的模块副本，也就不会增加或者删除方法。
当创建一个 class或者模块的时候，其实是新建了一层词法作用域，Ruby 用两个指针来标示： nd_clss ，指向当前作用域对应的模块或者类；nd_next 指向父层或者上下文的词法作用域。形成一个作用域链条。
常量的查找跟方法的查找类似，只是方法的查找是通过祖先连(super) 来查找，而常量是通过迭代词法作用域链（nd_next)来查找。
Ruby 优先通过词法作用域来查找常量：

class SuperClass
  FIND_ME = "Found in Superclass"
end

module ParentLexicalScope
  FIND_ME = "Found in ParentLexicalScope"

  module ChildLexicalScope

    class SubClass < SuperClass
      p FIND_ME
    end
  end
end

输出 "Found in ParentLexicalScope"

真实的 Ruby 常量查找还需要加入 autoload 关键字：

『检索词法作用域链 -> 为了个作用域的类检查 autoload -> 检索超类链 -> 为每个超类检查 autoload -> 调用 const_missing。』。

关于 autoload http://www.rubyinside.com/ruby-techniques-revealed-autoload-1652.html