Breaking the barrier between different programming languages is both interesting and beneficial. For example, ByteDance developed rust2go[1] to facilitate migrating Golang projects to Rust smoothly[2].
Importing Rust into Ruby can not only improve performance but also reduce tedious, repetitive work. For instance, Rust has implemented OpenTelemetry metrics, whereas Ruby hasn't. Wrapping the Rust implementation for use in Ruby can save a lot of effort. Ccache[3] is an experimental project exploring this direction.
Ccache is an etag-based local cache that saves cache values in Arc, which are then queried by Ruby. It needs to consider how to handle memory efficiently between the two systems.
This article introduces an idea about how to make Rust Arc
work between Rust and Ruby without copying.
Programs allocate memory and, after usage, need to consider how to return that memory.
The most straightforward way is managing memory manually, where programmers need to know when a variable can be freed and release it to the system. This is how C works; however, it's error-prone, and many memory bugs can be found in C programs.
Rust improves this by providing ownership. Variables belong to a specific scope, and when execution goes out of that scope, Rust releases the memory. This makes memory management explicit in the code.
Arc
shares variable usage by adding a reference. When code uses a variable, it increases the count, and after finishing its use, it decreases the count. Rust releases the memory when the reference count reaches zero.
Ruby's Garbage Collection (GC) allows programs to use memory without considering when to release it. The Ruby VM triggers GC when necessary, and during GC, it finds unreachable variables and returns their memory to the system or memory pool.
The problem arises when we use Rust's Arc
and need to pass the value to the Ruby part.
Arc
pub struct Store {
hash_map: HashMap<String, Arc<AnyObject>>,
}
impl Store {
fn new() -> Self {
Store {
hash_map: HashMap::new(),
}
}
}
wrappable_struct!(Store, StoreWrapper, STORE_WRAPPER);
class!(RubyStore);
methods!(
RubyStore,
rtself,
fn ruby_new() -> AnyObject {
let store = Store::new();
Class::from_existing("RubyStore").wrap_data(store, &*STORE_WRAPPER)
},
fn ruby_insert(key: RString, obj: AnyObject) -> AnyObject {
let rbself = rtself.get_data_mut(&*STORE_WRAPPER);
rbself
.hash_map
.insert(key.unwrap().to_string(), Arc::new(obj.unwrap()));
NilClass::new().into()
},
fn ruby_get(rb_key: RString) -> AnyObject {
let rbself = rtself.get_data_mut(&*STORE_WRAPPER);
let key = rb_key.unwrap().to_string();
let val = rbself.hash_map.get(&key).unwrap();
AnyObject::from(val.value())
},
);
#[allow(non_snake_case)]
#[no_mangle]
pub extern "C" fn Init_ruby_example() {
Class::new("RubyStore", None).define(|klass| {
klass.def_self("new", ruby_new);
klass.def("insert", ruby_insert);
klass.def("get", ruby_get);
});
}
The Store
struct is a simple HashMap
, and its value is an Arc
of Ruby AnyObject
. It is for concurrent usage.
And it seems to work well:
it 'works' do
store = RubyStore.new
foo = Foo.new(1, 2)
store.insert("key", foo)
sleep 0.1
GC.start
expect(store.get("key").class).to eq Foo
expect(store.get("key").a).to eq 1
expect(store.get("key").b).to eq 2
end
The above example works because the created Foo
object is still referenced by foo
variable, so the memory has not been freed.
To trigger the segmentation fault or other memory issues, we create and pass the Foo object directly to RubyStore
.
it 'has memory issues :(' do
store = RubyStore.new
store.insert("key", Foo.new(1, 2))
sleep 0.1
GC.start
expect(store.get("key").class).to eq Foo
end
Then it raises a segmentation fault.
We can avoid this by deep cloning the Arc
’s value; however, it is not zero cost. Serializing a large object may take more than 1ms in Ruby[3]. To improve performance, we need to consider other methods.
To work around this situation, one option is to let Rust allocate the memory and free it through drop
. However, this means Rust needs to figure out whether the allocated memory is being used by Ruby, which is not feasible. Thus, memory must be allocated by Ruby.
Rust ownership is a great idea as it lets the owner manage its job. We need to consider the responsibilities between Ruby and Rust. The memory is allocated by Ruby, so Ruby has the duty to release it. Then the memory is used by Rust but it doesn’t own it. Thus, we can still use Arc
; the only difference is that we do not return memory back when we drop the Arc
.
pub struct RubyObject {
value: rutie::types::Value,
}
impl Drop for RubyObject {
// drop nothing, GC was handled by Ruby
fn drop(&mut self) {
}
}
Rust has done its job properly, so we need to consider Ruby's job now. In Ruby, it allocates memory from the system, so it needs to free the memory. Additionally, it passes the object to Rust, so it needs to record this.
klass.def("insert_inner", ruby_insert);
class RubyStore
def insert(key, val)
@_val = val
insert_inner(key, val)
end
end
The value is assigned to a local variable @_val
, making it reachable through the RubyStore
instance. This prevents Ruby's GC from reclaiming its memory. When the key is deleted, we can set @_val = nil
to “free” its memory.
Now, everything works well.
Clearly, the current solution is far from ideal. To improve it, we can use a doubly linked list to save all Arc
references in a Ruby local variable. When Drop is called, instead of doing nothing, we can remove the reference from the doubly linked list.
For this solution, Rust depends on Ruby's GC for reclaiming memory, and it requires Ruby’s cooperation. Rust doesn’t trust programmers completely, so it isn’t strictly safe. However, Ruby works in another way; it assumes programmers can do the right things (though they often don’t). Thus, this solution is acceptable for Ruby. To make it safer and cleaner, we can handle the references things in Rust code.
Another interesting direction is to make ownership works in Ruby, not only for Arc
, but also for mutable/immutable
, and lock
usage. This can make the interaction smoother and make Ruby safer.
This article discussed a method to integrate Rust's Arc
into Ruby, ensuring memory safety without the need for deep cloning.
You can find the whole example in rust_arc_demo[4] and a use case in Ccache.