<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0">
  <channel>
    <title>yfractal (yang)</title>
    <link>https://ruby-china.org/yfractal</link>
    <description>眼界很重要</description>
    <language>en-us</language>
    <item>
      <title>Recurrent Neural Network (RNN) Introduction</title>
      <description>&lt;p&gt;原文： &lt;a href="https://github.com/yfractal/blog/blob/master/blog/2025-12-23-rnn-introduction.md" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/blog/blob/master/blog/2025-12-23-rnn-introduction.md&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;代码实现：&lt;a href="https://github.com/yfractal/rnn-rb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/rnn-rb&lt;/a&gt;&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Fri, 26 Dec 2025 09:18:14 +0800</pubDate>
      <link>https://ruby-china.org/topics/44427</link>
      <guid>https://ruby-china.org/topics/44427</guid>
    </item>
    <item>
      <title>How SDB Scans the Ruby Stack Without the GVL</title>
      <description>&lt;p&gt;链接 &lt;a href="https://github.com/yfractal/blog/blob/master/blog/2025-01-15-non-blocking-stack-profiler.md" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/blog/blob/master/blog/2025-01-15-non-blocking-stack-profiler.md&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;这篇文章主要是介绍为什么 SDB 扫描 Ruby 栈的时候没有使用全局锁，但仍然可以得到想要的结果的。&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Wed, 15 Jan 2025 23:03:05 +0800</pubDate>
      <link>https://ruby-china.org/topics/44021</link>
      <guid>https://ruby-china.org/topics/44021</guid>
    </item>
    <item>
      <title>一个简单的栈分析器（Stack Profiler）</title>
      <description>&lt;p&gt;&lt;a href="https://github.com/yfractal/sdb_signal" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb_signal&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;这个主要是用来和 &lt;a href="https://github.com/yfractal/sdb" rel="nofollow" target="_blank" title=""&gt;SDB&lt;/a&gt; 做性能对比的，是一个非常简单的 Stack Profiler。&lt;/p&gt;

&lt;p&gt;因为简单，可以用来了解 Ruby 栈分析器的实现，以及如何用 Rust 写 Ruby extension。&lt;/p&gt;

&lt;p&gt;一般的栈分析器，大体是先设置 signal，之后每隔一段时间，比如 1ms，触发 signal。在这个 signal handler 里，用 Ruby 内置的 &lt;code&gt;rb_profile_thread_frames&lt;/code&gt; 扫描线程当前的栈。最后 &lt;code&gt;rb_profile_frame_full_label&lt;/code&gt; 等方法获得 symbol。&lt;/p&gt;

&lt;p&gt;用法：&lt;/p&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s1"&gt;'sdb_signal'&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="nb"&gt;sleep&lt;/span&gt; &lt;span class="mi"&gt;10000000&lt;/span&gt;
  &lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;times&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
  &lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;Thread&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="no"&gt;SdbSignal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setup_signal_handler&lt;/span&gt;
&lt;span class="no"&gt;SdbSignal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start_scheduler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# 这里没有输出，需要输出的话，可以找到相应的 Rust 代码打印或者打 log。&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;SDB 会稍微复杂一些，因为它的定位是 none blocking，不使用全局锁，也就没法用 Ruby 的 &lt;code&gt;rb_profile_thread_frames&lt;/code&gt; 之类的方法，需要写扫描逻辑。架构上有一点优化，再就是并发性能上有一些考量（因为要做到 none blocking），比如用了 spinlock、memory barrier 之类的做并发控制。&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Wed, 08 Jan 2025 08:23:32 +0800</pubDate>
      <link>https://ruby-china.org/topics/44007</link>
      <guid>https://ruby-china.org/topics/44007</guid>
    </item>
    <item>
      <title>Understanding the Page Table Step by Step</title>
      <description>&lt;p&gt;之前 page table 一直理解的不好，最近在重读 xv6, a simple Unix-like teaching operating system 的时候发现，page table 就是一个特殊的 hash-map，虚拟地址的一部分作为 page table 的 key（index），最后几位作为 page 内的 offset（可以保证一个 page 内的内存是连续的，并节省内存）。而多层 page table 是为了 lazy allocate memory 从而达到节省内存的目的。&lt;/p&gt;

&lt;p&gt;写了篇文章作为记录：&lt;a href="https://github.com/yfractal/blog/blob/master/blog/2025-01-01.md" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/blog/blob/master/blog/2025-01-01.md&lt;/a&gt;&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Wed, 01 Jan 2025 22:18:30 +0800</pubDate>
      <link>https://ruby-china.org/topics/43998</link>
      <guid>https://ruby-china.org/topics/43998</guid>
    </item>
    <item>
      <title>SDB generated RubyChina(Homeland) call graph</title>
      <description>&lt;p&gt;The begin:&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/84de73ab-0556-4776-84a8-9df76df1b372.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/83b1b268-d019-4cea-9ee0-b02848e3d142.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;....&lt;/p&gt;

&lt;p&gt;The bottom&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/e2d67aee-195a-4603-b8dc-95bb7a41174f.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;A brief:&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/4711def2-592e-46fb-96f7-99ec96e7a9bf.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;I can't upload the origin image, it's too large.&lt;/p&gt;

&lt;p&gt;The image is generated by &lt;a href="https://github.com/yfractal/sdb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb&lt;/a&gt;, a Ruby stack profiling tool under the experiment stage.&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Fri, 18 Oct 2024 23:34:11 +0800</pubDate>
      <link>https://ruby-china.org/topics/43921</link>
      <guid>https://ruby-china.org/topics/43921</guid>
    </item>
    <item>
      <title>Observing Puma Thread Scheduling through eBPF</title>
      <description>&lt;h2 id="Introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Observation tools can help us understand and improve system performance. In this article, I will introduce how to observe Puma thread scheduling through eBPF.&lt;/p&gt;

&lt;p&gt;The eBPF code used in this example can be found here:
&lt;a href="https://github.com/yfractal/sdb/blob/main/scripts/thread_schedule.py" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb/blob/main/scripts/thread_schedule.py&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="How It Works"&gt;How It Works&lt;/h2&gt;
&lt;p&gt;eBPF[1] allows us to probe kernel functions. We can use the command &lt;code&gt;sudo bpftrace -l | grep -E "kprobe|kfunc"&lt;/code&gt; to find all available kernel functions.&lt;/p&gt;

&lt;p&gt;Inspired by &lt;a href="https://github.com/iovisor/bcc/blob/master/tools/offcputime.py" rel="nofollow" target="_blank" title=""&gt;BCC offcputimfe.py&lt;/a&gt;, I use &lt;code&gt;finish_task_switch&lt;/code&gt; as the instrumentation point. This function is called after the context switched to the new task(thread)[2], and we can get the previous task (thread) through the &lt;code&gt;prev&lt;/code&gt; argument, and the current thread ID is the task that has been switched to. Its signature is:&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;rq&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nf"&gt;finish_task_switch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;__releases&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rq&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;lock&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The program is straightforward: it records the start timestamp for the current thread. When the system suspends the thread, the thread occurs in the prev argument. At that point, we record the end timestamp and submit the event.&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;oncpu&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;pt_regs&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;task_struct&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;u32&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;u64&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_ktime_get_ns&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// current task&lt;/span&gt;
    &lt;span class="n"&gt;u64&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;bpf_get_current_pid_tgid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="n"&gt;tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid_tgid&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__u32&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;pid_tgid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event_t&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;bpf_get_current_comm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;events_map&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// previous task&lt;/span&gt;
    &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;tgid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prev&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;eventp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;events_map&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lookup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;tgid&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;eventp&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;bpf_trace_printk&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"prev is nil"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;eventp&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;end_ts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;perf_submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;eventp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;eventp&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="The Results"&gt;The Results&lt;/h2&gt;
&lt;p&gt;Next, I created a simple HTTP server using Roda, and I used a Ruby script to send HTTP requests. After collecting the events, I converted it into the &lt;a href="https://ui.perfetto.dev/" rel="nofollow" target="_blank" title=""&gt;Perfetto&lt;/a&gt; trace format.&lt;/p&gt;

&lt;p&gt;The result looks like this (&lt;a href="https://github.com/yfractal/sdb-analyzer/blob/main/data/thread_schedule_trace.json" rel="nofollow" target="_blank" title=""&gt;the trace is available here&lt;/a&gt;):
&lt;img src="https://l.ruby-china.com/photo/yfractal/eb6b8e9a-e8cb-4d39-894e-6250ae245d6d.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;puma srv 23630&lt;/code&gt;(the last thread in the image) is Puma’s server thread, which pulls ready I/O events through &lt;code&gt;nio&lt;/code&gt; and distributes them to worker threads (the ThreadPool). So you can see that it is active for a very short period.&lt;/p&gt;
&lt;h2 id="Others"&gt;Others&lt;/h2&gt;
&lt;p&gt;One interesting finding is that when I use &lt;code&gt;wrk&lt;/code&gt; for sending requests, I can barely see Puma’s server thread being active. This is because &lt;code&gt;wrk&lt;/code&gt; enables keep-alive by default, and Puma reuses the previous connection, so the Puma server doesn’t need to wait for a new request.&lt;/p&gt;

&lt;p&gt;Without eBPF, we only know the system schedules threads, but we don't know how frequently this happens or how long a thread runs. This visibility helps us understand the system better.&lt;/p&gt;

&lt;p&gt;Next, I plan to link scheduling events with lock events to understand how the GVL and other locks affect a Ruby HTTP server.&lt;/p&gt;
&lt;h2 id="Links"&gt;Links&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://ebpf.io/" rel="nofollow" target="_blank"&gt;https://ebpf.io/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://elixir.bootlin.com/linux/v6.4-rc7/source/kernel/sched/core.c#L5157" rel="nofollow" target="_blank"&gt;https://elixir.bootlin.com/linux/v6.4-rc7/source/kernel/sched/core.c#L5157&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
      <author>yfractal</author>
      <pubDate>Tue, 08 Oct 2024 09:04:04 +0800</pubDate>
      <link>https://ruby-china.org/topics/43902</link>
      <guid>https://ruby-china.org/topics/43902</guid>
    </item>
    <item>
      <title>Symbolizing Ruby ISeq Through eBPF</title>
      <description>&lt;h2 id="Introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;A stack profiler scans the function stack, where we can find the function's address. To make this address meaningful, we need to retrieve the function name and other information—a process known as symbolization.&lt;/p&gt;

&lt;p&gt;In this article, I will introduce how to symbolize Ruby instructions using eBPF and explain why I chose eBPF for this purpose.  Its code is here &lt;a href="https://github.com/yfractal/sdb/pull/7" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb/pull/7&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="Background"&gt;Background&lt;/h2&gt;
&lt;p&gt;We can think of the Ruby VM as a stack machine[1]. When it executes a function, it pushes the function address(ISeq) onto its stack, which is an array of &lt;code&gt;rb_control_frame_struct&lt;/code&gt;. Simplified code is shown below:&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/08805eab-6a25-41f2-87f9-1c56089124d0.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;A stack profiler can scan the &lt;code&gt;rb_control_frame_struct&lt;/code&gt; array and retrieve the functions that are currently executing.&lt;/p&gt;

&lt;p&gt;Ruby natively supports this through &lt;code&gt;rb_profile_frames&lt;/code&gt;, which fetches relevant information (iseq and line number). We can then retrieve additional details using functions like &lt;code&gt;rb_profile_frame_method_name&lt;/code&gt;. Several tools make use of this approach, such as Shopify's &lt;code&gt;stack_frames&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id="Why Another Stack Profiler?"&gt;Why Another Stack Profiler?&lt;/h2&gt;
&lt;p&gt;Ruby already has several stack profiling tools, such as &lt;a href="https://github.com/tmm1/stackprof" rel="nofollow" target="_blank" title=""&gt;stackprof&lt;/a&gt; and Shopify's &lt;a href="https://github.com/Shopify/stack_frames" rel="nofollow" target="_blank" title=""&gt;stack_frames&lt;/a&gt;. These tools use &lt;code&gt;rb_profile_frames&lt;/code&gt;, which requires holding the Global VM Lock (GVL), blocking the execution of all other threads. Although Ruby has Reactor, it still blocks all threads within the Reactor, and the Reactor doesn’t seem to be widely adopted. Even without considering the GVL, these tools run in the application thread, adding additional delays to the application.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/yfractal/sdb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb&lt;/a&gt; solves these issues by pulling stack frames without holding the GVL (see &lt;a href="https://github.com/yfractal/sdb/blob/main/ext/sdb/src/lib.rs#L228-L233" rel="nofollow" target="_blank" title=""&gt;this code&lt;/a&gt;). As it doesn’t affect application threads, it can be used on the fly, even in production environments.&lt;/p&gt;
&lt;h2 id="Troubles After Releasing the GVL"&gt;Troubles After Releasing the GVL&lt;/h2&gt;
&lt;p&gt;The Ruby GVL ensures VM data integrity, which includes the ISeq. When fetching an ISeq's fields, we need to get the GVL back. For performance reasons, we need to retrieve the ISeq’s information in batch. And in the puller thread, we couldn’t keep ISeq’s reference(we do not have GVL). And when we retrieve ISeq’s information, they could be freed by GC. Then it can cause segment fault.&lt;/p&gt;

&lt;p&gt;We could mitigate this by waiting for Ruby VM to load all the code, checking the &lt;a href="https://github.com/yfractal/sdb/pull/1" rel="nofollow" target="_blank" title=""&gt;ISeq type&lt;/a&gt;, or catching segmentation faults.&lt;/p&gt;

&lt;p&gt;That said, we can still improve the process. If asynchronous ISeq retrieval is error-prone, we can opt for synchronous retrieval. While Ruby doesn't load code all the time and the performance impact is minimal, I believe this is a reasonable trade-off.&lt;/p&gt;
&lt;h2 id="SDB eBPF Symbolizer"&gt;SDB eBPF Symbolizer&lt;/h2&gt;
&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/e8ac2e72-2382-40c2-aa1c-a27544b3ab61.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;eBPF allows us to probe both kernel and user functions through &lt;code&gt;kprobe&lt;/code&gt; and &lt;code&gt;uprobe&lt;/code&gt;. It inserts a breakpoint instruction, and when this instruction is executed, it jumps to a predefined handler function[2][3].&lt;/p&gt;

&lt;p&gt;We see, it executes code synchronously. So we can insert probes when the VM creates an ISeq and capture relevant information. Probing functions like &lt;code&gt;rb_iseq_new_with_opt&lt;/code&gt; and &lt;code&gt;rb_iseq_new_with_callback&lt;/code&gt; serve this purpose well.&lt;/p&gt;

&lt;p&gt;Using &lt;a href="https://github.com/iovisor/bcc" rel="nofollow" target="_blank" title=""&gt;bcc&lt;/a&gt; makes this relatively simple:&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;BPF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;bpf_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;binary_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/home/ec2-user/.rvm/rubies/ruby-3.1.5/lib/libruby.so.3.1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attach_uprobe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;binary_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sym&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rb_iseq_new_with_opt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fn_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rb_iseq_new_with_opt_instrument&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;attach_uretprobe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;binary_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sym&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rb_iseq_new_with_opt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fn_name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rb_iseq_new_with_opt_return_instrument&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In &lt;code&gt;rb_iseq_new_with_opt_instrument&lt;/code&gt;, we can get arguments by &lt;code&gt;PT_REGS_PARMX&lt;/code&gt;. For example, in Ruby 3.1.5, the second argument of &lt;code&gt;rb_iseq_new_with_opt&lt;/code&gt; is the function’s name, which we can obtain as follows:&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;RString&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nf"&gt;bpf_probe_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nc"&gt;PT_REGS_PARM2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Since &lt;code&gt;RString&lt;/code&gt; is not a C string, we need to convert it to a C string. However, because eBPF operates in a sandboxed environment and cannot call user-space functions, we need to implement the conversion ourselves.&lt;/p&gt;

&lt;p&gt;Here is a simple implementation:&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kr"&gt;inline&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;read_rstring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;RString&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buff&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;u64&lt;/span&gt; &lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;char&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;unsigned&lt;/span&gt; &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="n"&gt;bpf_probe_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;basic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Check if the string is embedded or heap-allocated&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flags&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;13&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;bpf_probe_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;heap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;len&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;bpf_probe_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;heap&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;bpf_probe_read_str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buff&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;len&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_embed_ary_len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MAX_STR_LENGTH&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;bpf_probe_read_str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buff&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buff&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;str&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;as&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;embed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ary&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After obtaining the necessary information, we can submit it to the user program.&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nc"&gt;BPF_PERF_OUTPUT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;rb_iseq_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
&lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="nf"&gt;rb_iseq_new_with_opt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;const&lt;/span&gt; &lt;span class="n"&gt;rb_ast_body_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ast&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;realpath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="o"&gt;//&lt;/span&gt;                      &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;first_lineno&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;const&lt;/span&gt; &lt;span class="n"&gt;rb_iseq_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;parent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="n"&gt;isolated_depth&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="o"&gt;//&lt;/span&gt;                      &lt;span class="n"&gt;enum&lt;/span&gt; &lt;span class="n"&gt;iseq_type&lt;/span&gt; &lt;span class="nb"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;const&lt;/span&gt; &lt;span class="n"&gt;rb_compile_option_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;option&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;rb_iseq_new_with_opt_instrument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;pt_regs&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;event_t&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;

    &lt;span class="n"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;RString&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nf"&gt;bpf_probe_read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nc"&gt;PT_REGS_PARM2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="nf"&gt;read_rstring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then, the data can be read in the user program as below:&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;print_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cpu&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ctypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctypes&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;POINTER&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="n"&gt;contents&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_dict&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;

&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;events&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;open_perf_buffer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;print_event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_buffer_poll&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;KeyboardInterrupt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The full code is here &lt;a href="https://github.com/yfractal/sdb/pull/7" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb/pull/7&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="Others"&gt;Others&lt;/h2&gt;
&lt;p&gt;Probing ISeq creation alone is not enough. ISeq could be moved to other places during GC compacting. To detect this, we could probe &lt;code&gt;gc_move&lt;/code&gt; and record the scan(source) and free(destination) address. As Ruby disables GC compaction by default, I leave it as a future work.&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;static VALUE gc_move(rb_objspace_t *objspace, VALUE scan, VALUE free, size_t slot_size);
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Besides eBPF, binary instrumentation or ptrace could offer better alternatives, as they can access the application’s functions. However, since &lt;a href="https://github.com/yfractal/sdb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb&lt;/a&gt; is still experimental, I chose eBPF for its simplicity.&lt;/p&gt;
&lt;h2 id="References"&gt;References&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Ruby Under a Microscope.&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.kernel.org/trace/kprobes.html#id2" rel="nofollow" target="_blank"&gt;https://docs.kernel.org/trace/kprobes.html#id2&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints" rel="nofollow" target="_blank"&gt;https://eli.thegreenplace.net/2011/01/27/how-debuggers-work-part-2-breakpoints&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
      <author>yfractal</author>
      <pubDate>Thu, 03 Oct 2024 10:24:36 +0800</pubDate>
      <link>https://ruby-china.org/topics/43901</link>
      <guid>https://ruby-china.org/topics/43901</guid>
    </item>
    <item>
      <title>无 root 权限、证书查看 Ruby HTTPS 请求内容</title>
      <description>&lt;h2 id="Introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;本文的代码在 &lt;a href="https://github.com/yfractal/sdb/tree/main/sdb-shim" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb/tree/main/sdb-shim&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;在开发或者排查问题的时候，时常会需要查看请求内容。比如著名的 tcpdump 可以查看 http 内容。https 也有相应的工具，比如基于 eBPF 的 ecapture（eBPF 是为了区分早起的 BPF，但有的人建议还是用 bpf，另 tcpdump 是用 bpf 的，且作者都是 Van Jacobson）。&lt;/p&gt;

&lt;p&gt;但这些工具都需要 root 权限，有些环境下，比如公司奇怪的安全要求或者在 docker 里，user 没有 root 权限。这个时候，想查看 https 的请求内容就非常困难。&lt;/p&gt;

&lt;p&gt;本文介绍一种既不需要更改 Ruby 代码，也不需要 root 权限，查看 https 请求内容的方法。&lt;/p&gt;
&lt;h2 id="How it works"&gt;How it works&lt;/h2&gt;
&lt;p&gt;我们可以让应用程序自己告诉我们，它的请求内容是什么。最直接的方法是，我们复写 Ruby http 请求方法，当请求反回后，打印解密后的内容。&lt;/p&gt;

&lt;p&gt;但这样，就需要引入一个 Gem，如果是编译型语言，比如 Go，还需要重新编译。&lt;/p&gt;

&lt;p&gt;eBPF 的 uprobe，会在方法的地址，插入一个 trap instruction 比如 int3，当执行到该地址，不会执行原有方法，而是跳转到指定地址。但 eBPF uprobe 需要 root 权限。&lt;/p&gt;

&lt;p&gt;而我们既要避免侵入代码，又要在非 root 权限下运行，为了既要又要，在 Mac 系统下，我们可以使用 &lt;code&gt;__interpose section&lt;/code&gt; 替换掉原有方法，然后用 &lt;code&gt;DYLD_INSERT_LIBRARIES&lt;/code&gt; 链接入程序。达到类似 Ruby alias method 的目的。&lt;/p&gt;
&lt;h2 id="How to implement"&gt;How to implement&lt;/h2&gt;
&lt;p&gt;首先我们要找到合适的方法。Ruby 使用 openssl 进行加密解密。但我对 Ruby 标准库和 openssl 并不熟悉，直接看代码会比较麻烦。&lt;/p&gt;

&lt;p&gt;可以用 Ruby 构造请求，并用 stack profiling tool 查看请求了哪些方法，从而缩小范围。我用的是 &lt;a href="https://github.com/yfractal/sdb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb&lt;/a&gt;，在这个并不好用的工具下，它帮我把位置定位到 read_nonblock openssl/buffering.rb:204。&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/7628f305-6162-4bed-b92f-ec36675c0ea3.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;之后通过 debug 和阅读代码，可以知道，Ruby 调用 SSL_read 进行解密。&lt;/p&gt;

&lt;p&gt;找到对应的方法后，我们需要告诉 MacOS 做相应的替换。&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="cp"&gt;#include&lt;/span&gt; &lt;span class="cpf"&gt;"openssl/ssl.h"&lt;/span&gt;&lt;span class="cp"&gt;
&lt;/span&gt;
&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;__osx_interpose&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;new_func&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;orig_func&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;Real__SSL_read&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;SSL_read&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="nf"&gt;__interpose_SSL_read&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ssl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;__osx_interpose&lt;/span&gt; &lt;span class="n"&gt;__osx_interpose_SSL_read&lt;/span&gt; &lt;span class="n"&gt;__attribute__&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;used&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;section&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"__DATA, __interpose"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)((&lt;/span&gt;&lt;span class="kt"&gt;uintptr_t&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;__interpose_SSL_read&lt;/span&gt;&lt;span class="p"&gt;))),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)((&lt;/span&gt;&lt;span class="kt"&gt;uintptr_t&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SSL_read&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;在 &lt;code&gt;__interpose_SSL_read&lt;/code&gt; 里，我们调用 &lt;code&gt;SSL_read&lt;/code&gt; 得到解密后的内容。由于 http body 是被压缩过的，我们需要先找到 body 的位置（同时解密了 headers 和 body），并进行解压，之后就可以拿到可读的 body。&lt;/p&gt;

&lt;p&gt;代码在 &lt;a href="https://github.com/yfractal/sdb/blob/main/sdb-shim/src/https_instrument.c" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb/blob/main/sdb-shim/src/https_instrument.c&lt;/a&gt; 。&lt;/p&gt;
&lt;h2 id="Others"&gt;Others&lt;/h2&gt;
&lt;p&gt;目前 &lt;code&gt;https_instrument&lt;/code&gt; 还只是一个玩具，我只测试了一个最简单的例子。对我来说，写这样的工具是一件很有趣的事情。再者，公司的开发机，&lt;strong&gt;&lt;em&gt;没有 root 权限&lt;/em&gt;&lt;/strong&gt;，它毕竟也不是科技公司。。。&lt;/p&gt;

&lt;p&gt;相比 eBPF，这种方法除了不用 root 权限外，开发起来也更容易，不需要额外的支持，还可以随便使用 library。&lt;/p&gt;

&lt;p&gt;比如 &lt;a href="https://github.com/odigos-io/opentelemetry-go-instrumentation" rel="nofollow" target="_blank" title=""&gt;opentelemetry-go-instrumentation&lt;/a&gt; 使用 eBPF 做 instrument，但 eBPF 是单独的内存空间，操作复杂的 Go 数据结构就极其困难，比如 hash map。&lt;/p&gt;

&lt;p&gt;不过 linux 并不直接支持这种方法，但可以用 &lt;code&gt;LD_PRELOAD&lt;/code&gt; 替换动态连结库的方法，相应&lt;a href="https://github.com/yfractal/sdb/pull/3/commits/0b6ad4caae85b5c59e8c2ae66a8d8147a97907cc" rel="nofollow" target="_blank" title=""&gt;代码&lt;/a&gt;。我之前的&lt;a href="https://ruby-china.org/topics/43883" title=""&gt;文章&lt;/a&gt;也有相关的介绍。&lt;/p&gt;

&lt;p&gt;&lt;code&gt;LD_PRELOAD&lt;/code&gt;，虽然可以 instrument openssl，但没法改程序本身的代码。理论上，通过改 binary，比如在相应的地址插入 int3，生成新的 bianry，应该可以达到类似的效果，或者直接在编译的时候做相应操作，再或者改 ELF。&lt;/p&gt;

&lt;p&gt;相比 eBPF，个人更喜欢 function Interposing 这种方法做 instrument。虽然需要应用配合，但比 eBPF 更可控。更重要的是，开发更简单也更灵活。&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/yfractal/sdb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb&lt;/a&gt; 目前来说，也是一个玩具，不但不好用，还会 segment fault。&lt;/p&gt;

&lt;p&gt;&lt;code&gt;https_instrument&lt;/code&gt; 应该还有很多问题，我会在个人使用过程中慢慢完善。如果真的有人需要的话，我再想办法让它用起来更简单。&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Sun, 15 Sep 2024 00:19:34 +0800</pubDate>
      <link>https://ruby-china.org/topics/43886</link>
      <guid>https://ruby-china.org/topics/43886</guid>
    </item>
    <item>
      <title>Detect Ruby GVL contention through dynamic link library functions</title>
      <description>&lt;h2 id="Introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Ruby's Global VM Lock (GVL) protects the Ruby VM's data but reduces parallel execution because only one thread can hold the lock at a time.&lt;/p&gt;

&lt;p&gt;The GVL can affect application performance. For example, in a Puma server with several threads, when one thread holds the lock, it causes delays for other threads.&lt;/p&gt;

&lt;p&gt;Ruby 3.2 introduced a GVL instrumentation API, and there are several tools for visualizing it. However, such observability requires Ruby VM support. Ruby VM supported observability development is slow, hard to cover all scenarios, and adds maintenance overhead for Ruby.&lt;/p&gt;

&lt;p&gt;This article explores a more dynamic solution that provides similar observability without modifying Ruby code. It uses &lt;code&gt;LD_PRELOAD&lt;/code&gt;[1] and &lt;code&gt;dlsym&lt;/code&gt;[2] to wrap pthread lock functions, achieving behavior similar to Ruby's alias method.&lt;/p&gt;

&lt;p&gt;And the code is in &lt;a href="https://github.com/yfractal/sdb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="How it Works"&gt;How it Works&lt;/h2&gt;
&lt;p&gt;Ruby’s GVL is implemented using mutex and conditional variable, which are loaded through the dynamic linker. On linux, the dynamic linker allows us to override those functions using &lt;code&gt;LD_PRELOAD&lt;/code&gt;. In the overridden functions, we can log relevant events and locate the original function through &lt;code&gt;dlsym&lt;/code&gt;. This approach is similar to Ruby's alias method but for dynamically linked functions.&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[no_mangle]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;unsafe&lt;/span&gt; &lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="s"&gt;"C"&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;pthread_mutex_lock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mutex&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;pthread_mutex_t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;i32&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// log acquire event ...&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="nf"&gt;Some&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;real_pthread_mutex_lock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;REAL_PTHREAD_MUTEX_LOCK&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;ret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;real_pthread_mutex_lock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mutex&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="c1"&gt;// log acquired event ...&lt;/span&gt;
        &lt;span class="n"&gt;ret&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nd"&gt;eprintln!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Failed to resolve pthread_mutex_lock"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// then we could do similar things for pthread_mutex_unlock, pthread_cond_wait and pthread_cond_signal&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To identify the mutex’s address, we need to access Ruby's &lt;code&gt;rb_thread_t&lt;/code&gt; object.&lt;/p&gt;

&lt;p&gt;Here’s a simplified version of the code:&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;unsafe&lt;/span&gt; &lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="s"&gt;"C"&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;log_gvl_addr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;_module&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_val&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// find rb_thread_t from thread value&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;thread_ptr&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;RTypedData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;thread_val&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;RTypedData&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;rb_thread_ptr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;thread_ptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="py"&gt;.data&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;rb_thread_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// access gvl_addr through offset directly&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;gvl_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;rb_thread_ptr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="py"&gt;.ractor&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;u64&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;344&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;gvl_ref&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gvl_addr&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;rb_global_vm_lock_t&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;lock_addr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;gvl_ref&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="py"&gt;.lock&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;u64&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// log gvl address ...&lt;/span&gt;
    &lt;span class="nf"&gt;rb_ll2inum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lock_addr&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;i64&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="Testing"&gt;Testing&lt;/h2&gt;
&lt;p&gt;I used the following script for testing:&lt;/p&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="sr"&gt;//&lt;/span&gt; &lt;span class="n"&gt;example&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rb&lt;/span&gt;
&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s1"&gt;'sdb'&lt;/span&gt;

&lt;span class="no"&gt;Sdb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log_gvl_addr&lt;/span&gt;

&lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;times&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="n"&gt;thread&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Thread&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
    &lt;span class="no"&gt;Sdb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log_gvl_addr&lt;/span&gt;
    &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
    &lt;span class="mi"&gt;10000&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;times&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="n"&gt;threads&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;thread&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;threads&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;each&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;thread&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;thread&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We can run it using the this command: &lt;code&gt;LD_PRELOAD=./target/release/libsdb_shim.so bundle exec ruby example.rb&lt;/code&gt;(&lt;code&gt;libsdb_shim.so&lt;/code&gt; is the compiled Rust file).&lt;/p&gt;

&lt;p&gt;Then, we could see logs similar to these:&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;2024-09-10 21:09:11.540956679 [INFO] [lock] thread_id=281472580841568, rb_thread_addr=187651089870448, gvl_mutex_addr=187651083330256

2024-09-10 21:09:11.53981372 [INFO] [lock][mutex][acquire]: thread=281472580841568, lock_addr=187651083330256
2024-09-10 21:09:11.539815804 [INFO] [lock][mutex][acquired]: thread=281472580841568, lock_addr=187651083330256
2024-09-10 21:09:11.539816595 [INFO] [lock][cond][acquire]: thread=281472580841568, lock_addr=187651083330256, cond_var_addr=187651089870568
2024-09-10 21:09:11.540927137 [INFO] [lock][cond][acquired]: thread=281472580841568, lock_addr=187651083330256, cond_var_addr=187651089870568
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="Others"&gt;Others&lt;/h2&gt;&lt;h3 id="Does the GVL Matter?"&gt;Does the GVL Matter?&lt;/h3&gt;
&lt;p&gt;Ruby uses the GVL to protect its VM and releases the lock during I/O operations. It's not bad for I/O-bound applications.&lt;/p&gt;

&lt;p&gt;However, background threads or code instrumentation (like NewRelic) can not only consume CPU resources but also introduce delays to all Ruby application threads.&lt;/p&gt;
&lt;h3 id="eBPF Solution"&gt;eBPF Solution&lt;/h3&gt;
&lt;p&gt;We could use eBPF to probe these functions without modifying the application, but eBPF programs usually require root privileges and have more dependencies.&lt;/p&gt;

&lt;p&gt;LD_PRELOAD alters the application’s library loading but is a much lighter solution compared to eBPF.&lt;/p&gt;
&lt;h3 id="Improvements"&gt;Improvements&lt;/h3&gt;
&lt;p&gt;The code demonstrates how to use &lt;code&gt;LD_PRELOAD&lt;/code&gt; and &lt;code&gt;dlsym&lt;/code&gt; to instrument the Ruby VM without modifying Ruby code.&lt;/p&gt;

&lt;p&gt;Since Ruby’s GVL is complex(it uses conditional variables and only acquires the lock when the GVL has an owner and the current thread is not the timer thread), instrumenting mutex and conditional variable doesn’t fully capture &lt;code&gt;gvl_acquire&lt;/code&gt; and &lt;code&gt;gvl_release&lt;/code&gt;. However, we can still infer GVL delays from the locking patterns.&lt;/p&gt;

&lt;p&gt;The code logs events to a file, allowing for async analysis. We could use fast_log[4], which buffers logs in memory and writes them to a file in batches.&lt;/p&gt;

&lt;p&gt;However, since Ruby VM accesses the GVL pretty frequently, the &lt;code&gt;example.rb&lt;/code&gt; can generate over 80,000 lines of logs. Likes ldb[3], the performance could be further improved by logging lock events only when the delay exceeds a threshold.&lt;/p&gt;
&lt;h2 id="Summary"&gt;Summary&lt;/h2&gt;
&lt;p&gt;The uses LD_PRELOAD and dlsym to instrument the GVL without modifying Ruby code. You can find the code at &lt;a href="https://github.com/yfractal/sdb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/sdb&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="References"&gt;References&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://man7.org/linux/man-pages/man8/ld.so.8.html" rel="nofollow" target="_blank"&gt;https://man7.org/linux/man-pages/man8/ld.so.8.html&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://linux.die.net/man/3/dlsym" rel="nofollow" target="_blank"&gt;https://linux.die.net/man/3/dlsym&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;LDB: An Efficient Latency Profiling Tool  for Multithreaded Applications&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/rbatis/fast_log" rel="nofollow" target="_blank"&gt;https://github.com/rbatis/fast_log&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
      <author>yfractal</author>
      <pubDate>Tue, 10 Sep 2024 22:05:55 +0800</pubDate>
      <link>https://ruby-china.org/topics/43883</link>
      <guid>https://ruby-china.org/topics/43883</guid>
    </item>
    <item>
      <title>【译】垃圾回收和 Ruby RGenGC 简介</title>
      <description>&lt;h2 id="背景"&gt;背景&lt;/h2&gt;
&lt;p&gt;最近，我在做叫 Ccache 的实验项目（&lt;a href="https://github.com/yfractal/ccache" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/ccache&lt;/a&gt;），该项目用 Rust 实现核心功能，并与 Ruby 和 Golang 等语言进行集成。Rust 是一种系统编程语言，没有垃圾回收（GC）进行内存管理。但 Ruby 和 Golang 确实使用 GC。这引出了一个有趣的问题：Rust 如何安全有效地与使用 GC 的语言进行交互？因此，我花了一点时间了解 Ruby 的 GC 是如何工作的。&lt;/p&gt;
&lt;h2 id="介绍"&gt;介绍&lt;/h2&gt;
&lt;p&gt;在本文中，我将描述 Ruby 2.2 引入的 RGenGC（Restricted Generational GC）是如何工作的。为了使事情更容易理解，我会先解释垃圾回收的基本原理，然后描述 RGenGC 解决的独特问题及其机制和源代码。在这个过程中，我尽量忽略不必要的细节。&lt;/p&gt;
&lt;h2 id="垃圾回收"&gt;垃圾回收&lt;/h2&gt;
&lt;p&gt;程序从操作系统分配虚拟内存，操作系统将虚拟内存映射到物理内存（或其他资源，如文件）。由于物理内存的限制和性能要求，程序需要在使用后将内存归还。&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/ba3fa818-4a96-451c-96f9-a9ae9158c1be.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;程序使用两种类型内存：栈内存和堆内存。Rust 主要使用栈内存，这需要在分配内存之前知道变量的大小。对于动态大小的结构（如 vector），Rust 在堆上分配内存。为了安全高效地管理内存，Rust 不允许共享可变和循环引用。这些限制使 Rust 能够使用引用计数来管理堆内存，但实现一些数据结果（双向链表）会变得困难 [1]。&lt;/p&gt;

&lt;p&gt;C 要求程序员手动分配和释放堆内存，难以使用并容易出错。Rust 使用生命周期（lifetime）和所有权 (owership) 进行内存管理，从而达到明确、安全、高效的目的。&lt;/p&gt;

&lt;p&gt;像 Ruby 和 Go 使用 GC——程序员不需要考虑何时释放内存，因为 GC 会处理这个事情。所以 GC 需要高效地将不再使用的内存归还系统。&lt;/p&gt;
&lt;h2 id="Mark and Sweep 算法"&gt;Mark and Sweep 算法&lt;/h2&gt;
&lt;p&gt;Mark and Sweep 是一种检测和释放不再使用 (dead) 内存的垃圾回收算法。&lt;/p&gt;

&lt;p&gt;为了找到不再使用 (dead) 的内存，Mark and Sweep 算法通过引用遍历所有可访问（rechable）对象。未访问到的对象被视为不活跃（dead），可以被释放。既，标记阶段标识所有可达对象并将其标记。在清除阶段，这些标记的对象被保留，而其他对象被释放。&lt;/p&gt;

&lt;p&gt;为了遍历所有可访问对象，Mark and Sweep 使用广度优先搜索（BFS）算法。它递归地查找每个对象的所有直接引用。为了避免无限循环，访问完一个对象的所有引用用，需要标记为已被访问。&lt;/p&gt;

&lt;p&gt;该算法维护一个队列以保存当前已知的可达对象（object）。它将一个 item 出队，并将这个 item 的所有引用入队，并将这个 item 标记为已访问。如果 item 的引用已经被标记为访问过，则不会将其添加到队列中。此过程持续到队列为空，表示所有可被访问的对象已被访问。&lt;/p&gt;

&lt;p&gt;下面是伪代码：&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;init_queue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;global_objects&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;visted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;true&lt;/span&gt;
  &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_empty&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;true&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="nb"&gt;object&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dequeue&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;reference&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;references&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;reference&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;visited&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;false&lt;/span&gt;
       &lt;span class="n"&gt;reference&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;visited&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;true&lt;/span&gt;
       &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reference&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;现在所有可达对象都已知，清除阶段将遍历所有对象。如果对象的已访问字段为 false，则表示对象不可访问，可以释放。&lt;/p&gt;

&lt;p&gt;总结如下：&lt;/p&gt;

&lt;p&gt;标记阶段：广度优先搜索标记所有活动对象。
清除阶段：扫描内存以释放未标记的对象&lt;/p&gt;
&lt;h2 id="Mark and Compact"&gt;Mark and Compact&lt;/h2&gt;
&lt;p&gt;Mark and Sweep 可以释放不反问的对象，但它不能处理碎片问题，会在内存中创建空洞。&lt;/p&gt;

&lt;p&gt;!&lt;img src="https://l.ruby-china.com/photo/yfractal/e6d98eee-fe56-42f6-b455-731df5e0fb4e.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;由于释放的对象是不连续的，即使总内存足够大，也有可能无法为大结构分配内存。&lt;/p&gt;

&lt;p&gt;Mark and Compact 解决了这个问题，在标记对象的同时，将可被访问的对象重新排列，从而使空闲内存连续。&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/63f75045-9f90-412a-ac22-448de8fd3b5d.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;其思路是将内存分为两半：FROM space 和 TO space。当 FROM space“满”时，标记阶段开始。在标记阶段，将可被访问的对象，移动到 TO space。由于其他对象可能引用了移动的对象，需要在对象的旧内存中记录给对象已经被移动到新的地址，既 forward reference。之后将该对象，引用到的对象移动到 TO space，并更新地址。标记阶段结束后，所有可达引用都已移动到 TO space，则可释放 FROM space。&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/63f75045-9f90-412a-ac22-448de8fd3b5d.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;也可以将 TO space 视为 Mark and Sweep 算法中的队列，由两个指针表示队列的开始和结束。&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/caf4add4-445b-40fd-a52f-dd37295035a5.png!large" title="" alt=""&gt;&lt;/p&gt;
&lt;h2 id="Generational Garbage Collection"&gt;Generational Garbage Collection&lt;/h2&gt;
&lt;p&gt;为了释放 TO space，Mark and Compact 法必须扫描所有对象，非常耗时。Generational Garbage Collection 通过多数时候仅扫描新对象来改进这一点。&lt;/p&gt;

&lt;p&gt;Generational GC 基于一个简单的发现：如果一个对象存在很长时间，它往往会存在更长时间 [3]，新对象更有可能被回收，这意味着我们多数时候只需要检查新对象。例如，接收请求时，Rails 会创建 controller 实例，请求结束后应回收该实例。然而，数据库连接实例可能已经存在一段时间，不应回收，也不需要被回收。&lt;/p&gt;

&lt;p&gt;简单来说，Generational GC 将对象分为两类：新生代和老年代。新生代对象是最近创建的，而老年代对象已经存在了一段时间。目标是扫描新生代对象并释放相关内存。&lt;/p&gt;

&lt;p&gt;一个对象要被标记为不可被访问（dead），需要保证没有引用指向它。在标记阶段，我们目标是只扫描所有新生代对象。如果一个老对象引用了一个新对象，我们对其特殊记录，并在标记阶段扫描这个对象。这样就可以保证，释放一个对象的时候，没有任何其他对象引用该对象。&lt;/p&gt;

&lt;p&gt;当我们引用一个对象的时候，例如 &lt;code&gt;a.b = &amp;amp;c&lt;/code&gt;，如果 &lt;code&gt;a&lt;/code&gt; 和 &lt;code&gt;c&lt;/code&gt; 是同一时代，或者 &lt;code&gt;a&lt;/code&gt; 比 &lt;code&gt;c&lt;/code&gt; 更年轻的话，不需要做特殊处理，因为在 mark 阶段，可以被正常扫描到。当老对象，引用到新对象的时候，既 &lt;code&gt;a&lt;/code&gt; 比 &lt;code&gt;c&lt;/code&gt; 更老，则需要记录 &lt;code&gt;a&lt;/code&gt;，并在 mark 阶段，扫描 &lt;code&gt;a&lt;/code&gt;。&lt;/p&gt;
&lt;h2 id="Ruby 的 RGenGC"&gt;Ruby 的 RGenGC&lt;/h2&gt;&lt;h2 id="Ruby 需要解决的问题"&gt;Ruby 需要解决的问题&lt;/h2&gt;
&lt;p&gt;在 Ruby 2.2 之前，Ruby 只有 non-generational mark-and-sweep GC[4]。为了支持 Generational GC，我们需要一个写屏障 (Write Barrier) 来记录老对象引用新对象的情况。然而，由于 Ruby 有很多代码，以及第三方 C 扩展的存在，使得 Ruby 没办法做到向后兼容。&lt;/p&gt;

&lt;p&gt;为了解决这个兼容性问题，Ruby 团队引入了  Write-Barrier-Unprotected Objects 这个概念。&lt;/p&gt;
&lt;h2 id="Write-Barrier-Unprotected Objects"&gt;Write-Barrier-Unprotected Objects&lt;/h2&gt;
&lt;p&gt;Write-Barrier 就是之前提到的，当老对象引用新对象的时候，需要做的记录。Unprotected 指的是，存在老对象引用了新对象，但没有记录的情况。&lt;/p&gt;

&lt;p&gt;由于不能让所有 Ruby 对象在老对象引用新对象时使用 Write-Barrier，我们无法知道这些对象是否引用了新对象。在 Minor GC 期间，需要扫描这些对象。&lt;/p&gt;
&lt;h2 id="Ruby RGenGC 标记步骤[4]"&gt;Ruby RGenGC 标记步骤 [4]&lt;/h2&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Put the root-set objects and objects referenced from the remembered set objects into the work queue.
2. Repeat the following until the work queue is empty:
a. Dequeue an object p from the work queue.
b. For each object c referenced from p do:
    i. If p is an old object:
       • If c is already marked, makes c an old object and add c to the remembered set.
       • If c is not marked and not an old object, makes c’s age two (becomes an old object at the next step).
   ii. Increment the age of c by one, mark c, and then put c to work queue if c was not marked and is not an old object. Note that, in our implementation, if the age of an object becomes 3, the object becomes an old object.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;为了确保 Write-Barrier-Unprotected Objects  的安全，Ruby 会检测这些对象并将其放入一个集合中，以便在 minor GC 期间扫描。从而 Ruby 可以安全的回收对象。&lt;/p&gt;

&lt;p&gt;以下是相关的 Ruby 2.2 代码（删除了和本文无关代码）[6]：&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="nf"&gt;gc_mark_ptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb_objspace_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;rgengc_check_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;gc_mark_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="cm"&gt;/* already marked */&lt;/span&gt;
    &lt;span class="n"&gt;gc_aging&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;gc_grey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="nf"&gt;rgengc_check_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb_objspace_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;old_parent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;rgengc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parent_object&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;old_parent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* parent object is old */&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RVALUE_WB_UNPROTECTED&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;gc_remember_unprotected&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;RVALUE_OLD_P&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RVALUE_MARKED&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="cm"&gt;/* An object pointed from an OLD object should be OLD. */&lt;/span&gt;
                    &lt;span class="n"&gt;RVALUE_AGE_SET_OLD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_incremental_marking&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;RVALUE_MARKING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                            &lt;span class="n"&gt;gc_grey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                        &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;rgengc_remember&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;RVALUE_AGE_SET_CANDIDATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="总结"&gt;总结&lt;/h2&gt;
&lt;p&gt;本文介绍了几种基本的 GC 算法，以便更容易地理解 Ruby RGenGC。它没有涉及许多有趣的内容，例如并行垃圾回收器 [8]，只是讲解思路，所以有意地忽略了一些细节，比如 Mark and Compact 中，释放 TO space 之后，FROM space 会变成，TO sapce，TO space 变成 FROM space，既翻转 (flip)，因为这些并不影响理解算法主体，并且容易理解。更多的细节，可以参考一下链接 [2][3][4][5][7][8]。&lt;/p&gt;

&lt;p&gt;原文： &lt;a href="https://ruby-china.org/topics/43798" rel="nofollow" target="_blank"&gt;https://ruby-china.org/topics/43798&lt;/a&gt;&lt;/p&gt;
&lt;h2 id="引用"&gt;引用&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;GhostCell: Separating Permissions from Data in Rust&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ruby-china.org/topics/32226" rel="nofollow" target="_blank"&gt;https://ruby-china.org/topics/32226&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A Real-Time Garbage Collector Based on the Lifetimes of Objects&lt;/li&gt;
&lt;li&gt;Gradual Write-Barrier Insertion into a Ruby Interpreter&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.peterzhu.ca/notes-on-ruby-gc/" rel="nofollow" target="_blank"&gt;https://blog.peterzhu.ca/notes-on-ruby-gc/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/ruby/ruby/releases/tag/v2_2_1" rel="nofollow" target="_blank"&gt;https://github.com/ruby/ruby/releases/tag/v2_2_1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ocw.mit.edu/courses/6-172-performance-engineering-of-software-systems-fall-2018/resources/lecture-11-storage-allocation/" rel="nofollow" target="_blank"&gt;https://ocw.mit.edu/courses/6-172-performance-engineering-of-software-systems-fall-2018/resources/lecture-11-storage-allocation/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://inside.java/2022/08/01/sip062/" rel="nofollow" target="_blank"&gt;https://inside.java/2022/08/01/sip062/&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
      <author>yfractal</author>
      <pubDate>Sun, 07 Jul 2024 11:10:23 +0800</pubDate>
      <link>https://ruby-china.org/topics/43802</link>
      <guid>https://ruby-china.org/topics/43802</guid>
    </item>
    <item>
      <title>Garbage Collection 101 and Ruby's RGenGC (Restricted Generational GC)</title>
      <description>&lt;h2 id="Background"&gt;Background&lt;/h2&gt;
&lt;p&gt;Recently, I’ve been working on an experimental project called Ccache(&lt;a href="https://github.com/yfractal/ccache" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/ccache&lt;/a&gt;), which implements core functions in Rust and integrates with other languages such as Ruby and Golang. Rust is a systems programming language that doesn’t use garbage collection (GC) for memory management. However, Ruby and Golang do use GC. This raises an interesting question: how can Rust interact safely and effectively with languages that use GC? To answer this question, I spent some time understanding how Ruby's GC works.&lt;/p&gt;
&lt;h2 id="Introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;In this article, I will describe how RGenGC (Restricted Generational GC), introduced in Ruby 2.2, works. To make things easier, I will first explain how garbage collection works in general, and then describe the unique problems that RGenGC solves, along with its mechanisms and source code.&lt;/p&gt;
&lt;h2 id="Garbage Collection 101"&gt;Garbage Collection 101&lt;/h2&gt;
&lt;p&gt;Programs allocate virtual memory from the operating system, which maps the virtual memory to physical memory (or other resources such as files). Due to the limitations of physical memory and performance requirements, programs need to return the memory back once they are done using it.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/ba3fa818-4a96-451c-96f9-a9ae9158c1be.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;Programs use two kinds of memory: stack and heap. Rust primarily uses stack memory, which requires knowing the size of variables before allocating memory. For dynamically sized structures, such as vectors, Rust allocates memory on the heap. To manage memory safely and efficiently, Rust doesn’t allow shared mutable and cyclic references. These restrictions allow Rust to use reference counting for managing heap memory but make it challenging to implement certain structures, like doubly linked lists[1].&lt;/p&gt;

&lt;p&gt;C requires programmers to allocate and free heap memory manually, which is extremely difficult to manage correctly. Rust uses lifetimes and ownership to make memory management explicit and safe, minimizing performance costs as a system language.&lt;/p&gt;

&lt;p&gt;Languages like Ruby and Go use GC—programmers do not need to consider when memory is freed, as it’s handled by the GC. Simply put, GC needs to return unused memory back to the system efficiently.&lt;/p&gt;
&lt;h2 id="Mark and Sweep"&gt;Mark and Sweep&lt;/h2&gt;
&lt;p&gt;Mark-and-sweep is a garbage collection algorithm that detects and frees inactive memory.&lt;/p&gt;

&lt;p&gt;To find unused memory, the mark-and-sweep algorithm traverses all reachable objects through references. Unvisited objects are deemed inactive and can be freed. The mark phase identifies all reachable objects and marks them as such. In the sweep phase, these marked objects are retained while the others are freed.&lt;/p&gt;

&lt;p&gt;To traverse all reachable objects, it uses the breadth-first search (BFS) algorithm. It recursively finds all direct references to each item. To avoid infinite loops, once all direct references of an item are identified, the item is marked as visited, ensuring it won't be revisited.&lt;/p&gt;

&lt;p&gt;The algorithm maintains a queue to hold currently known reachable items. It dequeues an item, enqueues all its references, and marks it as visited. If a reference has already been visited, it isn’t added to the queue. This process continues until the queue is empty, indicating all reachable items have been visited.&lt;/p&gt;

&lt;p&gt;Here's the pseudocode:&lt;/p&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;init_queue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;global_objects&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;visted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;true&lt;/span&gt;
  &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_empty&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;true&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
  &lt;span class="nb"&gt;object&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dequeue&lt;/span&gt;
  &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;reference&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;references&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;reference&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;visited&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;false&lt;/span&gt;
       &lt;span class="n"&gt;reference&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;visited&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;true&lt;/span&gt;
       &lt;span class="n"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reference&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now that all reachable objects are known, the sweep phase involves looping through all objects. If an object’s visited field is false, it means the object is unreachable and can be freed.&lt;/p&gt;

&lt;p&gt;To summarize, the process is:&lt;/p&gt;

&lt;p&gt;Mark stage: Breadth-first search marked all of the live objects.
Sweep stage: Scan over memory to free unmarked objects.&lt;/p&gt;
&lt;h2 id="Mark and Compact"&gt;Mark and Compact&lt;/h2&gt;
&lt;p&gt;While mark-and-sweep can free unreachable objects, it doesn’t deal with fragmentation, which creates holes in memory.&lt;/p&gt;

&lt;p&gt;!&lt;img src="https://l.ruby-china.com/photo/yfractal/e6d98eee-fe56-42f6-b455-731df5e0fb4e.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;As the freed objects are discontinuous, memory cannot be allocated for large structures even if there is enough total memory available.&lt;/p&gt;

&lt;p&gt;Mark and Compact solves this issue by not only freeing unused memory but also rearranging live objects to make free memory contiguous.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/1ce7fcd9-e0bc-430b-a500-b91f292a1dee.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;The idea is to divide memory into two halves, FROM space and TO space. When FROM space is "full," the mark phase starts. During the mark phase, objects are moved from FROM space to TO space. Since other objects may refer to the moved object, a forward reference is recorded in the object’s old memory. The object’s references are also moved to TO space, and their pointers are updated to the new address. After the marking phase, all reachable references have been moved to TO space, and the old space can be freed.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/63f75045-9f90-412a-ac22-448de8fd3b5d.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;You can also consider TO space as the queue in the mark-and-sweep algorithm, with two pointers representing the start and end of the queue.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/caf4add4-445b-40fd-a52f-dd37295035a5.png!large" title="" alt=""&gt;&lt;/p&gt;
&lt;h2 id="Generational Garbage Collection"&gt;Generational Garbage Collection&lt;/h2&gt;
&lt;p&gt;To release TO space, the mark-and-compact algorithm has to scan through all objects, which is time-consuming. Generational garbage collection improves this by scanning only new objects most of the time.&lt;/p&gt;

&lt;p&gt;Generational garbage collection is based on a simple observation: if an object lives for a long time, it tends to live even longer[3]. This means we only need to check new objects most of the time. For example, when a request comes in, Rails creates a controller instance, which should be reclaimed after the request finishes. However, DB connection instances can live for a while and should not be reclaimed.&lt;/p&gt;

&lt;p&gt;To simplify, objects are divided into two categories: new generation and old generation. New-generation objects have been created recently, while old-generation objects have been around for a while. The goal is to free memory related to new-generation objects after scanning them.&lt;/p&gt;

&lt;p&gt;For an object to be marked as dead, there must be no references to it. During the marking phase, we scan all new-generation objects. If an old object references a new object, we need to record it and scan it during the marking phase.&lt;/p&gt;

&lt;p&gt;When we create a reference to an object, we need to look at the objects that reference it. If the referencing objects are in the same generation or the new generation, they can be considered in our marking phase. For objects that are older than the referenced object, they are recorded separately and also considered in our marking phase. By considering all these objects, we can safely mark an object as dead during the marking phase.&lt;/p&gt;
&lt;h2 id="Ruby RGenGC"&gt;Ruby RGenGC&lt;/h2&gt;&lt;h2 id="The Write Barrier"&gt;The Write Barrier&lt;/h2&gt;
&lt;p&gt;During generational garbage collection, we scan young generation objects most of the time to reduce GC cost. To do this safely, we need to record when an old object references a young object. This record is called a Write Barrier.&lt;/p&gt;
&lt;h2 id="The Unique Problem Ruby Faces"&gt;The Unique Problem Ruby Faces&lt;/h2&gt;
&lt;p&gt;Before Ruby 2.2, Ruby only had a non-generational mark-and-sweep GC[4]. To support generational GC, we need a write barrier to record when an old object references a new object. However, this is challenging because Ruby has a large code base and many third-party C extensions.&lt;/p&gt;

&lt;p&gt;To solve this compatibility issue, the Ruby team created the concept of Write-Barrier-Unprotected Objects.&lt;/p&gt;
&lt;h2 id="Write-Barrier-Unprotected Objects"&gt;Write-Barrier-Unprotected Objects&lt;/h2&gt;
&lt;p&gt;Since we can't let all Ruby objects use a Write Barrier when an old object references a new object, we don't know if these objects reference new objects or not. During minor GC, we must scan these objects.&lt;/p&gt;
&lt;h2 id="Ruby RGenGC Marking Steps without WB-unprotected objects[4]:"&gt;Ruby RGenGC Marking Steps without WB-unprotected objects[4]:&lt;/h2&gt;&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Put the root-set objects and objects referenced from the remembered set objects into the work queue.
2. Repeat the following until the work queue is empty:
a. Dequeue an object p from the work queue.
b. For each object c referenced from p do:
    i. If p is an old object:
       • If c is already marked, makes c an old object and add c to the remembered set.
       • If c is not marked and not an old object, makes c’s age two (becomes an old object at the next step).
   ii. Increment the age of c by one, mark c, and then put c to work queue if c was not marked and is not an old object. Note that, in our implementation, if the age of an object becomes 3, the object becomes an old object.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To make Write-Barrier-Unprotected Objects safe, Ruby detects these objects and puts them in a set for scanning during minor GC.&lt;/p&gt;

&lt;p&gt;Based on the above steps, Ruby ensures the safety of Write-Barrier-Unprotected Objects by detecting and recording them in a set.&lt;/p&gt;

&lt;p&gt;Here is the relevant Ruby 2.2 code (simplified by removing unrelated code)[6]:&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="nf"&gt;gc_mark_ptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb_objspace_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;rgengc_check_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;gc_mark_set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="cm"&gt;/* already marked */&lt;/span&gt;
    &lt;span class="n"&gt;gc_aging&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;gc_grey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt;
&lt;span class="nf"&gt;rgengc_check_relation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb_objspace_t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;old_parent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;rgengc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parent_object&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;old_parent&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* parent object is old */&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RVALUE_WB_UNPROTECTED&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;gc_remember_unprotected&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;RVALUE_OLD_P&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RVALUE_MARKED&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="cm"&gt;/* An object pointed from an OLD object should be OLD. */&lt;/span&gt;
                    &lt;span class="n"&gt;RVALUE_AGE_SET_OLD&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_incremental_marking&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;RVALUE_MARKING&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                            &lt;span class="n"&gt;gc_grey&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                        &lt;span class="p"&gt;}&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                        &lt;span class="n"&gt;rgengc_remember&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                    &lt;span class="p"&gt;}&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;RVALUE_AGE_SET_CANDIDATE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;objspace&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="Summary"&gt;Summary&lt;/h2&gt;
&lt;p&gt;This article introduced several basic GC algorithms to help understand Ruby's RGenGC more easily. It doesn’t cover many interesting topics such as the Parallel Garbage Collector[8] because the focus is on the basics and those related to Ruby. Some details are intentionally omitted, such as the flip in Mark and Compact, as they are obvious and easy to understand. For more details, you can refer to these links: [2][3][4][5][7][8].&lt;/p&gt;
&lt;h2 id="References"&gt;References&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;GhostCell: Separating Permissions from Data in Rust&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ruby-china.org/topics/32226" rel="nofollow" target="_blank"&gt;https://ruby-china.org/topics/32226&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A Real-Time Garbage Collector Based on the Lifetimes of Objects&lt;/li&gt;
&lt;li&gt;Gradual Write-Barrier Insertion into a Ruby Interpreter&lt;/li&gt;
&lt;li&gt;&lt;a href="https://blog.peterzhu.ca/notes-on-ruby-gc/" rel="nofollow" target="_blank"&gt;https://blog.peterzhu.ca/notes-on-ruby-gc/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/ruby/ruby/releases/tag/v2_2_1" rel="nofollow" target="_blank"&gt;https://github.com/ruby/ruby/releases/tag/v2_2_1&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ocw.mit.edu/courses/6-172-performance-engineering-of-software-systems-fall-2018/resources/lecture-11-storage-allocation/" rel="nofollow" target="_blank"&gt;https://ocw.mit.edu/courses/6-172-performance-engineering-of-software-systems-fall-2018/resources/lecture-11-storage-allocation/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://inside.java/2022/08/01/sip062/" rel="nofollow" target="_blank"&gt;https://inside.java/2022/08/01/sip062/&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
      <author>yfractal</author>
      <pubDate>Sat, 06 Jul 2024 11:37:07 +0800</pubDate>
      <link>https://ruby-china.org/topics/43798</link>
      <guid>https://ruby-china.org/topics/43798</guid>
    </item>
    <item>
      <title>Calling Rust from Ruby: How Rutie Works</title>
      <description>&lt;h2 id="Introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently, I've been developing an experimental project called&amp;nbsp;ccache(&lt;a href="https://github.com/yfractal/ccache" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/ccache&lt;/a&gt;), which is a Redis client-side caching that guarantees consistency. Since it operates on the client side, I need to ensure it supports different programming languages.&lt;/p&gt;

&lt;p&gt;The common practice is to write similar logic in different languages, as seen with the Redis client and OpenTelemetry instrument library. However, this approach involves tedious and repetitive work.&lt;/p&gt;

&lt;p&gt;One potential solution is to write the core functionality in Rust and integrate it with different languages. To achieve this, we need to address the discrepancies between Rust and other languages, such as how to represent data and manage memory safely.&lt;/p&gt;

&lt;p&gt;In this article, I will introduce Rutie, which bridges the gap between Ruby and Rust.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://west-barber-f19.notion.site/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F3ad64ea3-ffc7-46f7-a5e8-78e7a3534dae%2Fc3ecbef2-8cb2-4bae-bf1d-7744801cf570%2FUntitled.png?table=block&amp;amp;id=8150861d-5437-4eb8-9d6a-2c6373196403&amp;amp;spaceId=3ad64ea3-ffc7-46f7-a5e8-78e7a3534dae&amp;amp;width=1420&amp;amp;userId=&amp;amp;cache=v2" title="" alt="Untitled"&gt;&lt;/p&gt;
&lt;h2 id="How it works"&gt;How it works&lt;/h2&gt;
&lt;p&gt;Ruby MRI is written in C, so it natively works well with C. The idea is to write Rust code in the C ABI and ensure memory safety. &lt;/p&gt;
&lt;h2 id="Ruby Calls Rust Functions"&gt;Ruby Calls Rust Functions&lt;/h2&gt;
&lt;p&gt;First, we can let &lt;code&gt;cargo&lt;/code&gt; compile Rust files into a dynamic library by specifying &lt;code&gt;crate-type = ["dylib"]&lt;/code&gt; in &lt;code&gt;Cargo.toml&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Then we can calls the function through &lt;code&gt;fiddle&lt;/code&gt;[2].&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;#[allow(non_snake_case)]&lt;/span&gt;
&lt;span class="nd"&gt;#[no_mangle]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="s"&gt;"C"&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;rust_method&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nd"&gt;println!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"hello from Rust"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;handle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Fiddle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dlopen&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"./target/release/libruby_example.dylib"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="no"&gt;Fiddle&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Function&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="n"&gt;rust_method&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="no"&gt;Fiddle&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;TYPE_VOIDP&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="Bind C Functions to Ruby Class"&gt;Bind C Functions to Ruby Class&lt;/h2&gt;
&lt;p&gt;Ruby allows us to define Ruby methods through C, which is more convenient than using &lt;code&gt;fiddle&lt;/code&gt;. For example, &lt;code&gt;void rb_define_method(VALUE klass, const char *name, VALUE (*func)(ANYARGS), int argc)&lt;/code&gt; is used to define an instance method for a class. The first argument is the class, the second argument is the method’s name, and the third argument is the callback function.&lt;/p&gt;

&lt;p&gt;After defining the method through &lt;code&gt;rb_define_method(SomeClass, "a_method", call_back_ptr, -1)&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;When the method is called in the Ruby VM, for example &lt;code&gt;SomeClass.new().a_method()&lt;/code&gt;, Ruby calls the callback function. The callback function receives the argument count, arguments array, and the object(the self in Ruby). For example, the callback function in C could be:&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt;
&lt;span class="nf"&gt;ruby_insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ...&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="Bind C Struct to Ruby Object"&gt;Bind C Struct to Ruby Object&lt;/h2&gt;
&lt;p&gt;C and Rust use &lt;code&gt;struct&lt;/code&gt; and define methods for &lt;code&gt;struct&lt;/code&gt;s. By binding a struct to a Ruby object allowing us reusing exist code. Ruby achieves this through &lt;code&gt;rb_data_typed_object_wrap&lt;/code&gt; and &lt;code&gt;rb_check_typeddata&lt;/code&gt; methods.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;rb_data_typed_object_wrap&lt;/code&gt; creates a new instance with the struct. Its signature is &lt;code&gt;VALUE rb_data_typed_object_wrap(VALUE klass, void *datap, const rb_data_type_t *type)&lt;/code&gt;. &lt;code&gt;datap&lt;/code&gt; is the pointer to our struct, and the return value is the created instance.&lt;/p&gt;

&lt;p&gt;Then we can use &lt;code&gt;rb_check_typeddata&lt;/code&gt; to find the struct. Its signature is &lt;code&gt;void * rb_check_typeddata(VALUE obj, const rb_data_type_t *data_type)&lt;/code&gt;. The first argument is the Ruby instance, and it returns the struct’s pointer.&lt;/p&gt;
&lt;h2 id="Making Ruby Work with Rust"&gt;Making Ruby Work with Rust&lt;/h2&gt;&lt;h3 id="Binding Rust Methods to Ruby Classes"&gt;Binding Rust Methods to Ruby Classes&lt;/h3&gt;
&lt;p&gt;In the sections above, I explained how Ruby works with C. Now, I will introduce how Rust works with Ruby.&lt;/p&gt;

&lt;p&gt;Rust allows us to define C functions using the &lt;code&gt;extern&lt;/code&gt; keyword:&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;ruby_insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;rutie&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;types&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;rutie&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;AnyObject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SomeClass&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AnyObject&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ......&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then the method can be bound to a class through:&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nn"&gt;Class&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"SomeRubyClass"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.define&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;klass&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;klass&lt;/span&gt;&lt;span class="nf"&gt;.def&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"rs_insert"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ruby_insert&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;h3 id="Binding Rust Structs to Ruby Objects"&gt;Binding Rust Structs to Ruby Objects&lt;/h3&gt;
&lt;p&gt;To bind a struct to a Ruby object, we need to manage memory properly since Ruby uses garbage collection (GC) while Rust relies on ownership. &lt;code&gt;rutie&lt;/code&gt; solves this problem by delegating the struct’s memory management to Ruby. When it wraps data, it bypasses memory management using &lt;code&gt;Box::into_raw(Box::new(data)) as *mut c_void&lt;/code&gt; (in the &lt;code&gt;Class::wrap_data&lt;/code&gt; method). This allocates memory on the heap through &lt;code&gt;Box::new&lt;/code&gt; and then bypasses Rust's memory management through &lt;code&gt;Box::into_raw&lt;/code&gt;, meaning Rust doesn’t free the variable when it goes out of scope. When Ruby reclaims the struct-wrapped object, it also frees the struct. When wrapping the struct by calling &lt;code&gt;rb_data_typed_object_wrap&lt;/code&gt;, the last argument includes the free callback, which is:&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="s"&gt;"C"&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="n"&gt;free&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Sized&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="nb"&gt;c_void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Memory is freed when the box goes out of the scope&lt;/span&gt;
    &lt;span class="k"&gt;unsafe&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Box&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_raw&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;in Rutie.&lt;/p&gt;

&lt;p&gt;Moreover, Rutie provides several macros and methods to make this process easier:&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="nd"&gt;wrappable_struct!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SomeStruct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SomeStructWraper&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SOME_STRUCT_WRAPPER&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; 
&lt;span class="nd"&gt;class!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;SomeRubyClass&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nn"&gt;class&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;wrap_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;Class&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_existing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"SomeRubyClass"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.value&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;some_struct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;*&lt;/span&gt;&lt;span class="n"&gt;SOME_STRUCT_WRAPPER&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;wrappable_struct!&lt;/code&gt; defines a wrapper struct:&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;SomeStructWraper&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;data_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;rutie&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;types&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;DataType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;_marker&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;std&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;marker&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;PhantomData&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;&lt;code&gt;PhantomData&lt;/code&gt; is used for referencing &lt;code&gt;T&lt;/code&gt;, and &lt;code&gt;data_type&lt;/code&gt; is for the last argument of &lt;code&gt;rb_data_typed_object_wrap&lt;/code&gt;. The macro defines a global variable &lt;code&gt;SOME_STRUCT_WRAPPER&lt;/code&gt; for use, similar to a singleton in Ruby[3].&lt;/p&gt;

&lt;p&gt;&lt;code&gt;class!(RubySomeStruct);&lt;/code&gt; defines a Ruby class for wrapping.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;class::wrap_data(Class::from_existing("RubySomeStruct").value(), some_struct, &amp;amp;*SOME_STRUCT_WRAPPER);&lt;/code&gt; is called in &lt;code&gt;initialize&lt;/code&gt; for binding. As mentioned, &lt;code&gt;SOME_STRUCT_WRAPPER&lt;/code&gt; contains &lt;code&gt;SomeStructWrapper&lt;/code&gt;, which has a &lt;code&gt;free&lt;/code&gt; method in it&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;ruby_initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;argc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;rutie&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;types&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Argc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;rutie&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;AnyObject&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="n"&gt;rtself&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RubyStore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AnyObject&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nn"&gt;class&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;wrap_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nn"&gt;Class&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_existing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"RubySomeStruct"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.value&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;some_struct&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;*&lt;/span&gt;&lt;span class="n"&gt;SOME_STRUCT_WRAPPER&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;For using the struct:&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;pub extern fn ruby_insert(
    argc: ::rutie::types::Argc,
    argv: *const ::rutie::AnyObject,
    mut rtself: RubyStore,
) -&amp;gt; AnyObject {
   let rs_struct = rtself.get_data_mut(&amp;amp;*SOME_STRUCT_WRAPPER);
   // ......
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We need the global variable &lt;code&gt;SOME_STRUCT_WRAPPER&lt;/code&gt; because &lt;code&gt;get_data_mut&lt;/code&gt; calls &lt;code&gt;rb_check_typeddata&lt;/code&gt;, whose third argument is &lt;code&gt;rb_data_type_t&lt;/code&gt; provided by the struct wrapper’s &lt;code&gt;data_type&lt;/code&gt; field.&lt;/p&gt;
&lt;h2 id="Memory Safety"&gt;Memory Safety&lt;/h2&gt;
&lt;p&gt;A wrapped Rust struct is safe because the responsibility for freeing memory has been delegated to Ruby, allowing Ruby to free the memory during garbage collection (GC).&lt;/p&gt;

&lt;p&gt;Variables passed to Rust are structures of pointers, so Rust does not free the contents the pointers point to. Since this is safe in C, it is safe in Rust as well.&lt;/p&gt;

&lt;p&gt;The return variables are allocated through Ruby's memory system, and when they return, their ownership is passed to the caller—Ruby—which is also safe.&lt;/p&gt;

&lt;p&gt;However, it becomes unsafe when Rust wants to keep a reference in its struct because Ruby doesn't know that Rust holds the reference, and Ruby might free it, causing a use-after-free issue. A simple solution is when Rust wants to hold a reference in &lt;code&gt;Arc&lt;/code&gt;, it needs to let Ruby keep an additional reference to the object. And when Rust drops the &lt;code&gt;Arc&lt;/code&gt;, it should let Ruby remove the additional reference.&lt;/p&gt;
&lt;h2 id="Summary"&gt;Summary&lt;/h2&gt;
&lt;p&gt;This article introduced how Rutie enables Ruby to use Rust code and discussed memory safety. Many details are not covered here, such as Rutie allows Rust call Ruby, how to build for different platforms, and how it translates structs and binds C methods. It's encouraged to use Rutie and explore its source code for a deeper understanding.&lt;/p&gt;

&lt;p&gt;I believe Rutie can greatly benefit the Ruby community. Not only can Rust enhance performance, but it also allows Ruby to leverage Rust implementations, such as gRPC and OpenTelemetry metrics.&lt;/p&gt;

&lt;p&gt;A complete example of using Rutie can be found at &lt;a href="https://github.com/yfractal/ccache/tree/main/ccache_rb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/ccache/tree/main/ccache_rb&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="References"&gt;References&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;a href="https://github.com/danielpclark/rutie" rel="nofollow" target="_blank"&gt;https://github.com/danielpclark/rutie&lt;/a&gt; &lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/ruby/fiddle" rel="nofollow" target="_blank"&gt;https://github.com/ruby/fiddle&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://refactoring.guru/design-patterns/singleton/ruby/example" rel="nofollow" target="_blank"&gt;https://refactoring.guru/design-patterns/singleton/ruby/example&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://ruby-china.org/topics/43728" rel="nofollow" target="_blank"&gt;https://ruby-china.org/topics/43728&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
      <author>yfractal</author>
      <pubDate>Thu, 27 Jun 2024 21:46:43 +0800</pubDate>
      <link>https://ruby-china.org/topics/43777</link>
      <guid>https://ruby-china.org/topics/43777</guid>
    </item>
    <item>
      <title>Rust2go: calls Go from Rust </title>
      <description>&lt;h2 id="Introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Recently, I've been developing an experimental project called &lt;a href="https://github.com/yfractal/ccache" rel="nofollow" target="_blank" title=""&gt;ccache&lt;/a&gt;, which is a Redis client-side caching that guarantees consistency. Since it operates on the client side, I need to ensure it supports different programming languages.&lt;/p&gt;

&lt;p&gt;The common practice is to write similar logic in different languages, as seen with Redis client and OpenTelemetry instrument library. However, this approach involves tedious and repetitive work.&lt;/p&gt;

&lt;p&gt;One potential solution is to write the core functionality in Rust and integrate it with different languages. To achieve this, we need to address the discrepancies between Rust and other languages, such as how to represent data and manage memory safely.&lt;/p&gt;

&lt;p&gt;Rust2go is a practical FFI  framework that enables calling Go from Rust. In this article, I will introduce how it works.&lt;/p&gt;
&lt;h2 id="Benefits"&gt;Benefits&lt;/h2&gt;
&lt;p&gt;Due to its low overhead and safety guarantees, Rust has been integrated into many other systems traditionally written in C, such as Ruby and Linux.&lt;/p&gt;

&lt;p&gt;Integrating Rust with other high-level languages is beneficial as it can improve performance and reduce repetitive work.&lt;/p&gt;

&lt;p&gt;For example, ByteDance reduced CPU usage by more than 30% after migrating a core service from Golang to Rust[2].&lt;/p&gt;

&lt;p&gt;Additionally, OpenTelemetry supports 11 languages[3]. Using Rust for core functionality can significantly reduce development efforts and prevent inconsistencies between different language implementations.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://west-barber-f19.notion.site/image/https%3A%2F%2Fprod-files-secure.s3.us-west-2.amazonaws.com%2F3ad64ea3-ffc7-46f7-a5e8-78e7a3534dae%2Fc3ecbef2-8cb2-4bae-bf1d-7744801cf570%2FUntitled.png?table=block&amp;amp;id=8150861d-5437-4eb8-9d6a-2c6373196403&amp;amp;spaceId=3ad64ea3-ffc7-46f7-a5e8-78e7a3534dae&amp;amp;width=1420&amp;amp;userId=&amp;amp;cache=v2" title="" alt="Untitled"&gt;&lt;/p&gt;
&lt;h2 id="How Rust2go Works"&gt;How Rust2go Works&lt;/h2&gt;&lt;h2 id="Calling Go Functions"&gt;Calling Go Functions&lt;/h2&gt;
&lt;p&gt;After building the Go code into a library and linking it to Rust, the Go functions become accessible within the Rust project.&lt;/p&gt;

&lt;p&gt;However, Rust and Go have different calling conventions, so Rust cannot directly call Go functions. One solution is to use a trampoline to handle this issue[4]. Due to the unstable Rust ABI and the desire to address goroutine stack expansion, the author of rust2go chose not to use this method.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/ihciah/rust2go?tab=readme-ov-file" rel="nofollow" target="_blank" title=""&gt;rust2go&lt;/a&gt; uses the C ABI as a "bridge" between Rust and Go. The Go functions are exposed as C functions through cgo, and Rust calls these C functions.&lt;/p&gt;
&lt;h2 id="Memory Representation"&gt;Memory Representation&lt;/h2&gt;
&lt;p&gt;Rust and Go represent structs in different ways. In Rust2go, a struct is first converted to a C struct and then to a Go struct. For example, a Rust struct &lt;code&gt;DemoUser&lt;/code&gt; is converted to &lt;code&gt;DemoUserRef&lt;/code&gt; and then to a Go &lt;code&gt;DemoUser&lt;/code&gt;.&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;DemoUser&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;typedef&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;DemoUserRef&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;StringRef&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kt"&gt;uint8_t&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="n"&gt;DemoUserRef&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;newDemoUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DemoUserRef&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;DemoUser&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;DemoUser&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;newString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;newC_uint8_t&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And then it coverts the primate types, for example, &lt;code&gt;StringRef&lt;/code&gt; is converted to Go string by &lt;code&gt;newString&lt;/code&gt;&lt;/p&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;newString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s_ref&lt;/span&gt; &lt;span class="n"&gt;C&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StringRef&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;unsafeString&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s_ref&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s_ref&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;unsafeString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;length&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;sliceHeader&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;reflect&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SliceHeader&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Data&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;uintptr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ptr&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
        &lt;span class="n"&gt;Len&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;Cap&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="n"&gt;unsafe&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Pointer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sliceHeader&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;I will explain why Rust2go uses &lt;code&gt;XXXRef&lt;/code&gt; in the next section.&lt;/p&gt;
&lt;h2 id="Passing Variables Between Rust and Go"&gt;Passing Variables Between Rust and Go&lt;/h2&gt;
&lt;p&gt;In the previous section, I explained how rust2go understands structs in Rust and Go. Now, I will explain how it passes variables between the two languages.&lt;/p&gt;
&lt;h2 id="Passing Arguments to Go"&gt;Passing Arguments to Go&lt;/h2&gt;
&lt;p&gt;The most simple and straightforward method is to use serialization protocols like Thrift and Protocol Buffers. Rust2go does not choose this method as it wastes CPU time converting the data back and forth.&lt;/p&gt;

&lt;p&gt;Instead, it passes arguments through pointers and converts the data to make it understandable for Rust and Go. This avoids deep copying, such as strings and binary data.&lt;/p&gt;

&lt;p&gt;This method adheres to Rust's safety rules because the arguments are "borrowed" by Go, and the memory is "owned" by Rust. Once Go finishes using the data, it frees its allocated memory, but the variables' memory allocated by Rust is not freed by Go.&lt;/p&gt;
&lt;h2 id="Receiving Return Variables from Go"&gt;Receiving Return Variables from Go&lt;/h2&gt;
&lt;p&gt;The return variables are created by Go, so Go can free them when necessary. Rust calling Rust does not have this problem because the variable can own the return result, such as &lt;code&gt;let x = some_func();&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Rust2go handles this by copying the variable in the C callback so that Rust and Go can manage the "same" variable independently.&lt;/p&gt;
&lt;h2 id="Summary"&gt;Summary&lt;/h2&gt;
&lt;p&gt;This article provides an introduction to how rust2go works. For more details, please refer to the author's article[2].&lt;/p&gt;
&lt;h2 id="References"&gt;References&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://github.com/ihciah/rust2go?tab=readme-ov-file" rel="nofollow" target="_blank" title=""&gt;https://github.com/ihciah/rust2go&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.ihcblog.com/rust2go/" rel="nofollow" target="_blank"&gt;https://en.ihcblog.com/rust2go/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://opentelemetry.io/status/" rel="nofollow" target="_blank"&gt;https://opentelemetry.io/status/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://metalbear.co/blog/hooking-go-from-rust-hitchhikers-guide-to-the-go-laxy/" rel="nofollow" target="_blank"&gt;https://metalbear.co/blog/hooking-go-from-rust-hitchhikers-guide-to-the-go-laxy/&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
      <author>yfractal</author>
      <pubDate>Mon, 24 Jun 2024 21:23:37 +0800</pubDate>
      <link>https://ruby-china.org/topics/43765</link>
      <guid>https://ruby-china.org/topics/43765</guid>
    </item>
    <item>
      <title>Memory Safety between Rust and Ruby — Making Ruby Allocated Memory Works in Rust</title>
      <description>&lt;h2 id="Introduction"&gt;Introduction&lt;/h2&gt;
&lt;p&gt;Breaking the barrier between different programming languages is both interesting and beneficial. For example, ByteDance developed rust2go[1] to facilitate migrating Golang projects to Rust smoothly[2].&lt;/p&gt;

&lt;p&gt;Importing Rust into Ruby can not only improve performance but also reduce tedious, repetitive work. For instance, Rust has implemented OpenTelemetry metrics, whereas Ruby hasn't. Wrapping the Rust implementation for use in Ruby can save a lot of effort. Ccache[3] is an experimental project exploring this direction.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/yfractal/ccache" rel="nofollow" target="_blank" title=""&gt;Ccache&lt;/a&gt; is an etag-based local cache that saves cache values in Arc, which are then queried by Ruby. It needs to consider how to handle memory efficiently between the two systems.&lt;/p&gt;

&lt;p&gt;This article introduces an idea about how to make Rust &lt;code&gt;Arc&lt;/code&gt; work between Rust and Ruby without copying.&lt;/p&gt;
&lt;h2 id="Background"&gt;Background&lt;/h2&gt;
&lt;p&gt;Programs allocate memory and, after usage, need to consider how to return that memory.&lt;/p&gt;

&lt;p&gt;The most straightforward way is managing memory manually, where programmers need to know when a variable can be freed and release it to the system. This is how C works; however, it's error-prone, and many memory bugs can be found in C programs.&lt;/p&gt;

&lt;p&gt;Rust improves this by providing ownership. Variables belong to a specific scope, and when execution goes out of that scope, Rust releases the memory. This makes memory management explicit in the code.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;Arc&lt;/code&gt; shares variable usage by adding a reference. When code uses a variable, it increases the count, and after finishing its use, it decreases the count. Rust releases the memory when the reference count reaches zero.&lt;/p&gt;

&lt;p&gt;Ruby's Garbage Collection (GC) allows programs to use memory without considering when to release it. The Ruby VM triggers GC when necessary, and during GC, it finds unreachable variables and returns their memory to the system or memory pool.&lt;/p&gt;

&lt;p&gt;The problem arises when we use Rust's &lt;code&gt;Arc&lt;/code&gt; and need to pass the value to the Ruby part.&lt;/p&gt;
&lt;h2 id="Producing the Problem"&gt;Producing the Problem&lt;/h2&gt;&lt;h2 id="An Example Using Rust Arc"&gt;An Example Using Rust &lt;code&gt;Arc&lt;/code&gt;
&lt;/h2&gt;&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;Store&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;hash_map&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;HashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Arc&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AnyObject&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="n"&gt;Store&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;Self&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;Store&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;hash_map&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;HashMap&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nd"&gt;wrappable_struct!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;StoreWrapper&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;STORE_WRAPPER&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nd"&gt;class!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RubyStore&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nd"&gt;methods!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;RubyStore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;rtself&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;ruby_new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AnyObject&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nn"&gt;Store&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nn"&gt;Class&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from_existing&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"RubyStore"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.wrap_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;*&lt;/span&gt;&lt;span class="n"&gt;STORE_WRAPPER&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;ruby_insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RString&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AnyObject&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AnyObject&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;rbself&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rtself&lt;/span&gt;&lt;span class="nf"&gt;.get_data_mut&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;*&lt;/span&gt;&lt;span class="n"&gt;STORE_WRAPPER&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;rbself&lt;/span&gt;
            &lt;span class="py"&gt;.hash_map&lt;/span&gt;
            &lt;span class="nf"&gt;.insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.to_string&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nn"&gt;Arc&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;obj&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;()));&lt;/span&gt;
        &lt;span class="nn"&gt;NilClass&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.into&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;ruby_get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rb_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RString&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AnyObject&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;rbself&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rtself&lt;/span&gt;&lt;span class="nf"&gt;.get_data_mut&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;*&lt;/span&gt;&lt;span class="n"&gt;STORE_WRAPPER&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rb_key&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.to_string&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rbself&lt;/span&gt;&lt;span class="py"&gt;.hash_map&lt;/span&gt;&lt;span class="nf"&gt;.get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.unwrap&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="nn"&gt;AnyObject&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="nf"&gt;.value&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nd"&gt;#[allow(non_snake_case)]&lt;/span&gt;
&lt;span class="nd"&gt;#[no_mangle]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;extern&lt;/span&gt; &lt;span class="s"&gt;"C"&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;Init_ruby_example&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nn"&gt;Class&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"RubyStore"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="nf"&gt;.define&lt;/span&gt;&lt;span class="p"&gt;(|&lt;/span&gt;&lt;span class="n"&gt;klass&lt;/span&gt;&lt;span class="p"&gt;|&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;klass&lt;/span&gt;&lt;span class="nf"&gt;.def_self&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"new"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ruby_new&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;klass&lt;/span&gt;&lt;span class="nf"&gt;.def&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"insert"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ruby_insert&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;klass&lt;/span&gt;&lt;span class="nf"&gt;.def&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"get"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ruby_get&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;Store&lt;/code&gt; struct is a simple &lt;code&gt;HashMap&lt;/code&gt;, and its value is an &lt;code&gt;Arc&lt;/code&gt; of Ruby &lt;code&gt;AnyObject&lt;/code&gt;. It is for concurrent usage.&lt;/p&gt;

&lt;p&gt;And it seems to work well:&lt;/p&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'works'&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;RubyStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;
  &lt;span class="n"&gt;foo&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;foo&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

  &lt;span class="nb"&gt;sleep&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
  &lt;span class="no"&gt;GC&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;

  &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;eq&lt;/span&gt; &lt;span class="no"&gt;Foo&lt;/span&gt;
  &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;a&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;eq&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;b&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;eq&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="The Segmentation Fault"&gt;The Segmentation Fault&lt;/h2&gt;
&lt;p&gt;The above example works because the created &lt;code&gt;Foo&lt;/code&gt; object is still referenced by &lt;code&gt;foo&lt;/code&gt; variable, so the memory has not been freed.&lt;/p&gt;

&lt;p&gt;To trigger the segmentation fault or other memory issues, we create and pass the Foo object directly to &lt;code&gt;RubyStore&lt;/code&gt;.&lt;/p&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="s1"&gt;'has memory issues :('&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;store&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;RubyStore&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;
  &lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;Foo&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

  &lt;span class="nb"&gt;sleep&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;
  &lt;span class="no"&gt;GC&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;

  &lt;span class="n"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"key"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;class&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;to&lt;/span&gt; &lt;span class="n"&gt;eq&lt;/span&gt; &lt;span class="no"&gt;Foo&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then it raises a segmentation fault.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/4032a1d5-433c-47c2-954d-498837df983d.png!large" title="" alt=""&gt;&lt;/p&gt;
&lt;h2 id="A Simple Solution"&gt;A Simple Solution&lt;/h2&gt;
&lt;p&gt;We can avoid this by deep cloning the &lt;code&gt;Arc&lt;/code&gt;’s value; however, it is not zero cost. Serializing a large object may take more than 1ms in Ruby[3]. To improve performance, we need to consider other methods.&lt;/p&gt;

&lt;p&gt;To work around this situation, one option is to let Rust allocate the memory and free it through &lt;code&gt;drop&lt;/code&gt;. However, this means Rust needs to figure out whether the allocated memory is being used by Ruby, which is not feasible. Thus, memory must be allocated by Ruby.&lt;/p&gt;

&lt;p&gt;Rust ownership is a great idea as it lets the owner manage its job. We need to consider the responsibilities between Ruby and Rust. The memory is allocated by Ruby, so Ruby has the duty to release it. Then the memory is used by Rust but it doesn’t own it. Thus, we can still use &lt;code&gt;Arc&lt;/code&gt;; the only difference is that we do not return memory back when we drop the &lt;code&gt;Arc&lt;/code&gt;.&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;RubyObject&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nn"&gt;rutie&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;types&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;impl&lt;/span&gt; &lt;span class="nb"&gt;Drop&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;RubyObject&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// drop nothing, GC was handled by Ruby&lt;/span&gt;
    &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;drop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Rust has done its job properly, so we need to consider Ruby's job now. In Ruby, it allocates memory from the system, so it needs to free the memory. Additionally, it passes the object to Rust, so it needs to record this.&lt;/p&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="n"&gt;klass&lt;/span&gt;&lt;span class="nf"&gt;.def&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"insert_inner"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ruby_insert&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;RubyStore&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@_val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;
    &lt;span class="n"&gt;insert_inner&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The value is assigned to a local variable &lt;code&gt;@_val&lt;/code&gt;, making it reachable through the &lt;code&gt;RubyStore&lt;/code&gt; instance. This prevents Ruby's GC from reclaiming its memory. When the key is deleted, we can set &lt;code&gt;@_val = nil&lt;/code&gt; to “free” its memory.&lt;/p&gt;

&lt;p&gt;Now, everything works well.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/7f53ffa0-9db1-4415-b90d-652e1caff48a.png!large" title="" alt=""&gt;&lt;/p&gt;
&lt;h2 id="Discussion"&gt;Discussion&lt;/h2&gt;
&lt;p&gt;Clearly, the current solution is far from ideal. To improve it, we can use a doubly linked list to save all &lt;code&gt;Arc&lt;/code&gt; references in a Ruby local variable. When Drop is called, instead of doing nothing, we can remove the reference from the doubly linked list.&lt;/p&gt;

&lt;p&gt;For this solution, Rust depends on Ruby's GC for reclaiming memory, and it requires Ruby’s cooperation. Rust doesn’t trust programmers completely, so it isn’t strictly safe. However, Ruby works in another way; it assumes programmers can do the right things (though they often don’t). Thus, this solution is acceptable for Ruby. To make it safer and cleaner, we can handle the references things in Rust code.&lt;/p&gt;

&lt;p&gt;Another interesting direction is to make ownership works in Ruby, not only for &lt;code&gt;Arc&lt;/code&gt;, but also for &lt;code&gt;mutable/immutable&lt;/code&gt;, and &lt;code&gt;lock&lt;/code&gt; usage. This can make the interaction smoother and make Ruby safer.&lt;/p&gt;
&lt;h2 id="Summary"&gt;Summary&lt;/h2&gt;
&lt;p&gt;This article discussed a method to integrate Rust's &lt;code&gt;Arc&lt;/code&gt; into Ruby, ensuring memory safety without the need for deep cloning.
You can find the whole example in rust_arc_demo[4] and a use case in &lt;a href="https://github.com/yfractal/ccache" rel="nofollow" target="_blank" title=""&gt;Ccache&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id="References"&gt;References&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;&lt;a href="https://github.com/ihciah/rust2go" rel="nofollow" target="_blank"&gt;https://github.com/ihciah/rust2go&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://en.ihcblog.com/rust2go/" rel="nofollow" target="_blank"&gt;https://en.ihcblog.com/rust2go/&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/yfractal/ccache" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/ccache&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/yfractal/rust_arc_demo" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/rust_arc_demo&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;</description>
      <author>yfractal</author>
      <pubDate>Sat, 08 Jun 2024 17:55:42 +0800</pubDate>
      <link>https://ruby-china.org/topics/43728</link>
      <guid>https://ruby-china.org/topics/43728</guid>
    </item>
    <item>
      <title>eBPF USDT in Rust</title>
      <description>&lt;p&gt;写了一篇&lt;a href="https://github.com/yfractal/blog/issues/15" rel="nofollow" target="_blank" title=""&gt;文章&lt;/a&gt;介绍如何使用在 Rust 里使用 USDT。&lt;/p&gt;

&lt;p&gt;然后这里是一个使用 USDT 的例子 &lt;a href="https://github.com/yfractal/ccache/pull/7/files" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/ccache/pull/7/files&lt;/a&gt;&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Wed, 22 May 2024 22:42:51 +0800</pubDate>
      <link>https://ruby-china.org/topics/43702</link>
      <guid>https://ruby-china.org/topics/43702</guid>
    </item>
    <item>
      <title>Redis 真的很快吗？</title>
      <description>&lt;p&gt;主要解释了 Redis 快的原因，以及 Redis 还不够快。&lt;/p&gt;

&lt;p&gt;上传不了截图，我先放个链接，等能上传图片的时候，我再搬过来。&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.zhihu.com/question/563209865/answer/2736513961" rel="nofollow" target="_blank"&gt;https://www.zhihu.com/question/563209865/answer/2736513961&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;最近在做 in memory cache 的&lt;a href="https://github.com/yfractal/blog/issues/11" rel="nofollow" target="_blank" title=""&gt;调研&lt;/a&gt;，所以有了这个知乎的回答。&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Sun, 30 Oct 2022 12:48:50 +0800</pubDate>
      <link>https://ruby-china.org/topics/42715</link>
      <guid>https://ruby-china.org/topics/42715</guid>
    </item>
    <item>
      <title>有人想聊聊 Shopify 新出的 app server pitchfork 吗？</title>
      <description>&lt;p&gt;相关资源&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Github &lt;a href="https://github.com/Shopify/pitchfork" rel="nofollow" target="_blank"&gt;https://github.com/Shopify/pitchfork&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Reddit 上的讨论 &lt;a href="https://www.reddit.com/r/ruby/comments/xwcvty/a_new_web_server_github_shopifypitchfork/" rel="nofollow" target="_blank"&gt;https://www.reddit.com/r/ruby/comments/xwcvty/a_new_web_server_github_shopifypitchfork/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Twitter &lt;a href="https://nitter.net/_byroot/status/1577647475686514689" rel="nofollow" target="_blank"&gt;https://nitter.net/_byroot/status/1577647475686514689&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;</description>
      <author>yfractal</author>
      <pubDate>Sun, 09 Oct 2022 10:18:50 +0800</pubDate>
      <link>https://ruby-china.org/topics/42687</link>
      <guid>https://ruby-china.org/topics/42687</guid>
    </item>
    <item>
      <title>JeMalloc 相关资料</title>
      <description>&lt;h2 id="Background"&gt;Background&lt;/h2&gt;
&lt;p&gt;最近对 JeMalloc 的实现比较好其，所以看了一些相关资料，以及代码实现。下面是一些资料整理。&lt;/p&gt;
&lt;h2 id="Resources"&gt;Resources&lt;/h2&gt;&lt;h3 id="General"&gt;General&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;Pseudomonarchia jemallocum

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="http://phrack.org/issues/68/10.html" rel="nofollow" target="_blank"&gt;http://phrack.org/issues/68/10.html&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;虽然是分析 JeMalloc 安全相关的问题，但 对 JeMalloc 的实现有很好的分析，是一篇高质量的技术文章，唯一缺点是描述的版本比较旧。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;A Scalable Concurrent malloc(3) Implementation for FreeBSD

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://people.freebsd.org/~jasone/jemalloc/bsdcan2006/jemalloc.pdf" rel="nofollow" target="_blank"&gt;https://people.freebsd.org/~jasone/jemalloc/bsdcan2006/jemalloc.pdf&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;JeMalloc 作者 Jason Evans 的 paper。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Scalable memory allocation using jemalloc

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://www.facebook.com/notes/10158791475077200/" rel="nofollow" target="_blank"&gt;https://www.facebook.com/notes/10158791475077200/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Facebook 的技术博客，对应 JeMalloc 版本是 2.1.0。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;How JeMalloc Works

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://github.com/yfractal/blog/blob/master/blog/2022-10-05-jemalloc.md" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/blog/blob/master/blog/2022-10-05-jemalloc.md&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;个人做的总结，关注点是 JeMalloc 内存申请和释放的过程以及各个结构体间的关系，可以用来大体了解 JeMalloc 的机制，对应版本 5.2.1。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="源码分析"&gt;源码分析&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;JeMalloc 源码分析

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://youjiali1995.github.io/allocator/jemalloc/" rel="nofollow" target="_blank"&gt;https://youjiali1995.github.io/allocator/jemalloc/&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;内容详尽，对应的版本应该是 5.0.1&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;JeMalloc

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://zhuanlan.zhihu.com/p/48957114" rel="nofollow" target="_blank"&gt;https://zhuanlan.zhihu.com/p/48957114&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;对应版本是 5.1.0&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="Purging 相关"&gt;Purging 相关&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;Tick Tock, malloc Needs a Clock

&lt;ul&gt;
&lt;li&gt;link:&lt;span class="embed-responsive embed-responsive-16by9"&gt;&lt;iframe class="embed-responsive-item" src="//www.youtube.com/embed/RcWp5vwGlYU" allowfullscreen=""&gt;&lt;/iframe&gt;&lt;/span&gt;
&lt;/li&gt;
&lt;li&gt;JeMalloc 作者 Jason Evans 关于 purging 相关的分享。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Implement decay-based unused dirty page purging

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://github.com/jemalloc/jemalloc/issues/325" rel="nofollow" target="_blank"&gt;https://github.com/jemalloc/jemalloc/issues/325&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;关于 decay-based purging 的设计思路。&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Implement two-phase purging

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://github.com/jemalloc/jemalloc/issues/521" rel="nofollow" target="_blank"&gt;https://github.com/jemalloc/jemalloc/issues/521&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;jemalloc purge 改进

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://youjiali1995.github.io/allocator/jemalloc-purge/" rel="nofollow" target="_blank"&gt;https://youjiali1995.github.io/allocator/jemalloc-purge/&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="调试"&gt;调试&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;使用 GDB 查看 JeMalloc 内存布局

&lt;ul&gt;
&lt;li&gt; link:  &lt;a href="https://blog.csdn.net/hl09083253cy/article/details/79147625" rel="nofollow" target="_blank"&gt;https://blog.csdn.net/hl09083253cy/article/details/79147625&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;jemalloc heap exploitation framework

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/CENSUS/shadow" rel="nofollow" target="_blank"&gt;https://github.com/CENSUS/shadow&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3 id="相关"&gt;相关&lt;/h3&gt;
&lt;ol&gt;
&lt;li&gt;ptmalloc、tcmalloc 与 jemalloc 对比分析

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://www.cyningsun.com/07-07-2018/memory-allocator-contrasts.html" rel="nofollow" target="_blank"&gt;https://www.cyningsun.com/07-07-2018/memory-allocator-contrasts.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;TCMalloc : Thread-Caching Malloc

&lt;ul&gt;
&lt;li&gt;link: &lt;a href="https://google.github.io/tcmalloc/design.html" rel="nofollow" target="_blank"&gt;https://google.github.io/tcmalloc/design.html&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;</description>
      <author>yfractal</author>
      <pubDate>Sat, 08 Oct 2022 15:09:44 +0800</pubDate>
      <link>https://ruby-china.org/topics/42685</link>
      <guid>https://ruby-china.org/topics/42685</guid>
    </item>
    <item>
      <title>Share HashMap for Different Processes by `mmap` in Ruby</title>
      <description>&lt;h2 id="Why"&gt;Why&lt;/h2&gt;
&lt;p&gt;A cache in the application can reduce both latency and network usage. Such a cache can replace Redis in some situations too. Erlang's ETS is a really good example.&lt;/p&gt;

&lt;p&gt;But as we know, most Rails applications are deployed in cluster mode, and a cluster will have 3 or more processes, data can't be accessed by different processes normally.&lt;/p&gt;

&lt;p&gt;If each process has one such cache, it will waste memory and reduce the cache hit.&lt;/p&gt;

&lt;p&gt;So we need a HashMap that can be accessed by different processes.&lt;/p&gt;

&lt;p&gt;We need something that works like:&lt;/p&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="n"&gt;_pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fork&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# insert in child process&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="n"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# insert in parent process&lt;/span&gt;
&lt;span class="no"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait&lt;/span&gt;

&lt;span class="nb"&gt;display&lt;/span&gt; &lt;span class="c1"&gt;# should display the 2 elements&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;&lt;h2 id="How it works"&gt;How it works&lt;/h2&gt;
&lt;p&gt;As know, a program uses virtual memory to access physical memory, and the virtual memory to physical memory mapping is managed by the operating system. &lt;/p&gt;

&lt;p&gt;So we can use &lt;code&gt;mmap&lt;/code&gt; to map the same physical memory to different processes.&lt;/p&gt;

&lt;p&gt;&lt;img src="https://l.ruby-china.com/photo/yfractal/7c83355b-d13f-43d3-a365-379b083e7df5.png!large" title="" alt=""&gt;&lt;/p&gt;

&lt;p&gt;The code is simple:&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;create_shared_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;size_t&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;protection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;PROT_READ&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;PROT_WRITE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;visibility&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;MAP_SHARED&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;MAP_ANONYMOUS&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;mmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;size&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;protection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;visibility&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then we need to allocate memory for both array(HashMap is an array actually) pointers and the array data by:&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;DataItem&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;hashArray&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;DataItem&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="n"&gt;create_shared_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;SIZE&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;dataArea&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;create_shared_memory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;DataItem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;SIZE&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then we can insert item into the array by:&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;DataItem&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;DataItem&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dataArea&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;sizeof&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="n"&gt;DataItem&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;hashIndex&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// use the shared memory&lt;/span&gt;
&lt;span class="n"&gt;hashArray&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;hashIndex&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Ruby can write C extension easily:&lt;/p&gt;
&lt;pre class="highlight c"&gt;&lt;code&gt;&lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;Init_extension&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;CFromRubyExample&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rb_define_module&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"CacheRb"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;NativeHelpers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rb_define_class_under&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CFromRubyExample&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"NativeHelpers"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rb_cObject&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="n"&gt;rb_define_singleton_method&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;NativeHelpers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"insert"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rb_insert&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
   &lt;span class="c1"&gt;// ......&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we can test this by&lt;/p&gt;
&lt;pre class="highlight ruby"&gt;&lt;code&gt;&lt;span class="no"&gt;CacheRb&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;NativeHelpers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;init&lt;/span&gt;

&lt;span class="n"&gt;_pid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fork&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="no"&gt;CacheRb&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;NativeHelpers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="no"&gt;CacheRb&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;NativeHelpers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="no"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;wait&lt;/span&gt;

&lt;span class="no"&gt;CacheRb&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;NativeHelpers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;display&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After &lt;code&gt;display&lt;/code&gt; is executed, we can see the hash has two elements, &lt;code&gt;1 =&amp;gt; 10&lt;/code&gt; and &lt;code&gt;2 =&amp;gt; 20&lt;/code&gt;, one is inserted by the child process and one is inserted by the parent process.&lt;/p&gt;

&lt;p&gt;All code is in &lt;a href="https://github.com/yfractal/cache_rb" rel="nofollow" target="_blank"&gt;https://github.com/yfractal/cache_rb&lt;/a&gt;, you can compile it by &lt;code&gt;rake compile&lt;/code&gt;, and run &lt;code&gt;CacheRb.demo&lt;/code&gt; in the bundle console.&lt;/p&gt;
&lt;h2 id="What's the next?"&gt;What's the next?&lt;/h2&gt;
&lt;p&gt;This is a just simple or silly example to prove the idea works.&lt;/p&gt;

&lt;p&gt;For making it useful, I will find or write a good hash map and handle memory allocation and free wisely in the following days.&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Sun, 28 Aug 2022 19:21:41 +0800</pubDate>
      <link>https://ruby-china.org/topics/42616</link>
      <guid>https://ruby-china.org/topics/42616</guid>
    </item>
    <item>
      <title>MIT 6.824 Distributed Systems Reading Notes</title>
      <description>&lt;h2 id="阅读笔记"&gt;阅读笔记&lt;/h2&gt;
&lt;p&gt;Distributed Systems 这门课断断续续学了很久，总算是看完了。&lt;/p&gt;

&lt;p&gt;没有像学习其他课程一样看资料 + 做 project，而是采取资料 + 做笔记的方式，一个是对 project 不是特别感兴趣，更主要的原因是懒。。。之后有机会会把项目做完。&lt;/p&gt;

&lt;p&gt;笔记如下：&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/yfractal/blog/issues/8#issuecomment-1100841630" rel="nofollow" target="_blank" title=""&gt;Scaling Memcache at Facebook Reading Note&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/yfractal/blog/issues/8#issuecomment-1115903733" rel="nofollow" target="_blank" title=""&gt;The Google File System Reading Note&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/yfractal/blog/issues/8#issuecomment-1140428705" rel="nofollow" target="_blank" title=""&gt;Amazon Aurora Reading Note&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/yfractal/blog/issues/8#issuecomment-1193444846" rel="nofollow" target="_blank" title=""&gt;Spark Paper Reading Note&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/yfractal/blog/issues/8#issuecomment-1207530839" rel="nofollow" target="_blank" title=""&gt;COPS Reading Note&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/yfractal/blog/issues/8#issuecomment-1225041302" rel="nofollow" target="_blank" title=""&gt;Google Spanner Reading Note&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id="有的没的"&gt;有的没的&lt;/h2&gt;&lt;h3 id="Why"&gt;Why&lt;/h3&gt;
&lt;p&gt;之前对并发相关的一直很感兴趣，工作上也需要对架构有更好的理解。&lt;/p&gt;

&lt;p&gt;上一份工作的后端系统，几年间架构经过了几次改进，从最初的相对合理、可以应对一定的性能要求，到中期的局部重构，以及后期整体重构以应对较高的性能要求和扩展要求。&lt;/p&gt;

&lt;p&gt;都是架构上的调整，看不同的人用不同的方式做同一个项目，是件很有趣的事情。&lt;/p&gt;

&lt;p&gt;MIT 这门课程，主要是讲解各种 paper 从 MapReduce 到 Spanner，再到 Bitcoin 和 Blockstack。&lt;/p&gt;

&lt;p&gt;具体的东西比抽象的东西有趣一些，比如 DDIA，讲数据密集型应用各种原理，讲的很好，但记起来不容易。&lt;/p&gt;

&lt;p&gt;而 paper 就有趣的多，有具体的问题，还可以和相似的问题做横向比较。&lt;/p&gt;

&lt;p&gt;非常喜欢这门课的老师，总能用最简单的语言描述问题的本质。&lt;/p&gt;
&lt;h3 id="感受"&gt;感受&lt;/h3&gt;
&lt;p&gt;最大的感受是开阔眼界。学完这个，后端玩法至少有一个大体的了解，催牛的时候不至于无话可说。&lt;/p&gt;

&lt;p&gt;分布式系统也很有趣，没有完美的解决方案，但可以做 trade-off，比如在性能，可用性，一致性之间做取舍。&lt;/p&gt;

&lt;p&gt;可以看到软件的发展，比如开始的 MapReduce 到后来的 Spark。&lt;/p&gt;

&lt;p&gt;可以看到不同的玩法。比如从一开始，就把一致性放到较低优先级的 Faceboke Memache 集群，选择强一致的 Spanner，选择 Causal Consistency 的 COPS。&lt;/p&gt;

&lt;p&gt;后端很有趣，工程上的问题也好，纯粹的技术、理论也好，总是有很多有趣的事物出现。&lt;/p&gt;</description>
      <author>yfractal</author>
      <pubDate>Wed, 24 Aug 2022 13:58:36 +0800</pubDate>
      <link>https://ruby-china.org/topics/42609</link>
      <guid>https://ruby-china.org/topics/42609</guid>
    </item>
  </channel>
</rss>
