<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="3.10.0">Jekyll</generator><link href="https://wiyi.org/feed.xml" rel="self" type="application/atom+xml" /><link href="https://wiyi.org/" rel="alternate" type="text/html" /><updated>2026-02-05T07:28:11+00:00</updated><id>https://wiyi.org/feed.xml</id><title type="html">kikcat</title><subtitle>coding for fun
</subtitle><author><name>kikcat</name></author><entry><title type="html">电商系统的高并发库存扣减</title><link href="https://wiyi.org/inventory-system.html" rel="alternate" type="text/html" title="电商系统的高并发库存扣减" /><published>2026-01-31T00:00:00+00:00</published><updated>2026-01-31T00:00:00+00:00</updated><id>https://wiyi.org/inventory</id><content type="html" xml:base="https://wiyi.org/inventory-system.html"><![CDATA[<p>电商系统中的高并发库存扣减算是一个比较经典的难题，网上关于这类话题的文章不少，不过很少有看到有系统讲解，以及提供一个真正可以落地的方案的。刚好前段时间和朋友讨论了一下这个话题，受益匪浅。趁着脑子思绪还比较清晰，在这篇文章做一个系统的梳理。</p>

<h2 id="前提">前提</h2>

<p>本文预设场景:</p>

<ul>
  <li>服务端架构是分布式架构，即订单服务，商品服务是不同的服务，部署在不同的节点。</li>
  <li>mysql作为数据库</li>
  <li>redis作为内存数据库</li>
</ul>

<p>如果读者掌握下面的一些知识，阅读本文可能会更加流畅。</p>

<ul>
  <li>Event Sourcing</li>
  <li>Redis Lua脚本原子操作</li>
  <li>Redis 锁的基本原理</li>
</ul>

<!--more-->

<h2 id="基于数据库的库存扣减">基于数据库的库存扣减</h2>

<h3 id="数据库锁">数据库锁</h3>

<p>我曾在某论坛看到过一种观点，直接使用数据库无法实现库存扣减，这其实是一种误解。使用数据库的锁是有能力做到库存扣减的，比如:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">update</span> <span class="n">goods</span> <span class="k">set</span> <span class="n">quantity</span> <span class="o">=</span> <span class="n">quantity</span> <span class="o">-</span> <span class="mi">1</span> <span class="k">where</span> <span class="n">id</span> <span class="o">=</span> <span class="err">$</span><span class="p">{</span><span class="n">id</span><span class="p">}</span> <span class="k">and</span> <span class="n">quantity</span> <span class="o">&gt;</span> <span class="mi">0</span>
</code></pre></div></div>

<p>上面的update语句是原子的，根据数据库返回的effectRows，可以判断这次更新有没有执行成功，做对应的commit或rollback操作。那么为什么很多人不会推荐直接用数据库锁呢？因为它的效率非常低，在高并发的环境下，很容易造成系统卡死。</p>

<h3 id="数据库瓶颈">数据库瓶颈</h3>

<ul>
  <li>
    <p>高度竞争下，事务串行化，并发反而成了一种负担(线程上下文切换等)</p>

    <p>同一条记录的 X 锁只能被一个事务持有，其余全部进入锁等待队列，吞吐量直接退化为“单线程 + 事务持锁时间”</p>
  </li>
  <li>
    <p>mysql的死锁检测有额外的开销</p>
  </li>
  <li>
    <p>数据库读写是Disk IO</p>

    <ul>
      <li>毫秒级的读写，叠加事务串行化，吞吐量受限于单个事务的执行速度</li>
      <li>写放大，写入一条记录还会伴随着索引、undolog、redolog的写入</li>
    </ul>
  </li>
  <li>
    <p>连接数限制</p>

    <p>数据库系统的连接数通常不会很大，超过后就会拒绝服务</p>
  </li>
</ul>

<p>对于一个OLTP系统而言，latency是最重要的指标。如果大量的并发写入请求打到数据库，因为事务串行化，造成等待时间过长，数据库连接数会快速耗尽，导致数据库无法继续提供服务，这显然是很难让人接受的。</p>

<h2 id="基于redis的分布式锁">基于Redis的分布式锁</h2>

<h3 id="锁的粒度">锁的粒度</h3>

<p>Redis是一个内存数据库，单个操作延迟极低，能支撑很高的并发。同时，redis提供了一种set nx的语法</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">set</span> <span class="s2">"global_lock"</span> <span class="s2">"9a00ca3e-cb19-4cce-9f5c-2e94d5dce7c2"</span> nx
</code></pre></div></div>

<p>上面的指令，使用nx，确保<code class="language-plaintext highlighter-rouge">global_lock</code>这个key不存在时，才会设置成功。这就是redis锁实现的根基。set nx的结果可以判断锁是否被持有。</p>

<p>redis锁是网上最常见到的解决方案，它的执行路径大致如下:</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">createOrder</span><span class="p">():</span>
  <span class="k">with</span> <span class="n">redis</span><span class="p">.</span><span class="n">lock</span><span class="p">(</span><span class="s">"global_lock"</span><span class="p">)</span> <span class="k">as</span> <span class="n">lock</span><span class="p">:</span>
  	<span class="n">successful</span> <span class="o">=</span> <span class="n">goods</span><span class="p">.</span><span class="n">updateStock</span><span class="p">(</span><span class="nb">id</span><span class="p">,</span> <span class="n">quantity</span><span class="p">)</span>
  	<span class="k">if</span> <span class="n">successful</span><span class="p">:</span>
      <span class="n">order</span><span class="p">.</span><span class="n">create</span><span class="p">()</span>
  
</code></pre></div></div>

<p>通过redis的锁，可以把大量的并发拦截，防止并发涌入到数据库中，造成连接数被快速耗尽。读者可能已经发现，上面的<code class="language-plaintext highlighter-rouge">global_lock</code>锁粒度太粗了，吞吐量反而会下降。因为不相关的sku也要竞争同一个锁，相当于把所有sku扣减都串行化了。那么如果我们把粒度改为sku级别呢?</p>

<div class="language-python highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">createOrder</span><span class="p">():</span>
  <span class="k">with</span> <span class="n">redis</span><span class="p">.</span><span class="n">lock</span><span class="p">(</span><span class="n">sku_id</span><span class="p">)</span> <span class="k">as</span> <span class="n">lock</span><span class="p">:</span>
  	<span class="n">successful</span> <span class="o">=</span> <span class="n">goods</span><span class="p">.</span><span class="n">updateStock</span><span class="p">(</span><span class="nb">id</span><span class="p">,</span> <span class="n">quantity</span><span class="p">)</span>
  	<span class="k">if</span> <span class="n">successful</span><span class="p">:</span>
      <span class="n">order</span><span class="p">.</span><span class="n">create</span><span class="p">()</span>
</code></pre></div></div>

<p>锁粒度改为sku级别后，一定程度上缓解了吞吐量问题，但如果我们认真审视上面的方案，可以发现它并没有解决吞吐量的问题: 针对热门商品(即同一sku)，<strong>事务依然退化为了单线程</strong>。大量的并发在等待持有锁那个事务释放，吞吐量受限。</p>

<p>细粒度的锁还带来了一个新的问题: <strong>锁的实现复杂度变高</strong>。</p>

<p>用户下单时，通常是几个sku同时下单，比如</p>

<p>User A:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span><span class="w">
  </span><span class="p">{</span><span class="w">
    </span><span class="nl">"sku_id"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w">
    </span><span class="nl">"quantity"</span><span class="p">:</span><span class="w"> </span><span class="mi">2</span><span class="w">
  </span><span class="p">},</span><span class="w">
  </span><span class="p">{</span><span class="w">
    </span><span class="nl">"sku_id"</span><span class="p">:</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w">
    </span><span class="nl">"quantity"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="w">
  </span><span class="p">},</span><span class="w">
</span><span class="p">]</span><span class="w">
</span></code></pre></div></div>

<p>User B:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">[</span><span class="w">
  </span><span class="p">{</span><span class="w">
    </span><span class="nl">"sku_id"</span><span class="p">:</span><span class="w"> </span><span class="mi">2</span><span class="p">,</span><span class="w">
    </span><span class="nl">"quantity"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="w">
  </span><span class="p">},</span><span class="w">
  </span><span class="p">{</span><span class="w">
    </span><span class="nl">"sku_id"</span><span class="p">:</span><span class="w"> </span><span class="mi">3</span><span class="p">,</span><span class="w">
    </span><span class="nl">"quantity"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="w">
  </span><span class="p">},</span><span class="w">
  </span><span class="p">{</span><span class="w">
    </span><span class="nl">"sku_id"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="p">,</span><span class="w">
    </span><span class="nl">"quantity"</span><span class="p">:</span><span class="w"> </span><span class="mi">1</span><span class="w">
  </span><span class="p">},</span><span class="w">
</span><span class="p">]</span><span class="w">
</span></code></pre></div></div>

<p>如果有两个用户按照上面的方式下单，稍不注意可能就会造成死锁。</p>

<blockquote>
  <p>A: lock 1</p>

  <p>B: lock 2</p>

  <p>A: wait B</p>

  <p>B: lock 3</p>

  <p>B: wait A</p>
</blockquote>

<p>虽然我们可以对sku id进行排序，按顺序加锁避免死锁问题。但如果锁的实现是每个sku一把锁，当用户下单的sku很多，锁的可靠性会直线下降。很快我们会发现，可能要实现一个类似于mysql的gap lock避免这类问题。</p>

<p>由此可见，锁粒度越细，实现难度可能是指数上升的。</p>

<h3 id="原子提交问题">原子提交问题</h3>

<p>即便我们解决了锁的复杂度问题，还不得不面对一个难题: 原子提交。</p>

<p>因为order和goods是不同的服务，一般的业务流程是同步的，即下单完成后要立即跳转到订单页面。所以当存扣减成功后，需要保证order和goods同时成功或失败。如果引入2PC或TCC之类的分布式事务，那么单个事务的latency会雪上加霜。</p>

<h3 id="优点">优点</h3>

<p>利用成熟的基础设施，实现简单，适用于单品秒杀这种场景。</p>

<h3 id="缺点">缺点</h3>

<p>吞吐量有限，仅适用于单品秒杀，如果允许一次下多个sku，锁的实现复杂度会很高</p>

<h2 id="基于内存的库存系统">基于内存的库存系统</h2>

<p>回顾刚刚提到的方案，都被数据库系统操作慢这个特性影响了系统整体的吞吐量。因为数据库操作都是Disk IO，延迟无法做到很低。那么，如果我们把整个库存搬到内存中去实现呢？听起来似乎激进到不可行，但这就是这一章节的主题。</p>

<p>内存是一种不稳定的存储介质，但通过一些工程手段，可以最大程度避免出现超卖和少买问题(降低到一个可接受的范围)。</p>

<h3 id="库存状态机">库存状态机</h3>

<p>从抽象的角度看，sku的库存变化就是某个sku的状态变化。如果我们把sku看做一个状态机，库存变化就是一个状态变更Event，再通过Event Sourcing的方式把所有event apply到DB或memory中对应的sku，就能实现状态一致。库存的状态变化主要来自于两个操作:</p>

<ol>
  <li>商户(平台)采购新的sku补充库存</li>
  <li>用户购买消耗库存</li>
</ol>

<h4 id="sku补货流程">sku补货流程</h4>

<p>当sku完成了采购流程，会生成一条库存记录插入到sku_state_events表中，同时也会更新sku表中quantity字段。如下图:</p>

<p><img src="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/replenishment.jpg" alt="image-20260130030941986" /></p>

<p>stock event consumer会消费sku_state_events的事件，把状态变化提交到redis中，让redis的snapshot保持最新。当然这里最好也利用上lua脚本，做好幂等，防止重复消费事件。</p>

<h4 id="下单流程">下单流程</h4>

<p>另一种库存变化路径是用户侧下单操作，当用户提交订单:</p>

<ol>
  <li>先判断redis中的库存是否满足条件</li>
  <li>满足后立即创建一个订单，如果下单失败，记录到一个本地账本，定时上报对账</li>
  <li>order_item中每一条记录对应了一个sku状态变化event</li>
  <li>通过stock event consumer把这些状态apply到对应的sku中</li>
</ol>

<p><img src="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/order.jpg" alt="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/order.jpg" /></p>

<p>上图可以看出，库存扣减业务是在redis中进行，数据库不再参与这部分业务，这样做解决了两个很大的问题:</p>

<ol>
  <li>
    <p>数据库IO慢的问题</p>

    <p>因为使用了redis进行库存扣减，吞吐量高延迟低。而数据库它只作为<strong>Source of Truth</strong>，即便它的库存一致性存在一些延迟，也不会影响业务正常运作。</p>
  </li>
  <li>
    <p>原子提交问题</p>

    <p>order_item中的记录会转换为sku的状态变化event，使用Event Sourcing就可把变更的库存同步回goods db中，不需要原子提交也能保证库存一致性。</p>
  </li>
</ol>

<h3 id="处理超卖">处理超卖</h3>

<p>因为Redis是内存数据库，如果发生意外，一定会出现数据不一致的情况(断电，异常退出，执行failover等)，所以需要引入一种机制，用于<strong>判断redis中的库存数据什么时候是可信的</strong>。</p>

<h4 id="熔断">熔断</h4>

<p>当业务端发现redis的数据不可信时，需要执行熔断操作，不再执行库存扣减逻辑，等待coordinator执行完故障恢复(即强制库存同步)，才恢复正常。</p>

<p><img src="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/circuit_break.jpg" alt="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/circuit_break.jpg" /></p>

<p>执行库存扣减的lua脚本，前面加上一个if判断。如果发现当前快照是不新鲜的，就拒绝业务端写入。</p>

<div class="language-lua highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">local</span> <span class="n">stale</span> <span class="o">=</span> <span class="n">redis</span><span class="p">.</span><span class="n">call</span><span class="p">(</span><span class="s2">"GET"</span><span class="p">,</span> <span class="s2">"is_stale"</span><span class="p">)</span>
<span class="k">if</span> <span class="p">(</span><span class="ow">not</span> <span class="n">stale</span><span class="p">)</span> <span class="ow">or</span> <span class="p">(</span><span class="n">stale</span> <span class="o">~=</span> <span class="s2">"0"</span><span class="p">)</span> <span class="k">then</span>
  <span class="k">return</span> <span class="mi">0</span>
<span class="k">end</span>
</code></pre></div></div>

<p>现在引入一个新的角色，<code class="language-plaintext highlighter-rouge">coordinator</code>，负责维护redis的stale状态。当它发现redis快照状态是不新鲜的，就执行强制库存同步。根据redis的部署架构，可分为几种不同的设计。</p>

<h4 id="单节点">单节点</h4>

<p>单节点的redis，熔断方案非常简单。只要我们每次启动redis时，同时执行一条命令，把<code class="language-plaintext highlighter-rouge">is_stale</code>这个key设置为<code class="language-plaintext highlighter-rouge">1</code>，就能直接让业务端执行库存扣减时触发熔断。直到coordinator执行完成强制库存同步再恢复为<code class="language-plaintext highlighter-rouge">0</code>。</p>

<h4 id="哨兵">哨兵</h4>

<p>在redis哨兵架构下，sentinel执行failover时不受我们控制，情况开始变得复杂起来，无法继续使用<code class="language-plaintext highlighter-rouge">is_stale</code>这种简单的方式执行熔断，需要转变一下思路。</p>

<p>现在给master设置一个<code class="language-plaintext highlighter-rouge">master:epoch</code>的key，value是redis sentinel <code class="language-plaintext highlighter-rouge">config epoch</code>。啥是epoch呢?可以看看redis官网的定义:</p>

<blockquote>
  <p>Sentinels require to get authorizations from a majority in order to start a failover for a few important reasons:</p>

  <p>When a Sentinel is authorized, it gets a unique <strong>configuration epoch</strong> for the master it is failing over. This is a number that will be used to version the new configuration after the failover is completed. Because a majority agreed that a given version was assigned to a given Sentinel, no other Sentinel will be able to use it. This means that every configuration of every failover is versioned with a unique version. We’ll see why this is so important.</p>
</blockquote>

<p>epoch是一个单调递增的数字，每次sentinel执行failover时，需要majority sentinels的同意，更改这个数字，成功后存入配置中。这个epoch起到一个类似于fencing token的作用，后续会详细聊到。</p>

<p>在执行库存扣减之前，现在我们需要检查epoch跟之前是否一致</p>

<div class="language-lua highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">local</span> <span class="n">epoch</span> <span class="o">=</span> <span class="n">ARGV</span><span class="p">[</span><span class="mi">1</span><span class="p">]</span>

<span class="kd">local</span> <span class="n">current</span> <span class="o">=</span> <span class="n">redis</span><span class="p">.</span><span class="n">call</span><span class="p">(</span><span class="s1">'GET'</span><span class="p">,</span> <span class="s1">'master:epoch'</span><span class="p">)</span>
<span class="k">if</span> <span class="p">(</span><span class="ow">not</span> <span class="n">current</span><span class="p">)</span> <span class="ow">or</span> <span class="p">(</span><span class="n">epoch</span> <span class="o">~=</span> <span class="n">current</span><span class="p">)</span> <span class="k">then</span>
  <span class="k">return</span> <span class="mi">0</span>
<span class="k">end</span>

<span class="c1">-- ...</span>
</code></pre></div></div>

<p>引入epoch主要服务于下面两个场景:</p>

<ol>
  <li>给coordinator判断什么时候该执行强制库存同步</li>
  <li>降低脑裂带来的影响</li>
</ol>

<p>其中脑裂是我们接下来要讨论的重点。在sentinel执行failover时，主要会有下面两类场景:</p>

<ol>
  <li>
    <p>旧的master真的挂掉了</p>

    <p>这种情况对我们业务影响不大，因为所有clients都将会连上新的master，不会同时出现2个master</p>

    <p><img src="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/master_down.jpg" alt="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/master_down.jpg" /></p>
  </li>
  <li>
    <p>发生网络分区，并非真的挂了</p>

    <p>这种情况对业务会产生很大的影响，因为可能会有一个时间窗口同时存在两个master节点，这就是脑裂。如果两个节点同时进行库存扣减，会出现超卖。必须要通过一些手段尽可能降低这种情况带来的影响。</p>

    <p><img src="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/split_brain1.jpg" alt="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/split_brain1.jpg" /></p>
  </li>
</ol>

<p>上面说的降低，是因为在redis中脑裂是无法避免的，因为它写入并不需要quorum的确认，当发生网络分区，就有可能会同时存在两个master，我们只能尽可能把影响降低到一个可接受的范围内。</p>

<p><strong>处理脑裂</strong></p>

<p>既然无法避免出现两个master，那就只能通过一些手段，在出现两个master时，只让其中一个工作，只要不是两个master同时工作，就不会造成超卖。可以在下面几个点上做努力:</p>

<ol>
  <li>
    <p>redis sentinel配置</p>

    <p>redis官方提供了两个配置用于缓解脑裂写入</p>

    <blockquote>
      <p>min-replicas-to-write <N>
min-replicas-max-lag <M></M></N></p>
    </blockquote>

    <p>min-replicas-to-write: 要求 master <strong>至少有 N 个“健康副本”</strong>（replica）存在时才接受写入。</p>

    <p>min-replicas-max-lag: 这些副本与 master 的复制心跳/ACK <strong>延迟（lag）不超过 M 秒</strong>；超了就不算“好副本”。</p>

    <div class="language-conf highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">min</span>-<span class="n">replicas</span>-<span class="n">to</span>-<span class="n">write</span> <span class="m">1</span>
<span class="n">min</span>-<span class="n">replicas</span>-<span class="n">max</span>-<span class="n">lag</span> <span class="m">10</span>
</code></pre></div>    </div>

    <p>如果<strong>连 1 个“好副本”都保证不了</strong>，就停止接受写入；如果是网络分区导致副本 ACK 超过 10 秒收不到，旧 master 大约 10 秒后就会拒写。</p>

    <p>当然这个保证不了什么，只能做一个最坏情况的保障。</p>
  </li>
  <li>
    <p>业务端的熔断</p>

    <p>业务端需要维护一个和sentinel的健康状态检查，周期性向sentinels询问当前最新的master和epoch信息。如果失联了一定的时间，则认为unhealthy，在业务端触发熔断。</p>

    <p>需要注意的是: 在询问sentinel时，需要至少同时询问majority个sentinel节点(比如3就至少问询2个，5就至少询问3个)，才能保证信息的可靠性。 下图展示了一种情况，因为网络分区，sentinel2被隔离，返回的是一个旧的epoch。</p>

    <p><img src="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/health_check.jpg" alt="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/health_check.jpg" /></p>
  </li>
  <li>
    <p>redis lua脚本判断epoch触发熔断</p>

    <p>在上一步，业务端维护了一个最新的epoch。在执行库存扣减时，传入epoch，在redis lua中判断epoch是否一致，不一致即拒绝写入。下图就是旧的clients连接到旧master执行库存扣减，被epoch拒绝执行。</p>

    <p><img src="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/split_brain2.jpg" alt="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/split_brain2.jpg" /></p>
  </li>
</ol>

<p><strong>long pause问题</strong></p>

<p>通过业务端和redis lua的熔断，还远不能覆盖真实的情况，比如经典的long pause问题。在现实世界中有一些情况会造成进程长时间暂停，比如:</p>

<ol>
  <li>
    <p>long gc pause</p>

    <p>如果程序使用的是有gc的语言，gc的STW可能会导致进程长时间暂停</p>
  </li>
  <li>
    <p>OS Swapping</p>

    <p>在内存不足时，os swapping可能会导致进程卡住</p>
  </li>
  <li>
    <p>docker的cgroup 限额导致的 throttling</p>

    <p>如果程序跑在容器中，可能会因为cgroup 限额导致的 throttling而暂停进程</p>
  </li>
</ol>

<p>一旦发生上面的情况，就可能会导致两个master同时在做业务扣减。设想下面场景:</p>

<ol>
  <li>client刚执行完了一次健康检查，得到epoch = 1</li>
  <li>此时刚好因为long gc进程暂停了5s，在这5s发生了failover，选举了新的master，epoch = 2</li>
  <li>client所连接的master因为网络分区，epoch还是1</li>
  <li>long gc结束，进行业务扣减。两个master同时在做库存扣减，就可能会超卖。</li>
</ol>

<p><img src="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/split_brain3.jpg" alt="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/split_brain3.jpg" /></p>

<p>为了缓解long pause带来的问题，client在执行健康检查之前，可以询问当前的master当前的时间，维护一个时间戳字段。在执行库存扣减时，传入这个时间戳，同时lua脚本多加一段逻辑:</p>

<div class="language-lua highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">-- ARGV[1] checked_at</span>

<span class="kd">local</span> <span class="n">t</span> <span class="o">=</span> <span class="n">redis</span><span class="p">.</span><span class="n">call</span><span class="p">(</span><span class="s2">"TIME"</span><span class="p">)</span>
<span class="kd">local</span> <span class="n">now</span> <span class="o">=</span> <span class="nb">tonumber</span><span class="p">(</span><span class="n">t</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>

<span class="kd">local</span> <span class="n">checked_at</span> <span class="o">=</span> <span class="nb">tonumber</span><span class="p">(</span><span class="n">ARGV</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
<span class="kd">local</span> <span class="n">max_age</span> <span class="o">=</span> <span class="mi">5</span>

<span class="k">if</span> <span class="p">(</span><span class="ow">not</span> <span class="n">checked_at</span><span class="p">)</span> <span class="k">then</span>
  <span class="k">return</span> <span class="p">{</span><span class="n">err</span><span class="o">=</span><span class="s2">"BAD_ARGS"</span><span class="p">}</span>
<span class="k">end</span>

<span class="k">if</span> <span class="p">(</span><span class="n">now</span> <span class="o">-</span> <span class="n">checked_at</span><span class="p">)</span> <span class="o">&gt;</span> <span class="n">max_age</span> <span class="k">then</span>
  <span class="k">return</span> <span class="p">{</span><span class="n">err</span><span class="o">=</span><span class="s2">"STALE_VIEW"</span><span class="p">}</span>
<span class="k">end</span>
</code></pre></div></div>

<p>上面的情况可以缓解long pause带来的问题。但也只是缓解，在下面情况还是会出现超卖:</p>

<ol>
  <li>
    <p>failover发生在timeout的时间窗口之内</p>

    <p>如果gc了3s，这3s内完成了failover，还是出现了2个master同时工作。</p>
  </li>
  <li>
    <p>时钟回拨</p>

    <p>上面的时间都是依赖物理时钟。在分布式系统中，物理时钟都是不可靠的，它会有偏差，需要定期校准。所以会出现下面情况:</p>

    <ul>
      <li>client获取到了最新的epoch</li>
      <li>发生了long pause，暂停了10s</li>
      <li>redis发生了时钟回拨，时钟回到了10秒之前</li>
      <li>lua脚本校验时间通过</li>
    </ul>
  </li>
</ol>

<p>上面两点，可以通过强制加长新master恢复服务的时间缓解。比如coordinator在执行强制库存同步时，时间不能低于redis timeout的2-3倍。加长新master的启动时间，尽可能避免同时2个master同时在进行库存扣减。</p>

<h4 id="集群">集群</h4>

<p>从redis哨兵架构能看到，方案设计已经非常复杂了，而且也无法完全避免超卖问题。如果使用redis集群，实现难度会上升一个数量级，主要涉及到:</p>

<ol>
  <li>redis的lua无法保证跨分片的原子性，订单的sku分布在不同的分片上就无法实现原子操作</li>
  <li>集群的节点加入，退出，rebalance也加大了实现redis和db之间数据一致性的难度</li>
</ol>

<p>所以我觉得，这个方案不适用于redis集群的架构。因为最终可能会复杂到难以实现。</p>

<h4 id="故障恢复">故障恢复</h4>

<p>要进行故障恢复(即强制同步库存)，首先需要明白什么时候该执行。在单节点的架构下，只需要不断轮询<code class="language-plaintext highlighter-rouge">is_stale</code>这个值即可。如果是sentinel，则需要向majority询问epoch和master信息。</p>

<p>当coordinator发现redis的状态异常，它就会启动业务端故障恢复，执行强制库存同步。它需要完成以下几个步骤:</p>

<ol>
  <li>把order_item中所有未同步的sku全部处理完成</li>
  <li>把sku_state_events中所有待处理的事件处理完成</li>
  <li>把redis熔断状态取消</li>
</ol>

<p><img src="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/sync.jpg" alt="https://raw.githubusercontent.com/xingty/assets/refs/heads/main/images/shop/sync.jpg" /></p>

<p>经过上面几个步骤，goods db中所有的sku状态已经正常，再把这部分状态强制刷新到redis中，成功后，redis就拥有了最新且正确的snapshot。</p>

<p>当完成所有步骤后，coordinator把redis的状态设置为ready(is_stale=0或master:epoch更新为当前epoch)。此时业务端的熔断也会结束。</p>

<h3 id="处理少卖">处理少卖</h3>

<p>少卖主要发生在，当redis库存扣减成功后，创建订单出现异常，这里又分为两种情况:</p>

<ol>
  <li>
    <p>成功插入本地账本</p>

    <p>如果成功插入本地账本(sqlite或MQ都行，看取舍)，那么可以定时上报本地账本的异常数据给coordinator，让它执行对账。</p>
  </li>
  <li>
    <p>本地账本都无法插入</p>

    <p>如果因为断电或进程异常退出或重启等造成这条记录永久丢失，那么需要一个更重的对账机制。比如每天凌晨3点，拉取前一天交易的所有sku，比较当前sku和db中的库存是否一致。为了正确性，可能不得不停止下单一些时间(可能几分钟到10分钟)，是一个很重的操作，这个就要看取舍。</p>
  </li>
</ol>

<h3 id="缺点-1">缺点</h3>

<p><strong>1. 牺牲部分可用性</strong></p>

<p>单节点的redis虽然可以避免超卖问题，也因此存在单点故障。一旦redis节点挂掉，整个服务就会处于不可用状态。</p>

<p>如果采用sentinel架构部署redis，一定程度上能提高可用性。但如果发生了网络分区，局部分区的节点无法联系上sentinel，无法得知当前的master节点是谁，这部分节点就变得不可用，无法下单。(否则可能会因脑裂造成超卖)</p>

<p>在极端情况下，如果网络不稳定在频繁执行failover，服务的可用性也会变得很糟糕。</p>

<p><strong>2. 对账复杂度</strong></p>

<p>写入local db之前，如果进程挂掉或断电，就丢失了。这就需要依赖一个更重的账本去对账，生成一些补偿性的stock_state_event插入到数据库中。如果系统的单量很大，这个操作的耗时也会线性增加，暂停下单的时间也会增加。</p>

<p><strong>3. 超卖问题</strong></p>

<p>在redis sentinel架构下，还是存在一个时间窗口会造成超卖问题。这个就是该方案可用性的一个代价。如果想绝对避免超卖，就要牺牲可用性，换到单节点的redis，这个只能看取舍。如果需要高吞吐和又不想牺牲可用性，那只能在用户协议中规定好赔偿协议。</p>

<p>可以看到，sentinel模式下，用了诸多复杂的设计，都无法从根本解决超卖问题。根本原因还是我们想要在sentinel这样一个AP的系统中构建一个CP的保证，这个基本上是不可能做到的。</p>

<h3 id="优点-1">优点</h3>

<ol>
  <li>实现了高吞吐量，低延迟的同时，在数据一致性上也取得了一个平衡。把超卖和少卖控制在一个可以接受的范围。</li>
  <li>能支撑1万到几万个TPS，尤其擅长应付秒杀这种海量的tps但实际库存只有很少的场景。</li>
  <li>都是使用通用的基础设施，不需要投入太多资源也能使用和维护。</li>
</ol>

<h2 id="基于alisql的inventory-hint">基于AliSQL的Inventory Hint</h2>

<p>在介绍数据库方案的时候，我们列举了一些数据库的瓶颈，其中主要是:</p>

<ol>
  <li>数据库的磁盘IO瓶颈影响了整体吞吐量</li>
  <li>一些写放大，死锁检测等增加了额外的开销</li>
</ol>

<p>那么有没有办法从数据库层面去做一些优化呢？这个就是AliSQL做的事，它在优化了数据库的内核，让单个事务的执行时间变得更短，从而提升整体的吞吐量。</p>

<h3 id="inventory-hint">Inventory Hint</h3>

<p>hint特性是mysql中用来控制优化器的一种手段。假设在一个join查询中，我们想控制优化器使用hash join，就可以使用hint控制优化器的行为</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">SELECT</span> <span class="cm">/*+ BNL(t1, t2) */</span> <span class="o">*</span>
<span class="k">FROM</span> <span class="n">t1</span>
<span class="k">JOIN</span> <span class="n">t2</span> <span class="k">ON</span> <span class="n">t1</span><span class="p">.</span><span class="n">c1</span> <span class="o">=</span> <span class="n">t2</span><span class="p">.</span><span class="n">c1</span><span class="p">;</span>
</code></pre></div></div>

<p>其中<code class="language-plaintext highlighter-rouge">BNL(t1, t2)</code>就是用 hint 控制优化器/连接算法。AliSQL也增加了几个专门针对库存扣减场景的hint，叫<a href="https://help.aliyun.com/zh/rds/apsaradb-rds-for-mysql/inventory-hint">inventory hint</a>。</p>

<p>语法</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/*+ COMMIT_ON_SUCCESS */</span>
<span class="cm">/*+ ROLLBACK_ON_FAIL */</span>

<span class="k">UPDATE</span> <span class="cm">/*+ COMMIT_ON_SUCCESS ROLLBACK_ON_FAIL */</span> <span class="n">T</span>
<span class="k">SET</span> <span class="k">c</span> <span class="o">=</span> <span class="k">c</span> <span class="o">-</span> <span class="mi">1</span>
<span class="k">WHERE</span> <span class="n">id</span> <span class="o">=</span> <span class="mi">1</span><span class="p">;</span>
</code></pre></div></div>

<p>使用了上面的hint，当update语句执行完后会立即提交或回滚当前事务，这相当于减少了单次事务的执行耗时，从而提升吞吐量。阿里<a href="https://help.aliyun.com/zh/rds/support/test-method-and-results-of-hot-data-updates-on-a-single-row">官方声称</a>，在一台90核 720GB（独占物理机型）中能达到3W的TPS。</p>

<p>当然如果只在事务提交方面做优化肯定还远远不够的，磁盘IO始终是一个瓶颈，内部应该还利用了cache，否则难以解释。不过目前披露的资料很少，没办法作进一步查证。</p>

<h3 id="局限">局限</h3>

<p>inventory hint特性看着挺美好的，实际上它所适用的场景也是较为单一。使用它会导致事务立即被提交或回滚，这就意味着，如果它想和其他操作打包为一个事务，扣库存的操作必须要放在最后一步，否则因为hint的立即提交特性，无法保证数据的一致性。</p>

<p>设想一下，如果我们一次购入是多个sku，update的sql就可能要写成下面这样:</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">begin</span><span class="p">;</span>
<span class="k">update</span> <span class="n">goods</span> <span class="k">set</span> <span class="n">quantity</span> <span class="o">=</span> <span class="n">quantity</span> <span class="o">-</span> <span class="mi">1</span> <span class="k">where</span> <span class="n">sku_id</span> <span class="o">=</span> <span class="mi">1</span> <span class="k">and</span> <span class="n">quantity</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">;</span>

<span class="k">update</span> <span class="cm">/*+ COMMIT_ON_SUCCESS ROLLBACK_ON_FAIL TARGET_AFFECT_ROW(1) */</span>
<span class="n">goods</span> <span class="k">set</span> <span class="n">quantity</span> <span class="o">=</span> <span class="n">quantity</span> <span class="o">-</span> <span class="mi">1</span> 
<span class="k">where</span> <span class="n">sku_id</span> <span class="o">=</span> <span class="mi">2</span> <span class="k">and</span> <span class="n">quantity</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="p">;</span>

<span class="k">commit</span><span class="p">;</span>
</code></pre></div></div>

<p>因为hint的特性，不得不写在最后一条sql，那前面一条怎么办呢？这种情况可能就退化为了原本的mysql中存在的问题。因为引入了这个inventory hint，说不定情况还会比原来更加糟糕，所以它局限性也挺大。</p>

<h3 id="优点-2">优点</h3>

<p>一致性更强。</p>

<h3 id="缺点-2">缺点</h3>

<ol>
  <li>
    <p>业务场景有限</p>

    <p>如果是单品秒杀这种临时场景，那么用redis锁，设计一个分段锁也能以较小的代价实现较高的吞吐。</p>
  </li>
  <li>
    <p>绑定死了阿里云生态</p>

    <p>使用了AliSQL就意味着自己的业务绑死在阿里云生态，为了一个使用场景比较窄的业务而选择把业务架构绑死在一个云厂商，代价有点高。</p>
  </li>
</ol>

<h2 id="结语">结语</h2>

<p>本文分析了好几种实现高并发库存扣减的方案，可以看出，没有一种是能做到完美的。而且，每一种方案，都是构建在一整套基础设施之上。即便是阿里的sql魔改了数据库内核，也不是简简单单用一个Inventory Hint就能在业务系统中实现很高的吞吐量。这也体现了分布式系统的复杂度和细节之多，没有业务实战真的很难体会其中的坑坑洼洼。</p>]]></content><author><name>kikcat</name></author><category term="分布式系统" /><summary type="html"><![CDATA[电商系统中的高并发库存扣减算是一个比较经典的难题，网上关于这类话题的文章不少，不过很少有看到有系统讲解，以及提供一个真正可以落地的方案的。刚好前段时间和朋友讨论了一下这个话题，受益匪浅。趁着脑子思绪还比较清晰，在这篇文章做一个系统的梳理。 前提 本文预设场景: 服务端架构是分布式架构，即订单服务，商品服务是不同的服务，部署在不同的节点。 mysql作为数据库 redis作为内存数据库 如果读者掌握下面的一些知识，阅读本文可能会更加流畅。 Event Sourcing Redis Lua脚本原子操作 Redis 锁的基本原理]]></summary></entry><entry><title type="html">How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Progranm</title><link href="https://wiyi.org/how-to-make.html" rel="alternate" type="text/html" title="How to Make a Multiprocessor Computer That Correctly Executes Multiprocess Progranm" /><published>2023-02-09T00:00:00+00:00</published><updated>2023-02-09T00:00:00+00:00</updated><id>https://wiyi.org/how-to-make</id><content type="html" xml:base="https://wiyi.org/how-to-make.html"><![CDATA[<p>分布式系统的顺序一致性的来源就是Lamport大神的这篇文章，因为原始的PDF字体清晰度不好，我弄成了文字版的。实际上在CPU这种微观系统中实现顺序一致性也会很浪费性能，作者在最后也提到了更小粒度的顺序一致，比如Memory module级别降低到内存单元(Memory cell)，对于其他的操作依然是out of order，这其实也由total order变为了partial order。</p>

<p>有兴趣的可以下载原文看: <a href="https://www.microsoft.com/en-us/research/uploads/prod/2016/12/How-to-Make-a-Multiprocessor-Computer-That-Correctly-Executes-Multiprocess-Programs.pdf">点击查看原文</a>。</p>

<blockquote>
  <p><strong>Abstract–Many large sequential computers execute operations in a different order than is specified by the program. A correct execution is achieved if the results produced are the same as would be produced by executing the program steps in order. For a multiprocessor computer, such a correct execution by each processor does not guarantee the correct execution of the entire program. Additional conditions are given which do guarantee that a computer correctly executes multiprocess programs.</strong></p>
</blockquote>

<blockquote>
  <p>Index Terms-Computer design, concurrent computing, hardware correctness, multiprocessing, parallel processing.</p>
</blockquote>

<p>A high-speed processor may execute operations in a different order than is specified by the program. The correctness of the execution is guaranteed if the processor satisfies the following condition: <strong>the result of an execution is the same as if the operations had been executed in the order specified by the program</strong>. A processor satisfying this condition will be called sequential. Consider a computer composed of several such processors accessing a common memory. The customary approach to designing and proving the correctness of multiprocess algorithms for such a computer assumes that the following condition is satisfied: <strong>the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program</strong>. A multiprocessor satisfying this condition will be called sequentially consistent. The sequentiality of each individual processor does not guarantee that the multi-processor computer is sequentially consistent. In this brief note, we describe a method of interconnecting sequential processors with memory modules that insures the sequential consistency of the resulting multiprocessor.</p>

<!--more-->

<p><strong>We assume that the computer consists of a collection of processors and memory modules, and that the processors communicate with one another only through the memory modules</strong>. (Any special communication registers may be regarded as separate memory modules.) <strong>The only processor operations that concern us are the operations of sending fetch and store requests to memory modules.</strong> We assume that each processor issues a sequence of such requests. (It must sometimes wait for requests to be executed, but that does not concern us.)</p>

<p>We illustrate the problem by considering a simple two-process mutual exclusion protocol. Each process contains a critical section, and the purpose of the protocol is to insure that only one process may be executing its critical section at any time. The protocol is as follows.</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Process 1
 a :<span class="o">=</span> 1<span class="p">;</span>
 <span class="k">if </span>b <span class="o">=</span> 0 <span class="k">then </span>critical section<span class="p">;</span>
  a :<span class="o">=</span> 0
 <span class="k">else</span>
  ...
 <span class="k">fi</span>
</code></pre></div></div>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>process 2
 b :<span class="o">=</span> 1
 <span class="k">if </span>a <span class="o">=</span> 0 <span class="k">then </span>critical section<span class="p">;</span>
  b :<span class="o">=</span> 0
 <span class="k">else</span> 
 	...
 <span class="k">fi</span>
</code></pre></div></div>

<p>The else clauses contain some mechanism for guaranteeing eventual access to the critical section, but that is irrelevant to the discussion. It is easy to prove that this protocol guarantees mutually exclusive access to the critical sections. (Devising a proof provides a nice exercise in using the assertional techniques of [2] and [3], and is left to the reader.) Hence, when this two-process program is executed by a sequentially consistent multiprocessor computer, the two processors cannot both be executing their critical sections at the same time.</p>

<p>We first observe that a sequential processor could execute the “b:=1” and “fetch b” operations of process 1 in either order. (When process 1’s program is considered by itself, it does not matter in which order these two operations are performed.) However, it is easy to see that executing the “fetch b” operation first can lead to an error–both processes could then execute their critical sections at the same time. This immediately suggests our first requirement for a multiprocessor computer.</p>

<p><strong>Requirement R1: Each processor issues memory requests in the order specified by its program.</strong></p>

<p>Satisfying Requirement R1 is complicated by the fact that storing a value is possible only after the value has been computed. A processor will often be ready to issue a memory fetch request before it knows the value to be stored by a preceding store request. To minimize waiting, the processor can issue the store request to the memory module without specifying the value to be stored. Of course, the store request cannot actually be executed by the memory module until it receives the value to be stored.</p>

<p>Requirement R1 is not sufficient to guarantee correct execution.To see this, suppose that each memory module has several ports, and each port services one processor (or I/0 channel). Let the values of “a” and “b” be stored in separate memory modules, and consider the following sequence of events</p>

<ol>
  <li>Processor 1 sends the “a := 1” request to its port in memory module 1. The module is currently busy executing an operation for some other processor (or I/O channel).</li>
  <li>Processor 1 sends the “fetch b” request to its port in memory module 2. The module is free, and execution is begun.</li>
  <li>Processor 2 sends its “b:= 1” request to memory module 2. This request will be executed after processor 1’s “fetch b” request is completed</li>
  <li>Processor 2 sends its “fetch a” request to its port in memory module 1. The module is still busy.</li>
</ol>

<p>There are now two operations waiting to be performed by memory module 1. If processor 2’s “fetch a” operation is performed first, then both processes can enter their critical sections at the same time, and the protocol fails. This could happen if the memory module uses a round robin scheduling discipline in servicing its ports.</p>

<p>In this situation, an error occurs only if the two requests to memory module 1 are not executed in the same order in which they were received. This suggests the following requirement.</p>

<p><strong>Requirement R2: Memory requests from all processors issued to an individual memory module are serviced from a single FIFC queue. Issuing a memory request consists of entering the request on this queue.</strong></p>

<p>Condition R1 implies that a processor may not issue any further memory requests until after its current request has been entered on the queue. Hence, it must wait if the queue is full. If two or more processors are trying to enter requests in the queue at the same time, then it does not matter in which order they are serviced.</p>

<p>Note. If a fetch requests the contents of a memory location for which there is already a write request on the queue, then the fetch need not be entered on the queue. It may simply return the value from the last such write request on the queue.</p>

<p>Requirements R1 and R2 insure that if the individual processors are sequential, then the entire multiprocessor computer is sequentially consistent. To demonstrate this, one first introduces a relation “—&gt;” on memory requests as follows. Define “A —&gt; B” if and only if 1) A and B are issued by the same processor and A is issued before B, or 2) A and B are issued to the same memory module,</p>

<p>and A is entered in the queue before B (and is thus executed before B). It is easy to see that R1 and R2 imply that is a partial ordering on the set of memory requests. Using the sequentiality of each processor, one can then prove the following result: each fetch and store operation fetches or stores the same value as if all the operations were executed sequentially in any order such that A—&gt;B implies that A is executed before B. This in turn proves the sequential consistency of the multiprocessor computer.</p>

<p><strong>Requirement R2 states that a memory module’s request queue must be serviced in a FIFO order.</strong> This implies that the memory module must remain idle if the request at the head of its queue is a store request for which the value to be stored has not yet been received. Condition R2 can be weakened to allow the memory module to service other requests in this situation. We need only require that all requests to the same memory cell be serviced in the order that they appear in the queue. Requests to different memory cells may be serviced out of order. Sequential consistency is preserved because such a service policy is logically equivalent to considering each memory cell to be a separate memory module with its own request queue. (The fact that these modules may share some hardware affects the rate at which they service requests and the capacity of their queues, but it does not affect the logical property of sequential consistency.)</p>

<p>The requirements needed to guarantee sequential consistency rule out some techniques which can be used to speed up individual sequential processors. For some applications, achieving sequential consistency may not be worth the price of slowing down the processors. In this case, one must be aware that conventional methods for designing multiprocess algorithms cannot be relied upon to produce correctly executing programs. Protocols for synchronizing the processors must be designed at the lowest level of the machine instruction code, and verifying their correctness becomes a monumental task.</p>]]></content><author><name>kikcat</name></author><category term="分布式系统" /><category term="CPU" /><summary type="html"><![CDATA[分布式系统的顺序一致性的来源就是Lamport大神的这篇文章，因为原始的PDF字体清晰度不好，我弄成了文字版的。实际上在CPU这种微观系统中实现顺序一致性也会很浪费性能，作者在最后也提到了更小粒度的顺序一致，比如Memory module级别降低到内存单元(Memory cell)，对于其他的操作依然是out of order，这其实也由total order变为了partial order。 有兴趣的可以下载原文看: 点击查看原文。 Abstract–Many large sequential computers execute operations in a different order than is specified by the program. A correct execution is achieved if the results produced are the same as would be produced by executing the program steps in order. For a multiprocessor computer, such a correct execution by each processor does not guarantee the correct execution of the entire program. Additional conditions are given which do guarantee that a computer correctly executes multiprocess programs. Index Terms-Computer design, concurrent computing, hardware correctness, multiprocessing, parallel processing. A high-speed processor may execute operations in a different order than is specified by the program. The correctness of the execution is guaranteed if the processor satisfies the following condition: the result of an execution is the same as if the operations had been executed in the order specified by the program. A processor satisfying this condition will be called sequential. Consider a computer composed of several such processors accessing a common memory. The customary approach to designing and proving the correctness of multiprocess algorithms for such a computer assumes that the following condition is satisfied: the result of any execution is the same as if the operations of all the processors were executed in some sequential order, and the operations of each individual processor appear in this sequence in the order specified by its program. A multiprocessor satisfying this condition will be called sequentially consistent. The sequentiality of each individual processor does not guarantee that the multi-processor computer is sequentially consistent. In this brief note, we describe a method of interconnecting sequential processors with memory modules that insures the sequential consistency of the resulting multiprocessor.]]></summary></entry><entry><title type="html">Java多态原理 - JVM的静态分派和动态分派</title><link href="https://wiyi.org/java-polymorphism-in-deep.html" rel="alternate" type="text/html" title="Java多态原理 - JVM的静态分派和动态分派" /><published>2023-02-03T00:00:00+00:00</published><updated>2023-02-03T00:00:00+00:00</updated><id>https://wiyi.org/java-polymorphism-in-deep</id><content type="html" xml:base="https://wiyi.org/java-polymorphism-in-deep.html"><![CDATA[<h2 id="0-前言">0. 前言</h2>

<p>多态是面向对象编程模型中一个核心概念，它可以帮助我们写出更具有弹性的代码。相信每个Java开发者都对多态的使用非常熟练，不过可能大多人对于”多态”这一概念的理解仅是浮于表面，对它内部的调用过程以及实现原理缺乏更深一步的认识。本文将对多态的实现原理抽丝剥茧，带大家深入理解多态。</p>

<p>后记: 这篇文章在我的草稿中躺了很久，近期翻出来整理成了一篇文章。近一年来很少写Java相关的东西，一来是深度有限，不想把时间花在没多大意义的地方上跟别人卷;再者还有很多感兴趣的课题想去学习和研究，比如多读一些分布式系统的论文、CS的一些理论等。 学习真正的知识总是缓慢而枯燥的，过程或多或少会感到煎熬，掌握后就会觉得相当充实和充满趣味。以后可能会很少写这类文章了。</p>

<h2 id="1多态的种类">1.多态的种类</h2>

<p>多态(Polymorphism)这个术语在不同的上下文中会有不同的含义。在类型学说(Type Theory)中，多态分为好几个种类，其中最常见的有3种类型，分别为<code class="language-plaintext highlighter-rouge">Ad hoc polymorphism</code>、<code class="language-plaintext highlighter-rouge">Subtyping</code>,<code class="language-plaintext highlighter-rouge">Parametric polymorphism</code> ，而Java实现了这3种类型的多态；在OOP(面向对象编程)中，我们常说的多态指的是类型学说中的Subtyping。</p>

<p>为了方便理解多态调用原理，本文着重介绍前两者。后者因为涉及到类型擦除和单态化(monomorphized)，碍于篇幅，不进行详细介绍。如果读者想系统了解请点击维基百科原词条<a href="https://en.wikipedia.org/wiki/Polymorphism_(computer_science)">Polymorphism (computer science)</a>。</p>

<ul>
  <li>
    <p><em><a href="https://en.wikipedia.org/wiki/Ad_hoc_polymorphism">Ad hoc polymorphism</a></em></p>

    <p>在Java中，方法重载(Method Overloading)属于Ad hoc polymorphism。在这种模式下，我们使用同一个方法名字和返回值，不同的方法参数和类型来区分不同的方法。比如String类中，多个valueOf使用不同的参数类型区分。</p>

    <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//code 1-1</span>
<span class="kd">public</span> <span class="kd">static</span> <span class="nc">String</span> <span class="nf">valueOf</span><span class="o">(</span><span class="kt">boolean</span> <span class="n">b</span><span class="o">)</span> <span class="o">{</span>
  <span class="k">return</span> <span class="n">b</span> <span class="o">?</span> <span class="s">"true"</span> <span class="o">:</span> <span class="s">"false"</span><span class="o">;</span>
<span class="o">}</span>
  
<span class="kd">public</span> <span class="kd">static</span> <span class="nc">String</span> <span class="nf">valueOf</span><span class="o">(</span><span class="kt">char</span> <span class="n">c</span><span class="o">)</span> <span class="o">{</span>
  <span class="kt">char</span> <span class="n">data</span><span class="o">[]</span> <span class="o">=</span> <span class="o">{</span><span class="n">c</span><span class="o">};</span>
  <span class="k">return</span> <span class="k">new</span> <span class="nf">String</span><span class="o">(</span><span class="n">data</span><span class="o">,</span> <span class="kc">true</span><span class="o">);</span>
<span class="o">}</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p><em><a href="https://en.wikipedia.org/wiki/Parametric_polymorphism">Parametric polymorphism</a></em></p>

    <p>我们可以把Parametric polymorphism理解为泛型编程</p>

    <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//code 1-2</span>
<span class="nc">ArrayList</span><span class="o">&lt;</span><span class="nc">String</span><span class="o">&gt;</span> <span class="n">list</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ArrayList</span><span class="o">&lt;&gt;();</span>
<span class="n">list</span><span class="o">.</span><span class="na">add</span><span class="o">(</span><span class="s">"generic programming"</span><span class="o">);</span>
</code></pre></div>    </div>
  </li>
  <li>
    <p><em><a href="https://en.wikipedia.org/wiki/Subtyping">Subtyping</a></em></p>

    <p>Subtyping即是我们最熟悉的一种多态，它描述的是subtype和supertype之间的一种可替换关系，比如下面代码</p>

    <div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//code 1-3</span>
<span class="nc">Charsequence</span> <span class="n">s</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">String</span><span class="o">(</span><span class="s">"bigcat"</span><span class="o">);</span>
</code></pre></div>    </div>

    <p>String实现了Charsequence，在这里我们说String是Charsequence的subtype(String is a subtype of Charsequence)。</p>

    <p>Subtyping类似于集合中的”包含关系(⊆)”。在上面的例子中，可以理解为String是Charsequence的子集。因为它继承自Charsequence，因此使用Charsequence来表达String满足类型安全。</p>
  </li>
</ul>

<!--more-->

<h2 id="2方法分派method-dispatch">2.方法分派(Method Dispatch)</h2>

<p>多态带来了许多好处的同时也引入了一些问题，比如在进行方法调用时，我们如何确定调用哪一个版本的方法呢? 如同上一个小节描述的Overloading，String内部存在多个valueOf方法;同样，Charsequence和String都存在签名(Method Signature)完全一样的方法(比如<code class="language-plaintext highlighter-rouge">length</code>、<code class="language-plaintext highlighter-rouge">charAt</code>等)，我们需要选中其中一个仅且一个版本的方法进行调用。 这个选择某个版本方法的过程就称为方法分派。按照类型划分，又有静态分派和动态分派两种，下面将分别介绍。</p>

<h3 id="21-静态分派static-dispatch">2.1 静态分派(Static Dispatch)</h3>

<p>静态分派是指<strong>选择方法的过程发生在编译期</strong>，它主要实现了Ad hoc polymorphism和Parametric polymorphism(即overloading和generic)。考虑下面代码</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//code 2-1</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">StaticDispatch</span> <span class="o">{</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">speak</span><span class="o">(</span><span class="nc">Animal</span> <span class="n">animal</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"undefined..."</span><span class="o">);</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">speak</span><span class="o">(</span><span class="nc">Cat</span> <span class="n">cat</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"miao..."</span><span class="o">);</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">speak</span><span class="o">(</span><span class="nc">Dog</span> <span class="n">dog</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"wang..."</span><span class="o">);</span>
    <span class="o">}</span>

    <span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="nc">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">Animal</span> <span class="n">cat</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Cat</span><span class="o">();</span>
        <span class="nc">Animal</span> <span class="n">dog</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Dog</span><span class="o">();</span>
        <span class="nc">StaticDispatch</span> <span class="n">speaker</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">StaticDispatch</span><span class="o">();</span>
        <span class="n">speaker</span><span class="o">.</span><span class="na">speak</span><span class="o">(</span><span class="n">cat</span><span class="o">);</span>
        <span class="n">speaker</span><span class="o">.</span><span class="na">speak</span><span class="o">(</span><span class="n">dog</span><span class="o">);</span>
        <span class="n">speaker</span><span class="o">.</span><span class="na">speak</span><span class="o">((</span><span class="nc">Cat</span><span class="o">)</span><span class="n">cat</span><span class="o">);</span>
        <span class="n">speaker</span><span class="o">.</span><span class="na">speak</span><span class="o">((</span><span class="nc">Dog</span><span class="o">)</span><span class="n">dog</span><span class="o">);</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p>上面代码最终输出的结果为</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c"># undefined...</span>
<span class="c"># undefined...</span>
<span class="c"># miao...</span>
<span class="c"># wang...</span>
</code></pre></div></div>

<p>对于熟悉Java的方法重载的朋友可能很轻易就猜到这个输出结果。在上面的代码中，cat和dog的类型定义为Animal，因为Subtyping，这是完全合法的。但编译器在<strong>编译期</strong>无法得知cat和dog的实际类型(实际指向内存哪个对象)，从上面的结果能看出编译器在<strong>编译期</strong>只能根据参数的静态类型去选择某个方法的实现。</p>

<p>为了加深对静态分派的理解，我们查看StaticDispatch的字节码，通过命令<code class="language-plaintext highlighter-rouge">javap -v org.moonto.java.StaticDispatch</code>查看字节码的详细信息。</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//code 2-2</span>
<span class="c1">//注: 为了方便阅读删减了部分不影响阅读的内容</span>

<span class="kd">public</span> <span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">moonto</span><span class="o">.</span><span class="na">java</span><span class="o">.</span><span class="na">StaticDispatch</span>
  <span class="n">minor</span> <span class="nl">version:</span> <span class="mi">0</span>
  <span class="n">major</span> <span class="nl">version:</span> <span class="mi">52</span>
  <span class="nl">flags:</span> <span class="no">ACC_PUBLIC</span><span class="o">,</span> <span class="no">ACC_SUPER</span>
<span class="nc">Constant</span> <span class="nl">pool:</span>
   <span class="err">#</span><span class="mi">1</span> <span class="o">=</span> <span class="nc">Methodref</span>          <span class="err">#</span><span class="mi">16</span><span class="o">.</span><span class="err">#</span><span class="mi">41</span>        <span class="c1">// java/lang/Object."&lt;init&gt;":()V</span>
   <span class="err">#</span><span class="mi">2</span> <span class="o">=</span> <span class="nc">Methodref</span>          <span class="err">#</span><span class="mi">42</span><span class="o">.</span><span class="err">#</span><span class="mi">43</span>        <span class="c1">// org/moonto/java/Animal.speak:()V</span>
   <span class="err">#</span><span class="mi">3</span> <span class="o">=</span> <span class="nc">Fieldref</span>           <span class="err">#</span><span class="mi">44</span><span class="o">.</span><span class="err">#</span><span class="mi">45</span>        <span class="c1">// java/lang/System.out:Ljava/io/PrintStream;</span>
   <span class="err">#</span><span class="mi">4</span> <span class="o">=</span> <span class="nc">String</span>             <span class="err">#</span><span class="mi">46</span>            <span class="c1">// miao...</span>
   <span class="err">#</span><span class="mi">5</span> <span class="o">=</span> <span class="nc">Methodref</span>          <span class="err">#</span><span class="mi">47</span><span class="o">.</span><span class="err">#</span><span class="mi">48</span>        <span class="c1">// java/io/PrintStream.println:(Ljava/lang/String;)V</span>
   <span class="err">#</span><span class="mi">6</span> <span class="o">=</span> <span class="nc">String</span>             <span class="err">#</span><span class="mi">49</span>            <span class="c1">// wang...</span>
   <span class="err">#</span><span class="mi">7</span> <span class="o">=</span> <span class="nc">Class</span>              <span class="err">#</span><span class="mi">50</span>            <span class="c1">// org/moonto/java/Cat</span>
   <span class="err">#</span><span class="mi">8</span> <span class="o">=</span> <span class="nc">Methodref</span>          <span class="err">#</span><span class="mi">7</span><span class="o">.</span><span class="err">#</span><span class="mi">41</span>         <span class="c1">// org/moonto/java/Cat."&lt;init&gt;":()V</span>
   <span class="err">#</span><span class="mi">9</span> <span class="o">=</span> <span class="nc">Class</span>              <span class="err">#</span><span class="mi">51</span>            <span class="c1">// org/moonto/java/Dog</span>
  <span class="err">#</span><span class="mi">10</span> <span class="o">=</span> <span class="nc">Methodref</span>          <span class="err">#</span><span class="mi">9</span><span class="o">.</span><span class="err">#</span><span class="mi">41</span>         <span class="c1">// org/moonto/java/Dog."&lt;init&gt;":()V</span>
  <span class="err">#</span><span class="mi">11</span> <span class="o">=</span> <span class="nc">Class</span>              <span class="err">#</span><span class="mi">52</span>            <span class="c1">// org/moonto/java/StaticDispatch</span>
  <span class="err">#</span><span class="mi">12</span> <span class="o">=</span> <span class="nc">Methodref</span>          <span class="err">#</span><span class="mi">11</span><span class="o">.</span><span class="err">#</span><span class="mi">41</span>        <span class="c1">// org/moonto/java/StaticDispatch."&lt;init&gt;":()V</span>
  <span class="err">#</span><span class="mi">13</span> <span class="o">=</span> <span class="nc">Methodref</span>          <span class="err">#</span><span class="mi">11</span><span class="o">.</span><span class="err">#</span><span class="mi">53</span>        <span class="c1">// org/moonto/java/StaticDispatch.speak:(Lorg/moonto/java/Animal;)V</span>
  <span class="err">#</span><span class="mi">14</span> <span class="o">=</span> <span class="nc">Methodref</span>          <span class="err">#</span><span class="mi">11</span><span class="o">.</span><span class="err">#</span><span class="mi">54</span>        <span class="c1">// org/moonto/java/StaticDispatch.speak:(Lorg/moonto/java/Cat;)V</span>
  <span class="err">#</span><span class="mi">15</span> <span class="o">=</span> <span class="nc">Methodref</span>          <span class="err">#</span><span class="mi">11</span><span class="o">.</span><span class="err">#</span><span class="mi">55</span>        <span class="c1">// org/moonto/java/StaticDispatch.speak:(Lorg/moonto/java/Dog;)V</span>
<span class="o">{</span>
  <span class="kd">public</span> <span class="n">org</span><span class="o">.</span><span class="na">moonto</span><span class="o">.</span><span class="na">java</span><span class="o">.</span><span class="na">StaticDispatch</span><span class="o">();</span>
    <span class="nl">descriptor:</span> <span class="o">()</span><span class="no">V</span>
    <span class="nl">flags:</span> <span class="no">ACC_PUBLIC</span>
    <span class="nl">Code:</span>
      <span class="n">stack</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">locals</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">args_size</span><span class="o">=</span><span class="mi">1</span>
         <span class="mi">0</span><span class="o">:</span> <span class="n">aload_0</span>
         <span class="mi">1</span><span class="o">:</span> <span class="n">invokespecial</span> <span class="err">#</span><span class="mi">1</span>                  <span class="c1">// Method java/lang/Object."&lt;init&gt;":()V</span>
         <span class="mi">4</span><span class="o">:</span> <span class="k">return</span>
      <span class="nl">LineNumberTable:</span>
        <span class="n">line</span> <span class="mi">3</span><span class="o">:</span> <span class="mi">0</span>
      <span class="nl">LocalVariableTable:</span>
        <span class="nc">Start</span>  <span class="nc">Length</span>  <span class="nc">Slot</span>  <span class="nc">Name</span>   <span class="nc">Signature</span>
            <span class="mi">0</span>       <span class="mi">5</span>     <span class="mi">0</span>  <span class="k">this</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">StaticDispatch</span><span class="o">;</span>

  <span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="n">java</span><span class="o">.</span><span class="na">lang</span><span class="o">.</span><span class="na">String</span><span class="o">[]);</span>
    <span class="nl">descriptor:</span> <span class="o">([</span><span class="nc">Ljava</span><span class="o">/</span><span class="n">lang</span><span class="o">/</span><span class="nc">String</span><span class="o">;)</span><span class="no">V</span>
    <span class="nl">flags:</span> <span class="no">ACC_PUBLIC</span><span class="o">,</span> <span class="no">ACC_STATIC</span>
    <span class="nl">Code:</span>
      <span class="n">stack</span><span class="o">=</span><span class="mi">2</span><span class="o">,</span> <span class="n">locals</span><span class="o">=</span><span class="mi">4</span><span class="o">,</span> <span class="n">args_size</span><span class="o">=</span><span class="mi">1</span>
         <span class="mi">0</span><span class="o">:</span> <span class="k">new</span>           <span class="err">#</span><span class="mi">7</span>                  <span class="c1">// class org/moonto/java/Cat</span>
         <span class="mi">3</span><span class="o">:</span> <span class="n">dup</span>
         <span class="mi">4</span><span class="o">:</span> <span class="n">invokespecial</span> <span class="err">#</span><span class="mi">8</span>                  <span class="c1">// Method org/moonto/java/Cat."&lt;init&gt;":()V</span>
         <span class="mi">7</span><span class="o">:</span> <span class="n">astore_1</span>
         <span class="mi">8</span><span class="o">:</span> <span class="k">new</span>           <span class="err">#</span><span class="mi">9</span>                  <span class="c1">// class org/moonto/java/Dog</span>
        <span class="mi">11</span><span class="o">:</span> <span class="n">dup</span>
        <span class="mi">12</span><span class="o">:</span> <span class="n">invokespecial</span> <span class="err">#</span><span class="mi">10</span>                 <span class="c1">// Method org/moonto/java/Dog."&lt;init&gt;":()V</span>
        <span class="mi">15</span><span class="o">:</span> <span class="n">astore_2</span>
        <span class="mi">16</span><span class="o">:</span> <span class="k">new</span>           <span class="err">#</span><span class="mi">11</span>                 <span class="c1">// class org/moonto/java/StaticDispatch</span>
        <span class="mi">19</span><span class="o">:</span> <span class="n">dup</span>
        <span class="mi">20</span><span class="o">:</span> <span class="n">invokespecial</span> <span class="err">#</span><span class="mi">12</span>                 <span class="c1">// Method "&lt;init&gt;":()V</span>
        <span class="mi">23</span><span class="o">:</span> <span class="n">astore_3</span>
        <span class="mi">24</span><span class="o">:</span> <span class="n">aload_3</span>
        <span class="mi">25</span><span class="o">:</span> <span class="n">aload_1</span>
        <span class="mi">26</span><span class="o">:</span> <span class="n">invokevirtual</span> <span class="err">#</span><span class="mi">13</span>                 <span class="c1">// Method speak:(Lorg/moonto/java/Animal;)V</span>
        <span class="mi">29</span><span class="o">:</span> <span class="n">aload_3</span>
        <span class="mi">30</span><span class="o">:</span> <span class="n">aload_2</span>
        <span class="mi">31</span><span class="o">:</span> <span class="n">invokevirtual</span> <span class="err">#</span><span class="mi">13</span>                 <span class="c1">// Method speak:(Lorg/moonto/java/Animal;)V</span>
        <span class="mi">34</span><span class="o">:</span> <span class="n">aload_3</span>
        <span class="mi">35</span><span class="o">:</span> <span class="n">aload_1</span>
        <span class="mi">36</span><span class="o">:</span> <span class="n">checkcast</span>     <span class="err">#</span><span class="mi">7</span>                  <span class="c1">// class org/moonto/java/Cat</span>
        <span class="mi">39</span><span class="o">:</span> <span class="n">invokevirtual</span> <span class="err">#</span><span class="mi">14</span>                 <span class="c1">// Method speak:(Lorg/moonto/java/Cat;)V</span>
        <span class="mi">42</span><span class="o">:</span> <span class="n">aload_3</span>
        <span class="mi">43</span><span class="o">:</span> <span class="n">aload_2</span>
        <span class="mi">44</span><span class="o">:</span> <span class="n">checkcast</span>     <span class="err">#</span><span class="mi">9</span>                  <span class="c1">// class org/moonto/java/Dog</span>
        <span class="mi">47</span><span class="o">:</span> <span class="n">invokevirtual</span> <span class="err">#</span><span class="mi">15</span>                 <span class="c1">// Method speak:(Lorg/moonto/java/Dog;)V</span>
        <span class="mi">50</span><span class="o">:</span> <span class="k">return</span>
      <span class="nl">LineNumberTable:</span>
      <span class="nl">LocalVariableTable:</span>
        <span class="nc">Start</span>  <span class="nc">Length</span>  <span class="nc">Slot</span>  <span class="nc">Name</span>   <span class="nc">Signature</span>
            <span class="mi">0</span>      <span class="mi">51</span>     <span class="mi">0</span>  <span class="n">args</span>   <span class="o">[</span><span class="nc">Ljava</span><span class="o">/</span><span class="n">lang</span><span class="o">/</span><span class="nc">String</span><span class="o">;</span>
            <span class="mi">8</span>      <span class="mi">43</span>     <span class="mi">1</span>   <span class="n">cat</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">Animal</span><span class="o">;</span>
           <span class="mi">16</span>      <span class="mi">35</span>     <span class="mi">2</span>   <span class="n">dog</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">Animal</span><span class="o">;</span>
           <span class="mi">24</span>      <span class="mi">27</span>     <span class="mi">3</span> <span class="n">speaker</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">StaticDispatch</span><span class="o">;</span>
<span class="o">}</span>

</code></pre></div></div>

<p>观察<code class="language-plaintext highlighter-rouge">code 2-2</code>的字节码，留意main方法编号为26和31的指令，这两条指令都是<code class="language-plaintext highlighter-rouge">invokevirtual #13</code>，其中<code class="language-plaintext highlighter-rouge">#13</code>指的是常量池中index为13的常量，即speak(Animal)的方法签名: <code class="language-plaintext highlighter-rouge">speak:(Lorg/moonto/java/Animal;)V</code>。</p>

<p>同样，再观察编号39和47的指令，分别为<code class="language-plaintext highlighter-rouge">invokevirtual #14</code>和<code class="language-plaintext highlighter-rouge">invokevirtual #15</code>，而常量池中14和15代表的是<code class="language-plaintext highlighter-rouge">speak:(Lorg/moonto/java/Cat;)V</code>、<code class="language-plaintext highlighter-rouge">speak:(Lorg/moonto/java/Dog;)V</code>。</p>

<p><code class="language-plaintext highlighter-rouge">code 2-2</code>的字节码能看出调用重载方法在编译期就已经选择好了某个版本的方法。因此在上面的例子中，cat和dog的实际类型并不会影响方法的选择，编译器只能根据它定义的类型进行方法选择。</p>

<h3 id="22-动态分派">2.2 动态分派</h3>

<p>和静态分派相反，动态分派即<strong>方法选择的过程发生在运行时(Runtime)</strong>，因为有些信息在编译时无法被确定(引用在编译器的角度看纯粹是一串符号，需要等待运行时才会被解析为实际引用)。考虑下面的代码:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//code 2-3</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">Animal</span> <span class="o">{</span>
  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">speak</span><span class="o">()</span> <span class="o">{</span>
    <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"undefined"</span><span class="o">);</span>
  <span class="o">}</span>
  
  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">run</span><span class="o">()</span> <span class="o">{</span>
    
  <span class="o">}</span>
<span class="o">}</span>

<span class="kd">public</span> <span class="kd">class</span> <span class="nc">Dog</span> <span class="kd">extends</span> <span class="nc">Animal</span><span class="o">{</span>
    <span class="nd">@Override</span>
    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">speak</span><span class="o">()</span> <span class="o">{</span>
        <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"wang.."</span><span class="o">);</span>
    <span class="o">}</span>
<span class="o">}</span>

<span class="kd">public</span> <span class="kd">class</span> <span class="nc">Cat</span> <span class="kd">extends</span> <span class="nc">Animal</span><span class="o">{</span>
    <span class="nd">@Override</span>
    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">speak</span><span class="o">()</span> <span class="o">{</span>
        <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"miao.."</span><span class="o">);</span>
    <span class="o">}</span>
<span class="o">}</span>

<span class="kd">public</span> <span class="kd">class</span> <span class="nc">DynamicDispatch</span> <span class="o">{</span>

  <span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="nc">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="o">{</span>
    <span class="nc">DynamicDispatch</span> <span class="n">dispatcher</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">DynamicDispatch</span><span class="o">();</span>
    <span class="n">dispatcher</span><span class="o">.</span><span class="na">doDispatch</span><span class="o">(</span><span class="k">new</span> <span class="nc">Cat</span><span class="o">());</span>
    <span class="n">dispatcher</span><span class="o">.</span><span class="na">doDispatch</span><span class="o">(</span><span class="k">new</span> <span class="nc">Dog</span><span class="o">());</span>
  <span class="o">}</span>

  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">doDispatch</span><span class="o">(</span><span class="nc">Animal</span> <span class="n">animal</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">animal</span><span class="o">.</span><span class="na">speak</span><span class="o">();</span>
  <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p>上面代码的执行结果最终输出</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#miao..</span>
<span class="c">#wang..</span>
</code></pre></div></div>

<p>对于任何一个有OOP基础的读者，都能猜到这个输出结果。按照我们对静态分派的理解，如果此时根据静态类型来决定调用方法，那么显然应该输出”undefined”。但上面的例子对方法进行调用时，显然是根据引用的<strong>实际类型</strong>选择方法，我们把这种分派逻辑称为动态分派。</p>

<p>为了深入理解这个原则，同样看一下DynamicDispatch的字节码</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//code 2-4</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">doDispatch</span><span class="o">(</span><span class="n">org</span><span class="o">.</span><span class="na">moonto</span><span class="o">.</span><span class="na">java</span><span class="o">.</span><span class="na">Animal</span><span class="o">);</span>
  <span class="nl">descriptor:</span> <span class="o">(</span><span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">Animal</span><span class="o">;)</span><span class="no">V</span>
  <span class="nl">flags:</span> <span class="no">ACC_PUBLIC</span>
  <span class="nl">Code:</span>
    <span class="n">stack</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">locals</span><span class="o">=</span><span class="mi">2</span><span class="o">,</span> <span class="n">args_size</span><span class="o">=</span><span class="mi">2</span>
       <span class="mi">0</span><span class="o">:</span> <span class="n">aload_1</span>
       <span class="mi">1</span><span class="o">:</span> <span class="n">invokevirtual</span> <span class="err">#</span><span class="mi">9</span>                  <span class="c1">// Method org/moonto/java/Animal.speak:()V</span>
       <span class="mi">4</span><span class="o">:</span> <span class="k">return</span>
    <span class="nl">LineNumberTable:</span>
      <span class="n">line</span> <span class="mi">11</span><span class="o">:</span> <span class="mi">0</span>
      <span class="n">line</span> <span class="mi">12</span><span class="o">:</span> <span class="mi">4</span>
    <span class="nl">LocalVariableTable:</span>
      <span class="nc">Start</span>  <span class="nc">Length</span>  <span class="nc">Slot</span>  <span class="nc">Name</span>   <span class="nc">Signature</span>
        <span class="mi">0</span>       <span class="mi">5</span>     <span class="mi">0</span>  <span class="k">this</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">DynamicDispatch</span><span class="o">;</span>
        <span class="mi">0</span>       <span class="mi">5</span>     <span class="mi">1</span> <span class="n">animal</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">Animal</span><span class="o">;</span>
</code></pre></div></div>

<p>上面的字节码中调用方法的指令为<code class="language-plaintext highlighter-rouge">invokevirtual #9</code>，后面的注释也标注了调用的方法是Animal的方法签名，那么为什么最终会选择了Cat和Dog的speak方法呢？这是因为分派逻辑由<code class="language-plaintext highlighter-rouge">invokevirtual</code>指令定义。</p>

<p>Java虚拟机规范中规定了<code class="language-plaintext highlighter-rouge">invokevirtual</code>指令的逻辑，如下:</p>

<blockquote>
  <p>Let C be the class of <em>objectref</em>. The actual method to be invoked is selected by the following lookup procedure:</p>

  <ul>
    <li>If C contains a declaration for an instance method <code class="language-plaintext highlighter-rouge">m</code> that overrides (<a href="https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-5.html#jvms-5.4.5">§5.4.5</a>) the resolved method, then <code class="language-plaintext highlighter-rouge">m</code> is the method to be invoked, and the lookup procedure terminates.</li>
    <li>Otherwise, if C has a superclass, this same lookup procedure is performed recursively using the direct superclass of C; the method to be invoked is the result of the recursive invocation of this lookup procedure.</li>
    <li>Otherwise, an <code class="language-plaintext highlighter-rouge">AbstractMethodError</code> is raised.</li>
  </ul>
</blockquote>

<p>为了方便理解，这里就不逐字翻译上面的内容(翻译水平差)。它大概要表达的意思如下:</p>

<p>假设C是引用objectref所属的类，实际要调用的方法遵循以下的查找过程:</p>

<ol>
  <li>如果C中存在一个方法<code class="language-plaintext highlighter-rouge">m</code>,它重写(Override)了被解析的方法，那么直接调用此方法，结束查找过程。</li>
  <li>如果第一步中找不到此方法，且C存在父类(superclass)，那么对它的父类重复第一步的查找过程。</li>
  <li>如果都找不到，则抛出AbstractMethodError。</li>
</ol>

<p><code class="language-plaintext highlighter-rouge">invokevirtual</code>指令定义可以看出我们平时熟悉的多态(Subtyping)调用原理是由这条指令本身的逻辑提供的。</p>

<p>需要注意的是，虽然JVM的规范(Specification)这样规定<code class="language-plaintext highlighter-rouge">invokevirtual</code>指令的查找过程，但实际上JVM的实现(Implementation)只需要实现规范中要求的功能即可，并不一定完全按照这种死板的方式查找。</p>

<h3 id="23-单分派和多分派">2.3 单分派和多分派</h3>

<p>上面的分派逻辑中，有两个因素会影响方法分派的结果，即根据<strong>方法的接收者(method receiver，被调用对象本身)的类型</strong>和<strong>参数类型</strong>进行分派。</p>

<p><strong>只根据方法接收者的实际类型进行分派的称为单分派; 根据方法接收者，参数等多个组合进行分派的称为多分派</strong>。在Java中，方法调用都是属于单分派(注2)。因为JVM只会根据方法接收者的实际类型进行分派，而被调用的方法(签名)已经在编译期就被确定，参数的实际类型在运行时再也无法影响方法的选择过程。</p>

<p><strong>注1</strong>: 在OOP中，我们把方法调用称为给对象发送消息，被调用对象本身是消息接受者，因此称为方法接收者。</p>

<p><strong>注2</strong>: 关于Java的静态分派是否属于多分派可能存在争议，在《深入理解Java虚拟机》这本书中，作者认为静态分派属于多分派。大概是因为Java静态分派在选择方法的过程确实会根据静态类型进行选择。个人觉得这种解释有待商榷。对于读者而言，理解这个概念，以及为什么会有这种争议更加重要。</p>

<h2 id="3-jvm运行时内存区域">3. JVM运行时内存区域</h2>

<p>上一小节介绍了多态调用的一部分原理，不过我们对方法以什么样的方式存储在内存中仍是非常抽象，要想彻底理解方法分派原理，必须要弄懂方法在内存中的布局。在正式开始之前，我们先复习一下Java虚拟机运行时的内存区域。</p>

<h3 id="31-内存区域">3.1 内存区域</h3>

<p>Java虚拟机把内存划分出几个不同的区域，分别为method area、heap、java stacks、pc registers、native method stacks，它们分别负责存放不同类型的数据。</p>

<p><img src="https://user-images.githubusercontent.com/3600657/175875845-c9a742f8-b393-45f9-ad6b-0c921f121288.png" alt="jvm's runtime memory area" /></p>

<p>如上图所示，对于理解JVM的朋友会非常熟悉。在本文中，我们将重点放在method area、heap、java stacks这3个区域上。</p>

<ul>
  <li>
    <p>Stacks (Java Virtual Machine Stacks)</p>

    <p>每一个JVM的线程都拥有一块独立的内存叫Stacks，它伴随着这个线程的生命周期。对于这块内存，主要用于存储栈帧(Stack Frame)。在本文后续的介绍中，将会看到进行方法调用之前，需要先把对象的引用(objectref)push到栈帧的操作栈(oprand stack)中。</p>
  </li>
  <li>
    <p>Heap</p>

    <p>Heap是一块被所有线程共享的内存区域，所有对象(即class instances)都在这块区域分配内存。在后续的介绍中，我们将会看到普通对象在这块区域中的内存布局。</p>
  </li>
  <li>
    <p>Method Area</p>

    <p>Method Area也是一块被所有线程共享的区域，主要用来存储类的信息。比如类结构中的常量池(constant pool)，方法数据(method data)等。</p>
  </li>
</ul>

<h3 id="32-类的结构和类加载">3.2 类的结构和类加载</h3>

<p>Java的源代码会被编译成一个class文件，它包含我们定义的常量、方法等等信息，结构如下</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ClassFile <span class="o">{</span>
    u4             magic<span class="p">;</span>
    u2             minor_version<span class="p">;</span>
    u2             major_version<span class="p">;</span>
    u2             constant_pool_count<span class="p">;</span>
    cp_info        constant_pool[constant_pool_count-1]<span class="p">;</span>
    u2             access_flags<span class="p">;</span>
    u2             this_class<span class="p">;</span>
    u2             super_class<span class="p">;</span>
    u2             interfaces_count<span class="p">;</span>
    u2             interfaces[interfaces_count]<span class="p">;</span>
    u2             fields_count<span class="p">;</span>
    field_info     fields[fields_count]<span class="p">;</span>
    u2             methods_count<span class="p">;</span>
    method_info    methods[methods_count]<span class="p">;</span>
    u2             attributes_count<span class="p">;</span>
    attribute_info attributes[attributes_count]<span class="p">;</span>
<span class="o">}</span>
</code></pre></div></div>

<p>当类被ClassLoader加载时，类的信息就会按照某种格式存放到内存中的方法区</p>

<p><img src="https://user-images.githubusercontent.com/3600657/175875859-4d4a86ad-2017-4e82-8054-464259b322b4.png" alt="jvm's classloading" /></p>

<p><img src="https://user-images.githubusercontent.com/3600657/175875863-b87cad03-1369-4685-a5bb-251ca6a2a9fa.png" alt="image-20220626131609198" /></p>

<h2 id="4-方法的内存布局">4. 方法的内存布局</h2>

<p>Java虚拟机规范并没有强制虚拟机的实现者应该要如何实现方法的内存布局，它把具体实现细节交给实现虚拟机的开发者。虚拟机的实现只需要保证<code class="language-plaintext highlighter-rouge">invokevirtual</code>这条指令的定义的语义即可。</p>

<h3 id="41-虚方法virtual-method">4.1 虚方法(Virtual method)</h3>

<p><code class="language-plaintext highlighter-rouge">invokevirtual</code>这条指令实际上调用的是虚方法，在正式开始介绍方法布局之前，有必要先了解一下虚方法。</p>

<p>虚方法又叫虚函数(Virtual function)，学习过C++的读者应该很清楚这个概念。简单来说，虚方法就是那些可以被继承(inheritable)、重写(Overridable)的方法。因为这些特性，这些方法在编译期无法确定调用的版本，只能在运行时进行动态分派。</p>

<p>Java中的非static、final、private的方法就属于虚方法，它们无法在编译期就被确定方法的版本。反过来说，被static、final、private修饰的方法，因为不可被继承和重写，在类加载时就可以确定方法的版本，无需进行动态分派。比如private方法调用时使用的是<code class="language-plaintext highlighter-rouge">invokespecial</code>指令而不是<code class="language-plaintext highlighter-rouge">invokevirtual</code>。</p>

<p>更多关于虚方法的介绍，<a href="https://en.wikipedia.org/wiki/Virtual_function">点击进入</a>维基百科页面。</p>

<h3 id="42-虚方法表virtual-method-table">4.2 虚方法表(Virtual method table)</h3>

<p>Virtual method table别名很多，也叫virtual function table，vtable等。为了方便，下面统一使用vtable。</p>

<p>回想<code class="language-plaintext highlighter-rouge">invokevirtual</code>这条指令的定义，它存在递归查找过程。动态分派是极为频繁的操作，如果每次分派都递归查找方法显然效率非常低下。vtable在内存中为每个类都维护了一个方法表(method table)，可以把该表看作是一个装着指针的数组，指针指向了具体的方法数据。</p>

<p>虚方法表中记录了类自己拥有的方法以及它从superclass中继承过来的方法。</p>

<p><img src="https://user-images.githubusercontent.com/3600657/175875872-9307132b-cba8-4b0b-8db2-021635fee1af.png" alt="virtual method table" /></p>

<p>对于从superclass中继承的方法，如果subclass对它进行了重写(Orverride)，那么方法表中的指针会指向subclass的方法数据; 如果subclass没有重写，指针指向superclass的方法数据。</p>

<p><img src="https://user-images.githubusercontent.com/3600657/175875890-46a24b7d-4e8a-4257-af8a-8f3b1d03e246.png" alt="virtual method table" /></p>

<p>上图是Dog的方法表，深灰色部分是从Object继承过来的方法，Dog类没有重写这部分方法，因此方法表中存放的只指向Object方法数据的指针; 浅灰色的run也是从Animal继承过来的方法，这部分也没有重写，所以存放的是指向Animal的指针; 白色部分也是从Animal继承的方法，因为Dog内部重写了该方法，所以这里是指向Dog方法数据的指针。</p>

<p>值得注意的是，虚方法表只会记录虚方法，非虚方法因为在类加载阶段就可以把符号引用转化为直接引用，因此并不需要记录在虚方法表中。换而言之，非虚方法无需动态分派。</p>

<p>通过方法表可以优化<code class="language-plaintext highlighter-rouge">invokevirtual</code>指令中的递归查询从而提高查询性能，同时也能间接实现<code class="language-plaintext highlighter-rouge">invokevirtual</code>指令的要求。</p>

<h2 id="5-对象的结构">5. 对象的结构</h2>
<p>理解了方法在内存的布局，最后一步就是理解一个对象是如何查找到方法表中的数据。当Java虚拟机遇到new指令时，就会触发类加载，完成后给对象分配内存，对象的格式如下</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#The layout of regular objects in memory</span>

+---------------+	  +---------------+
| Object Header |  <span class="nt">---</span><span class="o">&gt;&gt;</span>  |   mark word   | 
+---------------+         +---------------+       
| Instance Data |	  | klass pointer | <span class="nt">---</span><span class="o">&gt;&gt;</span> ptr to special structure
+---------------+         +---------------+
|    Padding    |	
+---------------+ 

</code></pre></div></div>

<p>对象在内存中主要分为3部分结构，分别是Header、Data、Padding。</p>

<ul>
  <li>
    <p>Object Header</p>

    <ul>
      <li>
        <p>mark word</p>

        <p>mark word对于熟悉Java多线程机制和垃圾回收的朋友会比较熟悉，这里存放了对象的GC状态、锁、hashcode等等信息。</p>
      </li>
      <li>
        <p>klass pointer</p>

        <p>klass pointer是一个指向另一个描述了当前对象方法的布局的指针。简单来说，就是一个指向了vtable的一个指针。</p>
      </li>
    </ul>
  </li>
  <li>
    <p>Instance Data</p>

    <p>这部分数据是对象的属性，包括对象从父类中继承过来的属性</p>
  </li>
  <li>
    <p>Padding</p>

    <p>这部分主要用作内存对齐，因为Instance Data部分数据大小不确定，因此需要添加额外的字节做内存对齐。</p>
  </li>
</ul>

<p>下面是一个更完整的展示对象结构的图</p>

<p><img src="https://user-images.githubusercontent.com/3600657/175878256-2690842b-63ad-47b5-b5d6-8374306399a1.jpg" alt="" /></p>

<p>仔细回顾<code class="language-plaintext highlighter-rouge">code 2-4</code>中的字节码，在执行<code class="language-plaintext highlighter-rouge">invokevirtual</code>指令之前，还执行了aload_1指令，这条指令的的作用是从栈帧中的Local Variable中加载slot为1的变量，并push到操作栈(oprand stack)。</p>

<p><code class="language-plaintext highlighter-rouge">code 2-4</code>中slot为1的变量正是animal的符号引用<code class="language-plaintext highlighter-rouge">Lorg/moonto/java/Animal</code>，执行invokevirtual指令后，JVM从操作栈中读取这个符号引用，并解析为内存中的直接引用。因为对象的Header中有一个指向class的指针，因此就可以按照<code class="language-plaintext highlighter-rouge">invokevirtual</code>的分派逻辑进行方法调用。</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//duplicate code 2-4</span>
<span class="kd">public</span> <span class="kt">void</span> <span class="nf">doDispatch</span><span class="o">(</span><span class="n">org</span><span class="o">.</span><span class="na">moonto</span><span class="o">.</span><span class="na">java</span><span class="o">.</span><span class="na">Animal</span><span class="o">);</span>
  <span class="nl">descriptor:</span> <span class="o">(</span><span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">Animal</span><span class="o">;)</span><span class="no">V</span>
  <span class="nl">flags:</span> <span class="no">ACC_PUBLIC</span>
  <span class="nl">Code:</span>
    <span class="n">stack</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">locals</span><span class="o">=</span><span class="mi">2</span><span class="o">,</span> <span class="n">args_size</span><span class="o">=</span><span class="mi">2</span>
       <span class="mi">0</span><span class="o">:</span> <span class="n">aload_1</span>
       <span class="mi">1</span><span class="o">:</span> <span class="n">invokevirtual</span> <span class="err">#</span><span class="mi">9</span>                  <span class="c1">// Method org/moonto/java/Animal.speak:()V</span>
       <span class="mi">4</span><span class="o">:</span> <span class="k">return</span>
    <span class="nl">LocalVariableTable:</span>
      <span class="nc">Start</span>  <span class="nc">Length</span>  <span class="nc">Slot</span>  <span class="nc">Name</span>   <span class="nc">Signature</span>
        <span class="mi">0</span>       <span class="mi">5</span>     <span class="mi">0</span>  <span class="k">this</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">DynamicDispatch</span><span class="o">;</span>
        <span class="mi">0</span>       <span class="mi">5</span>     <span class="mi">1</span> <span class="n">animal</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">moonto</span><span class="o">/</span><span class="n">java</span><span class="o">/</span><span class="nc">Animal</span><span class="o">;</span>
</code></pre></div></div>

<p>至此，方法调用的完成流程已经全部介绍完毕。</p>

<h2 id="6-小结">6. 小结</h2>

<p>本文详细介绍了Java的方法分派过程，其中包括静态分派和动态分派以及方法在内存中的布局。静态分派发生在编译期，由编译器进行方法分派。动态分派发生在运行时，由虚拟机中的<code class="language-plaintext highlighter-rouge">invokevitrual</code>指令决定分派逻辑。</p>

<p>一般较为常见的支持动态分派的机制是Virtual method table，Java的Hotspot虚拟机，C++都在使用这种机制实现动态分派。</p>

<p>本文虽然介绍的是Java的方法分派过程，但所有面向对象语言都会面临这个问题，所有相关的语言都会有相关的机制实现方法分派，读者有兴趣可以参考引用的资料进一步了解相关知识，达到触类旁通的效果。</p>

<h2 id="7-引用资料">7. 引用资料</h2>

<p>《Inside the Java Virtual Machine》 <br />
《深入理解Java虚拟机: JVM高级特性与最佳实践》 <br />
<a href="https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.12.1">https://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html</a> <br />
<a href="https://en.wikipedia.org/wiki/Subtyping">https://en.wikipedia.org/wiki/Subtyping</a> <br />
<a href="https://en.wikipedia.org/wiki/Dynamic_dispatch">https://en.wikipedia.org/wiki/Dynamic_dispatch</a> <br />
<a href="https://en.wikipedia.org/wiki/Virtual_method_table">https://en.wikipedia.org/wiki/Virtual_method_table</a> <br />
<a href="https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html#jvms-6.5.invokevirtual">https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-6.html</a></p>]]></content><author><name>kikcat</name></author><category term="polymorphism" /><summary type="html"><![CDATA[0. 前言 多态是面向对象编程模型中一个核心概念，它可以帮助我们写出更具有弹性的代码。相信每个Java开发者都对多态的使用非常熟练，不过可能大多人对于”多态”这一概念的理解仅是浮于表面，对它内部的调用过程以及实现原理缺乏更深一步的认识。本文将对多态的实现原理抽丝剥茧，带大家深入理解多态。 后记: 这篇文章在我的草稿中躺了很久，近期翻出来整理成了一篇文章。近一年来很少写Java相关的东西，一来是深度有限，不想把时间花在没多大意义的地方上跟别人卷;再者还有很多感兴趣的课题想去学习和研究，比如多读一些分布式系统的论文、CS的一些理论等。 学习真正的知识总是缓慢而枯燥的，过程或多或少会感到煎熬，掌握后就会觉得相当充实和充满趣味。以后可能会很少写这类文章了。 1.多态的种类 多态(Polymorphism)这个术语在不同的上下文中会有不同的含义。在类型学说(Type Theory)中，多态分为好几个种类，其中最常见的有3种类型，分别为Ad hoc polymorphism、Subtyping,Parametric polymorphism ，而Java实现了这3种类型的多态；在OOP(面向对象编程)中，我们常说的多态指的是类型学说中的Subtyping。 为了方便理解多态调用原理，本文着重介绍前两者。后者因为涉及到类型擦除和单态化(monomorphized)，碍于篇幅，不进行详细介绍。如果读者想系统了解请点击维基百科原词条Polymorphism (computer science)。 Ad hoc polymorphism 在Java中，方法重载(Method Overloading)属于Ad hoc polymorphism。在这种模式下，我们使用同一个方法名字和返回值，不同的方法参数和类型来区分不同的方法。比如String类中，多个valueOf使用不同的参数类型区分。 //code 1-1 public static String valueOf(boolean b) { return b ? "true" : "false"; } public static String valueOf(char c) { char data[] = {c}; return new String(data, true); } Parametric polymorphism 我们可以把Parametric polymorphism理解为泛型编程 //code 1-2 ArrayList&lt;String&gt; list = new ArrayList&lt;&gt;(); list.add("generic programming"); Subtyping Subtyping即是我们最熟悉的一种多态，它描述的是subtype和supertype之间的一种可替换关系，比如下面代码 //code 1-3 Charsequence s = new String("bigcat"); String实现了Charsequence，在这里我们说String是Charsequence的subtype(String is a subtype of Charsequence)。 Subtyping类似于集合中的”包含关系(⊆)”。在上面的例子中，可以理解为String是Charsequence的子集。因为它继承自Charsequence，因此使用Charsequence来表达String满足类型安全。]]></summary></entry><entry><title type="html">解决黑苹果Monterey蓝牙睡眠后不工作问题</title><link href="https://wiyi.org/fixed-sleep.html" rel="alternate" type="text/html" title="解决黑苹果Monterey蓝牙睡眠后不工作问题" /><published>2022-10-15T00:00:00+00:00</published><updated>2022-10-15T00:00:00+00:00</updated><id>https://wiyi.org/fixed-sleep</id><content type="html" xml:base="https://wiyi.org/fixed-sleep.html"><![CDATA[<p>自Monterey(macOS 12.x)以来，博通BCM94360的网卡蓝牙模块可能会出现问题，具体表现为睡眠唤醒后，蓝牙会出现睡死的情况，即需要进入系统把蓝牙关了重新打开才能正常工作。</p>

<p>苹果切换到Apple silicon后，貌似Opencore现在对Hackintosh也没那么上心了，蓝牙问题已经挺久。现在曲线救国的办法是在睡眠之前把蓝牙关闭，睡醒后再打开，这样会省事很多。我们当然不会手动去做这个事，mac下刚好有个app叫sleepwatcher，可以监控睡眠和唤醒。</p>

<h4 id="准备工作">准备工作</h4>

<ol>
  <li>
    <p>下载sleepwatcher</p>

    <p>推荐官网直接下载，地址: https://www.bernhard-baehr.de/。(地址可能需要翻墙才能访问)</p>
  </li>
  <li>
    <p>安装blueutil</p>

    <p>blueutil是一个蓝牙工具，可以通过命令打开和关闭蓝牙，推荐使用homebrew下载</p>

    <div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>brew <span class="nb">install </span>blueutil
</code></pre></div>    </div>
  </li>
</ol>

<h4 id="安装sleepwatcher">安装sleepwatcher</h4>

<p>首先通过终端进入sleepwatcher的文件夹，以我的为例，版本为<code class="language-plaintext highlighter-rouge">sleepwatcher_2.2.1</code></p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cd </span>sleepwatcher_2.2.1
<span class="nb">sudo cp</span> ./sleepwatcher /usr/local/sbin
<span class="nb">sudo cp</span> ./sleepwatcher.8 /usr/local/share/man/man8

<span class="c">#创建数据文件夹，这个可以自定义，如果选择不同的路径，下面的路径也必须跟着修改</span>
<span class="nb">mkdir</span> ~/.sleep
</code></pre></div></div>
<!--more-->

<p>紧接着创建2个脚本，一个用于启动蓝牙，一个用于关闭，分别把这两个文件放到<code class="language-plaintext highlighter-rouge">~/.sleep/</code> <br />
下面是rc.sleep文件，用于睡眠时关闭蓝牙</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
<span class="c"># rc.sleep</span>
<span class="c"># Stop Bluetooth Module on Mac OS X</span>
<span class="c">#</span>
<span class="c"># Requires Blueutil to be installed: http://brewformulas.org/blueutil</span>

<span class="nv">BT</span><span class="o">=</span><span class="s2">"/usr/local/bin/blueutil"</span>

log<span class="o">()</span> <span class="o">{</span>
	<span class="c"># logger -p notice -t bt_restarter "$@"</span>
	<span class="nb">echo</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span> <span class="o">&gt;&gt;</span> ~/.sleep/sleepwatcher.log
<span class="o">}</span>

err<span class="o">()</span> <span class="o">{</span>
	<span class="nb">echo</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span> <span class="o">&gt;</span>&amp;2
	<span class="nb">echo</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span> <span class="o">&gt;&gt;</span> ~/.sleep/sleepwatcher.log
	<span class="c"># logger -p error -t bt_restarter "$@"</span>
<span class="o">}</span>

log <span class="s2">""</span>
log <span class="s2">"sleep at </span><span class="si">$(</span><span class="nb">date</span> +<span class="s2">"%Y-%m-%dT%H:%M:%S"</span><span class="si">)</span><span class="s2">"</span>
<span class="k">if</span> <span class="o">[</span> <span class="nt">-f</span> <span class="s2">"</span><span class="nv">$BT</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
	if</span> <span class="o">[[</span> <span class="si">$(</span><span class="s2">"</span><span class="nv">$BT</span><span class="s2">"</span> <span class="nt">-p</span><span class="si">)</span> <span class="o">==</span> 0 <span class="o">]]</span><span class="p">;</span> <span class="k">then
		</span>log <span class="s2">"Bluetooth is off, nothing to do."</span>
	<span class="k">else
		</span>log <span class="s2">"Bluetooth on, stopping ..."</span>
		<span class="o">(</span><span class="si">$(</span><span class="s2">"</span><span class="nv">$BT</span><span class="s2">"</span> <span class="nt">-p</span> 0<span class="si">)</span> &amp;&gt; /dev/null <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Bluetooth Module stopped"</span><span class="o">)</span> <span class="o">||</span> <span class="o">(</span>err <span class="s2">"Couldn't stop Bluetooth Module"</span> <span class="o">&amp;&amp;</span> <span class="nb">exit </span>1<span class="o">)</span> 
		log <span class="s2">"Successfully stoped Bluetooth"</span> <span class="o">&amp;&amp;</span> <span class="nb">exit </span>0
	<span class="k">fi
else
	</span>err <span class="s2">"Couldn't find blueutil, please install http://brewformulas.org/blueutil"</span> <span class="o">&amp;&amp;</span> <span class="nb">exit </span>1
<span class="k">fi</span>
</code></pre></div></div>

<p>下面是rc.wakeup，用户唤醒时打开蓝牙</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
<span class="c"># rc.wakeup</span>
<span class="c"># Restart Bluetooth Module on Mac OS X</span>
<span class="c">#</span>
<span class="c"># Requires Blueutil to be installed: http://brewformulas.org/blueutil</span>

<span class="nv">BT</span><span class="o">=</span><span class="s2">"/usr/local/bin/blueutil"</span>

log<span class="o">()</span> <span class="o">{</span>
	<span class="c"># logger -p notice -t bt_restarter "$@"</span>
	<span class="nb">echo</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span> <span class="o">&gt;&gt;</span> ~/.sleep/sleepwatcher.log
<span class="o">}</span>

err<span class="o">()</span> <span class="o">{</span>
	<span class="nb">echo</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span> <span class="o">&gt;</span>&amp;2
	<span class="nb">echo</span> <span class="s2">"</span><span class="nv">$@</span><span class="s2">"</span> <span class="o">&gt;&gt;</span> ~/.sleep/sleepwatcher.log
	<span class="c"># logger -p error -t bt_restarter "$@"</span>
<span class="o">}</span>

log <span class="s2">"wakeup at </span><span class="si">$(</span><span class="nb">date</span> +<span class="s2">"%Y-%m-%dT%H:%M:%S"</span><span class="si">)</span><span class="s2">"</span>
<span class="k">if</span> <span class="o">[</span> <span class="nt">-f</span> <span class="s2">"</span><span class="nv">$BT</span><span class="s2">"</span> <span class="o">]</span><span class="p">;</span> <span class="k">then
	if</span> <span class="o">[[</span> <span class="si">$(</span><span class="s2">"</span><span class="nv">$BT</span><span class="s2">"</span> <span class="nt">-p</span><span class="si">)</span> <span class="o">==</span> 0 <span class="o">]]</span><span class="p">;</span> <span class="k">then
		</span>log <span class="s2">"Bluetooth is off, Starting Bluetooth..."</span>
		<span class="o">(</span><span class="si">$(</span><span class="s2">"</span><span class="nv">$BT</span><span class="s2">"</span> <span class="nt">-p</span> 1<span class="si">)</span> &amp;&gt; /dev/null <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Bluetooth Module started"</span><span class="o">)</span> <span class="o">||</span> <span class="o">(</span>err <span class="s2">"Couldn't start Bluetooth Module"</span> <span class="o">&amp;&amp;</span> <span class="nb">exit </span>1<span class="o">)</span> 
	<span class="k">else
		</span>log <span class="s2">"Bluetooth on, restarting ..."</span>
		<span class="o">(</span><span class="si">$(</span><span class="s2">"</span><span class="nv">$BT</span><span class="s2">"</span> <span class="nt">-p</span> 0<span class="si">)</span> &amp;&gt; /dev/null <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Bluetooth Module stopped"</span><span class="o">)</span> <span class="o">||</span> <span class="o">(</span>err <span class="s2">"Couldn't stop Bluetooth Module"</span> <span class="o">&amp;&amp;</span> <span class="nb">exit </span>1<span class="o">)</span>
		<span class="o">(</span><span class="si">$(</span><span class="s2">"</span><span class="nv">$BT</span><span class="s2">"</span> <span class="nt">-p</span> 1<span class="si">)</span> &amp;&gt; /dev/null <span class="o">&amp;&amp;</span> <span class="nb">echo</span> <span class="s2">"Bluetooth Module started"</span><span class="o">)</span> <span class="o">||</span> <span class="o">(</span>err <span class="s2">"Couldn't start Bluetooth Module"</span> <span class="o">&amp;&amp;</span> <span class="nb">exit </span>1<span class="o">)</span> 
		log <span class="s2">"Successfully restarted Bluetooth"</span> <span class="o">&amp;&amp;</span> <span class="nb">exit </span>0
	<span class="k">fi
else
	</span>err <span class="s2">"Couldn't find blueutil, please install http://brewformulas.org/blueutil"</span> <span class="o">&amp;&amp;</span> <span class="nb">exit </span>1
<span class="k">fi</span>
</code></pre></div></div>

<p>下一步是给这两个文件添加上可执行权限</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cd</span> ~/.sleep
<span class="nb">chmod </span>u+x rc.sleep rc.wakeup
</code></pre></div></div>

<p>成功后，创建文件<code class="language-plaintext highlighter-rouge"> ~/Library/LaunchAgents/de.bernhard-baehr.sleepwatcher.plist</code>，粘贴下面内容</p>

<div class="language-xml highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">&lt;?xml version="1.0" encoding="UTF-8"?&gt;</span>
<span class="cp">&lt;!DOCTYPE plist PUBLIC "-//Apple Computer//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd"&gt;</span>
<span class="nt">&lt;plist</span> <span class="na">version=</span><span class="s">"1.0"</span><span class="nt">&gt;</span>
<span class="nt">&lt;dict&gt;</span>
	<span class="nt">&lt;key&gt;</span>Label<span class="nt">&lt;/key&gt;</span>
	<span class="nt">&lt;string&gt;</span>de.bernhard-baehr.sleepwatcher<span class="nt">&lt;/string&gt;</span>
	<span class="nt">&lt;key&gt;</span>ProgramArguments<span class="nt">&lt;/key&gt;</span>
	<span class="nt">&lt;array&gt;</span>
		<span class="nt">&lt;string&gt;</span>/usr/local/sbin/sleepwatcher<span class="nt">&lt;/string&gt;</span>
		<span class="nt">&lt;string&gt;</span>-V<span class="nt">&lt;/string&gt;</span>
		<span class="nt">&lt;string&gt;</span>-s ~/.sleep/rc.sleep<span class="nt">&lt;/string&gt;</span>
		<span class="nt">&lt;string&gt;</span>-w ~/.sleep/rc.wakeup<span class="nt">&lt;/string&gt;</span>
	<span class="nt">&lt;/array&gt;</span>
	<span class="nt">&lt;key&gt;</span>RunAtLoad<span class="nt">&lt;/key&gt;</span>
	<span class="nt">&lt;true/&gt;</span>
	<span class="nt">&lt;key&gt;</span>KeepAlive<span class="nt">&lt;/key&gt;</span>
	<span class="nt">&lt;true/&gt;</span>
<span class="nt">&lt;/dict&gt;</span>
<span class="nt">&lt;/plist&gt;</span>
</code></pre></div></div>

<p>最后一步，执行命令</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>launchctl load de.bernhard-baehr.sleepwatcher.plist
</code></pre></div></div>

<p>执行完后，可能会弹窗请求授权，这时正常通过即可，这时你可以直接直接选择睡眠和唤醒，查看效果。</p>

<p>你可以通过执行命令<code class="language-plaintext highlighter-rouge">cat ~/.sleep/sleepwatcher.log</code>查看睡眠和唤醒时启动和关闭蓝牙的日志。</p>

<h3 id="总结">总结</h3>

<p>sleepwatcher是个很有意思的工具，监控电脑睡眠和唤醒有些场景可能会很有用，比如这次恰好可以解决蓝牙睡眠问题。还有就是之前我的黑苹果睡眠偶尔会异常唤醒，自从关掉蓝牙后睡眠再也没遇到过异常唤醒的情况。</p>]]></content><author><name>kikcat</name></author><category term="hackintosh" /><summary type="html"><![CDATA[自Monterey(macOS 12.x)以来，博通BCM94360的网卡蓝牙模块可能会出现问题，具体表现为睡眠唤醒后，蓝牙会出现睡死的情况，即需要进入系统把蓝牙关了重新打开才能正常工作。 苹果切换到Apple silicon后，貌似Opencore现在对Hackintosh也没那么上心了，蓝牙问题已经挺久。现在曲线救国的办法是在睡眠之前把蓝牙关闭，睡醒后再打开，这样会省事很多。我们当然不会手动去做这个事，mac下刚好有个app叫sleepwatcher，可以监控睡眠和唤醒。 准备工作 下载sleepwatcher 推荐官网直接下载，地址: https://www.bernhard-baehr.de/。(地址可能需要翻墙才能访问) 安装blueutil blueutil是一个蓝牙工具，可以通过命令打开和关闭蓝牙，推荐使用homebrew下载 brew install blueutil 安装sleepwatcher 首先通过终端进入sleepwatcher的文件夹，以我的为例，版本为sleepwatcher_2.2.1 cd sleepwatcher_2.2.1 sudo cp ./sleepwatcher /usr/local/sbin sudo cp ./sleepwatcher.8 /usr/local/share/man/man8 #创建数据文件夹，这个可以自定义，如果选择不同的路径，下面的路径也必须跟着修改 mkdir ~/.sleep]]></summary></entry><entry><title type="html">恒星的演变过程</title><link href="https://wiyi.org/evolution-of-the-stars.html" rel="alternate" type="text/html" title="恒星的演变过程" /><published>2022-07-18T00:00:00+00:00</published><updated>2022-07-18T00:00:00+00:00</updated><id>https://wiyi.org/evolution-of-the-stars</id><content type="html" xml:base="https://wiyi.org/evolution-of-the-stars.html"><![CDATA[<p>Youtube有个很有趣的频道叫Discovery With Andy，频道主题大多跟天文有关，讲的特别有趣。这期讲恒星的演变过程非常精彩，虽然这些知识可以在维基百科轻易找到，不过面对那些文字还是缺乏趣味性。</p>

<p>Andy的视频配乐和资料都是上乘的，这期恒星演变是我最喜欢的一期。</p>

<p>回想十几年前写博客时特别喜欢分享一些乱七八糟的东西，现在博客都是一些技术资料，显得有些单调。
<!--more--></p>

<p>恆星生命歷程 (上)：從誕生到白矮星|White Dwarf</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/Qlfxj6qLRJs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>

<p>恆星生命歷程 (下)：超新星與黑洞</p>
<iframe width="560" height="315" src="https://www.youtube.com/embed/g2byfaClqOA" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe>]]></content><author><name>kikcat</name></author><category term="water" /><summary type="html"><![CDATA[Youtube有个很有趣的频道叫Discovery With Andy，频道主题大多跟天文有关，讲的特别有趣。这期讲恒星的演变过程非常精彩，虽然这些知识可以在维基百科轻易找到，不过面对那些文字还是缺乏趣味性。 Andy的视频配乐和资料都是上乘的，这期恒星演变是我最喜欢的一期。 回想十几年前写博客时特别喜欢分享一些乱七八糟的东西，现在博客都是一些技术资料，显得有些单调。]]></summary></entry><entry><title type="html">EIP55-以太坊账户地址校验算法</title><link href="https://wiyi.org/eth-eip55.html" rel="alternate" type="text/html" title="EIP55-以太坊账户地址校验算法" /><published>2022-04-27T00:00:00+00:00</published><updated>2022-04-27T00:00:00+00:00</updated><id>https://wiyi.org/eth-address-checksum</id><content type="html" xml:base="https://wiyi.org/eth-eip55.html"><![CDATA[<p>以太坊账户地址是一个40字符的hex string，细心的朋友可能会发现账户地址有些字母包含了大写字母和小写字母，但实际交易时，用小些地址也可以正常执行操作。</p>

<p>查了一下资料，发现原来包含大写字母的是经过了校验和(checksum)的地址</p>

<blockquote>
  <ul>
    <li><code class="language-plaintext highlighter-rouge">0x7cb57b5a97eabe94205c07890be4c1ad31e486a8</code></li>
    <li><code class="language-plaintext highlighter-rouge">0x7cB57B5A97eAbe94205C07890BE4c1aD31E486A8</code></li>
  </ul>
</blockquote>

<p>上面两个地址的区别是，前者不包含校验和(checksum)，后者是包含校验和的地址。这个校验和有什么作用呢？因为以太坊地址本身不包含校验信息，一旦用户输入错误，没有一种机制校验，该交易将会永远丢失。</p>

<p>为了解决上面的问题，2016年有人在github提了一个Improvement Proposal《<a href="">Yet another cool checksum address encoding</a>》，用一种向后兼容的校验和算法对账户地址进行checksum。这种算法得满足下面条件:</p>

<ul>
  <li>向下兼容</li>
  <li>保持账户地址的长度和信息都不发生变化</li>
  <li>足够低的碰撞概率</li>
</ul>

<p>上面提到的proposal提出了一种算法来解决这个问题，该算法原文描述如下:</p>

<blockquote>
  <p>In English, convert the address to hex, but if the ith digit is a letter (ie. it’s one of <code class="language-plaintext highlighter-rouge">abcdef</code>) print it in uppercase if the ith bit of the hash of the address (in binary form) is 1 otherwise print it in lowercase.</p>
</blockquote>

<p>上面描述的意思需要结合代码看才更加清晰，我用自己的语言描述如下:</p>

<p>先把账户地址经过digist再转为hex string，然后遍历用户原本的地址，如果hex string和原地址对应位置的字符是一个字母时，把原地址的字符转为大写字符，否则是一个小些字符。</p>

<p>该算法最终被采纳为以太坊的标准，EIP-55。它的实现原理实际上非常简单，代码甚至都不到10行，非常简洁。下面是EIP55的JS实现。</p>

<div class="language-javascript highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">const</span> <span class="nx">createKeccakHash</span> <span class="o">=</span> <span class="nx">require</span><span class="p">(</span><span class="dl">'</span><span class="s1">keccak</span><span class="dl">'</span><span class="p">)</span>

<span class="kd">function</span> <span class="nx">toChecksumAddress</span> <span class="p">(</span><span class="nx">address</span><span class="p">)</span> <span class="p">{</span>
  <span class="nx">address</span> <span class="o">=</span> <span class="nx">address</span><span class="p">.</span><span class="nx">toLowerCase</span><span class="p">().</span><span class="nx">replace</span><span class="p">(</span><span class="dl">'</span><span class="s1">0x</span><span class="dl">'</span><span class="p">,</span> <span class="dl">''</span><span class="p">)</span>
  <span class="kd">var</span> <span class="nx">hash</span> <span class="o">=</span> <span class="nx">createKeccakHash</span><span class="p">(</span><span class="dl">'</span><span class="s1">keccak256</span><span class="dl">'</span><span class="p">).</span><span class="nx">update</span><span class="p">(</span><span class="nx">address</span><span class="p">).</span><span class="nx">digest</span><span class="p">(</span><span class="dl">'</span><span class="s1">hex</span><span class="dl">'</span><span class="p">)</span>
  <span class="kd">var</span> <span class="nx">ret</span> <span class="o">=</span> <span class="dl">'</span><span class="s1">0x</span><span class="dl">'</span>

  <span class="k">for</span> <span class="p">(</span><span class="kd">var</span> <span class="nx">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="nx">i</span> <span class="o">&lt;</span> <span class="nx">address</span><span class="p">.</span><span class="nx">length</span><span class="p">;</span> <span class="nx">i</span><span class="o">++</span><span class="p">)</span> <span class="p">{</span>
    <span class="k">if</span> <span class="p">(</span><span class="nb">parseInt</span><span class="p">(</span><span class="nx">hash</span><span class="p">[</span><span class="nx">i</span><span class="p">],</span> <span class="mi">16</span><span class="p">)</span> <span class="o">&gt;</span> <span class="mi">7</span><span class="p">)</span> <span class="p">{</span>
      <span class="nx">ret</span> <span class="o">+=</span> <span class="nx">address</span><span class="p">[</span><span class="nx">i</span><span class="p">].</span><span class="nx">toUpperCase</span><span class="p">()</span>
    <span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
      <span class="nx">ret</span> <span class="o">+=</span> <span class="nx">address</span><span class="p">[</span><span class="nx">i</span><span class="p">]</span>
    <span class="p">}</span>
  <span class="p">}</span>

  <span class="k">return</span> <span class="nx">ret</span>
<span class="p">}</span>
</code></pre></div></div>

<p>参考资料:</p>

<p><a href="mailto:vitalik.buterin@ethereum.org">Vitalik Buterin</a>, <a href="mailto:avsa@ethereum.org">Alex Van de Sande</a>, “EIP-55: Mixed-case checksum address encoding,” <em>Ethereum Improvement Proposals</em>, no. 55, January 2016. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-55. <br />
<a href="https://support.mycrypto.com/general-knowledge/ethereum-blockchain/ethereum-address-has-uppercase-and-lowercase-letters/">Ethereum Address Has Uppercase and Lowercase Letters</a></p>]]></content><author><name>kikcat</name></author><category term="blockchain" /><summary type="html"><![CDATA[以太坊账户地址是一个40字符的hex string，细心的朋友可能会发现账户地址有些字母包含了大写字母和小写字母，但实际交易时，用小些地址也可以正常执行操作。 查了一下资料，发现原来包含大写字母的是经过了校验和(checksum)的地址 0x7cb57b5a97eabe94205c07890be4c1ad31e486a8 0x7cB57B5A97eAbe94205C07890BE4c1aD31E486A8 上面两个地址的区别是，前者不包含校验和(checksum)，后者是包含校验和的地址。这个校验和有什么作用呢？因为以太坊地址本身不包含校验信息，一旦用户输入错误，没有一种机制校验，该交易将会永远丢失。 为了解决上面的问题，2016年有人在github提了一个Improvement Proposal《Yet another cool checksum address encoding》，用一种向后兼容的校验和算法对账户地址进行checksum。这种算法得满足下面条件: 向下兼容 保持账户地址的长度和信息都不发生变化 足够低的碰撞概率 上面提到的proposal提出了一种算法来解决这个问题，该算法原文描述如下: In English, convert the address to hex, but if the ith digit is a letter (ie. it’s one of abcdef) print it in uppercase if the ith bit of the hash of the address (in binary form) is 1 otherwise print it in lowercase. 上面描述的意思需要结合代码看才更加清晰，我用自己的语言描述如下: 先把账户地址经过digist再转为hex string，然后遍历用户原本的地址，如果hex string和原地址对应位置的字符是一个字母时，把原地址的字符转为大写字符，否则是一个小些字符。 该算法最终被采纳为以太坊的标准，EIP-55。它的实现原理实际上非常简单，代码甚至都不到10行，非常简洁。下面是EIP55的JS实现。 const createKeccakHash = require('keccak') function toChecksumAddress (address) { address = address.toLowerCase().replace('0x', '') var hash = createKeccakHash('keccak256').update(address).digest('hex') var ret = '0x' for (var i = 0; i &lt; address.length; i++) { if (parseInt(hash[i], 16) &gt; 7) { ret += address[i].toUpperCase() } else { ret += address[i] } } return ret } 参考资料: Vitalik Buterin, Alex Van de Sande, “EIP-55: Mixed-case checksum address encoding,” Ethereum Improvement Proposals, no. 55, January 2016. [Online serial]. Available: https://eips.ethereum.org/EIPS/eip-55. Ethereum Address Has Uppercase and Lowercase Letters]]></summary></entry><entry><title type="html">shadowsocks rc4-md5算法介绍</title><link href="https://wiyi.org/ss-rc4-md5-guide.html" rel="alternate" type="text/html" title="shadowsocks rc4-md5算法介绍" /><published>2021-12-08T00:00:00+00:00</published><updated>2021-12-08T00:00:00+00:00</updated><id>https://wiyi.org/ss-rc4-md5</id><content type="html" xml:base="https://wiyi.org/ss-rc4-md5-guide.html"><![CDATA[<p>shadowsocks协议早期使用RC4加密算法用于加密数据，不过因为每次数据都适用同一个密钥流加密，存在很大的安全隐患，后面更新了RC4-MD5算法。</p>

<p>即便关于RC4-MD5协议的资料很难找到，而且RC4-MD5算法现在也不在安全，但还是有必要介绍一遍，因为它简单，很便于理解，在后期也会展示RC4算法的缺陷。</p>

<p>RC4-MD5本质上还是RC4对数据进行加密。MD5的意思是对key进行MD5运算，得到RC4的key，最终会被RC4算法用于生成密钥流。简单来说，用户的密码会经过下面步骤转换为RC4的keystream</p>

<ol>
  <li>k1 = toBytes(password)</li>
  <li>k2 = ssKey(k1)</li>
  <li>rc4-md5-key = md5(toBytes(k2,iv))</li>
</ol>

<!--more-->
<p>上面步骤中，第一步把用户输入的密码转换为字节数组，第二步通过ss函数生成通用的ssKey，第三步生成RC4-MD5算法所需的key，过程就是把sskey和一个随机iv按顺序放到一个字节数组，在使用md5进行运算，得到一个最终的结果，这个Key就是RC4算法将要使用的Key。</p>

<p>sskey的生成算法如下</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">private</span> <span class="kd">static</span> <span class="kt">byte</span><span class="o">[]</span> <span class="nf">generateSSKey</span><span class="o">(</span><span class="nc">String</span> <span class="n">password</span><span class="o">)</span> <span class="o">{</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">keys</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="mi">32</span><span class="o">];</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">bytes</span> <span class="o">=</span> <span class="n">password</span><span class="o">.</span><span class="na">getBytes</span><span class="o">(</span><span class="nc">StandardCharsets</span><span class="o">.</span><span class="na">UTF_8</span><span class="o">);</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">hash</span> <span class="o">=</span> <span class="nc">Hash</span><span class="o">.</span><span class="na">md5</span><span class="o">(</span><span class="n">bytes</span><span class="o">);</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">tmp</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="n">bytes</span><span class="o">.</span><span class="na">length</span> <span class="o">+</span> <span class="n">hash</span><span class="o">.</span><span class="na">length</span><span class="o">];</span>
  <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">hash</span><span class="o">,</span> <span class="mi">0</span><span class="o">,</span> <span class="n">keys</span><span class="o">,</span> <span class="mi">0</span><span class="o">,</span> <span class="n">hash</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>

  <span class="k">for</span> <span class="o">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="n">hash</span><span class="o">.</span><span class="na">length</span><span class="o">;</span><span class="n">i</span><span class="o">&lt;</span><span class="n">keys</span><span class="o">.</span><span class="na">length</span><span class="o">;</span><span class="n">i</span><span class="o">+=</span><span class="n">hash</span><span class="o">.</span><span class="na">length</span><span class="o">)</span> <span class="o">{</span>
    <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">hash</span><span class="o">,</span> <span class="mi">0</span><span class="o">,</span> <span class="n">tmp</span><span class="o">,</span> <span class="mi">0</span><span class="o">,</span> <span class="n">hash</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>
    <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">bytes</span><span class="o">,</span> <span class="mi">0</span><span class="o">,</span> <span class="n">tmp</span><span class="o">,</span> <span class="n">hash</span><span class="o">.</span><span class="na">length</span><span class="o">,</span> <span class="n">bytes</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>
    <span class="n">hash</span> <span class="o">=</span> <span class="nc">Hash</span><span class="o">.</span><span class="na">md5</span><span class="o">(</span><span class="n">tmp</span><span class="o">);</span>
    <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">hash</span><span class="o">,</span> <span class="mi">0</span><span class="o">,</span> <span class="n">keys</span><span class="o">,</span> <span class="n">i</span><span class="o">,</span> <span class="n">hash</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>
  <span class="o">}</span>

  <span class="k">return</span> <span class="n">keys</span><span class="o">;</span>
<span class="o">}</span>
</code></pre></div></div>

<p>rc4-md5-key生成算法如下</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">static</span> <span class="kt">byte</span><span class="o">[]</span> <span class="nf">rc4md5Key</span><span class="o">(</span><span class="nc">String</span> <span class="n">password</span><span class="o">,</span><span class="kt">byte</span><span class="o">[]</span> <span class="n">iv</span><span class="o">)</span> <span class="o">{</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">ssKeys</span> <span class="o">=</span> <span class="n">generateSSKey</span><span class="o">(</span><span class="n">password</span><span class="o">);</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">keys</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="n">ssKeys</span><span class="o">.</span><span class="na">length</span> <span class="o">+</span> <span class="n">iv</span><span class="o">.</span><span class="na">length</span><span class="o">];</span>
  <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">ssKeys</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="n">keys</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="n">ssKeys</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>
  <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">iv</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="n">keys</span><span class="o">,</span><span class="n">ssKeys</span><span class="o">.</span><span class="na">length</span><span class="o">,</span><span class="n">iv</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>

  <span class="k">return</span> <span class="nc">Hash</span><span class="o">.</span><span class="na">md5</span><span class="o">(</span><span class="n">keys</span><span class="o">);</span>
<span class="o">}</span>
</code></pre></div></div>

<p>当我们得到Key后，就可以使用它来对数据进行加密</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kt">byte</span><span class="o">[]</span> <span class="nf">encrypt</span><span class="o">(</span><span class="kt">byte</span><span class="o">[]</span> <span class="n">keys</span><span class="o">,</span><span class="kt">byte</span><span class="o">[]</span> <span class="n">plaintext</span><span class="o">)</span> <span class="k">throw</span> <span class="nc">Exception</span><span class="o">{</span>
	<span class="nc">Cipher</span> <span class="n">encoder</span> <span class="o">=</span> <span class="nc">Cipher</span><span class="o">.</span><span class="na">getInstance</span><span class="o">(</span><span class="s">"RC4"</span><span class="o">);</span>
  <span class="n">encoder</span><span class="o">.</span><span class="na">init</span><span class="o">(</span><span class="nc">Cipher</span><span class="o">.</span><span class="na">ENCRYPT_MODE</span><span class="o">,</span> <span class="k">new</span> <span class="nc">SecretKeySpec</span><span class="o">(</span><span class="n">keys</span><span class="o">,</span> <span class="s">"RC4"</span><span class="o">));</span>
  <span class="k">return</span> <span class="n">encoder</span><span class="o">.</span><span class="na">update</span><span class="o">(</span><span class="n">content</span><span class="o">.</span><span class="na">getBytes</span><span class="o">());</span>
<span class="o">}</span>
</code></pre></div></div>

<p>当我们得到密文后，还需要把IV传给ss-server，ss-server会使用同样的算法计算出rc4-md5-key，从而对数据进行解密，组装方式如下:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kt">void</span> <span class="nf">combine</span><span class="o">()</span> <span class="o">{</span>
  <span class="nc">String</span> <span class="n">plaintext</span> <span class="o">=</span> <span class="s">"123456"</span><span class="o">;</span>
  <span class="nc">String</span> <span class="n">password</span> <span class="o">=</span> <span class="s">"654321"</span><span class="o">;</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">iv</span> <span class="o">=</span> <span class="n">randomIV</span><span class="o">();</span>
  
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">key</span> <span class="o">=</span> <span class="n">rc4md5Key</span><span class="o">(</span><span class="n">password</span><span class="o">,</span><span class="n">iv</span><span class="o">);</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">ciphertext</span> <span class="o">=</span> <span class="n">encrypt</span><span class="o">(</span><span class="n">key</span><span class="o">,</span><span class="n">plaintext</span><span class="o">.</span><span class="na">getBytes</span><span class="o">());</span>
  <span class="nc">ByteArrayOutputStream</span> <span class="n">os</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ByteArrayOutputStream</span><span class="o">(</span><span class="n">key</span><span class="o">.</span><span class="na">length</span> <span class="o">+</span> <span class="n">ciphertext</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>
  <span class="n">os</span><span class="o">.</span><span class="na">writeBytes</span><span class="o">(</span><span class="n">key</span><span class="o">);</span>
  <span class="n">os</span><span class="o">.</span><span class="na">writeBytes</span><span class="o">(</span><span class="n">ciphertext</span><span class="o">);</span>
  
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">data</span> <span class="o">=</span> <span class="n">os</span><span class="o">.</span><span class="na">toByteArray</span><span class="o">();</span>
<span class="o">}</span>
</code></pre></div></div>

<p>最终我们得到的data就是需要发送给服务端的数据。</p>

<p>虽然RC4-MD5算法早就可以被识别了，不过它足够简单，可以帮助我们理解为什么流加密很脆弱，作为一种入门的算法是不二之选。后续的文章将会继续介绍RC4算法的原理，以及对它的攻击方式。</p>]]></content><author><name>kikcat</name></author><category term="shadowsocks" /><category term="socks5" /><summary type="html"><![CDATA[shadowsocks协议早期使用RC4加密算法用于加密数据，不过因为每次数据都适用同一个密钥流加密，存在很大的安全隐患，后面更新了RC4-MD5算法。 即便关于RC4-MD5协议的资料很难找到，而且RC4-MD5算法现在也不在安全，但还是有必要介绍一遍，因为它简单，很便于理解，在后期也会展示RC4算法的缺陷。 RC4-MD5本质上还是RC4对数据进行加密。MD5的意思是对key进行MD5运算，得到RC4的key，最终会被RC4算法用于生成密钥流。简单来说，用户的密码会经过下面步骤转换为RC4的keystream k1 = toBytes(password) k2 = ssKey(k1) rc4-md5-key = md5(toBytes(k2,iv))]]></summary></entry><entry><title type="html">手把手使用Java实现一个Socks5代理</title><link href="https://wiyi.org/socks5-implementation.html" rel="alternate" type="text/html" title="手把手使用Java实现一个Socks5代理" /><published>2021-11-27T00:00:00+00:00</published><updated>2021-11-27T00:00:00+00:00</updated><id>https://wiyi.org/socks5-implementation</id><content type="html" xml:base="https://wiyi.org/socks5-implementation.html"><![CDATA[<h2 id="1-前言">1. 前言</h2>

<p><a href="https://wiyi.org/socks5-protocol-in-deep.html">上一篇文章</a>介绍了socks5协议的工作过程和协议的细节，通过上一篇文章我们可以认识到socks5协议主要有3个阶段，分别为: 协商、请求，转发(Relay)。本文将手把手使用Java语言实现一个简单的socks5代理</p>

<p>特别提醒: 本文目的仅作为加深socks5协议理解，其中的代码并不是严谨的代码，也没考虑其他的情况。在实际的开发过程中，需要考虑更多的意外情况。</p>

<p>上一篇文章中有一张时序图展示了socks5的大概工作过程，本文将使用Java把这些过程一一实现。</p>

<p><img src="https://wiyi.org/assets/images/socks5/client-socks5_f.jpg" alt="~replace~/assets/images/socks5/client-socks5_f.jpg" /></p>

<!--more-->
<h2 id="2-准备">2. 准备</h2>

<p>本文完整代码请点击<a href="https://github.com/xingty/socks5-server">socks5-server</a>查看。为了方便理解，本文使用BIO编写，对应的类是<code class="language-plaintext highlighter-rouge">ServerSocket</code>和<code class="language-plaintext highlighter-rouge">Socket</code>。在正式开始之前，我们需要创建一个ServerSocket类来接受客户端请求，代码如下:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">class</span> <span class="nc">Socks5Acceptor</span> <span class="kd">implements</span> <span class="nc">Runnable</span><span class="o">{</span>
  <span class="kd">private</span> <span class="kd">final</span> <span class="kt">int</span> <span class="n">port</span><span class="o">;</span>
  <span class="kd">private</span> <span class="kd">final</span> <span class="nc">BlockingQueue</span><span class="o">&lt;</span><span class="nc">Socket</span><span class="o">&gt;</span> <span class="n">queue</span><span class="o">;</span>

  <span class="nc">Socks5Acceptor</span><span class="o">(</span><span class="kt">int</span> <span class="n">port</span><span class="o">,</span><span class="nc">BlockingQueue</span><span class="o">&lt;</span><span class="nc">Socket</span><span class="o">&gt;</span> <span class="n">queue</span><span class="o">)</span> <span class="o">{</span>
    <span class="k">this</span><span class="o">.</span><span class="na">port</span> <span class="o">=</span> <span class="n">port</span><span class="o">;</span>
    <span class="k">this</span><span class="o">.</span><span class="na">queue</span> <span class="o">=</span> <span class="n">queue</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="nd">@Override</span>
  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">run</span><span class="o">()</span> <span class="o">{</span>
    <span class="k">try</span> <span class="o">{</span>
      <span class="nc">ServerSocket</span> <span class="n">socket</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">ServerSocket</span><span class="o">(</span><span class="n">port</span><span class="o">);</span>
      <span class="k">while</span> <span class="o">(</span><span class="kc">true</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">Socket</span> <span class="n">client</span> <span class="o">=</span> <span class="n">socket</span><span class="o">.</span><span class="na">accept</span><span class="o">();</span>
        <span class="n">queue</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="n">client</span><span class="o">);</span>
      <span class="o">}</span>
    <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="nc">Exception</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
      <span class="n">e</span><span class="o">.</span><span class="na">printStackTrace</span><span class="o">();</span>
    <span class="o">}</span>
  <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p>我们使用一个线程用来Accept来自客户端的连接，并把Socket放进一个队列中; 与此同时，我们需要一个Processor来处理Acceptor接受的client，代码如下:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">class</span> <span class="nc">Socks5Processor</span> <span class="kd">implements</span> <span class="nc">Runnable</span><span class="o">{</span>
    <span class="kd">private</span> <span class="kd">final</span> <span class="nc">BlockingQueue</span><span class="o">&lt;</span><span class="nc">Socket</span><span class="o">&gt;</span> <span class="n">queue</span><span class="o">;</span>
    <span class="kd">private</span> <span class="kd">final</span> <span class="nc">Socks5Handler</span> <span class="n">handler</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Socks5Handler</span><span class="o">();</span>

    <span class="kd">public</span> <span class="nf">Socks5Processor</span><span class="o">(</span><span class="nc">BlockingQueue</span><span class="o">&lt;</span><span class="nc">Socket</span><span class="o">&gt;</span> <span class="n">queue</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">this</span><span class="o">.</span><span class="na">queue</span> <span class="o">=</span> <span class="n">queue</span><span class="o">;</span>
    <span class="o">}</span>

    <span class="nd">@Override</span>
    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">run</span><span class="o">()</span> <span class="o">{</span>
        <span class="k">while</span> <span class="o">(</span><span class="kc">true</span><span class="o">)</span> <span class="o">{</span>
            <span class="k">try</span> <span class="o">{</span>
                <span class="nc">Socket</span> <span class="n">client</span> <span class="o">=</span> <span class="n">queue</span><span class="o">.</span><span class="na">take</span><span class="o">();</span>
                <span class="n">handler</span><span class="o">.</span><span class="na">handle</span><span class="o">(</span><span class="n">client</span><span class="o">,</span><span class="kc">true</span><span class="o">);</span>
            <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="nc">InterruptedException</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
                <span class="n">e</span><span class="o">.</span><span class="na">printStackTrace</span><span class="o">();</span>
            <span class="o">}</span>
        <span class="o">}</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p><code class="language-plaintext highlighter-rouge">Socks5Processor</code>取出BlockingQueue中的client对象，然后创建<code class="language-plaintext highlighter-rouge">Socks5Handler</code>来处理。在这里，<code class="language-plaintext highlighter-rouge">Socks5Handler</code>就是接下来我们要讲的主题，里面包含了socks5的协商、认证以及relay过程。</p>

<h2 id="3-socks5代理">3. Socks5代理</h2>

<h3 id="31-协商阶段">3.1 协商阶段</h3>

<p>根据<a href="https://datatracker.ietf.org/doc/html/rfc1928"><strong>RFC1928</strong></a>的规定，socks5客户端在第一次连接时，会发送认证协商请求，这里重复贴一下上篇文章的内容</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/*
 *  request:
 *  +----+----------+----------+
 *  |VER | NMETHODS | METHODS  |
 *  +----+----------+----------+
 *  | 1  |    1     | 1 to 255 |
 *  +----+----------+----------+
 *
 *   response:
 *   +----+--------+
 *   |VER | METHOD |
 *   +----+--------+
 *   | 1  |   1    |
 *   +----+--------+ 		
 */</span>
</code></pre></div></div>

<p>我们需要读取socket中的字节流，并逐个解析内容，connect方法如下:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">private</span> <span class="kt">void</span> <span class="nf">connect</span><span class="o">(</span><span class="nc">Socket</span> <span class="n">client</span><span class="o">,</span> <span class="kt">boolean</span> <span class="n">allowAnon</span><span class="o">)</span> <span class="kd">throws</span> <span class="nc">IOException</span> <span class="o">{</span>
  <span class="nc">InputStream</span> <span class="n">is</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="na">getInputStream</span><span class="o">();</span>
  <span class="nc">OutputStream</span> <span class="n">os</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="na">getOutputStream</span><span class="o">();</span>

  <span class="c1">//--------segment 1-----------</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">buffer</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="mi">257</span><span class="o">];</span>
  <span class="kt">int</span> <span class="n">len</span> <span class="o">=</span> <span class="n">is</span><span class="o">.</span><span class="na">read</span><span class="o">(</span><span class="n">buffer</span><span class="o">);</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">len</span> <span class="o">&lt;=</span> <span class="mi">0</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">os</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
    <span class="k">return</span><span class="o">;</span>
  <span class="o">}</span>
  
  <span class="c1">//VER</span>
  <span class="kt">int</span> <span class="n">version</span> <span class="o">=</span> <span class="n">buffer</span><span class="o">[</span><span class="mi">0</span><span class="o">];</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">version</span> <span class="o">!=</span> <span class="mh">0x05</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,-</span><span class="mi">1</span><span class="o">});</span>
    <span class="k">return</span><span class="o">;</span>
  <span class="o">}</span>
  <span class="c1">//--------segment 1-----------</span>

  <span class="c1">//--------segment 2-----------</span>
  <span class="c1">//NO AUTHENTICATION REQUIRED</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">allowAnon</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,</span><span class="mi">0</span><span class="o">});</span>
    <span class="n">waitingRequest</span><span class="o">(</span><span class="n">client</span><span class="o">);</span>
    <span class="k">return</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="k">if</span> <span class="o">(</span><span class="n">len</span> <span class="o">&lt;=</span> <span class="mi">1</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,-</span><span class="mi">1</span><span class="o">});</span> <span class="c1">//-1 = 0xFF</span>
    <span class="k">return</span><span class="o">;</span>
  <span class="o">}</span>
  <span class="c1">//--------segment 2-----------</span>

  <span class="c1">//--------segment 3-----------</span>
  <span class="c1">//NMETHODS</span>
  <span class="kt">int</span> <span class="n">methods</span> <span class="o">=</span> <span class="n">buffer</span><span class="o">[</span><span class="mi">1</span><span class="o">];</span>
  <span class="k">for</span> <span class="o">(</span><span class="kt">int</span> <span class="n">i</span><span class="o">=</span><span class="mi">0</span><span class="o">;</span><span class="n">i</span><span class="o">&lt;</span><span class="n">methods</span><span class="o">;</span><span class="n">i</span><span class="o">++)</span> <span class="o">{</span>
    <span class="c1">//username password authentication</span>
    <span class="k">if</span> <span class="o">(</span><span class="n">buffer</span><span class="o">[</span><span class="n">i</span><span class="o">+</span><span class="mi">2</span><span class="o">]</span> <span class="o">==</span> <span class="mh">0x02</span><span class="o">)</span> <span class="o">{</span>
      <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,</span><span class="mi">2</span><span class="o">});</span>
      <span class="k">if</span> <span class="o">(</span><span class="n">doAuthentication</span><span class="o">(</span><span class="n">client</span><span class="o">))</span> <span class="o">{</span>
        <span class="n">waitingRequest</span><span class="o">(</span><span class="n">client</span><span class="o">);</span>
      <span class="o">}</span>

      <span class="k">return</span><span class="o">;</span>
    <span class="o">}</span>
  <span class="o">}</span>

  <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,-</span><span class="mi">1</span><span class="o">});</span>
<span class="o">}</span>
</code></pre></div></div>

<p>上面代码共分为4个片段，表达的意义如下:</p>

<p><strong>segment 1</strong></p>

<p>创建一个257字节大小的byte数组作为buffer，把客户端首次请求的数据全部读进buffer，再取出第一个字节，判断版本号。如果版本号不为5，就发送error断开连接。</p>

<p><strong>segment 2</strong></p>

<p>segment 2判断server的配置是否允许匿名，如果允许直接返回[5,0]不需要进行下一步认证。</p>

<p><strong>segment 3</strong></p>

<p>segment 3使用<code class="language-plaintext highlighter-rouge">0x02</code>认证方式，即用户名和密码的认证。这种认证方式在<a href="https://datatracker.ietf.org/doc/html/rfc1929">RFC1929</a>中详细叙述。</p>

<h3 id="32-认证阶段子协商">3.2 认证阶段(子协商)</h3>

<p>在上一个阶段如果我们返回<code class="language-plaintext highlighter-rouge">0x02</code>就会进入用户名密码的认证阶段，此时socks5客户端会发送用户名和密码请求，格式如下:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/**
*  +----+------+----------+------+----------+
*  |VER | ULEN |  UNAME   | PLEN |  PASSWD  |
*  +----+------+----------+------+----------+
*  | 1  |  1   | 1 to 255 |  1   | 1 to 255 |
*  +----+------+----------+------+----------+
*
*  +----+--------+
*  |VER | STATUS |
*  +----+--------+
*  | 1  |   1    |
*  +----+--------+
*/</span>
</code></pre></div></div>

<p>在3.1代码中的doAuthentication就是读取客户端发送的数据，代码如下</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">private</span> <span class="kd">static</span> <span class="kt">boolean</span> <span class="nf">doAuthentication</span><span class="o">(</span><span class="nc">Socket</span> <span class="n">client</span><span class="o">)</span> <span class="kd">throws</span> <span class="nc">IOException</span><span class="o">{</span>
  <span class="nc">InputStream</span> <span class="n">is</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="na">getInputStream</span><span class="o">();</span>
  <span class="nc">OutputStream</span> <span class="n">os</span> <span class="o">=</span> <span class="n">client</span><span class="o">.</span><span class="na">getOutputStream</span><span class="o">();</span>
  <span class="kt">byte</span><span class="o">[]</span> <span class="n">buffer</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="mi">512</span><span class="o">];</span>
  <span class="kt">int</span> <span class="n">len</span> <span class="o">=</span> <span class="n">is</span><span class="o">.</span><span class="na">read</span><span class="o">(</span><span class="n">buffer</span><span class="o">);</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">len</span> <span class="o">&lt;=</span> <span class="mi">0</span><span class="o">)</span> <span class="o">{</span>
    <span class="c1">//TODO throw exception</span>
    <span class="n">client</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
    <span class="k">return</span> <span class="kc">false</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="kt">int</span> <span class="n">ver</span> <span class="o">=</span> <span class="n">buffer</span><span class="o">[</span><span class="mi">0</span><span class="o">];</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">ver</span> <span class="o">!=</span> <span class="mh">0x01</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,</span><span class="mi">1</span><span class="o">});</span>
    <span class="k">return</span> <span class="kc">false</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="k">if</span> <span class="o">(</span><span class="n">len</span> <span class="o">&lt;=</span> <span class="mi">1</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,</span><span class="mi">1</span><span class="o">});</span>
    <span class="k">return</span> <span class="kc">false</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="nc">UserInfo</span> <span class="n">info</span> <span class="o">=</span> <span class="nc">UserInfo</span><span class="o">.</span><span class="na">parse</span><span class="o">(</span><span class="n">buffer</span><span class="o">);</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">info</span><span class="o">.</span><span class="na">match</span><span class="o">(</span><span class="s">"bigbyto"</span><span class="o">,</span><span class="s">"123456"</span><span class="o">))</span> <span class="o">{</span>
    <span class="c1">//SUCCESSFUL</span>
    <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">1</span><span class="o">,</span><span class="mi">0</span><span class="o">});</span>
    <span class="k">return</span> <span class="kc">true</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="c1">//AUTHENTICATION FAILURE</span>
  <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">1</span><span class="o">,</span><span class="mi">1</span><span class="o">});</span>
  <span class="k">return</span> <span class="kc">false</span><span class="o">;</span>
<span class="o">}</span>

<span class="kd">private</span> <span class="kd">static</span> <span class="kd">class</span> <span class="nc">UserInfo</span> <span class="o">{</span>
  <span class="nc">String</span> <span class="n">username</span><span class="o">;</span>
  <span class="nc">String</span> <span class="n">password</span><span class="o">;</span>

  <span class="kd">public</span> <span class="kd">static</span> <span class="nc">UserInfo</span> <span class="nf">parse</span><span class="o">(</span><span class="kt">byte</span><span class="o">[]</span> <span class="n">data</span><span class="o">)</span> <span class="o">{</span>
    <span class="kt">int</span> <span class="n">uLen</span> <span class="o">=</span> <span class="n">data</span><span class="o">[</span><span class="mi">1</span><span class="o">];</span>
    <span class="kt">byte</span><span class="o">[]</span> <span class="n">uBytes</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="n">uLen</span><span class="o">];</span>
    <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">data</span><span class="o">,</span><span class="mi">2</span><span class="o">,</span><span class="n">uBytes</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="n">uBytes</span><span class="o">.</span><span class="na">length</span><span class="o">);;</span>

    <span class="nc">UserInfo</span> <span class="n">info</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">UserInfo</span><span class="o">();</span>
    <span class="n">info</span><span class="o">.</span><span class="na">username</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">String</span><span class="o">(</span><span class="n">uBytes</span><span class="o">);</span>

    <span class="kt">int</span> <span class="n">pLen</span> <span class="o">=</span> <span class="n">data</span><span class="o">[</span><span class="n">uLen</span> <span class="o">+</span> <span class="mi">2</span><span class="o">];</span>
    <span class="kt">byte</span><span class="o">[]</span> <span class="n">pBytes</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="n">pLen</span><span class="o">];</span>
    <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">data</span><span class="o">,</span><span class="n">uLen</span> <span class="o">+</span> <span class="mi">3</span><span class="o">,</span><span class="n">pBytes</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="n">pBytes</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>
    <span class="n">info</span><span class="o">.</span><span class="na">password</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">String</span><span class="o">(</span><span class="n">pBytes</span><span class="o">);</span>

    <span class="k">return</span> <span class="n">info</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="kd">public</span> <span class="kt">boolean</span> <span class="nf">match</span><span class="o">(</span><span class="nc">String</span> <span class="n">username</span><span class="o">,</span><span class="nc">String</span> <span class="n">password</span><span class="o">)</span> <span class="o">{</span>
    <span class="k">return</span> <span class="n">username</span><span class="o">.</span><span class="na">equals</span><span class="o">(</span><span class="k">this</span><span class="o">.</span><span class="na">username</span><span class="o">)</span> <span class="o">&amp;&amp;</span> <span class="n">password</span><span class="o">.</span><span class="na">equals</span><span class="o">(</span><span class="k">this</span><span class="o">.</span><span class="na">password</span><span class="o">);</span>
  <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p>这部分代码其实没什么值得说的，逻辑基本上和3.1差不多，读取socks5客户端发出的认证请求，然后解析出用户名和密码，再比对返回认证结果。</p>

<h3 id="33-请求阶段">3.3 请求阶段</h3>

<p>当必须要客户端认证，或者客户端通过认证，即将进入请求阶段，socks5客户端向socks5 server发送目标网站的地址信息。</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/**
*   socks5 client request
*   +----+-----+-------+------+----------+----------+
*   |VER | CMD |  RSV  | ATYP | DST.ADDR | DST.PORT |
*   +----+-----+-------+------+----------+----------+
*   | 1  |  1  | X'00' |  1   | Variable |    2     |
*   +----+-----+-------+------+----------+----------+
*
*   socks5 server response
*   +----+-----+-------+------+----------+----------+
*   |VER | REP |  RSV  | ATYP | BND.ADDR | BND.PORT |
*   +----+-----+-------+------+----------+----------+
*   | 1  |  1  | X'00' |  1   | Variable |    2     |
*   +----+-----+-------+------+----------+----------+
*/</span>
</code></pre></div></div>

<p>如果读者看不懂这里，可以参考上一篇文章<a href="https://wiyi.org/socks5-protocol-in-deep.html">理解socks5协议的工作过程和协议细节</a>或参考<a href="https://datatracker.ietf.org/doc/html/rfc1928">RFC1928</a>。</p>

<p>根据我们之前的描述，在这个阶段我们需解析出客户端发过来的target server地址，然后向它发起一个TCP/UDP请求。</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">private</span> <span class="kt">void</span> <span class="nf">waitingRequest</span><span class="o">(</span><span class="nc">Socket</span> <span class="n">socket</span><span class="o">)</span> <span class="kd">throws</span> <span class="nc">IOException</span><span class="o">{</span>
  <span class="nc">InputStream</span> <span class="n">is</span> <span class="o">=</span> <span class="n">socket</span><span class="o">.</span><span class="na">getInputStream</span><span class="o">();</span>
  <span class="nc">OutputStream</span> <span class="n">os</span> <span class="o">=</span> <span class="n">socket</span><span class="o">.</span><span class="na">getOutputStream</span><span class="o">();</span>

  <span class="kt">byte</span><span class="o">[]</span> <span class="n">buffer</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="mi">256</span><span class="o">];</span>
  <span class="kt">int</span> <span class="n">len</span> <span class="o">=</span> <span class="n">is</span><span class="o">.</span><span class="na">read</span><span class="o">(</span><span class="n">buffer</span><span class="o">);</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">len</span> <span class="o">&lt;=</span> <span class="mi">0</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">socket</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
    <span class="k">return</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="kt">int</span> <span class="n">ver</span> <span class="o">=</span> <span class="n">buffer</span><span class="o">[</span><span class="mi">0</span><span class="o">];</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">ver</span> <span class="o">!=</span> <span class="mh">0x05</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,</span><span class="mi">1</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">1</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">});</span>
    <span class="k">return</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="kt">int</span> <span class="n">cmd</span> <span class="o">=</span> <span class="n">buffer</span><span class="o">[</span><span class="mi">1</span><span class="o">];</span>
  <span class="c1">//ONLY ACCEPT CONNECT</span>
  <span class="k">if</span> <span class="o">(</span><span class="n">cmd</span> <span class="o">!=</span> <span class="mh">0x01</span><span class="o">)</span> <span class="o">{</span>
    <span class="n">os</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,</span><span class="mi">1</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">1</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">});</span>
    <span class="k">return</span><span class="o">;</span>
  <span class="o">}</span>

  <span class="nc">RemoteAddr</span> <span class="n">addr</span> <span class="o">=</span> <span class="n">getRemoteAddrInfo</span><span class="o">(</span><span class="n">buffer</span><span class="o">,</span><span class="n">len</span><span class="o">);</span>
  <span class="n">socket</span><span class="o">.</span><span class="na">getOutputStream</span><span class="o">().</span><span class="na">write</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="mi">5</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">1</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="mi">0</span><span class="o">});</span>

  <span class="n">relayHandler</span><span class="o">.</span><span class="na">doRelay</span><span class="o">(</span><span class="n">socket</span><span class="o">,</span> <span class="n">addr</span><span class="o">.</span><span class="na">addr</span><span class="o">,</span><span class="n">addr</span><span class="o">.</span><span class="na">port</span><span class="o">);</span>
<span class="o">}</span>

<span class="kd">private</span> <span class="nc">RemoteAddr</span> <span class="nf">getRemoteAddrInfo</span><span class="o">(</span><span class="kt">byte</span><span class="o">[]</span> <span class="n">bytes</span><span class="o">,</span><span class="kt">int</span> <span class="n">len</span><span class="o">)</span> <span class="o">{</span>
    <span class="kt">byte</span> <span class="n">atype</span> <span class="o">=</span> <span class="n">bytes</span><span class="o">[</span><span class="mi">3</span><span class="o">];</span>
    <span class="nc">String</span> <span class="n">addr</span><span class="o">;</span>
    <span class="k">try</span> <span class="o">{</span>
        <span class="k">if</span> <span class="o">(</span><span class="n">atype</span> <span class="o">==</span> <span class="n">ATYPE_IPv4</span><span class="o">)</span> <span class="o">{</span>
            <span class="kt">byte</span><span class="o">[]</span> <span class="n">ipv4</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="mi">4</span><span class="o">];</span>
            <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">bytes</span><span class="o">,</span><span class="mi">4</span><span class="o">,</span><span class="n">ipv4</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="n">ipv4</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>
            <span class="n">addr</span> <span class="o">=</span> <span class="nc">Inet4Address</span><span class="o">.</span><span class="na">getByAddress</span><span class="o">(</span><span class="n">ipv4</span><span class="o">).</span><span class="na">getHostAddress</span><span class="o">();</span>
        <span class="o">}</span>
        <span class="k">else</span> <span class="nf">if</span> <span class="o">(</span><span class="n">atype</span> <span class="o">==</span> <span class="n">ATYPE_IPv6</span><span class="o">)</span> <span class="o">{</span>
            <span class="kt">byte</span><span class="o">[]</span> <span class="n">ipv6</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="mi">16</span><span class="o">];</span>
            <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">bytes</span><span class="o">,</span><span class="mi">4</span><span class="o">,</span><span class="n">ipv6</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="n">ipv6</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>
            <span class="n">addr</span> <span class="o">=</span> <span class="nc">Inet6Address</span><span class="o">.</span><span class="na">getByAddress</span><span class="o">(</span><span class="n">ipv6</span><span class="o">).</span><span class="na">getHostAddress</span><span class="o">();</span>
        <span class="o">}</span>
        <span class="k">else</span> <span class="nf">if</span> <span class="o">(</span><span class="n">atype</span> <span class="o">==</span> <span class="no">ATYPE_DOMAINNAME</span><span class="o">)</span> <span class="o">{</span>
            <span class="kt">int</span> <span class="n">domainLen</span> <span class="o">=</span> <span class="n">bytes</span><span class="o">[</span><span class="mi">4</span><span class="o">];</span>
            <span class="kt">byte</span><span class="o">[]</span> <span class="n">domain</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="n">domainLen</span><span class="o">];</span>
            <span class="nc">System</span><span class="o">.</span><span class="na">arraycopy</span><span class="o">(</span><span class="n">bytes</span><span class="o">,</span><span class="mi">5</span><span class="o">,</span><span class="n">domain</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="n">domain</span><span class="o">.</span><span class="na">length</span><span class="o">);</span>
            <span class="n">addr</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">String</span><span class="o">(</span><span class="n">domain</span><span class="o">);</span>
        <span class="o">}</span>
        <span class="k">else</span> <span class="o">{</span>
            <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="s">"Unknown address type: "</span> <span class="o">+</span> <span class="n">atype</span><span class="o">);</span>
        <span class="o">}</span>
    <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="nc">UnknownHostException</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
        <span class="k">throw</span> <span class="k">new</span> <span class="nf">RuntimeException</span><span class="o">(</span><span class="n">e</span><span class="o">);</span>
    <span class="o">}</span>

    <span class="nc">RemoteAddr</span> <span class="n">info</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">RemoteAddr</span><span class="o">();</span>
    <span class="n">info</span><span class="o">.</span><span class="na">addr</span> <span class="o">=</span> <span class="n">addr</span><span class="o">.</span><span class="na">trim</span><span class="o">();</span>

    <span class="nc">ByteBuffer</span> <span class="n">buffer</span> <span class="o">=</span> <span class="nc">ByteBuffer</span><span class="o">.</span><span class="na">wrap</span><span class="o">(</span><span class="k">new</span> <span class="kt">byte</span><span class="o">[]{</span><span class="n">bytes</span><span class="o">[</span><span class="n">len</span><span class="o">-</span><span class="mi">2</span><span class="o">],</span><span class="n">bytes</span><span class="o">[</span><span class="n">len</span><span class="o">-</span><span class="mi">1</span><span class="o">]});</span>
    <span class="n">info</span><span class="o">.</span><span class="na">port</span> <span class="o">=</span> <span class="n">buffer</span><span class="o">.</span><span class="na">asCharBuffer</span><span class="o">().</span><span class="na">get</span><span class="o">();</span>

    <span class="k">return</span> <span class="n">info</span><span class="o">;</span>
<span class="o">}</span>
</code></pre></div></div>

<p>上面代码解析了客户端发送过来的数据，并且拿到了目标主机的地址(ip或域名)和端口，且在最后，把这些信息交给relayHandler进行relay。</p>

<p>在这个阶段，我们的socks5 server就成功扮演了中间人的角色，建立了一条client–&gt;socks5–&gt;target server的一条链路。</p>

<h3 id="34-relay阶段">3.4 Relay阶段</h3>

<p>relay阶段主要是把来自于client inputstream的所有数据转发到target server的outputstream，同理，也会把target server inputstream的所有数据转发到client outputstream。Socks5RelayHandler代码如下:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">class</span> <span class="nc">Socks5RelayHandler</span> <span class="o">{</span>

    <span class="kd">public</span> <span class="kt">void</span> <span class="nf">doRelay</span><span class="o">(</span><span class="nc">Socket</span> <span class="n">client</span><span class="o">,</span> <span class="nc">String</span> <span class="n">addr</span><span class="o">,</span> <span class="kt">int</span> <span class="n">port</span><span class="o">)</span> <span class="o">{</span>
        <span class="nc">Socket</span> <span class="n">relay</span> <span class="o">=</span> <span class="kc">null</span><span class="o">;</span>
        <span class="k">try</span> <span class="o">{</span>
            <span class="n">relay</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Socket</span><span class="o">(</span><span class="n">addr</span><span class="o">,</span><span class="n">port</span><span class="o">);</span>
            <span class="n">relay</span><span class="o">.</span><span class="na">setSoTimeout</span><span class="o">(</span><span class="mi">30</span> <span class="o">*</span> <span class="mi">1000</span><span class="o">);</span>
            <span class="nc">Socks5Pipe</span> <span class="n">p1</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Socks5Pipe</span><span class="o">(</span><span class="n">client</span><span class="o">,</span><span class="n">relay</span><span class="o">,</span><span class="s">"client"</span><span class="o">);</span>
            <span class="nc">Socks5Pipe</span> <span class="n">p2</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Socks5Pipe</span><span class="o">(</span><span class="n">relay</span><span class="o">,</span><span class="n">client</span><span class="o">,</span><span class="s">"server"</span><span class="o">);</span>

            <span class="n">p1</span><span class="o">.</span><span class="na">relay</span><span class="o">();</span>
            <span class="n">p2</span><span class="o">.</span><span class="na">relay</span><span class="o">();</span>
        <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="nc">IOException</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
            <span class="k">try</span> <span class="o">{</span>
                <span class="k">if</span> <span class="o">(</span><span class="n">relay</span> <span class="o">!=</span> <span class="kc">null</span> <span class="o">&amp;&amp;</span> <span class="o">!</span><span class="n">relay</span><span class="o">.</span><span class="na">isClosed</span><span class="o">())</span> <span class="o">{</span>
                    <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">printf</span><span class="o">(</span><span class="s">"address: %s, reason: %s"</span><span class="o">,</span><span class="n">relay</span><span class="o">.</span><span class="na">getInetAddress</span><span class="o">(),</span><span class="n">e</span><span class="o">.</span><span class="na">getMessage</span><span class="o">());</span>
                    <span class="n">relay</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
                <span class="o">}</span>

                <span class="k">if</span> <span class="o">(!</span><span class="n">client</span><span class="o">.</span><span class="na">isClosed</span><span class="o">())</span> <span class="o">{</span>
                    <span class="n">client</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
                <span class="o">}</span>
            <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="nc">Exception</span> <span class="n">e1</span><span class="o">)</span> <span class="o">{</span>
                <span class="n">e1</span><span class="o">.</span><span class="na">printStackTrace</span><span class="o">();</span>
            <span class="o">}</span>
        <span class="o">}</span>
    <span class="o">}</span>

    <span class="kd">static</span> <span class="kd">class</span> <span class="nc">Socks5Pipe</span> <span class="kd">implements</span> <span class="nc">Runnable</span><span class="o">{</span>
        <span class="kd">private</span> <span class="nc">String</span> <span class="n">id</span><span class="o">;</span>
        <span class="kd">private</span> <span class="kd">final</span> <span class="nc">Socket</span> <span class="n">source</span><span class="o">;</span>
        <span class="kd">private</span> <span class="kd">final</span> <span class="nc">Socket</span> <span class="n">target</span><span class="o">;</span>

        <span class="nc">Socks5Pipe</span><span class="o">(</span><span class="nc">Socket</span> <span class="n">source</span><span class="o">,</span> <span class="nc">Socket</span> <span class="n">target</span><span class="o">,</span><span class="nc">String</span> <span class="n">id</span><span class="o">)</span> <span class="o">{</span>
            <span class="k">this</span><span class="o">.</span><span class="na">source</span> <span class="o">=</span> <span class="n">source</span><span class="o">;</span>
            <span class="k">this</span><span class="o">.</span><span class="na">target</span> <span class="o">=</span> <span class="n">target</span><span class="o">;</span>
            <span class="k">this</span><span class="o">.</span><span class="na">id</span> <span class="o">=</span> <span class="n">id</span><span class="o">;</span>
        <span class="o">}</span>

        <span class="kd">public</span> <span class="kt">void</span> <span class="nf">relay</span><span class="o">()</span> <span class="o">{</span>
            <span class="nc">Thread</span> <span class="n">t</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Thread</span><span class="o">(</span><span class="k">this</span><span class="o">);</span>
            <span class="n">t</span><span class="o">.</span><span class="na">setName</span><span class="o">(</span><span class="s">"Socks5-Thread-"</span> <span class="o">+</span> <span class="n">target</span><span class="o">.</span><span class="na">getInetAddress</span><span class="o">().</span><span class="na">toString</span><span class="o">());</span>
            <span class="n">t</span><span class="o">.</span><span class="na">start</span><span class="o">();</span>
        <span class="o">}</span>

        <span class="nd">@Override</span>
        <span class="kd">public</span> <span class="kt">void</span> <span class="nf">run</span><span class="o">()</span> <span class="o">{</span>
            <span class="k">try</span> <span class="o">{</span>
                <span class="nc">InputStream</span> <span class="n">sis</span> <span class="o">=</span> <span class="n">source</span><span class="o">.</span><span class="na">getInputStream</span><span class="o">();</span>
                <span class="nc">OutputStream</span> <span class="n">tos</span> <span class="o">=</span> <span class="n">target</span><span class="o">.</span><span class="na">getOutputStream</span><span class="o">();</span>

                <span class="kt">byte</span><span class="o">[]</span> <span class="n">buffer</span> <span class="o">=</span> <span class="k">new</span> <span class="kt">byte</span><span class="o">[</span><span class="mi">1024</span><span class="o">];</span>
                <span class="kt">int</span> <span class="n">len</span><span class="o">;</span>
                <span class="k">while</span> <span class="o">((</span><span class="n">len</span> <span class="o">=</span> <span class="n">sis</span><span class="o">.</span><span class="na">read</span><span class="o">(</span><span class="n">buffer</span><span class="o">))</span> <span class="o">&gt;</span> <span class="mi">0</span><span class="o">)</span> <span class="o">{</span>
                    <span class="n">tos</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="n">buffer</span><span class="o">,</span><span class="mi">0</span><span class="o">,</span><span class="n">len</span><span class="o">);</span>
                <span class="o">}</span>

                <span class="n">close</span><span class="o">();</span>
            <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="nc">IOException</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
                <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">printf</span><span class="o">(</span><span class="s">"address: %s, reason: %s\n"</span><span class="o">,</span><span class="n">source</span><span class="o">.</span><span class="na">getInetAddress</span><span class="o">(),</span><span class="n">e</span><span class="o">.</span><span class="na">getMessage</span><span class="o">());</span>
                <span class="n">close</span><span class="o">();</span>
                <span class="n">e</span><span class="o">.</span><span class="na">printStackTrace</span><span class="o">();</span>
            <span class="o">}</span>
        <span class="o">}</span>

        <span class="kd">public</span> <span class="kt">void</span> <span class="nf">close</span><span class="o">()</span> <span class="o">{</span>
            <span class="k">try</span> <span class="o">{</span>
                <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="n">id</span> <span class="o">+</span> <span class="s">" close"</span><span class="o">);</span>

                <span class="k">if</span> <span class="o">(!</span><span class="n">source</span><span class="o">.</span><span class="na">isClosed</span><span class="o">())</span> <span class="o">{</span>
                    <span class="n">source</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
                <span class="o">}</span>

                <span class="k">if</span> <span class="o">(!</span><span class="n">target</span><span class="o">.</span><span class="na">isClosed</span><span class="o">())</span> <span class="o">{</span>
                    <span class="n">target</span><span class="o">.</span><span class="na">shutdownInput</span><span class="o">();</span>
                <span class="o">}</span>
            <span class="o">}</span> <span class="k">catch</span> <span class="o">(</span><span class="nc">IOException</span> <span class="n">e</span><span class="o">)</span> <span class="o">{</span>
                <span class="nc">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span><span class="s">"close socket error"</span><span class="o">);</span>
                <span class="n">e</span><span class="o">.</span><span class="na">printStackTrace</span><span class="o">();</span>
            <span class="o">}</span>
        <span class="o">}</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<h2 id="4-组装执行">4. 组装执行</h2>

<p>上面代码是主要流程，接下来就是组装运行了。代码非常简单，如下</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">public</span> <span class="kd">class</span> <span class="nc">Socks5Server</span> <span class="o">{</span>
    <span class="kd">public</span> <span class="kd">static</span> <span class="kt">void</span> <span class="nf">main</span><span class="o">(</span><span class="nc">String</span><span class="o">[]</span> <span class="n">args</span><span class="o">)</span> <span class="kd">throws</span> <span class="nc">Exception</span><span class="o">{</span>
        <span class="nc">LinkedBlockingQueue</span><span class="o">&lt;</span><span class="nc">Socket</span><span class="o">&gt;</span> <span class="n">queue</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">LinkedBlockingQueue</span><span class="o">&lt;&gt;();</span>

        <span class="nc">Socks5Acceptor</span> <span class="n">acceptor</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Socks5Acceptor</span><span class="o">(</span><span class="mi">7582</span><span class="o">,</span><span class="n">queue</span><span class="o">);</span>
        <span class="nc">Socks5Processor</span> <span class="n">processor</span> <span class="o">=</span> <span class="k">new</span> <span class="nc">Socks5Processor</span><span class="o">(</span><span class="n">queue</span><span class="o">);</span>

        <span class="k">new</span> <span class="nf">Thread</span><span class="o">(</span><span class="n">acceptor</span><span class="o">).</span><span class="na">start</span><span class="o">();</span>
        <span class="k">new</span> <span class="nf">Thread</span><span class="o">(</span><span class="n">processor</span><span class="o">).</span><span class="na">start</span><span class="o">();</span>

        <span class="nc">Thread</span><span class="o">.</span><span class="na">currentThread</span><span class="o">().</span><span class="na">join</span><span class="o">();</span>
    <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p>执行上面代码后，本地开始监听7582端口。我们可以使用curl来测试是否正常工作</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#不需要认证</span>
curl <span class="nt">-x</span> socks5h://127.0.0.1:7582 http://a.baidu.com 

<span class="c">#如果需要认证</span>
curl <span class="nt">-x</span> socks5h://bigbyto:123456@127.0.0.1:7582 http://a.baidu.com 

<span class="c">#返回结果为OK</span>
<span class="c">#OK</span>
</code></pre></div></div>

<h2 id="5-相关阅读">5. 相关阅读</h2>

<ul>
  <li><a href="https://wiyi.org/socks5-protocol-in-deep.html">理解socks5协议的工作过程和协议细节</a></li>
</ul>]]></content><author><name>kikcat</name></author><category term="socks5" /><summary type="html"><![CDATA[1. 前言 上一篇文章介绍了socks5协议的工作过程和协议的细节，通过上一篇文章我们可以认识到socks5协议主要有3个阶段，分别为: 协商、请求，转发(Relay)。本文将手把手使用Java语言实现一个简单的socks5代理 特别提醒: 本文目的仅作为加深socks5协议理解，其中的代码并不是严谨的代码，也没考虑其他的情况。在实际的开发过程中，需要考虑更多的意外情况。 上一篇文章中有一张时序图展示了socks5的大概工作过程，本文将使用Java把这些过程一一实现。]]></summary></entry><entry><title type="html">理解socks5协议的工作过程和协议细节</title><link href="https://wiyi.org/socks5-protocol-in-deep.html" rel="alternate" type="text/html" title="理解socks5协议的工作过程和协议细节" /><published>2021-11-22T00:00:00+00:00</published><updated>2021-11-22T00:00:00+00:00</updated><id>https://wiyi.org/socks5-protocol</id><content type="html" xml:base="https://wiyi.org/socks5-protocol-in-deep.html"><![CDATA[<h2 id="1-前言">1. 前言</h2>

<p>本位将由浅入深带大家详细了解socks5协议。文章首先会对socks协议进行简单介绍，接着会介绍socks5协议的使用场景，然后介绍它的工作工程，最后介绍协议的细节(握手、数据转发)。</p>

<h2 id="2-协议介绍">2. 协议介绍</h2>

<h3 id="21-什么是socks协议">2.1 什么是socks协议</h3>

<p>啥是socks协议呢? 这里贴一段维基百科对它的定义</p>

<blockquote>
  <p>SOCKS is an Internet protocol that exchanges network packets between a client and server through a proxy server</p>
</blockquote>

<p>大概的意思是: socks是一种互联网协议，它通过一个代理服务器在客户端和服务端之间交换网络数据。简单来说，它就是一种代理协议，扮演一个中间人的角色，在客户端和目标主机之间转发数据。</p>

<p><img src="https://user-images.githubusercontent.com/3600657/169195754-0ed9658d-717e-4edd-915f-e2f0b7596818.jpeg" alt="~replace~/assets/images/socks5/socks5_01.jpeg" /></p>

<p>socks协议位于OSI模型中的第五层，即会话层(Session Layer)。</p>

<!--more-->
<h3 id="22-socks协议有什么用">2.2 socks协议有什么用</h3>

<p>对于广大的中国网友来说，一提到代理，肯定会想到翻墙，而socks5作为一种代理协议，肯定也能用来翻墙嘛。不过遗憾的是，虽然它是代理协议，然而并不能用于翻墙。因为它的数据都是明文传输，会被墙轻易阻断。</p>

<p>socks协议历史悠久，它面世时中国的互联网尚未成型，更别说墙，因此它并不是为翻墙而设计的协议。互联网早期，企业内部网络为了保证安全性，都是置于防火墙之后，这样带来的副作用就是访问内部资源会变得很麻烦，socks协议就是为了解决这个问题而诞生的。</p>

<p>socks相当于在防火墙撕了一道口子，让合法的用户可以通过这个口子连接到内部，从而访问内部的一些资源和进行管理。</p>

<h3 id="23-什么是socks5协议">2.3 什么是socks5协议</h3>

<p>socks5顾名思义就是socks协议的第五个版本，作为socks4的一个延伸，在socks4的基础上新增<strong>UDP转发</strong>和<strong>认证功能</strong>。唯一遗憾的是socks5并不兼容socks4协议。socks5由IETF在1996年正式发布，经过这么多年的发展，互联网上基本上都以socks5为主，socks4已经退出了历史的舞台。</p>

<p>实际上，你并不需要回头去看socks4协议，因为socks5协议完全可以取代socks4，因此读者对此不必感觉有心理压力。</p>

<h3 id="24-工作过程">2.4 工作过程</h3>

<p>在开始介绍socks5协议工作工程之前，先来了解一下浏览器不设置代理情况下的请求过程。假设读者通过浏览器访问本博(假设读者使用的是HTTP协议)，流程如下:</p>

<ol>
  <li>
    <p>建立TCP连接</p>

    <p>浏览器向本博所在服务器建立TCP连接，经过3次握手后成功双方建立一条连接，用于数据传输</p>
  </li>
  <li>
    <p>发起HTTP请求</p>

    <p>TCP连接建立成功后，浏览器通过建立的连接发送HTTP请求</p>

    <div class="language-http highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="err">GET /
Host wiyi.org
</span></code></pre></div>    </div>
  </li>
  <li>
    <p>服务器响应浏览器一段HTML内容，浏览器收到后对页面进行渲染</p>
  </li>
</ol>

<p><img src="https://user-images.githubusercontent.com/3600657/169195743-8bae7f81-a217-4f43-bb8b-64480ed6fe5b.png" alt="https://user-images.githubusercontent.com/3600657/169195743-8bae7f81-a217-4f43-bb8b-64480ed6fe5b.png" /></p>

<p><strong>图2.1</strong></p>

<p>上面是正常的请求过程，如果读者给浏览器设置了一个socks5代理，情况会复杂一些。在这里我们假设socks5代理位于读者本地，端口为7582，它的工作流程如下:</p>

<ol>
  <li>
    <p>浏览器和socks5代理建立TCP连接</p>

    <p>和上面不同的时，浏览器和服务器之间多了一个中间人，即socks5，因此浏览器需要跟socks5服务器建立一条连接。</p>
  </li>
  <li>
    <p>socks5协商阶段</p>

    <p>在浏览器正式向socks5服务器发起请求之前，双方需要协商，包括协议版本，支持的认证方式等，双方需要协商成功才能进行下一步。协商的细节将会在下一小节详细描述。</p>
  </li>
  <li>
    <p>socks5请求阶段</p>

    <p>协商成功后，浏览器向socks5代理发起一个请求。请求的内容包括，它要访问的服务器域名或ip，端口等信息。</p>
  </li>
  <li>
    <p>socks5 relay阶段</p>

    <p>scoks5收到浏览器请求后，解析请求内容，然后向目标服务器建立TCP连接。</p>
  </li>
  <li>
    <p>数据传输阶段</p>

    <p>经过上面步骤，我们成功建立了浏览器 –&gt; socks5，socks5–&gt;目标服务器之间的连接。这个阶段浏览器开始把数据传输给scoks5代理，socks5代理把数据转发到目标服务器。</p>
  </li>
</ol>

<p>上面的步骤虽然变多，但本质不变，非常容易理解，简单整理为下图</p>

<p><img src="https://wiyi.org/assets/images/socks5/client-socks5_f.jpg" alt="" /></p>

<p><strong>图2.2</strong></p>

<h3 id="25-协议细节">2.5 协议细节</h3>

<p>在上一个小节介绍了socks5代理简要的工作流程，我们可以把它的的过程总结为3个阶段，分别为:握手阶段、请求阶段，Relay阶段。</p>

<h4 id="251-握手阶段">2.5.1 握手阶段</h4>

<p>握手阶段包含协商和子协商阶段，我们把它拆分为两个分别讨论</p>

<p><strong>2.5.1.1 协商阶段</strong></p>

<p>在这个阶段，客户端向socks5发起请求，内容如下:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>+----+----------+----------+
|VER | NMETHODS | METHODS  |
+----+----------+----------+
| 1  |    1     | 1 to 255 |
+----+----------+----------+

<span class="c">#上方的数字表示字节数，下面的表格同理，不再赘述</span>
</code></pre></div></div>

<p>VER: 协议版本，socks5为<code class="language-plaintext highlighter-rouge">0x05</code></p>

<p>NMETHODS: 支持认证的方法数量</p>

<p>METHODS: 对应NMETHODS，NMETHODS的值为多少，METHODS就有多少个字节。RFC预定义了一些值的含义，内容如下:</p>

<ul>
  <li>X’00’ NO AUTHENTICATION REQUIRED</li>
  <li>X’01’ GSSAPI</li>
  <li>X’02’ USERNAME/PASSWORD</li>
  <li>X’03’ to X’7F’ IANA ASSIGNED</li>
  <li>X’80’ to X’FE’ RESERVED FOR PRIVATE METHODS</li>
  <li>X’FF’ NO ACCEPTABLE METHODS</li>
</ul>

<p><img src="https://wiyi.org/assets/images/socks5/socks5_ne_01.jpg" alt="~replace~/assets/images/socks5/socks5_ne_01.jpg" /></p>

<p>socks5服务器需要选中一个METHOD返回给客户端，格式如下:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>+----+--------+
|VER | METHOD |
+----+--------+
| 1  |   1    |
+----+--------+
</code></pre></div></div>

<p>当客户端收到<code class="language-plaintext highlighter-rouge">0x00</code>时，会跳过认证阶段直接进入请求阶段; 当收到<code class="language-plaintext highlighter-rouge">0xFF</code>时，直接断开连接。其他的值进入到对应的认证阶段。</p>

<p><img src="https://wiyi.org/assets/images/socks5/socks5_ne_02.jpg" alt="~replace~/assets/images/socks5/socks5_ne_02.jpg" /></p>

<p><strong>2.5.1.2 认证阶段(也叫子协商)</strong></p>

<p>认证阶段作为协商的一个子流程，它<strong>不是必须</strong>的。socks5服务器可以决定是否需要认证，如果不需要认证，那么认证阶段会被直接略过。</p>

<p>如果需要认证，客户端向socks5服务器发起一个认证请求，这里以<code class="language-plaintext highlighter-rouge">0x02</code>的认证方式举例:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>+----+------+----------+------+----------+
|VER | ULEN |  UNAME   | PLEN |  PASSWD  |
+----+------+----------+------+----------+
| 1  |  1   | 1 to 255 |  1   | 1 to 255 |
+----+------+----------+------+----------+
</code></pre></div></div>

<p>VER: 版本，通常为<code class="language-plaintext highlighter-rouge">0x01</code></p>

<p>ULEN: 用户名长度</p>

<p>UNAME: 对应用户名的字节数据</p>

<p>PLEN: 密码长度</p>

<p>PASSWD: 密码对应的数据</p>

<p><img src="https://wiyi.org/assets/images/socks5/socks5_ne_03_auth.jpg" alt="~replace~/assets/images/socks5/socks5_ne_03_auth.jpg" /></p>

<p>socks5服务器收到客户端的认证请求后，解析内容，验证信息是否合法，然后给客户端响应结果。响应格式如下:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>+----+--------+
|VER | STATUS |
+----+--------+
| 1  |   1    |
+----+--------+
</code></pre></div></div>

<p>STATUS字段如果为<code class="language-plaintext highlighter-rouge">0x00</code>表示认证成功，其他的值为认证失败。当客户端收到认证失败的响应后，它将会断开连接。</p>

<p><img src="https://wiyi.org/assets/images/socks5/socks5_ne_04_auth.jpg" alt="~replace~/assets/images/socks5/socks5_ne_04_auth.jpg" /></p>

<h4 id="252-请求阶段">2.5.2 请求阶段</h4>

<p>顺利通过协商阶段后，客户端向socks5服务器发起请求细节，格式如下:</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>+----+-----+-------+------+----------+----------+
|VER | CMD |  RSV  | ATYP | DST.ADDR | DST.PORT |
+----+-----+-------+------+----------+----------+
| 1  |  1  | X<span class="s1">'00'</span> |  1   | Variable |    2     |
+----+-----+-------+------+----------+----------+
</code></pre></div></div>

<ul>
  <li>VER 版本号，socks5的值为<code class="language-plaintext highlighter-rouge">0x05</code></li>
  <li>CMD
    <ul>
      <li><code class="language-plaintext highlighter-rouge">0x01</code>表示CONNECT请求</li>
      <li><code class="language-plaintext highlighter-rouge">0x02</code>表示BIND请求</li>
      <li><code class="language-plaintext highlighter-rouge">0x03</code>表示UDP转发</li>
    </ul>
  </li>
  <li>RSV   保留字段，值为<code class="language-plaintext highlighter-rouge">0x00</code></li>
  <li>ATYP   目标地址类型，DST.ADDR的数据对应这个字段的类型。
    <ul>
      <li><code class="language-plaintext highlighter-rouge">0x01</code>表示IPv4地址，DST.ADDR为4个字节</li>
      <li><code class="language-plaintext highlighter-rouge">0x03</code>表示域名，DST.ADDR是一个可变长度的域名</li>
      <li><code class="language-plaintext highlighter-rouge">0x04</code>表示IPv6地址，DST.ADDR为16个字节长度</li>
    </ul>
  </li>
  <li>DST.ADDR  一个可变长度的值</li>
  <li>DST.PORT 目标端口，固定2个字节</li>
</ul>

<p>上面的值中，DST.ADDR是一个变长的数据，它的数据长度根据ATYP的类型决定。<del>我们可以通过掐头去尾解析出这部分数据。</del>分为下面3种情况:</p>

<ul>
  <li>
    <p>X’01’</p>

    <p>一个4字节的ipv4地址</p>
  </li>
  <li>
    <p>X’03’</p>

    <p>一个可变长度的域名，这种情况下<code class="language-plaintext highlighter-rouge">DST.ADDR</code>的第一个字节表示域名长度，剩下部分是域名内容。</p>
  </li>
  <li>
    <p>X’04’</p>

    <p>一个16字节的ipv6地址</p>
  </li>
</ul>

<p><img src="https://wiyi.org/assets/images/socks5/socks5_05_req_01.jpg" alt="~replace~/assets/images/socks5/socks5_05_req_01.jpg" /></p>

<p>socks5服务器收到客户端的请求后，需要返回一个响应，结构如下</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>+----+-----+-------+------+----------+----------+
|VER | REP |  RSV  | ATYP | BND.ADDR | BND.PORT |
+----+-----+-------+------+----------+----------+
| 1  |  1  | X<span class="s1">'00'</span> |  1   | Variable |    2     |
+----+-----+-------+------+----------+----------+
</code></pre></div></div>

<ul>
  <li>VER socks版本，这里为<code class="language-plaintext highlighter-rouge">0x05</code></li>
  <li>REP Relay field,内容取值如下
    <ul>
      <li>X’00’ succeeded</li>
      <li>X’01’ general SOCKS server failure</li>
      <li>X’02’ connection not allowed by ruleset</li>
      <li>X’03’ Network unreachable</li>
      <li>X’04’ Host unreachable</li>
      <li>X’05’ Connection refused</li>
      <li>X’06’ TTL expired</li>
      <li>X’07’ Command not supported</li>
      <li>X’08’ Address type not supported</li>
      <li>X’09’ to X’FF’ unassigned</li>
    </ul>
  </li>
  <li>RSV 保留字段</li>
  <li>ATYPE 同请求的ATYPE</li>
  <li>BND.ADDR 服务绑定的地址</li>
  <li>BND.PORT 服务绑定的端口DST.PORT</li>
</ul>

<p>针对响应的结构中，<code class="language-plaintext highlighter-rouge">BND.ADDR</code>和<code class="language-plaintext highlighter-rouge">BND.PORT</code>值得特别关注一下，可能有朋友在这里会产生困惑，返回的地址和端口是用来做什么的呢？</p>

<p>我们回过头看<strong>图2.2</strong>，可以发现在图中socks5既充当socks服务器，又充当relay服务器。实际上这两个是可以被拆开的，当我们的socks5 server和relay server不是一体的，就需要告知客户端relay server的地址，这个地址就是BND.ADDR和BND.PORT。</p>

<p>当我们的relay server和socks5 server是同一台服务器时，<code class="language-plaintext highlighter-rouge">BND.ADDR</code>和<code class="language-plaintext highlighter-rouge">BND.PORT</code>的值全部为0即可。</p>

<h4 id="253-relay阶段">2.5.3 Relay阶段</h4>

<p>socks5服务器收到请求后，解析内容。如果是UDP请求，服务器直接转发; 如果是TCP请求，服务器向目标服务器建立TCP连接，后续负责把客户端的所有数据转发到目标服务。</p>

<h2 id="3总结--下载">3.总结 &amp; 下载</h2>

<p>本文简单介绍了下socks5协议的作用以及处理过程，下一篇文章，将会手把手用Java实现一个socks5代理服务器，进一步认识socks5协议的处理过程。</p>

<p>读者可以点击<a href="/assets/files/socks5.pcapng">socks5.pcapng</a>下载抓包数据，使用<a href="https://www.wireshark.org/">wireshark</a>可以查看本文事例的抓包数据。</p>

<h2 id="4-相关阅读">4. 相关阅读</h2>

<ul>
  <li><a href="https://wiyi.org/socks5-implementation.html">手把手使用Java实现一个Socks5代理</a></li>
</ul>

<h2 id="5参考资料">5.参考资料</h2>

<p><a href="https://en.wikipedia.org/wiki/SOCKS">https://en.wikipedia.org/wiki/SOCKS</a> <br />
<a href="https://datatracker.ietf.org/doc/html/rfc1928">https://datatracker.ietf.org/doc/html/rfc1928</a> <br />
<a href="https://datatracker.ietf.org/doc/html/rfc1929">https://datatracker.ietf.org/doc/html/rfc1929</a><br />
<a href="https://www.rapidseedbox.com/blog/guide-to-socks5-proxy">https://www.rapidseedbox.com/blog/guide-to-socks5-proxy</a></p>]]></content><author><name>kikcat</name></author><category term="socks5" /><summary type="html"><![CDATA[1. 前言 本位将由浅入深带大家详细了解socks5协议。文章首先会对socks协议进行简单介绍，接着会介绍socks5协议的使用场景，然后介绍它的工作工程，最后介绍协议的细节(握手、数据转发)。 2. 协议介绍 2.1 什么是socks协议 啥是socks协议呢? 这里贴一段维基百科对它的定义 SOCKS is an Internet protocol that exchanges network packets between a client and server through a proxy server 大概的意思是: socks是一种互联网协议，它通过一个代理服务器在客户端和服务端之间交换网络数据。简单来说，它就是一种代理协议，扮演一个中间人的角色，在客户端和目标主机之间转发数据。 socks协议位于OSI模型中的第五层，即会话层(Session Layer)。]]></summary></entry><entry><title type="html">理解Java中的Bridge Method</title><link href="https://wiyi.org/bridge-method-in-java.html" rel="alternate" type="text/html" title="理解Java中的Bridge Method" /><published>2021-07-27T00:00:00+00:00</published><updated>2021-07-27T00:00:00+00:00</updated><id>https://wiyi.org/bridge-method-in-java</id><content type="html" xml:base="https://wiyi.org/bridge-method-in-java.html"><![CDATA[<p>bridge method又叫synthetic method，它是由Java编译器自动生成的一个合成方法，这个方法不会出现在源码中，也不能显式调用。我们先通过一个例子对bridge method有一个感性的认识。</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Code 1-1</span>
<span class="kd">class</span> <span class="nc">Animal</span> <span class="o">{</span>
  <span class="kd">public</span> <span class="nc">Animal</span> <span class="nf">getAnimal</span><span class="o">()</span> <span class="o">{</span>
    <span class="k">return</span> <span class="k">new</span> <span class="nf">Animal</span><span class="o">();</span>
  <span class="o">}</span>
<span class="o">}</span>

<span class="kd">class</span> <span class="nc">Dog</span> <span class="kd">extends</span> <span class="nc">Animal</span> <span class="o">{</span>
  <span class="kd">public</span> <span class="nc">Dog</span> <span class="nf">getAnimal</span><span class="o">()</span> <span class="o">{</span>
    <span class="k">return</span> <span class="k">new</span> <span class="nf">Dog</span><span class="o">();</span>
  <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p>上面定义了Animal和Dog两个类，Dog是Animal的subclass，且Dog类override了Animal的<code class="language-plaintext highlighter-rouge">getAnimal</code>方法。如果你现在尝试去编译Dog，很大概率是可以直接通过编译，不过当我们尝试使用JDK 1.4以下版本编译，就是另外一回事了。</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code>javac org/wiyi/bridge/Dog.java
org/wiyi/bridge/Dog.java:6: getAnimal<span class="o">()</span> <span class="k">in </span>org.wiyi.bridge.Dog cannot override g
etAnimal<span class="o">()</span> <span class="k">in </span>org.wiyi.bridge.Animal<span class="p">;</span> attempting to use incompatible <span class="k">return </span><span class="nb">type

</span>found   : org.wiyi.bridge.Dog
required: org.wiyi.bridge.Animal
  public Dog getAnimal<span class="o">()</span> <span class="o">{</span>
             ^
1 error
</code></pre></div></div>
<!--more-->

<p>低版本的JDK编译会报错，提示的错误为”cannot override getAnimal() in org.wiyi.bridge.Animal;”。为什么无法override getAnimal方法呢？要了解这个，我们需要知道在JVM中对method override的定义。</p>

<blockquote>
  <p>An instance method <code class="language-plaintext highlighter-rouge">m1</code> declared in class C overrides another instance method <code class="language-plaintext highlighter-rouge">m2</code> declared in class A iff either <code class="language-plaintext highlighter-rouge">m1</code> is the same as <code class="language-plaintext highlighter-rouge">m2</code>, or all of the following are true:</p>

  <ul>
    <li>C is a subclass of A.  //C是A的subclass</li>
    <li><code class="language-plaintext highlighter-rouge">m1</code> has the same name and descriptor as <code class="language-plaintext highlighter-rouge">m2</code>. // m1和m2的方法名和descriptor相同</li>
    <li><code class="language-plaintext highlighter-rouge">m1</code> is not marked <code class="language-plaintext highlighter-rouge">ACC_PRIVATE</code>. //m1不能为private</li>
    <li>One of the following is true:
      <ul>
        <li><code class="language-plaintext highlighter-rouge">m2</code> is marked <code class="language-plaintext highlighter-rouge">ACC_PUBLIC</code>; or is marked <code class="language-plaintext highlighter-rouge">ACC_PROTECTED</code>; or is marked neither <code class="language-plaintext highlighter-rouge">ACC_PUBLIC</code> nor <code class="language-plaintext highlighter-rouge">ACC_PROTECTED</code> nor <code class="language-plaintext highlighter-rouge">ACC_PRIVATE</code> and A belongs to the same run-time package as C.</li>
        <li><code class="language-plaintext highlighter-rouge">m1</code> overrides a method <code class="language-plaintext highlighter-rouge">m'</code> (<code class="language-plaintext highlighter-rouge">m'</code> distinct from <code class="language-plaintext highlighter-rouge">m1</code> and <code class="language-plaintext highlighter-rouge">m2</code>) such that <code class="language-plaintext highlighter-rouge">m'</code> overrides <code class="language-plaintext highlighter-rouge">m2</code>.</li>
      </ul>
    </li>
  </ul>
</blockquote>

<p>根据上面的定义，我们可以逐个对比Dog类中到底哪条不满足，使用高版本的JDK编译Animal和Dog并查看对比它们的字节码。</p>

<div class="language-shell highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c">#openjdk version "11.0.6</span>
javap <span class="nt">-v</span> org.wiyi.bridge.Dog
javap <span class="nt">-v</span> org.wiyi.bridge.Animal
</code></pre></div></div>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//Dog.class</span>
<span class="n">org</span><span class="o">.</span><span class="na">wiyi</span><span class="o">.</span><span class="na">java9</span><span class="o">.</span><span class="na">generic</span><span class="o">.</span><span class="na">Dog</span> <span class="nf">getAnimal</span><span class="o">();</span> <span class="c1">//1.方法名都是getAnimal</span>
    <span class="nl">descriptor:</span> <span class="o">()</span><span class="nc">Lorg</span><span class="o">/</span><span class="n">wiyi</span><span class="o">/</span><span class="n">java9</span><span class="o">/</span><span class="n">generic</span><span class="o">/</span><span class="nc">Dog</span><span class="o">;</span> <span class="c1">//2.Dog的descriptor为Lorg/wiyi/java9/generic/Dog</span>
    <span class="nl">flags:</span> <span class="o">(</span><span class="mh">0x0000</span><span class="o">)</span>
    <span class="nl">Code:</span>
      <span class="n">stack</span><span class="o">=</span><span class="mi">2</span><span class="o">,</span> <span class="n">locals</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">args_size</span><span class="o">=</span><span class="mi">1</span>
         <span class="mi">0</span><span class="o">:</span> <span class="k">new</span>           <span class="err">#</span><span class="mi">2</span>                  <span class="c1">// class org/wiyi/java9/generic/Dog</span>
         <span class="mi">3</span><span class="o">:</span> <span class="n">dup</span>
         <span class="mi">4</span><span class="o">:</span> <span class="n">invokespecial</span> <span class="err">#</span><span class="mi">3</span>                  <span class="c1">// Method "&lt;init&gt;":()V</span>
         <span class="mi">7</span><span class="o">:</span> <span class="n">areturn</span>
        <span class="nc">Start</span>  <span class="nc">Length</span>  <span class="nc">Slot</span>  <span class="nc">Name</span>   <span class="nc">Signature</span>
            <span class="mi">0</span>       <span class="mi">8</span>     <span class="mi">0</span>  <span class="k">this</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">wiyi</span><span class="o">/</span><span class="n">java9</span><span class="o">/</span><span class="n">generic</span><span class="o">/</span><span class="nc">Dog</span><span class="o">;</span>
</code></pre></div></div>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">//Animal.class</span>
<span class="n">org</span><span class="o">.</span><span class="na">wiyi</span><span class="o">.</span><span class="na">java9</span><span class="o">.</span><span class="na">generic</span><span class="o">.</span><span class="na">Animal</span> <span class="nf">getAnimal</span><span class="o">();</span> <span class="c1">//1.方法名为Animal</span>
    <span class="nl">descriptor:</span> <span class="o">()</span><span class="nc">Lorg</span><span class="o">/</span><span class="n">wiyi</span><span class="o">/</span><span class="n">java9</span><span class="o">/</span><span class="n">generic</span><span class="o">/</span><span class="nc">Animal</span><span class="o">;</span> <span class="c1">//2.Animal的descriptor为Lorg/wiyi/java9/generic/Animal;</span>
    <span class="nl">flags:</span> <span class="o">(</span><span class="mh">0x0000</span><span class="o">)</span>
    <span class="nl">Code:</span>
      <span class="n">stack</span><span class="o">=</span><span class="mi">2</span><span class="o">,</span> <span class="n">locals</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">args_size</span><span class="o">=</span><span class="mi">1</span>
         <span class="mi">0</span><span class="o">:</span> <span class="k">new</span>           <span class="err">#</span><span class="mi">2</span>                  <span class="c1">// class org/wiyi/java9/generic/Animal</span>
         <span class="mi">3</span><span class="o">:</span> <span class="n">dup</span>
         <span class="mi">4</span><span class="o">:</span> <span class="n">invokespecial</span> <span class="err">#</span><span class="mi">3</span>                  <span class="c1">// Method "&lt;init&gt;":()V</span>
         <span class="mi">7</span><span class="o">:</span> <span class="n">areturn</span>
      <span class="nl">LocalVariableTable:</span>
        <span class="nc">Start</span>  <span class="nc">Length</span>  <span class="nc">Slot</span>  <span class="nc">Name</span>   <span class="nc">Signature</span>
            <span class="mi">0</span>       <span class="mi">8</span>     <span class="mi">0</span>  <span class="k">this</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">wiyi</span><span class="o">/</span><span class="n">java9</span><span class="o">/</span><span class="n">generic</span><span class="o">/</span><span class="nc">Animal</span><span class="o">;</span>
</code></pre></div></div>

<p>从字节码可以看出，Dog的<code class="language-plaintext highlighter-rouge">getAnimal()</code>方法和Animal的<code class="language-plaintext highlighter-rouge">getAnimal</code>方法的descriptor不同。因此，在JVM的层面上，它们是<strong>完全没有任何关系的两个方法</strong>，不能构成Override;因为方法重名参数一致，同时也不构成Overload，因此编译器会报错。</p>

<p>那么为什么高版本的Java编译器不会报错呢？因为JDK 1.5以上的版本支持了方法返回值的协变，编译源码时会生成一个Bridge Method实现Override。我们查看Dog类完整的字节码，会发现有两个getAnimal方法。</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nc">Classfile</span> <span class="nc">Dog</span><span class="o">.</span><span class="na">class</span>
  <span class="nc">Compiled</span> <span class="n">from</span> <span class="s">"Dog.java"</span>
<span class="kd">public</span> <span class="kd">class</span> <span class="nc">org</span><span class="o">.</span><span class="na">wiyi</span><span class="o">.</span><span class="na">java9</span><span class="o">.</span><span class="na">generic</span><span class="o">.</span><span class="na">Dog</span> <span class="kd">extends</span> <span class="n">org</span><span class="o">.</span><span class="na">wiyi</span><span class="o">.</span><span class="na">java9</span><span class="o">.</span><span class="na">generic</span><span class="o">.</span><span class="na">Animal</span>
  <span class="n">minor</span> <span class="nl">version:</span> <span class="mi">0</span>
  <span class="n">major</span> <span class="nl">version:</span> <span class="mi">55</span>
<span class="o">{</span>
  <span class="kd">public</span> <span class="n">org</span><span class="o">.</span><span class="na">wiyi</span><span class="o">.</span><span class="na">java9</span><span class="o">.</span><span class="na">generic</span><span class="o">.</span><span class="na">Dog</span><span class="o">();</span>
    <span class="nl">descriptor:</span> <span class="o">()</span><span class="no">V</span>
    <span class="nl">flags:</span> <span class="o">(</span><span class="mh">0x0001</span><span class="o">)</span> <span class="no">ACC_PUBLIC</span>
    <span class="nl">Code:</span>
      <span class="n">stack</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">locals</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">args_size</span><span class="o">=</span><span class="mi">1</span>
         <span class="mi">0</span><span class="o">:</span> <span class="n">aload_0</span>
         <span class="mi">1</span><span class="o">:</span> <span class="n">invokespecial</span> <span class="err">#</span><span class="mi">1</span>                  <span class="c1">// Method org/wiyi/java9/generic/Animal."&lt;init&gt;":()V</span>
         <span class="mi">4</span><span class="o">:</span> <span class="k">return</span>
      <span class="nl">LineNumberTable:</span>
        <span class="n">line</span> <span class="mi">3</span><span class="o">:</span> <span class="mi">0</span>
      <span class="nl">LocalVariableTable:</span>
        <span class="nc">Start</span>  <span class="nc">Length</span>  <span class="nc">Slot</span>  <span class="nc">Name</span>   <span class="nc">Signature</span>
            <span class="mi">0</span>       <span class="mi">5</span>     <span class="mi">0</span>  <span class="k">this</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">wiyi</span><span class="o">/</span><span class="n">java9</span><span class="o">/</span><span class="n">generic</span><span class="o">/</span><span class="nc">Dog</span><span class="o">;</span>

  <span class="n">org</span><span class="o">.</span><span class="na">wiyi</span><span class="o">.</span><span class="na">java9</span><span class="o">.</span><span class="na">generic</span><span class="o">.</span><span class="na">Dog</span> <span class="nf">getAnimal</span><span class="o">();</span> <span class="c1">//bigbyto注: 第一个getAnimal</span>
    <span class="nl">descriptor:</span> <span class="o">()</span><span class="nc">Lorg</span><span class="o">/</span><span class="n">wiyi</span><span class="o">/</span><span class="n">java9</span><span class="o">/</span><span class="n">generic</span><span class="o">/</span><span class="nc">Dog</span><span class="o">;</span>
    <span class="nl">flags:</span> <span class="o">(</span><span class="mh">0x0000</span><span class="o">)</span>
    <span class="nl">Code:</span>
      <span class="n">stack</span><span class="o">=</span><span class="mi">2</span><span class="o">,</span> <span class="n">locals</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">args_size</span><span class="o">=</span><span class="mi">1</span>
         <span class="mi">0</span><span class="o">:</span> <span class="k">new</span>           <span class="err">#</span><span class="mi">2</span>                  <span class="c1">// class org/wiyi/java9/generic/Dog</span>
         <span class="mi">3</span><span class="o">:</span> <span class="n">dup</span>
         <span class="mi">4</span><span class="o">:</span> <span class="n">invokespecial</span> <span class="err">#</span><span class="mi">3</span>                  <span class="c1">// Method "&lt;init&gt;":()V</span>
         <span class="mi">7</span><span class="o">:</span> <span class="n">areturn</span>
      <span class="nl">LineNumberTable:</span>
        <span class="n">line</span> <span class="mi">5</span><span class="o">:</span> <span class="mi">0</span>
      <span class="nl">LocalVariableTable:</span>
        <span class="nc">Start</span>  <span class="nc">Length</span>  <span class="nc">Slot</span>  <span class="nc">Name</span>   <span class="nc">Signature</span>
            <span class="mi">0</span>       <span class="mi">8</span>     <span class="mi">0</span>  <span class="k">this</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">wiyi</span><span class="o">/</span><span class="n">java9</span><span class="o">/</span><span class="n">generic</span><span class="o">/</span><span class="nc">Dog</span><span class="o">;</span>

  <span class="n">org</span><span class="o">.</span><span class="na">wiyi</span><span class="o">.</span><span class="na">java9</span><span class="o">.</span><span class="na">generic</span><span class="o">.</span><span class="na">Animal</span> <span class="nf">getAnimal</span><span class="o">();</span><span class="c1">//bigbyto注: 第二个getAnimal</span>
    <span class="nl">descriptor:</span> <span class="o">()</span><span class="nc">Lorg</span><span class="o">/</span><span class="n">wiyi</span><span class="o">/</span><span class="n">java9</span><span class="o">/</span><span class="n">generic</span><span class="o">/</span><span class="nc">Animal</span><span class="o">;</span>
    <span class="nl">flags:</span> <span class="o">(</span><span class="mh">0x1040</span><span class="o">)</span> <span class="no">ACC_BRIDGE</span><span class="o">,</span> <span class="no">ACC_SYNTHETIC</span>
    <span class="nl">Code:</span>
      <span class="n">stack</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">locals</span><span class="o">=</span><span class="mi">1</span><span class="o">,</span> <span class="n">args_size</span><span class="o">=</span><span class="mi">1</span>
         <span class="mi">0</span><span class="o">:</span> <span class="n">aload_0</span>
         <span class="mi">1</span><span class="o">:</span> <span class="n">invokevirtual</span> <span class="err">#</span><span class="mi">4</span>                  <span class="c1">// Method getAnimal:()Lorg/wiyi/java9/generic/Dog;</span>
         <span class="mi">4</span><span class="o">:</span> <span class="n">areturn</span>
      <span class="nl">LineNumberTable:</span>
        <span class="n">line</span> <span class="mi">3</span><span class="o">:</span> <span class="mi">0</span>
      <span class="nl">LocalVariableTable:</span>
        <span class="nc">Start</span>  <span class="nc">Length</span>  <span class="nc">Slot</span>  <span class="nc">Name</span>   <span class="nc">Signature</span>
            <span class="mi">0</span>       <span class="mi">5</span>     <span class="mi">0</span>  <span class="k">this</span>   <span class="nc">Lorg</span><span class="o">/</span><span class="n">wiyi</span><span class="o">/</span><span class="n">java9</span><span class="o">/</span><span class="n">generic</span><span class="o">/</span><span class="nc">Dog</span><span class="o">;</span>
<span class="o">}</span>
<span class="nl">SourceFile:</span> <span class="s">"Dog.java"</span>
</code></pre></div></div>

<p>我们留意第二个<code class="language-plaintext highlighter-rouge">getAnimal</code>方法，它的descriptor依然是<code class="language-plaintext highlighter-rouge">Lorg/wiyi/java9/generic/Animal</code>，且它的flags多了两个<strong>ACC_BRIDGE</strong>, <strong>ACC_SYNTHETIC</strong>，这两个flag的意思即表明这是由编译器生成的Bridge Method。</p>

<p>第二个<code class="language-plaintext highlighter-rouge">getAnimal</code>内部使用了invokevirtual这条指令，这意味着这个Bridge Method内部主要的逻辑就是直接调用第一个<code class="language-plaintext highlighter-rouge">getAnimal</code>方法返回结果，它的伪代码如下:</p>

<div class="language-java highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kd">class</span> <span class="nc">Dog</span> <span class="kd">extends</span> <span class="nc">Animal</span> <span class="o">{</span>
  <span class="nc">Dog</span> <span class="nf">getAnimal</span><span class="o">()</span> <span class="o">{</span>
    <span class="k">return</span> <span class="k">new</span> <span class="nf">Dog</span><span class="o">();</span>
  <span class="o">}</span>
  
  <span class="n">synthetic</span> <span class="n">bridge</span> <span class="nc">Animal</span> <span class="nf">getAnimal</span><span class="o">()</span> <span class="o">{</span>
    <span class="k">return</span> <span class="o">((</span><span class="nc">Dog</span><span class="o">)</span><span class="k">this</span><span class="o">).</span><span class="na">getAnimal</span><span class="o">();</span>
  <span class="o">}</span>
<span class="o">}</span>
</code></pre></div></div>

<p>从上面的结果可以看出，Java语言和JVM它们对类型系统的定义其实存在一些差异。在本文的例子中，Dog的getAnimal方法把原返回值Animal改为了Dog，这属于方法返回值协变，是多态中的一部分。但在JVM中它们对它们的定义并不相同，因此通过Bridge Method的方式把两套类型系统连接到一起。</p>

<p>当我们理解了Bridge Method的本质，在看Oracle官网的<a href="https://docs.oracle.com/javase/tutorial/java/generics/bridgeMethods.html">Effects of Type Erasure and Bridge Methods</a>就可以理解它为什么对于泛型会有很大的帮助。</p>

<p><strong>参考资料:</strong> <br />
<a href="https://www.youtube.com/watch?v=kOBHtmqavXc&amp;list=WL&amp;index=8&amp;t=705s">JVM Bridge Methods with Dan Heidinga</a> <br />
<a href="https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-5.html#jvms-5.4.5">https://docs.oracle.com/javase/specs/jvms/se8/html/jvms-5.html#jvms-5.4.5</a></p>]]></content><author><name>kikcat</name></author><category term="多态" /><summary type="html"><![CDATA[bridge method又叫synthetic method，它是由Java编译器自动生成的一个合成方法，这个方法不会出现在源码中，也不能显式调用。我们先通过一个例子对bridge method有一个感性的认识。 // Code 1-1 class Animal { public Animal getAnimal() { return new Animal(); } } class Dog extends Animal { public Dog getAnimal() { return new Dog(); } } 上面定义了Animal和Dog两个类，Dog是Animal的subclass，且Dog类override了Animal的getAnimal方法。如果你现在尝试去编译Dog，很大概率是可以直接通过编译，不过当我们尝试使用JDK 1.4以下版本编译，就是另外一回事了。 javac org/wiyi/bridge/Dog.java org/wiyi/bridge/Dog.java:6: getAnimal() in org.wiyi.bridge.Dog cannot override g etAnimal() in org.wiyi.bridge.Animal; attempting to use incompatible return type found : org.wiyi.bridge.Dog required: org.wiyi.bridge.Animal public Dog getAnimal() { ^ 1 error]]></summary></entry></feed>