Rails ancestry 和 closure_tree 的批量读对比

mizuhashi · 发布于 2017年04月17日 · 最后由 playmonkey 回复于 2017年05月17日 · 2424 次阅读
23529
本帖已被设为精华帖!

本次测试中使用了细粒度缓存插件 second_level_cache,测试数据是一个7个节点的二叉树。


批量读parent:

ancestry: 
pry(main)> Catalog.all.map(&:parent)
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs"
Cache read: slc/catalogs/1/0/4b739449bcf124e7ace371e00a28ea69
Cache read: slc/catalogs/1/0/4b739449bcf124e7ace371e00a28ea69
Cache read: slc/catalogs/2/0/4b739449bcf124e7ace371e00a28ea69
Cache read: slc/catalogs/2/0/4b739449bcf124e7ace371e00a28ea69
Cache read: slc/catalogs/3/0/4b739449bcf124e7ace371e00a28ea69
Cache read: slc/catalogs/3/0/4b739449bcf124e7ace371e00a28ea69
closure_tree:
pry(main)> Catalog.includes(:parent).all.map(&:parent)
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs"
Cache read_multi: ["slc/catalogs/1/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/2/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/3/0/e1a68710d2b02b4e6409f3222c9e280e"]
missed ids -> [] | hitted ids -> [1, 2, 3]

ancestry和closure_tree都可以用到slc,但closure_tree使用includes可从multi_get中受益,若不includes,则和ancestry完全一致。


查询children:

ancestry:
pry(main)> Catalog.all.map(&:children)
  Catalog Load (0.2ms)  SELECT "catalogs".* FROM "catalogs"
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1'
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/2'
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/3'
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/2/4'
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/2/5'
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/3/6'
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/3/7'
closure_tree:
pry(main)> Catalog.includes(:children).all.map(&:children)
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs"
  Catalog Load (0.2ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."parent_id" IN (1, 2, 3, 4, 5, 6, 7)

双方都不能命中缓存,但closure_tree可受益于includes。


查询ancestors:

ancestry:
pry(main)> Catalog.all.map(&:ancestors)
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs"
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE 1=0 ORDER BY coalesce("catalogs"."ancestry", '')
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."id" IN (1) ORDER BY coalesce("catalogs"."ancestry", '')
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."id" IN (1) ORDER BY coalesce("catalogs"."ancestry", '')
  Catalog Load (0.3ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."id" IN (1, 2) ORDER BY coalesce("catalogs"."ancestry", '')
  Catalog Load (0.4ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."id" IN (1, 2) ORDER BY coalesce("catalogs"."ancestry", '')
  Catalog Load (0.3ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."id" IN (1, 3) ORDER BY coalesce("catalogs"."ancestry", '')
  Catalog Load (0.2ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."id" IN (1, 3) ORDER BY coalesce("catalogs"."ancestry", '')
closure_tree:
pry(main)> Catalog.includes(:self_and_ancestors).all.map{|x| x.self_and_ancestors[0..-2]}
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs"
  CatalogHierarchy Load (0.2ms)  SELECT "catalog_hierarchies".* FROM "catalog_hierarchies" WHERE "catalog_hierarchies"."descendant_id" IN (1, 2, 3, 4, 5, 6, 7) ORDER BY "catalog_hierarchies".generations asc
Cache read_multi: ["slc/catalogs/1/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/2/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/3/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/4/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/5/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/6/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/7/0/e1a68710d2b02b4e6409f3222c9e280e"]
missed ids -> [] | hitted ids -> [1, 2, 3, 4, 5, 6, 7]

closure_tree 没直接将ancestors作为关联,所以includes的时候要用self_and_ancestors,并自己去掉self。因为关联排过序了,所以可以直接用range取值。查询descendants的情况与ancestors的类似。


查询siblings:

ancestry:
pry(main)> Catalog.all.map(&:siblings)
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs"
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" IS NULL
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1'
  Catalog Load (0.0ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1'
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/2'
  Catalog Load (0.0ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/2'
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/3'
  Catalog Load (0.0ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."ancestry" = '1/3'
closure_tree:
pry(main)> Catalog.includes(parent: :children).all.map{|x| x.parent&.children&.reject{|y| y == x}}
  Catalog Load (0.1ms)  SELECT "catalogs".* FROM "catalogs"
Cache read_multi: ["slc/catalogs/1/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/2/0/e1a68710d2b02b4e6409f3222c9e280e", "slc/catalogs/3/0/e1a68710d2b02b4e6409f3222c9e280e"]
missed ids -> [] | hitted ids -> [1, 2, 3]
  Catalog Load (0.2ms)  SELECT "catalogs".* FROM "catalogs" WHERE "catalogs"."parent_id" IN (1, 2, 3)

因为siblings依赖于不同查询条件,不能直接使用includes,但是可以用parent.children实现。


总结:

我们的使用情景是对产品批量查出他们的catalogs,并且要用到parent和grandparent。从生成的查询来看,有理由相信closure_tree的性能更好。ancestry所有查询都无法使用includes,也只有parent可以用到缓存,而closure_tree在缓存使用上占优,其他查询也大多可以用includes和eager_load进行优化。

tips: 使用closure_tree,如果需要特定depth的parent和children,可以自己定义一个关联:

has_one :grandparent, class_name: 'Catalog', through: :grandparent_hierarchy, source: :ancestor
has_one :grandparent_hierarchy, ->{ where(generations: 2) }, class_name: 'CatalogHierarchy', foreign_key: :descendant_id

但是对于grandparent没必要这么做,使用includes(parent: :parent)会更好。

共收到 2 条回复
De6df3 huacnlee 将本帖设为了精华贴 04月18日 14:02
1770

@darkbaby123 PostgreSQL 的话直接用 ltree 也是不错

需要 登录 后方可回复, 如果你还没有账号请点击这里 注册