重构 Simple Rails history pattern (ActiveRecord)
如果你的应用里需要保留 Model 的历史记录，可以尝试一下 Strikingly 团队实践后的方法：
Many times in my career, I had to face cases where I had to keep a history of one or many tables. Most of the time, the need of history comes after the main feature itself is developed and ends up in this kind of architecture:
| Post | | PostHistory | |==============| |===============| | author_id | | post_id | | title | | title | | content | | content | | published | |---------------| | published_at | |--------------|
It comes as the most natural and painless way to add the history feature to an existing entity. You can also find gems doing this job for you, in which case, they will serialize your Post in a generic history table.
Both of those approaches sound sexy because they’re simple to implement and they don’t introduce a “risk” for the Post model. Problem is, they’re very hard to maintain.
In the first case, every time we run a migration on the post table, we will have to run it on the PostHistory too. Double the work. If you have special columns (enum, serialized stuff in any manner), you also have to maintain the same code in both classes. Or you have to extract everything in modules to share it between classes, which makes the code less readable. If you use services, it can lead to even more complicated situations.
In the second case, if your Post model changes, restoring the serialized version from history will become hard, if not impossible. If you want to historize only a subset of properties from Post, it’s also inconvenient. Plus, if you have serialized columns in Post table, you will end with serialized data IN serialized data in the history table. Something you don’t want to deal with in case of bugs.
The other main problem is if you need to use history in another context than just reading values and showing them to a human, you will have a schizophrenic code, using sometimes Post, sometimes PostHistory.
This architecture also come with the classic problem “when do we create a new entry?”:
- before saving will imply that the last version is not in history, force you to read from 2 tables to have the complete history
- after saving will imply that the last version is in two different places (content duplicated)
Let’s take a simple example, restore values from history:
# services/posts/historize_service.rb class HistorizeService def run(post) history = PostHistory.create!( title: post.title, content: post.content ) end end # services/posts/restore_from_history_service.rb class RestoreFromHistoryService attr_reader :historize_service def initialize(historize_service:) @historize_service = historize_service end def run(post, post_history) Post.transaction do historize_service.run(post) post.title = post_history.title post.content = post_history.content # this list can be very long post.save! end end end
Also consider that in this specific case, published state is ambiguous. And if many versions in the history have different meanings, it’s hard to identify which one has what meaning. We could transfer “publish” informations into the history table but that would mean history would carry contextual informations and we would have to maintain the whole history to ensure only one version is published at a time.
Separate “meta” and “content”
A more flexible solution you can consider is to separate in two separate models what is content (ie: what we need to keep track of over time) and what’s immutable over time.
| Post | | Post::Content | |======================| |=============| | author_id | | post_id | | current_content_id | | title | | published_content_id | | content | | last_published_at | | created_at | |----------------------| |-------------|
Using this design, whatever version you use, you will always manipulate the same type of object. You don’t have to double the logic and migrations. You keep your data stable over time.
To give specific meaning to specific versions, we will rely on foreign keys. Post::Content should not know it’s role in outside models. Let’s keep it as small as possible and focus on data.
Your post model will look like this:
# models/post.rb class Post < ActiveRecord::Base # Associations has_many :contents has_one :current_content, class_name: 'Post::Content' has_one :published_content, class_name: 'Post::Content' # Methods def published? published_content_id? end end
Create a new entry in history is as simple as that:
# services/posts/historize_service.rb class HistorizeService def run(post) Post.transaction do new_content = post.current_content.dup new_content.save! post.update_attributes!(current_content: new_content) end end end
Compare to the previous version, we don’t need to modify this service every time we add (or remove) a property. It’s stable over time, that’s all we want.
The simplest way to deal with versions is to consider our Post::Content objects as immutable
# services/posts/content/update_service.rb class UpdateService # params # # * post: an instance of Post # * properties: a hash containing the following keys # * title: (string) optional # * content: (string) optional def run(post, properties) new_content = post.current_content.dup new_content.attributes = properties new_content.save! post.update_attributes!(current_content: new_content) end end
This service will create a new content entry every time you want to update content properties (title and/or content). If you want to prevent blank entries that would create useless history entry, you can use a form object to validate the data before calling the service.
Let’s keep going with restoring an entry
# services/posts/restore_content_from_history_service.rb class RestoreContentFromHistoryService def run(post, content) Post.transaction do # in order to keep the structure coherent, we need to # create a new entry instead of linking the old one. new_content = content.dup new_content.save! post.update_attributes!(current_version: new_content) end end end
Let’s see how we do the publishing now
# services/posts/publish_service.rb class PublishService def run(post) post.update_attributes!( published_content: post.current_content, last_published_at: Time.now ) end end
And voila! This design is flexible enough to add many features that won’t compromise the structure.
Making it a bit more transparent
# models/post.rb class Post < ActiveRecord::Base # Delegations delegate :title, :content, to: :current_content, allow_nil: true end # allow you to access directly post.title # and masking the complexity of post.current_content.title # (see demeter law https://en.wikipedia.org/wiki/Law_of_Demeter)
Adding version numbers, for example
# models/posts/content.rb class Post::Content < ActiveRecord::Base # Callbacks before_create :tag_version_number # Methods private def tag_version_number self.version = generate_version_number end def generate_version_number # optimistic solution (self.class.where(post_id: post_id) .maximum(:version) || 0) + 1 end end
This could also be extracted to a module, exposed as a method like
has_versions_number scope: :post_id
You can even disable update at model level to ensure no one is changing the history
# lib/active_record/immutable_model.rb module ActiveRecord module ImmutableModel class UpdateImmutableModelException < Exception def initialize(msg) super("Immutable model can't be updated") end end extend ActiveSupport::Concern included do before_update :prevent_update end private def prevent_update raise UpdateImmutableModelException.new end end end # models/post/content.rb class Post::Content < ActiveRecord::Base include ActiveRecord::ImmutableModel end
To update Post from forms, I discourage you to use ActiveRecord nested_attributes pattern as long as it’s a very horrible pattern (even if it looks very convenient). Instead, you would prefer to use a form object and hide the complexity of content versioning from your user.
This solution solves some problem cases but still have cons, like any other one. It has a higher cost at beginning but offers more flexibility over time, especially if you want to add features based on your history. It also allows you to easily “tag” and access some versions.