We recently had to write a one-off script to update some data in a database. A simple “find and replace” in multiple columns across multiple tables.
We ultimately chose to do something like this:
def process! do
update_value(User, :first_name)
update_value(User, :last_name)
update_value(Post, :title)
update_value(Comment, :body)
update_value(Article, :content)
update_value(Article, :excerpt)
end
def update_value(schema, field, old_thing, new_thing) do
schema
|> Repo.all()
|> Enum.each(fn
value = Map.get(row, field)
new_value = String.replace(value, old_thing, new_thing)
row
|> Changeset.change()
|> Changeset.put_change(field, new_value)
|> Repo.update!()
end)
end
One might take a look at this code and say “Well, that’s not very DRY code, is it?” Shouldn’t we map out our schemas and fields in a nice list like so:
[
%{schema: User, fields: [:first_name, :last_name]},
%{schema: Post, fields: [:title]},
%{schema: Comment, fields: [:body]},
%{schema: Article, fields: [:content, :excerpt]}
]
and then iterate over it to make sure we don’t repeat ourselves?
@list [
%{schema: User, fields: [:first_name, :last_name]},
%{schema: Post, fields: [:title]},
%{schema: Comment, fields: [:body]},
%{schema: Article, fields: [:content, :excerpt]}
]
def process! do
Enum.each(@list, fn %{schema: schema, fields: fields} ->
schema
|> Repo.all()
|> Enum.each(fn
changeset = Changeset.change(row)
changeset = Enum.reduce(fields, changeset, fn field, acc ->
value = Map.get(row, field)
new_value = String.replace(value, old_thing, new_thing)
Changeset.put_change(changeset, field, new_value)
end)
Repo.update!(changeset)
end)
end)
end
I’m not a big fan of the second one.
The complexity we needed to add feels unnecessary to me, especially for a one-off script where we don’t care about processing two fields in the same schema separately. It feels like an unecessary extra abstraction.
I think the first one strikes just the right balance of DRY:
But I guess these things are always a bit subjective 🙂