Data Remediation Strategies
For Data Refinery Workbench, there are few strategies regarding the import of client data and transfer to the Live Data store.
Strategy 1: Import to Workflows and Remediate
With this method, the client decides that Live Data will only store “clean” data. So, data records are imported as Workflows and Data Objects are created after the Workflow End Transition is applied.
This means that all data will be imported as Workflows for remediation before transitioning to Live Data. Under this approach, the process is linear and all data is “cleaned” before being stored as Live data. However, the Workflow backlog can be large.
This strategy may be useful when performing a one-time remediation of a set of data, since its easy to track how many items are remaining to be processed before the remediation effort is completed, and all data in the Live Data can be considered “finished.”
Strategy 2: Import to Live and Refine with Workflows
Under Strategy 2, the client uploads data records as Data Objects and then creates Workflows from those Data Objects on an “as-needed” basis. The data in the Live Data store is a mix of un-remediated and fully remediated data. The team will continually review the data and create Workflows as flaws are identified. The edited data is merged back into the Data Object when the End Transition is applied. In comparison to Strategy 1, this method for data remediation is cyclical and the backlog of workflows remains manageable while the Live Data becomes cleaner.
This strategy is the best approach for managing longer-term remediation efforts, where the team attempts to keep the data clean after the data has been imported into Workbench. It’s likely that most remediation projects will fall into this category, since high quality data is useful beyond the scope of an initial remediation effort.
Strategy 3: Import Clean Data to Live and Remediate with Workflows
For Strategy 3, the client imports data records with high confidence of quality as Data Objects, while records that do not meet the quality threshold are imported as Workflows. Then, Workflows are created for new records on an as-needed basis. Data can also be exported, updated, and then re-imported using Data Object IDs at the data manager’s discretion. With this method, the team uses their experience to gain maximum flexibility in the overall remediation project.