How To Enrich Scanned Data With External Information

Oct 3, 2018 | by Maik M.

Blogpost | Header | How To Enrich Scanned Data With External Information

In quite a lot of migration projects in the past, I came across the requirement to add information from an external data source before (or while) creating the documents in the new platform.
There are various reasons for this:

Option 1

The easiest way to do that is with mapping lists. In MC3 a mapping list is a simple key/value list.
An easy example could be a user mapping where you replace old usernames with new ones or set a default user if old users aren’t existing in the new system anymore.
You can do more with it with just a little bit of “hacking”. We can treat such a mapping list also as a multi-dimensional list (which will be a new feature in MC4 finally).

Blogpost | How To Enrich Scanned Data With External Information | 01

Table 1: Example multi-value mapping list

On the left side we have any attribute which serves as a unique identifier for the scanned documents and on the right side we have different attributes in a comma separated way.
With the following transformation function, you can easily get the desired value by the number (the place in the array).

 

Blogpost | How To Enrich Scanned Data With External Information | 02

Table 2: Example mapping to get the first item from the values array

You just have to use the combination of functions for each attribute and change the index returned (at #2) specifically.

Advantage: You have all information in one big mapping list. The functions and rules are being re-used and the information can be maintained in one place.
Disadvantage: If those informations are “living” informations and changes can be made, you’d need to always update the mapping list accordingly.
Hint: We also provide a database query/script to insert or update the mapping directly in the database – that especially makes sense if you have many values (100k+) in there or if you want to setup an automated way to keep the information up-to-date.

Option 2

In case the first option is suitable for you or not convenient enough, we have another option for you. It involves a little customization inside the MC database but due to the stable framework, this is doable within a few hours: create a new transformation function that collects information “in real-time” during the transformation of metadata.

Blogpost | How To Enrich Scanned Data With External Information | 03

Table 3: Example of a newly added transformation function that gets a value from another source

In order to do so, the external data source must be accessible by the MC database (Oracle) because the function is being executed inside of it. The function itself is pretty easy and straightforward: you create a SQL query that selects the desired results from the source table(s). The parameter (used as a “where condition”) is a unique identifier – it can be one ID or even several attributes if needed. You are free to implement that function behavior as you need it.

Blogpost | How To Enrich Scanned Data With External Information | 04

Table 4: Example of a custom transformation query to get other objects information

The “mapping” or enrichment is taking place during the transformation step.
Remember: analyze (scan), organize, transform, validate (correct) and import.

Advantage: You can get up-to-date or real-time information. At any time you do the transformation, fresh data will be pulled. Extremely useful when the DEV, STAGE, PRD migration process takes a while. Also, you don’t need to maintain the data because normally it is maintained automatically by another system.
Disadvantage: You have to implement a custom function. For some customers – especially in highly regulated environments – that could be a no-go or showstopper. Because it isn’t an out-of-the-box feature, the requirements for it are unique for each project.

Additionally, and that is what you see in the screenshots, you can also get information of other objects inside the MC database. That is a real benefit if you need attributes from other documents, versions or relations. For example, you want to set the “security class” attribute but in order to do so, you might need to get the security class from its parent.
Finally yet importantly, whatever option you might choose: the beauty in migration-center is the validation step is mandatory, always. Just in case you wondered if those custom implementations are risky, at the end of the day, the validation will show any kind of errors. migration-center will never let you import unvalidated documents.

If you have any questions or want to implement such a function – please don’t hesitate to contact anyone of our team. We are glad to help.

REQUEST YOUR FREE COPY OF MIGRATION-CENTER      SUBSCRIBE TO OUR NEWSLETTER