A Delta Migration occurs when you have records that already exist in the Akamai Identity Cloud but might require an update from a legacy system. (For example, the Identity Cloud might include records for beta testers who also have user accounts in your legacy system.) A delta migration functions the same as a regular migration, but with the following additional logic:
- IF a record does NOT exist THEN it will be created using the normal data migration process.
- IF a record DOES exist (as detected by unique key violation response from entity.bulkCreate) THEN an attempt will be made to update the existing record using /entity.update with a defined primary key (such as email).
In other words, suppose you are using email as your primary key and your datafile consists of a single record, a record that has email@example.com as the email address. When you run the delta migration, dataload.py checks to see if your user profile store already contains a record that has the email address firstname.lastname@example.org. If it does not, then a new record is written. However, if the user profile store already contains a record that has the email address email@example.com then the existing record is replaced by the record found in the datafile. This behavior occurs only during a Delta Migration. If you do not do a Delta Migration then the existing record is left alone and the new record is not written to the user profile store.
It is important to note that duplicity will not be ignored. Because of that, duplicate records in the CSV import file could trigger an update case that might not be desirable (i.e., replacing an existing record that you really didn’t want replaced). Make sure your import CSV file is free from duplicate records (records containing the same primary key, usually email). Keep in mind as well that plural structures addressed in an update will be replaced with new information contained in the import CSV. That means that existing InternalIds associated with plural entries (used for indexing and iteration) will be re-assigned. It is possible, albeit very unlikely, that you could have downstream third-party systems that depend on existing plural IDs. You should verify this is not the case before initiating a Delta Migration.
Delta Migrations produce two additional artifacts (update_success.csv and update_fail.csv) that indicate that status of records that were updated during the execution of the script. As with other log files, these files are appended with a timestamp:
To run a Delta Migration simply add the arguments -dm and -pk when executing dataload.py. For example:
Remember that the -pk (primary key) must be an attribute that is designated unique in the schema and must be present in the import CSV file. This attribute will be used to find and perform entity.update API calls on the existing record, overwriting the old record with data supplied from the corresponding record in the import CSV file. In the example above, the email attribute has been designated as the primary key. Other primary keys can be used, such as customer loyalty number, but that data:
- Must also be present in the import CSV and present in the schema.
- Must be unique, in that no two records will contain the same value for the attribute. The customer loyalty number can’t be used as a primary key if multiple customers (e.g., all the members of a family) share the same loyalty number.