Duplicate management is an additional licensed service which allows you to identify possible duplicate records in your account.
Within the software, a ‘record’ is identified by the email address associated with it – this is in effect used as the unique identifier for each supporter/constituent within your database. A duplicate is an instance where multiple records exist for what is seemingly the same individual. Such duplicates can occur for a multitude of reasons, including mis-typed email addresses upon page submission, old email addresses that are no longer in use and so on. De-duplication is a way of tidying up your data, so that it only contains records that are actually valid, and disposing of the records that are old or invalid.
If you have added this feature to your license, you can access it from the ‘Data & Reports’ tab in the dashboard. Select ‘Data and Reports > Manage duplication‘ and you will be taken to the Duplicate management interface. Here you will see a short overview of the total number of supporters that make up your database, as well as the total numbers of duplicates that have already been identified by the service. On the first run, this number will of course be 0.
Before any duplicates can be identified, you will need to add some instructions in the form of rules that will tell the service what to look for when it runs. Rules can be set for any non-transactional fields that form your Account data structure, such as First Name, Last Name, Postcode, Country etc.
Click on ‘Set up rules’ under ‘Matching rules’ to start setting up the service to identify potential duplicate records. Once you have turned on the ‘Enable matching rules’ switch, you will need to decide which fields in your account’s data structure will be checked to identify individuals, then decide whether you want this to be an exact match or a ‘fuzzy’ match. You can add multiple fields, which will always match with an ‘AND’ logic. Blank values in a field will always be ignored, and will not constitute a match.
Working with potential duplicates
The service will run every night to identify more potential duplicate records in your account. A maximum of 1000 newly identified records will be added to the queue on each run. After the first run, you should have a few matches and can begin processing them. Any user with access to this function is able to process duplicates, and you will see an overview of the most recent activity on the Duplicate management landing page. The overview includes information about which of your users processed a duplicate, as well as which record was affected.
Once a number of duplicates have been identified, you can click on to begin looking at the identified potential duplicates. You now have two ways of starting to process them. You can either use the button to start from the first matching potential duplicate record, or you can look through the list and use the merge icon to start with a particular record in the list.
The number of potential duplicates will be displayed in the ‘Potential Dups’ column. If this number is high, you will likely encounter situations were some records have already been dealt with as you move through the list.
Any records that have already been deleted by a previous merge operation will still appear in the queue as you move through it. When these records are encountered in the workflow it will result in an ‘empty’ display which simply displays two column headers: ‘Field’ and ‘Record to save’.
If you are looking at a record that is the end result of a merge operation, you may only see the ‘Field’, ‘Record to save’ and ‘Initial match’ columns, and no potential duplicate entries as these have already been deleted. There may still be a number for potential duplicates displayed for the record, even if the duplicates have already been dealt with.
You can use ‘Isolate differences‘ to shorten the list of fields that need investigating. Any fields where the value is exactly the same for each of the matching records will now be hidden. If some of the records have blank values for a field, they will not be considered identical to fields with values.
Looking at record with potential duplicates, the ‘Initial match’ will be the record we are currently working on. Each potential match will be displayed as a separate column, but matches are not returned in any particular order.
For each ‘Potential match’ record, you will have two icons available. These are be used to either select all fields for that record, or to remove a record from a merging process. To select all fields for a potential match you can use the ‘Select all’ icon . This will mean that you are effectively selecting this record to be the ‘Record to save’. You can also create a composite of field values by clicking on the individual value for each field in matching supporters. Note that you can only select one value per field from all the potential matches.
will allow you to remove a supporter from the merge queue. It will not affect the record in any way, just remove it from the process, but it is important to remove records that are clearly a different individual supporter, or supporters records you do not want to be part of the process.
When reviewing potential matches, the Email Address field may have an alert icon associated with it . This means that the email address for the potential match has already been suppressed. Suppressed email addresses will not be reachable by broadcast email sent through the platform, and are likely no longer valid.
Completing a merge
Once you are satisfied that you have made the correct selections of field values from the available matches, excluded any records that should not be part of the merging operation and have a complete set of values going to the record you have identified as the record you wish to keep, you can start the merge. The process will delete any duplicate records who remain listed as ‘Potential matches’. Any transactions attributed to them will be re-assigned to the record listed in the ‘Record to save’ column. Do note though that this does not happen in real-time, but will be scheduled as a job – so the merged record may not be complete immediately after the operation has completed.
Changing matching rules
It is possible to modify the rules used to match potential duplicates by clicking on the icon under ‘Matching rules’. This will allow you to delete existing rules or set up new ones. If you need to modify a rule for a field, for example changing from exact to fuzzy matching – you will need to delete the entry first to re-add it with your new rule selection.
If you are making changes to the rules after de-duplication has run for a while, it may be advisable to clear the current possible duplicates list. This will mean that you will be able to start again, hopefully with a better and shorter list of possible matches. Clearing will start from 0, and the de-duplication job will look through the first 1000 supporters on the next nightly run.