Send Enigma a File for Batch Append

If you have had a conversation with the Enigma team about your use case and wish to evaluate Enigma data further or begin an integration, then sending Enigma an input file to batch append may be your best option.

Preparing Input File

Once you are ready to send Enigma a file for batch append, the next step is to prepare your input file. The only supported format for input files is .csv. If your file is in another format and cannot be reformatted to .csv easily please reach out to a member of the Enigma team.

If enriching at the business level please ensure that at least one of the following combinations of fields is included in the input file and clearly marked with a column header:

  • Website URL (with or without any other information)
  • Name + Address + Person
  • Name + Address
  • Name + Person
  • Person + Address

If enriching at the business location level please ensure that at least one of the following combinations of fields is included in your file and clearly marked with a column header:

  • Name + Address + Person
  • Name + Address
  • Name + Person
  • Person + Address

Download our blank CSV template to help get you started. If you’d like the Enigma team to aid in predictive analysis, you can also provide additional fields like marketing or delinquency outcomes.

Transmitting the File

The best way to transmit the file to Enigma is via the File Manager System feature in the Console. Simply drag and drop a file into this tool and it will be securely transmitted to Enigma. To learn more about how Enigma manages security, see the security page.

Requesting the Desired Output

After transmitting the file, it’s important to be aware of the choices available so that the completed batch append meets the requirements of the use case at hand. The following are some things to consider.

Entity Type

Indicate whether the enrichment of input records should happen at the business or business location level.

Attributes

When discussing a batch append with a member of the Enigma team it helps to familiarize yourself with the Enigma attribute dictionary. This contains explanations of the various attributes available to enrich input records.

Some attributes are available as a monthly time-series (currently, only the Merchant Transaction Signals). Let the Enigma team know if you’d like to receive the most recent month only, or if you’d like to receive a historical time series as well.

Another thing to consider is the fact that some Enigma attributes are array types i.e., there is a list of values associated with that attribute. An example could be the attribute industries, where some businesses could be classified in multiple closely related industries. If you are just starting out with Enigma data or are unfamiliar with the attribute in question, we recommend only requesting the first value for such attributes. If more are needed, that is something to point out in advance of the batch append.

Output Format

The final enriched output can be in csv or parquet format.

File Structure

The structure of the output file is also something that can be customized.

Enigma data is stored as a collection of attributes. Some of these attributes are represented as objects with properties associated with them. The file can be structured so that all such attributes are either flattened or unflattened.

A flattened file will contain no columns with nested values inside them. Every property of an object type attribute will be represented as its own column in the file. Choose a flattened structure if the output file(s) will be opened in a spreadsheet-like tool and then analyzed.

An unflattened file will contain columns with nested values inside them. For example, an attribute like industries will have properties like classification_type and classification_description, amongst others. Each of these properties will not be represented as a separate column but will instead be nested inside a single column called “industries”. Choose an unflattened file if the output file(s) will be ingested into a data pipeline and the efficiency of ingesting and programmatically parsing the file are a priority.

Number of output files

In most cases, Enigma recommends sending the output back in one file. There are a few instances where this is not the case.

  1. If a user wants both firmographics attributes and multiple months of history for a time-series attribute, Enigma recommends splitting the firmographics into one file and the time-series in another file. The Enigma ID can serve as a matching key between the two files in these scenarios
  2. If a user wants to see matches for business and matches for business locations separately, Enigma recommends splitting those into two separate files.

Receiving the File Back

There are three ways to receive the file back from Enigma:

Secure File Exchange
The enriched file can be deposited back into the File Manager System of the console. Once the file is ready, a notification will be sent by a member of the Enigma team.

We recommend this method of receiving the file if you are evaluating Enigma data prior to a full integration.

SFTP
If an integration with Enigma data is being planned which requires that files be deposited in a server via SFTP, please reach out to the Enigma team.

Enigma will set up a SFTP with ssh-key-based authorization. To get started you will need to provide the Enigma team with a public key (AWS SFTP user guide and ssh key documentation for reference). You would need to provide a public key (rsa format preferable) for Enigma to use for file transfers.

The Enigma team will then share details of the server where files will be deposited via SFTP. These details will contain the server address and a username to use. You can then simply SSH into the SFTP server using the private key corresponding to the public key you provided the Enigma team with.

S3 Replication
If you use Amazon Web Services to manage your cloud infrastructure, then you may be able to receive files from Enigma via S3 replication. This method of receiving files is suitable if after a conversation with the Enigma team, it is determined that the amount of data you are receiving is large enough to warrant setting up a S3 replication.

Once you are ready to receive files via S3 replication please follow these steps:

  1. Create a destination bucket in AWS where Enigma will replicate files.
  2. Provide the Enigma team with the destination bucket name and the AWS account ID that hosts the bucket.
  3. Request that the Enigma team send over the access policy that needs to be applied to the destination bucket.
  4. Apply the access policy sent over by Enigma to the bucket (for examples of how to apply S3 policies see AWS documentation)
  5. Notify the Enigma team once the bucket is created with the policy applied.
  6. Enigma will send over a test file to ensure that replication is working.

Interpreting the Output

The output of the batch append file received via this method will contain the original input columns where the original column headers will be prepended with input_ and be positioned at the front of the list of columns.

The Enigma ID corresponding to each matched input record will be included next.

If matching on the business level, Enigma also provides the Enigma ID for up to five business locations associated with that business.

While all column names can be renamed by the Enigma team upon request, by default any additional column headers are named according to the following rules:

  • For attributes that are represented as arrays, each element of the array is added as a separate column with the header appended with __X where X = 0,1,2,…. E.g. names__0
  • For attributes that are represented as an array of objects, the column header representing each object in the array is distinguished by __X where X = 0,1,2,…, followed by the property name. E.g. addresses__0__street_address1
  • For object-type attributes containing nested properties, the column headers for each property are appended by a double underscore __. E.g. card_revenue_growth__3m__rate_sa
  • For attributes containing nested properties that are themselves arrays the column header is appended by a double underscore to distinguish the property name. Then each element of the array is added as a separate column with the header appended with __X where X = 0,1,2,…

Please take a look at the Enigma attribute dictionary to check the type of the attribute being appended.

If a time-series attribute is selected (only merchant transaction attributes currently have a time-series component) then there will be multiple rows of data for each Enigma ID - each row corresponding to one month of a time series. Please note the following characteristics for time series attributes:

  • If there is transaction presence in ANY month since Jan 2017, Enigma will return the entire history starting from Jan 2017, even if most attribute values are null.
  • In other words, any business with transaction presence at any time will have one record per month going back to Jan 2017 by default.
  • If there is no transaction presence, the Enigma record will not appear at all in the time series file.

Other columns will be appended by default in the file, indicating the match_confidence corresponding to the confidence of each match and the matched fields.

Please get in touch with the Enigma team if you feel sending Enigma a file is the best option for you. Our team will be happy to get you set up with either a one-off or recurring delivery of files based on your requirements.