File Delivery

If a List Generation workflow is suitable for the use case at hand, the file delivery service is the best way to access Enigma data.

Please reach out to the Enigma team if you need to generate a list. The steps below outline how to best prepare for that conversation.

Requesting the Desired Output

It’s important to be aware of the choices available when generating a list so that the output file meets the requirements of your use case. The following are some things to consider.

Entity Type
Indicate whether the desired file should contain a list of businesses or business locations. One file will contain a list of only one of these entity types.

Provide the Enigma team with an idea of the type of business or business location being targeted. To get an idea of the dimensions along which the Enigma data can be filtered see the Enigma attribute dictionary.

When discussing a list generation with a member of the Enigma team it helps to familiarize yourself with the Enigma attribute dictionary. This contains explanations of the various attributes available to include in the output file alongside the basic identifiers of the business or business location like business name, address, Enigma ID, etc.

Some attributes are available as a monthly time-series (currently, only the Merchant Transaction Signals). Let the Enigma team know whether you are interested in only the most recently available month of transaction related data or multiple months of history.

Output Format
The final output can be in csv or parquet format.

File Structure
The structure of the output file is also something that can be customized.

Enigma data is stored as a collection of attributes. Some of these attributes are represented as objects with properties associated with them. The file can be structured so that all such attributes are either flattened or unflattened.

A flattened file will contain no columns with nested values inside them. Every property of an object type attribute will be represented as its own column in the file. Choose a flattened structure if the output file(s) will be opened in a spreadsheet like tool and then analyzed.

An unflattened file will contain columns with nested values inside them. For example, an attribute like industries will have properties like classification_type and classification_description, amongst others. Each of these properties will not be represented as a separate column but will instead be nested inside a single column called “industries”. Choose an unflattened file if the output file(s) will be ingested into a data pipeline and the efficiency of ingesting and programmatically parsing the file are a priority.

Number of output files
In most cases, Enigma recommends sending the output back in one file. There are a few instances where this is not the case.

  1. If a user wants both firmographics attributes and multiple months of history for a time-series attribute, Enigma recommends splitting the firmographics into one file and the time-series in another file. The Enigma ID can serve as a matching key between the two files in these scenarios
  2. If a user wants to see matches for business and matches for business locations, Enigma recommends splitting those into two separate files.

Receiving the File Back

There are three ways to receive the file back from Enigma:

Secure File Exchange
The file containing the list can be deposited back into the Secure File Exchange feature of the console. Once the file is ready, a notification will be sent by a member of the Enigma team. The file can be found in the Received folder within the feature. You can download the file locally from this location.

We recommend this method of receiving the file if you are evaluating Enigma data prior to a full integration.

If an integration with Enigma data is being planned that requires that files be deposited in a server via SFTP please reach out to the Enigma team.

Enigma will set up a SFTP with ssh-key-based authorization. To get started please provide the Enigma team with a public key (AWS SFTP user guide and ssh key documentation for reference). You would need to provide a public key (rsa format preferable) for Enigma to use for file transfers.

The Enigma team will then share details of the server where files will be deposited via SFTP. These details will contain the server address and a username to use. You can then simply SSH into the SFTP server using the private key corresponding to the public key you provided the Enigma team with.

S3 Replication
If you use Amazon Web Services to manage your cloud infrastructure, then you may be able to receive files from Enigma via S3 replication. This method of receiving files is suitable if after a conversation with the Enigma team, it is determined that the amount of data you are receiving is large enough to warrant setting up a S3 replication.

Once you are ready to receive files via S3 replication please follow these steps:

  1. Create a destination bucket in AWS where Enigma will replicate files.
  2. Provide the Enigma team with the destination bucket name and the AWS account ID that hosts the bucket.
  3. Request that the Enigma team send over the access policy that needs to be applied to the destination bucket.
  4. Apply the access policy sent over by Enigma to the bucket (for examples of how to apply S3 policies see AWS documentation)
  5. Notify the Enigma team once the bucket is created with the policy applied.
  6. Enigma will send over a test file to ensure that replication is working.

Interpreting the Output

The output of the file containing the generated list will contain a number of columns representing attributes describing a business or business location.

The Enigma ID corresponding to each record will be included first.

While all additional column names can be renamed by the Enigma team upon request, by default column headers are named according to the following rules:

  • For attributes that are represented as arrays, each element of the array is added as a separate column with the header appended with __X where X = 0,1,2,… E.g., names__0
  • For attributes that are represented as an array of objects, the column header representing each object in the array is distinguished by __X where X = 0,1,2,…, followed by the property name. E.g., addresses__0__street_address1
  • For object-type attributes containing nested properties, the column headers for each property are appended by a double underscore __. E.g., card_revenue_growth__3m__rate_sa
  • For attributes containing nested properties that are themselves arrays the column header is appended by a double underscore to distinguish the property name. Then each element of the array is added as a separate column with the header appended with __X where X = 0,1,2,…

If a time-series attribute was selected (only merchant transaction attributes currently have a time-series component) then there will be multiple rows of data for each Enigma ID - each row corresponding to one month of a time series.

Please take a look at the Enigma attribute dictionary to check the type of the attribute being appended.