File Delivery

If a List Generation workflow is suitable for the use case at hand, the file delivery service is the best way to access Enigma data.

Please reach out to the Enigma team if you need to generate a list. The steps below outline how to best prepare for that conversation.

Requesting the Desired Output

It’s important to be aware of the choices available when generating a list so that the output file meets the requirements of your use case. The following are some things to consider.

Entity Type
Indicate whether the desired file should contain a list of businesses or business locations. One file will contain a list of only one of these entity types.

Filters
Provide the Enigma team with an idea of the type of business or business location being targeted. To get an idea of the dimensions along which the Enigma data can be filtered see the Enigma attribute dictionary.

Attributes
When discussing a list generation with a member of the Enigma team it helps to familiarize yourself with the Enigma attribute dictionary. This contains explanations of the various attributes available to include in the output file alongside the basic identifiers of the business or business location like business name, address, Enigma ID, etc.

Some attributes are available as a monthly time-series (currently, only the Merchant Transaction Signals). Let the Enigma team know whether you are interested in only the most recently available month of transaction related data or multiple months of history.

Output Format
The final output can be in csv or parquet format.

File Structure
The structure of the output file is also something that can be customized.

Enigma data is stored as a collection of attributes. Some of these attributes are represented as objects with properties associated with them. The file can be structured so that all such attributes are either flattened or unflattened.

A flattened file will contain no columns with nested values inside them. Every property of an object type attribute will be represented as its own column in the file. Choose a flattened structure if the output file(s) will be opened in a spreadsheet like tool and then analyzed.

An unflattened file will contain columns with nested values inside them. For example, an attribute like industries will have properties like classification_type and classification_description, amongst others. Each of these properties will not be represented as a separate column but will instead be nested inside a single column called “industries”. Choose an unflattened file if the output file(s) will be ingested into a data pipeline and the efficiency of ingesting and programmatically parsing the file are a priority.

Number of output files
In most cases, Enigma recommends sending the output back in one file. There are a few instances where this is not the case.

  1. If a user wants both firmographics attributes and multiple months of history for a time-series attribute, Enigma recommends splitting the firmographics into one file and the time-series in another file. The Enigma ID can serve as a matching key between the two files in these scenarios
  2. If a user wants to see matches for business and matches for business locations, Enigma recommends splitting those into two separate files.

Receiving the File Back

There are two ways to receive the file back from Enigma. For more information on how to self-serve delivery using these sources, please reference the Console File Manager.

Interpreting the Output

The output of the file containing the generated list will contain a number of columns representing attributes describing a business or business location.

The Enigma ID corresponding to each record will be included first.

While all additional column names can be renamed by the Enigma team upon request, by default column headers are named according to the following rules:

  • For attributes that are represented as arrays, each element of the array is added as a separate column with the header appended with __X where X = 0,1,2,… E.g., names__0
  • For attributes that are represented as an array of objects, the column header representing each object in the array is distinguished by __X where X = 0,1,2,…, followed by the property name. E.g., addresses__0__street_address1
  • For object-type attributes containing nested properties, the column headers for each property are appended by a double underscore __. E.g., card_revenue_growth__3m__rate_sa
  • For attributes containing nested properties that are themselves arrays the column header is appended by a double underscore to distinguish the property name. Then each element of the array is added as a separate column with the header appended with __X where X = 0,1,2,…

If a time-series attribute was selected (only merchant transaction attributes currently have a time-series component) then there will be multiple rows of data for each Enigma ID - each row corresponding to one month of a time series.

Please take a look at the Enigma attribute dictionary to check the type of the attribute being appended.