Entity Exploration
ENTITY_ID values in your database will most likely differ from those shown here, as they depend on load order. Use the ENTITY_ID values returned by your commands in subsequent steps. If you are using the truth set, DATA_SOURCE and RECORD_ID values will be the same.The following examples demonstrate basic exploration of the truth set using sz_explorer, located in the Senzing project’s /bin directory.
Run sz_explorer from the command line.
Run help to view the available commands.

This page focuses on the adhoc commands listed above. These commands can be used at any time, with or without a snapshot. Common use cases include:
- Investigating how a match was made or why one was not, in support of end-user inquiries.
- Exporting data and capturing screenshots when reporting issues to Senzing Support .
The snapshot and audit commands shown in the help output are covered in Snapshot Analysis and Auditing .
why and how. Type show_last_call after any command to see what call was made to Senzing, what flags were used, and what Senzing returned.Using search
Run help search to view the usage details.

Run search robert smith to find matching entities:

The search returns 3 entities that satisfy the search criteria:
"ENTITY_ID": 3is a Robert Smith with 4 customer records."ENTITY_ID": 200002is a Robert Smith who is on theWATCHLIST."ENTITY_ID": 16is a Robert E Smith Sr who is both aCUSTOMERSand aWATCHLISTrecord.
In addition:
-
The
MATCH_KEYcolumn shows which attributes matched (+name) and which principle was satisfied. Senzing performs Principle Based Entity Resolution . Principles are covered in detail later; the most important part at this stage is theMATCH_KEY. -
The match score is a simple scoring algorithm that ensures the strongest matches appear first. It sums the scores of each searched attribute, giving more weight to the name. Since this search used only a name, the match score equals the name score.
Using compare
Run compare search to see the entities returned from the prior search side-by-side.

The side-by-side view shows:
- The two customers share the same
ADDRESS, but theirDOBvalues are about 24 years apart. The difference is consistent with a father and son relationship. - The
WATCHLISTentity on the end does not appear to be related to either customer. - A status line like
lines 1-46/46 (END)in the lower left corner indicates a scrolling window. Use the arrow keys to navigate and pressQto quit.
When the compare table is wider than the terminal window, it opens in a scrollable pager. Use the left and right arrow keys to scroll horizontally and see all columns:

To retrieve a specific entity, use the get command.
Using get
Run help get to view the usage details.

Entity detail
Run get 3 detail to retrieve the detail view of "ENTITY_ID": 3.

The output starts with a grid of the records that belong to the entity:
- The first column shows which data source and record IDs were resolved to this entity as well as the
MATCH_KEYand rule that fired when the record was loaded. - The second column shows the data on each record that was used for resolution. These are the features used for resolution, such as
NAME,DOB,ADDRESS,EMAIL,PHONE, and other identifiers. - The third column shows all the other data for each record.
Underneath the records is a tree view of the entities related to this one by match level. For more on match levels, see Understanding match levels .
The ADDITIONAL_DATA column shows:
- Robert has one active and three inactive records.
- The earliest record date is 1/5/15, the latest is 1/2/18.
- The pattern of repeated inactivation and re-registration with different identifying information each year warrants investigation.
- Both his father and spouse appear on the
WATCHLIST.
Resolution decisions
"DATA_SOURCE":"CUSTOMERS", "RECORD_ID":"1001" and "DATA_SOURCE":"CUSTOMERS", "RECORD_ID":"1005" have very little in common. The how command shows the resolution path for this entity.
Run how 3 to view the decision tree that determined all these records belong to the same entity.
The how decision tree view:
- Read the decision tree from the bottom up.
- Most features have one score, but names can have up to 3: first, last, and combined, in that order.
- In step 1,
"DATA_SOURCE":"CUSTOMERS", "RECORD_ID":"1002"and"DATA_SOURCE":"CUSTOMERS", "RECORD_ID":"1004"came together first and created virtual entityV2-S1which was used in step 2 to match"DATA_SOURCE":"CUSTOMERS", "RECORD_ID":"1001"and so on. In each step, new features may be learned for use in the next step. In this case, the entity gained theEMAILused in step 2 and thePHONEused in step 3.
how report is covered in detail in the how section below. For now, press enter to continue.Using why
The why command explains why two entities did not resolve to the same entity. There are two possible reasons:
- The entities scored below the resolution threshold, though they may still be related.
- The entities could not find each other because they had no matching candidate keys, or all candidate keys went generic.
A why result helps explain why two records are related but not resolved to the same entity. For additional assistance, contact Senzing Support
.
Run help why to view the usage details.

The previous search for Robert Smith returned three different entities. The why command explains why the first and third did not resolve to the same entity.
Feature scoring
Run why 3 16 to compare them.
The upper portion of the table shows "ENTITY_ID": 3 on the left and "ENTITY_ID": 16 on the right.
- The data sources row shows that
"ENTITY_ID": 3has 4CUSTOMERSrecords and"ENTITY_ID": 16has 1CUSTOMERSand 1WATCHLISTrecord. - The
whyresult shows the currentMATCH_KEYand rule between the two entities. - The
MATCH_KEYshows the list of features that contributed to the match, both positively and negatively. The principle is also displayed and is the actual reason for the match. For questions about a specific principle, contact Senzing Support . For more detail, see Principle Based Entity Resolution . - The cross relation is what is stored in the database and should always equal the
whyresult. Although rare, it can happen they are different and reevaluating the entities will correct it. If this occurs, contact Senzing Support . - Below the header are the features for each entity, with the best scoring pair on top.
- On the
NAMErow:- Robert Smith (
"ENTITY_ID": 3) was compared with Robbie Smith ("ENTITY_ID": 16), with a full name score of 97. The surname scored 100 (exact match), and the given name scored 95 (recognized nickname). - The [2] in brackets after Robert Smith on the left indicates 2 entities share this exact name.
- Bob J Smith on the left is another name for
"ENTITY_ID": 3and the Bob Smith and B smith names are greyed out and have a # sign in the bracket indicator as they are suppressed due to a more complete name being available.
- Robert Smith (
- The
DOBrow is colored red because it scored 58 and detracted from the match. It is also red in thewhy_resultabove. - On the
ADDRESSrow, the best matching address scored 99 and contributed to the match. - On the remaining rows, only the entity on the left had a
PHONEand only the entity on the right had aDRLICso there was nothing to match.
Candidate keys
Before entities can be scored, they must find each other through candidate keys. The lower portion of the why output shows the keys that placed them on a short list of candidates for comparison.

The lower portion of the why screen shows the candidate keys that were created for each entity:
- Highlighted in blue are the keys that matched.
- To keep the system fast, keys can ”go generic”, which means they are no longer used to generate candidates.
- The
"NAME_KEY":"RPRT|SM0"is a metaphone for Robert Smith (and also for Robbie Smith), and [3] different entities share this key. - If there is an exclamation point in front of the number like [!120], that key is no longer being used to find candidates.
NAME_KEY for Robert Smith might go generic, Senzing creates composite keys like NAMEADDR_KEY and NAMEDATE_KEY as well. It is far less likely that all of these would go generic.To learn more about how Senzing Entity Resolution works see Entity Resolution Processes .
Using how
This section covers the how command and its views in detail.
Run help how to view the usage details.

The following example uses an entity with a more complex resolution path.
Run search maria sentosa

Decision tree view
Run how 24 to see how those 5 records resolved to the same entity.

The how decision tree is the default view:
- Read the decision tree from the bottom up. The two interim entities created along the way are combined in the last step to form the final entity.
- Each step shows the scores of all compared features, along with the
MATCH_KEYand the principle that was satisfied. - Each step has one of three types:
- “creating a virtual entity” by combining 2 records,
- “adding a record to a virtual entity”, or
- “combining virtual entities”.
In straightforward cases, two records create a virtual entity in step 1 and additional records are added to it. In more complex cases, two or more interim entities are created before they accumulate enough attributes to be joined. The Maria entity above follows the more complex path.
how output is a series of why comparisons. Instead of showing why two entities did not match, each step shows how each record entered the entity.Columnar view
Press C at the prompt to see the columnar view.

- Read this view from left to right.
- The first two columns show step 1.
- The
NAMEis highlighted in yellow because it did not score high enough for a close name match. However, the given name scored 100, producing a partial name match. This is why theMATCH_KEYstarts withPNAME. Principle 110 allows a partial name match when several other important features match, includingDOB,ADDRESS, andEMAIL. - Step 1 reveals a more complete
NAMEand newADDRESSwhich were used to match records in the remaining two steps.
why can be very wide. If it scrolls off the screen, use the arrow keys to scroll left and right, up and down, pressing q to quit.The columnar view shows what is learned at each step. It only shows how each record enters the entity. Steps that combine virtual entities are not included.
Summary view
Press S at the prompt to display the summary view.
The summary view provides a comprehensive overview of the entity and its resolution.

The resolution summary at the top summarizes the decision tree.
- It lists the number and type of steps required.
- It highlights steps of interest, including any low-scoring names and steps that combine virtual entities. For large entities with many steps, this section identifies the most significant ones.
- It concludes with the principles and match keys that fired.
The entity summary below shows the record count and feature breakdown.
- Of the 4
NAMEvalues, 3 are grayed out with a [#1]. The # indicates a suppressed name. Senzing computes the most complete name and identifies which others are derivatives of it.- For matching, if Barry Smith and Betty Smith both have an aka of B smith, the more complete name is used even if B Smith matches exactly.
- This information is also used in best-name calculation.
- After each feature, the number in [] indicates how many other entities use that exact value. The number in blue () indicates how many records in that entity reported that value.
- Looking at
ADDRESSvalues, the blue (3) shows that 9304 W 15th is the most common, useful for a best-address calculation. The bracketed [2] indicates another entity shares the 638 Downey St address.
Investigating shared features
Run search addr_full = 638 Downey St, Salem, OR to find entities at that address.
Then run compare search to see them side-by-side.

Maria is on the WATCHLIST and shares an ADDRESS with Susan. This overlap raises several questions:
- Whether the address was used fraudulently.
- Whether Susan is connected to the same activity.
- Whether they simply occupied the address at different times.
Senzing tracks feature usage across all entities, supporting both entity resolution and the identification of potential threats and fraud patterns.
Using why with search
The why command can also be used with search results. Refer to the previous section or run help search to review the syntax.
No keys matched
Run search barry smith to check for matches.

No entities were found, meaning no keys matched. Run help search to view the keys it generated:

The [0] indicates no entities exist with the name Barry Smith, nor any of its metaphone NAME_KEY values.
A NAME alone may not be sufficient to find a record. Adding a date of birth to the search may improve results.
Found but scored too low
Run search bubby smith | date_of_birth: 12/11/1978

No results returned, but the message changed to “entities were found but did not score high enough”.
- Run
why search 3if the expected result was"ENTITY_ID": 3. - Run
why searchif the expected entity ID is unknown.

The result shows that Bubby vs Bob J does not score high enough. The low name score indicates the search name is too distant. Searching for Bobby instead may yield better results.
Refining the search
Run search bobby smith | date_of_birth: 12/11/1978

The search returns two entities. The top result has the same DOB.
Using tree
The tree command displays relationships at multiple degrees of separation.
Run help tree to view the usage details.

The following example demonstrates the tree view using an organization entity.
Run search universal exports

The search returns 4 entities. Retrieve the Worldwide entity to examine the hierarchy.
Run get 103 to view the entity.

This entity is the global parent of the other three, and its relationships include ownership information. The get command shows a one-degree tree view. To see two degrees:
Run tree 103 degree 2

The tree also shows the principals behind Universal Exports USA. The tree command uses a single call to the Senzing SDK.
Using show_last_call
Run show_last_call to see the SDK calls made by the last command.

The last command used the find_network_by_entity_id SDK call. For full API documentation, see https://www.senzing.com/docs/
.
Using export
The export command extracts the original JSON records that make up an entity. Exported records can be loaded into a test system for further debugging, or attached to a Senzing support ticket for investigation.
Run help export to view the usage details.

Run export 3, 16 to /tmp/export.jsonl.

The exported file contains the original JSON records for both entities, suitable for loading into another system for testing or debugging.
The export command is also used for building truth sets. Because the best truth sets are based on real data, complex examples of entities that matched or did not match can be exported as truth set records. See How to create an entity resolution truth set
.
Next steps
If you have any questions, contact Senzing Support. Support is 100% FREE!