Questions? We have answers.

Data Validation: Overview

  • Updated

Summary

The Data Validation step of the file import process provides an overview of the file contents and high-level details about the data's quality. The different sections on this step can help you determine whether to proceed with the import or reassess the file for accuracy and completeness.

Note: If the data appears in a grouped format, MindBridge will not be able to analyze it. Select Format data to automatically convert the data into an ungrouped format, then proceed with the import process.
Note: MindBridge will only analyze data on the first sheet of the file. If you have relevant data across multiple sheets, use the split and merge tools in the file manager ( File manager icon ) to extract and combine sheets without affecting the original file.

File details

This pane provides information on key attributes that can help you verify whether or not the imported data aligns with your expectations.

  • Current period (or Prior period): The start date and end date for the given period.
  • Type: The intended file type (e.g., General ledger, or Opening balance, etc.), based on the slot that the file was imported into on the Data page.
  • Name: The name of the file that was imported.
  • Total number of columns: The total count of columns detected within the file.
  • Number of rows: The total count of rows detected within the file.
  • Number of numeric columns: The total count of columns that contain only numeric data.
  • Number of date columns: The total count of columns that contain dates.
  • Number of text columns: The total count of columns that contain text, which may include a combination of letters, numbers, and symbols.
  • Delimiter: Displays the character used to indicate where columns begin and end throughout the dataset. Also allows you to edit the delimiter.

Screenshot highlighting the file details pane

Edit the delimiter

  1. Select Edit to update the delimiter. A selection window will appear.
  2. Use the menu to select the delimiter MindBridge should look for or enter a custom delimiter in the text field.
    Use the Preview data columns section to ensure the data is being read properly.
  3. When you are satisfied, Confirm the change.
    The file being imported is automatically re-validated.


Integrity checks

These checks include information about different aspects of the data's quality. They may indicate issues with the file, but will not prevent you from continuing the import process.

Screenshot highlighting the integrity checks pane

What MindBridge looks for

Why?

The presence of numeric columns.

The data must contain at least 1 column with only numeric values, or MindBridge will not have enough data to perform an analysis.

Note: If MindBridge detects a non-numeric character in a numeric column, you will not be able to proceed. To resolve the issue, remove non-numeric characters from the numeric column and re-import the file.

Learn about numeric columns

Whether the file contains multiple sheets.

MindBridge will only analyze data from the first sheet in the file. Consolidate the relevant data into the first sheet, then re-import the data.

If you need assistance with getting the relevant data onto the first sheet of the file, learn about submitting a data formatting request.

An inconsistent number of columns across the rows.

If 700 rows have 6 columns, and 1 row has 5 columns, it could indicate missing or erroneous information in the data.

Learn about column inconsistency

The presence of abnormally large cells.

This could indicate erroneous data, merged cells, or improper escaping.

Learn about abnormally large cells

Whether any columns contain over 50% blank cells.

This could indicate that there is data missing from the file.

The presence of NULL values in the data.

MindBridge is unable to analyze cells that contain the word "NULL" instead of a value.

Remove the word “NULL” from the data, or replace it with the appropriate data.

Whether values are expressed in scientific notations.

Scientific notations often appear as a number below 10 with 2 decimal points and a power of 10, so the calculation may be less precise than the actual data.

For example, the scientific notation of 4,354,765,455 is 4.35E9. MindBridge would calculate this as 4.35 x 10^9 = 4,350,000, which is much less than the original value.

MindBridge can read scientific notations that appear in monetary columns in both Excel and .CSV files.

In Excel, scientific notations appear for presentation purposes. This means that Excel treats scientific notations found in monetary columns* as numerals.

In .CSV files, scientific notations are taken as presented, regardless of which column they appear in. This means that although MindBridge can ingest them without issue, when they are present in a .CSV file, this validation warning appears.

*Note: In Excel, scientific notations found in non-numeric columns are not readable by MindBridge.

The presence of special characters.

MindBridge cannot analyze some special characters, including extended character sets (such as copyright symbols), non-printable and control characters (such as formatting marks).

Inconsistent, ambiguous, or unsupported date formats.

Dates appearing in the same column should be formatted consistently and correctly.

Learn about supported date formats

 


Column preview

These statistics can help verify your expectations at a glance, and may assist you in finding anomalous data right away.

Use the chevron icon ( Expand chevron icon ) to expand the section and see a preview of your columns.

GIF showing how to access the column preview.gif

Column name

Description and value

Name

The column header as it appears in the original file.

Refer to this when viewing the statistics in each row.

Data type

Indicates the type of data detected in each column of the original file. These may be:

  • Dates (such as 01/19/2020)
  • Numeric (numbers only)
  • Text (a combination of letters, numbers, and special characters)

Some columns are expected to have certain data types. For example, if MindBridge detects text in the "Entered date" column, it may be indicative of erroneous data.

Lowest value

The lowest numeric value found in a cell within each column.

If the value falls outside of the expected range, it may be indicative of anomalous data.

Highest value

The highest numeric value found in a cell within each column.

If the value falls outside of the expected range, it may be indicative of anomalous data.

# of unique values

An estimate of the number of unique values in each column.

Some columns are expected to have certain values. For example, if MindBridge detects that the "Entered date" has 365 unique values, but the "Effective date" column has 400, it may be indicative of additional out of period data.

Sum

The total amount, resulting from the summation of all numeric values in each column.

Refer to this to check whether debits and credits balance.

Lowest # of characters

The lowest number of characters found in a cell within each column.

Refer to this column to verify your expectations. For example, if MindBridge detects a cell with only 2 characters in the "Debit" column, it may be indicative of erroneous or missing data.

Highest # of characters

The highest number of characters found in a cell within each column.

Refer to this column to verify your expectations. For example, if MindBridge detects a cell with 10,000 characters in the "Debit" column, it may be indicative of erroneous data, or improper escaping.

# of blank cells

The number of blank cells that appear in each column.

If you notice that a required column such as Debit contains blank cells, it may be indicative of missing data.

 


Anything else on your mind? Chat with us or submit a request for further assistance.


Related articles

Was this article helpful?