Data requirements

This topic covers the input data requirements for NLG Apps and Arria Answers. If you get unexpected results, please check that your data meets the requirements stated below.

The topic is organized as follows:

Note

The topic assumes the data is being uploaded in tabular form (either a CSV or XLSX file).

Measures

A measure is a numeric, quantitative field in your data, such as Sales, Profit, or Temperature. Measures in the sample dataset include Sales and Profit.

The requirements for measures data are:

  • Decimal numbers must use a period for the decimal separator.

  • If using thousand grouping separators, they must be commas.

  • Aggregation type must be one of the following: Don't summarize, Sum, Average, Minimum, Maximum, or Count*.

Tip

*Arria Answers currently supports only the Don't summarize, Sum, Average, Minimum, and Maximum aggregation types.

See Query types for guidance on which aggregation types are required for each type of query.

Tip

Not all aggregation and entity type combinations are supported by every NLG app. For details of the combinations supported by a particular app, see the relevant page in the NLG Apps Directory.

 

Top

Dimensions

A dimension is a categorical, qualitative field in your data, such as Segment or Product. Dimensions in the sample dataset include Country and Product.

The values in a dimension field are known as dimension instances. For example, Canada and Germany may be instances of the Country dimension.

There are no formal restrictions for categorical dimensions. The values will typically be textual or alphanumeric rather than purely numeric; however, purely numeric outputs are allowed.

In Arria for Power BI, the term time dimension refers to a special type of dimension that contains date values. Time dimensions are used to analyze your data by Day, Month, Quarter, or Year. The data requirements for time dimensions are described below.

 

Top

Time dimensions

A time dimension is a special type of dimension that contains date values. You can use time dimensions to analyze your data by year, quarter, month, or day.

This section provides guidance on valid field names and data values when using time dimensions with NLG Apps and Arria Answers.

Tip

When Arria for Power BI identifies a dimension as a time dimension, the NLG Apps Entity Type field is automatically set to DateTime. You can set the entity type manually if required.

 

There are two ways to store dates in your raw data:

Date columns

Date columns in your raw data can be used in two ways: as a Power BI date hierarchy, or as a single date field.

Power BI date hierarchy

Power BI automatically creates a date hierarchy from a single date column when you upload your data. As shown below, this creates fields for Year, Quarter, Month, and Day.

arria-date-columns.png
powerbi-arria-date-hierarchy.png

Important

Each field in the date hierarchy is counted as a single time dimension. This is important to know when considering the minimum number of time dimensions required by each NLG app. See the relevant Data Requirements section for each app for further information (NLG Apps directory).

 

Supported field names

Arria for Power BI supports the field names created by Power BI: Day, Month, Quarter, and Year.

 

Single date field

If you prefer to use a single date field instead of a hierarchy of date fields, you can disable the Power BI Auto date/time option that automatically creates date hierarchies.

 

Supported field names

A single field containing date values must be named as follows:

Naming Rule

Example Field Names

*date*

date, Date, DATE, Sales_Date, SALE DateTime

The asterisks (*) are wildcards that represent one or more spaces, hyphens, or underscores. These rules are case-insensitive.

Tip

You can use the Alias field in NLG Apps to override date field names and comply with date field naming rules.

 

Supported date formats

The following date formats are supported:

Pattern

Example Value

yyyy-mm-dd

2020-07-31

mm/dd/yyyy

07/31/2020

mm-dd-yyyy

07-31-2020

mmmm dd, yyyy

July 31, 2020

dd-mm-yyyy

31-07-2020

dd mmmm yyyy

31 July 2020

 

Date components across multiple columns

An alternative to a single date column is to spread your date components (year, quarter, month, day) across multiple columns.

Power BI creates a field for each of these columns in your raw data, as shown below.

qlik-arria-separate-date-columns.png
powerbi-arria-date-columns.png

NLG Apps and Arria Answers use the following date components:

Years

Supported field names

A single field containing year values must be named as follows:

Naming Rule

Example Field Names

y

y, Y

*yr*

yr, YR, Yr_Name, YR of Sales

*year*

year, YEAR, Year_Name, Year of Sales

The asterisks (*) are wildcards that represent one or more spaces, hyphens, or underscores. These rules are case-insensitive.

Tip

You can use the Alias field in Step 1 of the NLG Apps wizard to override date field names and comply with date field naming rules.

 

Supported values

A year value must be a two-digit or four-digit number (for example, "22" or "2022").

Important

Set the resulting field's Summarization property to Don't summarize. If you don't, Power BI will treat the field as a measure.

 

Top

Quarters

Supported field names

A single field containing quarter values must be named as follows:

Naming Rule

Example Field Names

q

q, Q

*qt*

qt, QT, Qt_Name, target qt

*qtr*

qtr, QTR, Qtr_Name, target qtr

*quarter*

quarter, QUARTER, Quarter_Name, target quarter

The asterisks (*) are wildcards that represent one or more spaces, hyphens, or underscores. These rules are case-insensitive.

Tip

You can use the Alias field in Step 1 of the NLG Apps wizard to override date field names and comply with date field naming rules.

 

Supported values

A quarter value can be given in alphanumeric or numeric form:

Quarter

Example Alphanumeric Values

Example Numeric Values

Quarter 1

Q1, Qt1, Qtr 1, Quarter 1

1

Quarter 2

Q2, Qt2, Qtr 2, Quarter 2

2

Quarter 3

Q3, Qt3, Qtr 3, Quarter 3

3

Quarter 4

Q4, Qt4, Qtr 4, Quarter 4

4

Any combination of the strings "Q", "Qt", "Qtr", and "Quarter" plus a number, with or without spaces, is acceptable for alphanumeric values. Therefore, "Q4", "Q 4", and "4Q" are equally valid. Also, any mixture of cases is acceptable. Therefore, "QUARTER 1" and "qUARTER 1" (to give two alternatives) are valid alternatives to "Quarter 1".

Important

Set the resulting field's Summarization property to Don't summarize. If you don't, Power BI will treat the field as a measure.

 

Top

Months

Supported field names

A single field containing month values must be named as follows:

Naming Rule

Example Field Names

m

m or M

*mon*

mon, MON, Mon_Name, payment mon

*mth*

mth, MTH, Mth_Name, payment mth

*month*

month, MONTH, Month_Name, payment month

The asterisks (*) are wildcards that represent one or more spaces, hyphens, or underscores. These rules are case-insensitive.

Tip

You can use the Alias field in Step 1 of the NLG Apps wizard to override date field names and comply with date field naming rules.

 

Supported values

A month value can be given in text or numeric form:

Month

Example Text Values

Example Numeric Values

January

January, Jan

01, 1

February

February, Feb

02, 2

March

March, Mar

03, 3

April

April, Apr

04, 4

May

May

05, 5

June

June, Jun

06, 6

July

July, Jul

07, 7

August

August, Aug

08, 8

September

September, Sep

09, 9

October

October, Oct

10

November

November, Nov

11

December

December, Dec

12

This requirement is case-insensitive, so "JAN", "JAn" and "jan" are valid alternatives to "Jan".

Important

Set the resulting field's Summarization property to Don't summarize. If you don't, Power BI will treat the field as a measure.

 

Top

Days

Supported field names

A single field containing day values must be named as follows:

Naming Rule

d

d and D

*day*

day, Day, DAY, day_NAME, Day Name

The asterisks (*) are wildcards that represent one or more spaces, hyphens, or underscores. These rules are case-insensitive.

Tip

You can use the Alias field in Step 1 of the NLG Apps wizard to override date field names and comply with date field naming rules.

 

Supported values

A day value must be a one-digit or two-digit number (e.g. 5 or 05). Ordinal numbers (e.g. 5th) are invalid.

Important

Set the resulting field's Summarization property to Don't summarize. If you don't, Power BI will treat the field as a measure.

 

Top