From 7af3e752da6e94d79d134b88452b96764bd57b78 Mon Sep 17 00:00:00 2001
From: BenHowland <ben.howland@ons.gov.uk>
Date: Thu, 11 Apr 2024 12:01:30 +0100
Subject: [PATCH 1/8] Code-lists.md feedback Fixes #35 Adding time, area, age
 and sex

---
 code-lists.md | 76 +++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/code-lists.md b/code-lists.md
index 14552c4..db50147 100644
--- a/code-lists.md
+++ b/code-lists.md
@@ -19,6 +19,10 @@
     - [Analysis function guidance on symbols and shorthand in tables](#analysis-function-guidance-on-symbols-and-shorthand-in-tables)
     - [Themes](#themes)
     - [Media types](#media-types)
+    - [Time periods](#time-periods)
+    - [Area code, label and type](#area-code-label-and-type)
+    - [Age code and label](#age-code-and-label)
+    - [Sex code and label](#sex-code-and-label)
 
 ## Codelists
 
@@ -513,3 +517,75 @@ Data providers should adopt the [analytical function guidance](https://analysisf
 | CSV    | `http://www.w3.org/ns/iana/media-types/text/csv#Resource`         |
 | JSON   | `http://www.w3.org/ns/iana/media-types/application/json#Resource` |
 | Turtle | `http://www.w3.org/ns/iana/media-types/text/turtle#Resource`      |
+
+
+### Time periods
+
+There are a varieety of different ways that time can be represented in your data. Below are some examples:
+
+| period_type        | period_code             | period_label |
+| ------------------ | ----------------------- | ------------ |
+| gregorian-interval | 2001-04-01 00:00:00/P3M | Apr-Jun 2001 |
+
+Gregorian interval can be used if the time frame of your data does not conform to a standard time frame. This can be used for monthly, quarterly and yearly data. You need to enter the start date of when your dataset starts. Using the example above it is the 1st April 2001. The P3M refers to how much time has been captured. Using the example it is 3 months. You can add P1Y for yearly data to show the data is being captured for a year period.
+
+| period_type | period_code | period_label |
+| ----------- | ----------- | ------------ |
+| month       | 2020-01     | January-2020 |
+
+For monthly data that is from a calendar period we require the `period_type` to be month. In the `period_code` we require the year followed by the specified digit of the month. The `period_label` column is more human readble hence why it is showing the month's full name and the year.
+
+| period_type | period_code | period_label |
+| ----------- | ----------- | ------------ |
+| quarter     | 2020-Q1     | 2020-Q1      |
+
+For quarterly data that is from a calendar period we require the `period_type` to be quarter. In the `period_code` and `period_label` we require the field to be the same. The year followed by which quarter.
+
+| period_type | period_code | period_label |
+| ----------- | ----------- | ------------ |
+| year        | 2020        | 2020         |
+
+For calendar year data we require the `period_type` to be year. In the `period_code` and `period_label` we require the field to be the same. Just the year.
+
+| period_type     | period_code | period_label |
+| --------------- | ----------- | ------------ |
+| government-year | 2020-2021   | 2020-2021    |
+
+For government year which starts in April we require the `period_type` to be government-year. In the `period_code` and `period_label` we require the field to be the same. The year the period starts and the period where it ends.
+
+| period_type | period_code | period_label     |
+| ----------- | ----------- | ---------------- |
+| day         | 1999-12-31  | 31-December-1999 |
+
+For calendar day data we require the `period_type` to be day. In the `period_code` we require the year, the month followed by the day. For `period_label` we require the field to be the day, the month written fully and then the year. This will help with human readability.
+
+### Area code, label and type
+
+| area_code | area_label     | area_type                         |
+| --------- | -------------- | --------------------------------- |
+| K02000001 | United Kingdom | Country                           |
+| E92000001 | England        | Nation                            |
+| E12000001 | North East     | Region                            |
+| E06000047 | County Durham  | County or Unitary Authority       |
+| E08000037 | Gateshead      | Local Authority District          |
+| E47000006 | Tees Valley    | Combined Authority or City Region |
+
+### Age code and label
+
+| age_code | age_label              |
+| -------- | ---------------------- |
+| Y_GE16   | Aged 16 years and over |
+| Y16T24   | Aged 16 to 24          |
+| Y25T34   | Aged 25 to 34          |
+| Y35T44   | Aged 35 to 44          |
+| Y45T54   | Aged 45 to 54          |
+| Y55T74   | Aged 55 to 74          |
+| Y_GE75   | Aged 75 and over       |
+
+
+### Sex code and label
+
+| sex_code | sex_label |
+| -------- | --------- |
+| F        | Female    |
+| M        | Male      |

From b7f48fb0ebb775c123d52d0391e2e12806f42b29 Mon Sep 17 00:00:00 2001
From: BenHowland <ben.howland@ons.gov.uk>
Date: Fri, 12 Apr 2024 13:11:12 +0100
Subject: [PATCH 2/8] Code-lists.md feedback Fixes #35 adding of time, area,
 age and sex

---
 code-lists.md | 36 ++++++++++++++++++++++++------------
 1 file changed, 24 insertions(+), 12 deletions(-)

diff --git a/code-lists.md b/code-lists.md
index db50147..6cc0bbb 100644
--- a/code-lists.md
+++ b/code-lists.md
@@ -19,7 +19,8 @@
     - [Analysis function guidance on symbols and shorthand in tables](#analysis-function-guidance-on-symbols-and-shorthand-in-tables)
     - [Themes](#themes)
     - [Media types](#media-types)
-    - [Time periods](#time-periods)
+  - [Reusable concepts in a CSV](#reusable-concepts-in-a-csv)
+    - [Periods of time](#periods-of-time)
     - [Area code, label and type](#area-code-label-and-type)
     - [Age code and label](#age-code-and-label)
     - [Sex code and label](#sex-code-and-label)
@@ -97,8 +98,8 @@ For example:
 | Property           | Requirement level | Notes                                                                     |
 | ------------------ | ----------------- | ------------------------------------------------------------------------- |
 | `skos:inScheme`    | mandatory         | See [codelists](#codelists)                                               |
-| `rdfs:label`       | mandatory         | See [titles](style.md#titles)                                                     |
-| `skos:prefLabel`   | mandatory         | See [titles](style.md#titles)                                                     |
+| `rdfs:label`       | mandatory         | See [titles](style.md#titles)                                             |
+| `skos:prefLabel`   | mandatory         | See [titles](style.md#titles)                                             |
 | `skos:notation`    | mandatory         |                                                                           |
 | `skos:broader`     | recommended       | See [hierarchical codelists](#hierarchical-codelists)                     |
 | `skos:narrower`    | recommended       | See [hierarchical codelists](#hierarchical-codelists)                     |
@@ -328,7 +329,7 @@ Statisticians may wish to report statistics against multiple classifications. Do
 
 For example, consider a dataset which mixes codes from the NUTS geography codelist with codes from the ONS geography codelist.
 
-| geography | geography_label     | value |
+| area_code | area_label          | value |
 | --------- | ------------------- | ----- |
 | UKC       | North East, England | ...   |
 | UKD       | North West, England | ...   |
@@ -336,7 +337,7 @@ For example, consider a dataset which mixes codes from the NUTS geography codeli
 
 The NUTS codes have IRIs which are maintained by Eurostat, such as `http://data.europa.eu/nuts/code/UKC`, whereas the ONS geography codes are maintained by the ONS at the `http://statistics.data.gov.uk/id/statistical-geography/E92000001` namespace.
 
-We map the cells of the dataset to RDF by using the `valueUrl` CSVW property. Only a single `valueUrl` can be applied to all the cells in a column. This is problematic, as the IRIs we wish to map to have different bases. Setting `valueUrl` to `http://data.europa.eu/nuts/code/{geography}` would result in a non-existant identifier `http://data.europa.eu/nuts/code/E92000001` appearing in the RDF output.
+We map the cells of the dataset to RDF by using the `valueUrl` CSVW property. Only a single `valueUrl` can be applied to all the cells in a column. This is problematic, as the IRIs we wish to map to have different bases. Setting `valueUrl` to `http://data.europa.eu/nuts/code/{area_code}` would result in a non-existant identifier `http://data.europa.eu/nuts/code/E92000001` appearing in the RDF output.
 
 We address this by creating new identifiers for each of the codes under a shared namespace, and using `skos:exactMatch` relations to relate these new identifiers to the more commonly used identifiers. For example,
 
@@ -512,14 +513,15 @@ Data providers should adopt the [analytical function guidance](https://analysisf
 
 > TODO: Cover media types from [IANA](https://www.w3.org/ns/iana/media-types/)
 
-| Label  | IRI                                                                |
-| ------ | ------------------------------------------------------------------ |
+| Label  | IRI                                                               |
+| ------ | ----------------------------------------------------------------- |
 | CSV    | `http://www.w3.org/ns/iana/media-types/text/csv#Resource`         |
 | JSON   | `http://www.w3.org/ns/iana/media-types/application/json#Resource` |
 | Turtle | `http://www.w3.org/ns/iana/media-types/text/turtle#Resource`      |
 
+## Reusable concepts in a CSV
 
-### Time periods
+### Periods of time
 
 There are a varieety of different ways that time can be represented in your data. Below are some examples:
 
@@ -570,6 +572,8 @@ For calendar day data we require the `period_type` to be day. In the `period_cod
 | E08000037 | Gateshead      | Local Authority District          |
 | E47000006 | Tees Valley    | Combined Authority or City Region |
 
+The table above shows the variety of area types that can be represented in your data. The important thing is that in the area code column each entry has its own identifiable code.
+
 ### Age code and label
 
 | age_code | age_label              |
@@ -582,10 +586,18 @@ For calendar day data we require the `period_type` to be day. In the `period_cod
 | Y55T74   | Aged 55 to 74          |
 | Y_GE75   | Aged 75 and over       |
 
+The examples in the table above show the best way to represent different age categories.
 
 ### Sex code and label
 
-| sex_code | sex_label |
-| -------- | --------- |
-| F        | Female    |
-| M        | Male      |
+| sex_code | sex_label      |
+| -------- | -------------- |
+| F        | Female         |
+| M        | Male           |
+| _N       | Non response   |
+| _O       | Other          |
+| -U       | Unknown        |
+| _Z       | Not applicable |
+
+The examples in the table above show the best way to represent different sex categories.
+

From cb5439c6043f0691aed9b9f7d6c591505acd5b74 Mon Sep 17 00:00:00 2001
From: BenHowland <ben.howland@ons.gov.uk>
Date: Fri, 24 May 2024 13:25:32 +0100
Subject: [PATCH 3/8] Code-lists.md feedback Fixes #35

---
 code-lists.md | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/code-lists.md b/code-lists.md
index 6cc0bbb..ef5a583 100644
--- a/code-lists.md
+++ b/code-lists.md
@@ -586,7 +586,7 @@ The table above shows the variety of area types that can be represented in your
 | Y55T74   | Aged 55 to 74          |
 | Y_GE75   | Aged 75 and over       |
 
-The examples in the table above show the best way to represent different age categories.
+The examples in the table above show the best way to represent different age categories. his has come from the Statistical Data and Metadata eXchange (SDMX) guidelines [^machine]
 
 ### Sex code and label
 
@@ -599,5 +599,9 @@ The examples in the table above show the best way to represent different age cat
 | -U       | Unknown        |
 | _Z       | Not applicable |
 
-The examples in the table above show the best way to represent different sex categories.
+The examples in the table above show the best way to represent different sex categories. This has come from the Statistical Data and Metadata eXchange (SDMX) guidelines [^machine]
+
+
+[^machine]: <https://sdmx.org/?page_id=3215>
+[^machine]: <https://sdmx.org/?page_id=3215>
 

From 86846ce66a414bb5f1876c7dac0fa37317b67087 Mon Sep 17 00:00:00 2001
From: BenHowland <ben.howland@ons.gov.uk>
Date: Fri, 24 May 2024 13:28:37 +0100
Subject: [PATCH 4/8] updated footnotes

---
 code-lists.md | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/code-lists.md b/code-lists.md
index ef5a583..87821f9 100644
--- a/code-lists.md
+++ b/code-lists.md
@@ -586,7 +586,7 @@ The table above shows the variety of area types that can be represented in your
 | Y55T74   | Aged 55 to 74          |
 | Y_GE75   | Aged 75 and over       |
 
-The examples in the table above show the best way to represent different age categories. his has come from the Statistical Data and Metadata eXchange (SDMX) guidelines [^machine]
+The examples in the table above show the best way to represent different age categories. his has come from the Statistical Data and Metadata eXchange (SDMX) guidelines [^1]
 
 ### Sex code and label
 
@@ -599,9 +599,9 @@ The examples in the table above show the best way to represent different age cat
 | -U       | Unknown        |
 | _Z       | Not applicable |
 
-The examples in the table above show the best way to represent different sex categories. This has come from the Statistical Data and Metadata eXchange (SDMX) guidelines [^machine]
+The examples in the table above show the best way to represent different sex categories. This has come from the Statistical Data and Metadata eXchange (SDMX) guidelines [^2]
 
 
-[^machine]: <https://sdmx.org/?page_id=3215>
-[^machine]: <https://sdmx.org/?page_id=3215>
+[^1]: <https://sdmx.org/?page_id=3215>
+[^2]: <https://sdmx.org/?page_id=3215>
 

From fbf55fc3312acc46de504274543ebc2f941a6c67 Mon Sep 17 00:00:00 2001
From: BenHowland <ben.howland@ons.gov.uk>
Date: Fri, 24 May 2024 13:31:07 +0100
Subject: [PATCH 5/8] altered the time period layout

---
 code-lists.md | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/code-lists.md b/code-lists.md
index 87821f9..b7a9243 100644
--- a/code-lists.md
+++ b/code-lists.md
@@ -525,11 +525,11 @@ Data providers should adopt the [analytical function guidance](https://analysisf
 
 There are a varieety of different ways that time can be represented in your data. Below are some examples:
 
-| period_type        | period_code             | period_label |
-| ------------------ | ----------------------- | ------------ |
-| gregorian-interval | 2001-04-01 00:00:00/P3M | Apr-Jun 2001 |
+| period_type | period_code | period_label     |
+| ----------- | ----------- | ---------------- |
+| day         | 1999-12-31  | 31-December-1999 |
 
-Gregorian interval can be used if the time frame of your data does not conform to a standard time frame. This can be used for monthly, quarterly and yearly data. You need to enter the start date of when your dataset starts. Using the example above it is the 1st April 2001. The P3M refers to how much time has been captured. Using the example it is 3 months. You can add P1Y for yearly data to show the data is being captured for a year period.
+For calendar day data we require the `period_type` to be day. In the `period_code` we require the year, the month followed by the day. For `period_label` we require the field to be the day, the month written fully and then the year. This will help with human readability.
 
 | period_type | period_code | period_label |
 | ----------- | ----------- | ------------ |
@@ -555,11 +555,11 @@ For calendar year data we require the `period_type` to be year. In the `period_c
 
 For government year which starts in April we require the `period_type` to be government-year. In the `period_code` and `period_label` we require the field to be the same. The year the period starts and the period where it ends.
 
-| period_type | period_code | period_label     |
-| ----------- | ----------- | ---------------- |
-| day         | 1999-12-31  | 31-December-1999 |
+| period_type        | period_code             | period_label |
+| ------------------ | ----------------------- | ------------ |
+| gregorian-interval | 2001-04-01 00:00:00/P3M | Apr-Jun 2001 |
 
-For calendar day data we require the `period_type` to be day. In the `period_code` we require the year, the month followed by the day. For `period_label` we require the field to be the day, the month written fully and then the year. This will help with human readability.
+Gregorian interval can be used if the time frame of your data does not conform to a standard time frame. This can be used for monthly, quarterly and yearly data. You need to enter the start date of when your dataset starts. Using the example above it is the 1st April 2001. The P3M refers to how much time has been captured. Using the example it is 3 months. You can add P1Y for yearly data to show the data is being captured for a year period.
 
 ### Area code, label and type
 

From 26b24e0a808dbc8758abbb4e34f797c483786af4 Mon Sep 17 00:00:00 2001
From: BenHowland <ben.howland@ons.gov.uk>
Date: Wed, 29 May 2024 10:19:33 +0100
Subject: [PATCH 6/8] reordering of types in csv.md

---
 csv.md | 35 +++++++++++++++++++----------------
 1 file changed, 19 insertions(+), 16 deletions(-)

diff --git a/csv.md b/csv.md
index af95143..1b7322d 100644
--- a/csv.md
+++ b/csv.md
@@ -43,12 +43,6 @@ TODO: Provide an example of a geography code, geography label, and geography typ
 
 There are five types of columns, and each CSV file should contain at least two of them.
 
-#### Observation
-
-Observation columns must only contain numbers. Suppressed or missing values must be left blank. If a value is suppressed, there should be a related column explaining the suppressed value. This is referred to as an observation status column (i.e. a special kind of attribute column called "observation status"). When dealing with whole numbers (i.e. counts of people) and where there isn't scaling (i.e. thousands, millions, etc.), the number should be expressed as an integer. When dealing with decimal numbers (i.e. percentages, indexes, scaled currency counts), the number should be expressed as a decimal number.
-
-**Note:** Try and keep the same number of decimal places for all values in a given column. This will make it easier to read, and not imply false precision.
-
 #### Dimension
 
 Dimension columns (otherwise known as factors or concepts) are used to identify the observation through a combination of concepts. Where each dimension in a CSV is filtered to a specific value there should only be one observation. In relational databases terminology all dimensions combine to a composite key. Some examples of dimensions are:
@@ -63,7 +57,7 @@ A quick way to check if a column only contains related data and unique identifia
 
 For example the three columns prefixed with `area_` are related in the table below, filtering on any two of the three would only ever result in one value for the remaining column. The area_code column is a unique identifier for each geography.
 
-| area_code | area_label        | Area_type              | value | ... |
+| area_code | area_label        | area_type              | value | ... |
 | --------- | ----------------- | ---------------------- | ----- | --- |
 | E08000006 | Salford           | Metropolitan Districts | 42    | ... |
 | E92000001 | England           | Country                | 1337  | ... |
@@ -71,17 +65,11 @@ For example the three columns prefixed with `area_` are related in the table bel
 
 **Note:** Dimension columns must contain values for every row in the CSV file and not be blank (i.e. they must be dense)
 
-#### Attributes
-
-Attribute columns are used to qualify the observation. Most commonly the attribute columns are used to describe the absence or quality of an observation, these are commonly called "observation status" columns. There are two types of attribute columns, literal and resource columns.
-
-##### Literal attributes
-
-Literal attributes are used to describe the observation. When providing point estimates, often there are additional values which help provide context. For example, when providing a point estimate for the number of people in a given area, there may be a confidence interval (of which there are two values, the upper and lower bounds), a sample size, and a standard deviation. These values are all literal attributes.
+#### Observation
 
-##### Observation status columns
+Observation columns must only contain numbers. Suppressed or missing values must be left blank. If a value is suppressed, there should be a related column explaining the suppressed value. This is referred to as an observation status column (i.e. a special kind of attribute column called "observation status"). When dealing with whole numbers (i.e. counts of people) and where there isn't scaling (i.e. thousands, millions, etc.), the number should be expressed as an integer. When dealing with decimal numbers (i.e. percentages, indexes, scaled currency counts), the number should be expressed as a decimal number.
 
-When creating observation status columns a naming convention helps users understand how they relate to the observation to the qualification. In this case for a given column name containing observations, the observation status column should have the same name as the observation column plus `_status` as a suffix. For example an observation column called `observation` should have a corresponding observation status called `observation_status`.
+**Note:** Try and keep the same number of decimal places for all values in a given column. This will make it easier to read, and not imply false precision.
 
 #### Measure columns
 
@@ -119,6 +107,18 @@ When scaling units take the base unit and suffix the multiplication factor prece
 - `ratio_0.001` for per thousands (used in SOME PUBLICATION)
 - `L_100` for hectolitres (used in HMRC Alcohol Bulletin)
 
+#### Attributes
+
+Attribute columns are used to qualify the observation. Most commonly the attribute columns are used to describe the absence or quality of an observation, these are commonly called "observation status" columns. There are two types of attribute columns, literal and resource columns.
+
+##### Literal attributes
+
+Literal attributes are used to describe the observation. When providing point estimates, often there are additional values which help provide context. For example, when providing a point estimate for the number of people in a given area, there may be a confidence interval (of which there are two values, the upper and lower bounds), a sample size, and a standard deviation. These values are all literal attributes.
+
+##### Observation status columns
+
+When creating observation status columns a naming convention helps users understand how they relate to the observation to the qualification. In this case for a given column name containing observations, the observation status column should have the same name as the observation column plus `_status` as a suffix. For example an observation column called `observation` should have a corresponding observation status called `observation_status`.
+
 ### Ordering
 
 Ensuring that users can understand your CSV files is important. To help with this, the columns should be ordered as follows:
@@ -133,6 +133,9 @@ Ensuring that users can understand your CSV files is important. To help with thi
 6. Observation status column (if necessary).
 7. All other attribute columns.
 
+| period_code | period_type | period_label | area_code | area_type | observation | measure | unit | observation_status |
+| ----------- | ----------- | ------------ | --------- | --------- | ----------- | ------- | ---- | ------------------ |
+
 ## Overall principles
 
 Concept Clarity: Ensure that the concepts used in the CSV files are clear and easily understandable to enhance human readability.

From 4fdd937e737420b0e275ca833ec3d99241fb7a85 Mon Sep 17 00:00:00 2001
From: BenHowland <ben.howland@ons.gov.uk>
Date: Wed, 5 Jun 2024 11:28:18 +0100
Subject: [PATCH 7/8] updated some areas

---
 csv.md | 32 ++++++++++++++++++++++++++++----
 1 file changed, 28 insertions(+), 4 deletions(-)

diff --git a/csv.md b/csv.md
index 1b7322d..a4fdfcd 100644
--- a/csv.md
+++ b/csv.md
@@ -33,10 +33,23 @@ CSV files used in our service should be saved as UTF-8 encoded text files with a
 
 Column headers should be in lowercase and snake case (e.g. `column_header`). This is to ensure consistency and readability. Column headers should also be unique, and should not contain any special characters (e.g. `!@#$%^&*()`). This even includes the pound sign (i.e. `£`), which should be replaced with `gbp` when appropriate.
 
-Related columns should have the same prefix (e.g. `area_code`, `area_label`, `area_type` or `time_period_type`, `time_period_code`, `time_period_label`, or even `observation` and `observation_status`), and should be adjacent. This is to ensure that related columns are grouped together when sorted alphabetically, and to make it easier to find concepts whose values are spread across multiple columns.
+Related columns should have the same prefix (e.g. `geography_code`, `geography_label`, `geography_type` or `period_type`, `period_code`, `period_label`, or even `observation` and `observation_status`), and should be adjacent. This is to ensure that related columns are grouped together when sorted alphabetically, and to make it easier to find concepts whose values are spread across multiple columns.
 
 **Note** When expressing a dimension which has a label and a code, the code should come first, followed by the label; in the case of area geography, you can add an additional value which helps disambiguates geography labels by providing the geography type which would only be disambiguated by the geography code.
 
+Below is an example of how we would like code, label and type to be represented.
+
+| period_code             | period_label      | period_type        | geography_code | geography_label |
+| ----------------------- | ----------------- | ------------------ | -------------- | --------------- |
+| 1999-12-31              | 31-Decemnber-1999 | day                | K02000001      | United Kingdom  |
+| 2020-01                 | Jaunuary-2020     | month              | E92000001      | England         |
+| 2020-Q1                 | 2020_Q1           | quarter            | E12000001      | North East      |
+| 2020                    | 2020              | year               | E06000047      | County Durham   |
+| 2020-2021               | 2020-2021         | government-year    | E07000088      | Gosport         |
+| 2001-04-01 00:00:00/P2M | Apr-Jun 2001      | gregorian-interval | E14001252      | Gosport         |
+
+
+
 TODO: Provide an example of a geography code, geography label, and geography type where the geography type/code is required to disambiguate the geography label.
 
 ### Types
@@ -47,7 +60,7 @@ There are five types of columns, and each CSV file should contain at least two o
 
 Dimension columns (otherwise known as factors or concepts) are used to identify the observation through a combination of concepts. Where each dimension in a CSV is filtered to a specific value there should only be one observation. In relational databases terminology all dimensions combine to a composite key. Some examples of dimensions are:
 
-- `time_period_code` with one value being `2019-2020` (i.e. the period of `April 2019 to March 2020`)
+- `period_code` with one value being `2019-2020` (i.e. the period of `April 2019 to March 2020`)
 - `geography_code` with one value being `E09000001` (i.e. the nation of `England`)
 - `sic_2007` with one value being `01.11` (i.e. the concept `Growing of cereals (except rice), leguminous crops and oil seeds`)
 
@@ -63,6 +76,8 @@ For example the three columns prefixed with `area_` are related in the table bel
 | E92000001 | England           | Country                | 1337  | ... |
 | K04000001 | England and Wales | England and Wales      |       | ... |
 
+If you need further help on how to configure dimensions such as period, geography, age and sex. Here is a link to help. [^1]
+
 **Note:** Dimension columns must contain values for every row in the CSV file and not be blank (i.e. they must be dense)
 
 #### Observation
@@ -119,6 +134,8 @@ Literal attributes are used to describe the observation. When providing point es
 
 When creating observation status columns a naming convention helps users understand how they relate to the observation to the qualification. In this case for a given column name containing observations, the observation status column should have the same name as the observation column plus `_status` as a suffix. For example an observation column called `observation` should have a corresponding observation status called `observation_status`.
 
+**Note:** An `observation_status` column is not required if all the cells in the `observation` column have data.
+
 ### Ordering
 
 Ensuring that users can understand your CSV files is important. To help with this, the columns should be ordered as follows:
@@ -133,8 +150,12 @@ Ensuring that users can understand your CSV files is important. To help with thi
 6. Observation status column (if necessary).
 7. All other attribute columns.
 
-| period_code | period_type | period_label | area_code | area_type | observation | measure | unit | observation_status |
-| ----------- | ----------- | ------------ | --------- | --------- | ----------- | ------- | ---- | ------------------ |
+## Example
+
+Below is a basic example of how the columns should be ordered and shown.
+
+| period_code | period_type | period_label | geography_code | geography_label | observation | measure | unit | observation_status |
+| ----------- | ----------- | ------------ | -------------- | --------------- | ----------- | ------- | ---- | ------------------ |
 
 ## Overall principles
 
@@ -143,3 +164,6 @@ Concept Clarity: Ensure that the concepts used in the CSV files are clear and ea
 Unique Addressability: Each observation should be uniquely addressable by filtering all dimension columns to a value.
 
 Value Completeness: All columns should have values for every observation, except for the observation, observation status, or attribute type columns.
+
+
+[^1]: <https://github.com/GSS-Cogs/application-profile/blob/draft/code-lists.md>
\ No newline at end of file

From 081d5276398a70ae7552a930876948bb878d2eef Mon Sep 17 00:00:00 2001
From: Andrew Fergusson <andrew.fergusson@ons.gov.uk>
Date: Fri, 2 Aug 2024 14:57:30 +0100
Subject: [PATCH 8/8] typo

---
 code-lists.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/code-lists.md b/code-lists.md
index b7a9243..4faad2e 100644
--- a/code-lists.md
+++ b/code-lists.md
@@ -596,7 +596,7 @@ The examples in the table above show the best way to represent different age cat
 | M        | Male           |
 | _N       | Non response   |
 | _O       | Other          |
-| -U       | Unknown        |
+| _U       | Unknown        |
 | _Z       | Not applicable |
 
 The examples in the table above show the best way to represent different sex categories. This has come from the Statistical Data and Metadata eXchange (SDMX) guidelines [^2]