Environments And Indicies
Proposal
After our initial data conversation with Algolia 02/06/23 we have come up with the following proposal.
Below are the questions we posed before the meeting and the answers recieved which will go on to form how we structure data and applications.
Q & A
Single vs multiple indexes
For Rapha Products are made up of 3 levels. Base, Variant and Size. Customers are only interested in base and variant as they are displayed in a single hit. Is there a recommendation to have a single vs multiple indices?
The recommendation here is to have two indexes per locale. One for base where most updates will take place and one for variants. We will then group records together into a single hit.
Grouping records usually refers to the process of combining multiple records into a single result, or consolidating many similar records into two or three results. This kind of deduplication or aggregation of results has three primary use cases:
- Item Variations, where any item with variations is displayed only once. A t-shirt that comes in five colors should only appear once in the results, with all five colour options displayed somewhere in the description.
- Large Records, where you first break up large record into smaller sub-records, and then during the search, if several of these sub-records match, you display the most relevant one.
- Grouping by attribute, where you group records depending on the value of one of their attributes.
For us the most important use case is the item variations. More in-depth information can be found here: https://www.algolia.com/doc/guides/managing-results/refine-results/grouping/#handling-item-variations.
Algolia's distinct feature
Algolia’s distinct feature solves the item variation use case. Distinct is a term borrowed from the SQL world. Algolia is not meant to be used as a traditional database as it’s a search engine. Still, it’s sometimes useful to borrow some concepts from the database world. Two of those concepts are distinct and group by; both can be achieved using the distinct feature.
For a full breakdown of the solution please read the item variation documentation.
tldr
[
{
"objectID": "WBL03SSBLK", // sku at base or variant level,
"description": "Women's Merino Base Layer - Short Sleeve",
"colour": "blue",
"thumbnail_url": "tshirt-B-blue.png",
"color_variants": ["orange", "teal", "yellow", "red", "green"]
},
{
"type": "t-shirt",
"sku": "B",
"colour": "orange",
"thumbnail_url": "tshirt-B-orange.png",
"color_variants": ["blue", "teal", "yellow", "red", "green"]
},
...
]
Do we split indicies by environment
We assume we would have separate indices for things like query suggestions and environments but is this common practice?
The recommendation here is to have applications per environment. Algolia will set up a sandbox environment that we can run the POC from as this doesn't impact our account usage. See below for recommendation:
| Application / Environment | Usage |
|---|---|
| Sandbox | For testing against our POC. Does not count towards account usage. |
| UAT | For building and testing our integrations. |
| Production | Production application pointing to production integrations as well as including production ready indexes. |
How do we structure data for colour refinement?
- In our records, colour attributes should have a title and hexadecimal code separated by a semicolon ; (we can customize this with any separator we see fit).
- We can also use an URL instead of the hexadecimal code if we want to display a pattern for example.
- The colour attribute should be added to
attributesForFacetingin your configuration.
Examples
black;#000red;#f00yellow;#ffff00pattern;https://example.com/images/pattern.png
The hexadecimal code length can be 3 or 6 chars (excluding the # symbol). I propose we sync without the use of # and add that in the micro front end.
{
"objectID": 0,
"sku": "PIN02RG",
"colour": "black;#000"
}
Outcomes
Indicies
The outcome of all the above is to structure our indexes as follows:
| Application (Sandbox) | Usage |
|---|---|
| base_en-GB | Base index containing individual base records for en-GB |
| variant_en-GB | Variants index containing individual variant records for en-GB |
| base_fr-FR | Base index containing individual base records for fr-FR |
| variant_fr-FR | Variants index containing individual variant records for fr-FR |
| Application (UAT) | Usage |
|---|---|
| base_en-GB | Base index containing individual base records for en-GB |
| variant_en-GB | Variants index containing individual variant records for en-GB |
| base_fr-FR | Base index containing individual base records for fr-FR |
| variant_fr-FR | Variants index containing individual variant records for fr-FR |
| Application (Production) | Usage |
|---|---|
| base_en-GB | Base index containing individual base records for en-GB |
| variant_en-GB | Variants index containing individual variant records for en-GB |
| base_fr-FR | Base index containing individual base records for fr-FR |
| variant_fr-FR | Variants index containing individual variant records for fr-FR |
Please note that we would need indexes for the following locales resulting in 18 indexes in total.
- en-GB
- en-US
- en-AU
- fr-FR
- de-DE
- es-ES
- ko-KR
- ja-JP
- zh-TW
Record Structure
Base
| Key | Value | Translated |
|---|---|---|
| sku | CLJ04XX | |
| variants | ["CLJ04XXBLK", "CLJ04XXBLU"] | |
| name | Men's Classic Jersey II | * |
| productType | raphacore | |
| categories | { lvl0: "Men's", lvl1: "Jersey" | |
| gender | "Men's" | |
| sleeveLength | ||
| legLength | ||
| isRccProduct | true/false | |
| description | A bestseller for over a decade, ... | * |
| summary | Premium Merino Jersey | * |
| marketingMessage | * | |
| rating | 4.4 |
Variant
| Key | Value |
|---|---|
| sku | AAO01XXARS |
| base_sku | AAO01XX |
| displayColour | Light Blue/Teal |
| swatchColour | #879BA3 |
| groupColour | Blue |
| colour | Blue;#879BA3 |
| image | https://media.rapha.cc/image/upload/archive/amplience-image/CLJ04XX_BLK_Product_H1-19_01 |