Skip to main content

Environments And Indicies

Proposal

After our initial data conversation with Algolia 02/06/23 we have come up with the following proposal.

Below are the questions we posed before the meeting and the answers recieved which will go on to form how we structure data and applications.

Q & A

Single vs multiple indexes

For Rapha Products are made up of 3 levels. Base, Variant and Size. Customers are only interested in base and variant as they are displayed in a single hit. Is there a recommendation to have a single vs multiple indices?

The recommendation here is to have two indexes per locale. One for base where most updates will take place and one for variants. We will then group records together into a single hit.

Grouping records usually refers to the process of combining multiple records into a single result, or consolidating many similar records into two or three results. This kind of deduplication or aggregation of results has three primary use cases:

  • Item Variations, where any item with variations is displayed only once. A t-shirt that comes in five colors should only appear once in the results, with all five colour options displayed somewhere in the description.
  • Large Records, where you first break up large record into smaller sub-records, and then during the search, if several of these sub-records match, you display the most relevant one.
  • Grouping by attribute, where you group records depending on the value of one of their attributes.

For us the most important use case is the item variations. More in-depth information can be found here: https://www.algolia.com/doc/guides/managing-results/refine-results/grouping/#handling-item-variations.

Algolia's distinct feature

Algolia’s distinct feature solves the item variation use case. Distinct is a term borrowed from the SQL world. Algolia is not meant to be used as a traditional database as it’s a search engine. Still, it’s sometimes useful to borrow some concepts from the database world. Two of those concepts are distinct and group by; both can be achieved using the distinct feature.

For a full breakdown of the solution please read the item variation documentation.

tldr
[
{
"objectID": "WBL03SSBLK", // sku at base or variant level,
"description": "Women's Merino Base Layer - Short Sleeve",
"colour": "blue",
"thumbnail_url": "tshirt-B-blue.png",
"color_variants": ["orange", "teal", "yellow", "red", "green"]
},
{
"type": "t-shirt",
"sku": "B",
"colour": "orange",
"thumbnail_url": "tshirt-B-orange.png",
"color_variants": ["blue", "teal", "yellow", "red", "green"]
},
...
]

Do we split indicies by environment

We assume we would have separate indices for things like query suggestions and environments but is this common practice?

The recommendation here is to have applications per environment. Algolia will set up a sandbox environment that we can run the POC from as this doesn't impact our account usage. See below for recommendation:

Application / EnvironmentUsage
SandboxFor testing against our POC. Does not count towards account usage.
UATFor building and testing our integrations.
ProductionProduction application pointing to production integrations as well as including production ready indexes.

How do we structure data for colour refinement?

  • In our records, colour attributes should have a title and hexadecimal code separated by a semicolon ; (we can customize this with any separator we see fit).
  • We can also use an URL instead of the hexadecimal code if we want to display a pattern for example.
  • The colour attribute should be added to attributesForFaceting in your configuration.

Examples

  • black;#000
  • red;#f00
  • yellow;#ffff00
  • pattern;https://example.com/images/pattern.png

The hexadecimal code length can be 3 or 6 chars (excluding the # symbol). I propose we sync without the use of # and add that in the micro front end.

{
"objectID": 0,
"sku": "PIN02RG",
"colour": "black;#000"
}

Outcomes

Indicies

The outcome of all the above is to structure our indexes as follows:

Application (Sandbox)Usage
base_en-GBBase index containing individual base records for en-GB
variant_en-GBVariants index containing individual variant records for en-GB
base_fr-FRBase index containing individual base records for fr-FR
variant_fr-FRVariants index containing individual variant records for fr-FR

Application (UAT)Usage
base_en-GBBase index containing individual base records for en-GB
variant_en-GBVariants index containing individual variant records for en-GB
base_fr-FRBase index containing individual base records for fr-FR
variant_fr-FRVariants index containing individual variant records for fr-FR

Application (Production)Usage
base_en-GBBase index containing individual base records for en-GB
variant_en-GBVariants index containing individual variant records for en-GB
base_fr-FRBase index containing individual base records for fr-FR
variant_fr-FRVariants index containing individual variant records for fr-FR

Please note that we would need indexes for the following locales resulting in 18 indexes in total.

  • en-GB
  • en-US
  • en-AU
  • fr-FR
  • de-DE
  • es-ES
  • ko-KR
  • ja-JP
  • zh-TW

Record Structure

Base

KeyValueTranslated
skuCLJ04XX
variants["CLJ04XXBLK", "CLJ04XXBLU"]
nameMen's Classic Jersey II*
productTyperaphacore
categories{ lvl0: "Men's", lvl1: "Jersey"
gender"Men's"
sleeveLength
legLength
isRccProducttrue/false
descriptionA bestseller for over a decade, ...*
summaryPremium Merino Jersey*
marketingMessage*
rating4.4

Variant

KeyValue
skuAAO01XXARS
base_skuAAO01XX
displayColourLight Blue/Teal
swatchColour#879BA3
groupColourBlue
colourBlue;#879BA3
imagehttps://media.rapha.cc/image/upload/archive/amplience-image/CLJ04XX_BLK_Product_H1-19_01

Resources