Daisy-chaining Processes | Real-world Scenarios
Firstly, what exactly do we mean by daisy-chaining processes & what are the rules?
Daisy-chaining simply means a series or sequence of something – in this case data transformation rules and they are applied one after the other.
Refining column data
Here is a short video to show how a set of transformations (or replace rules) can be applied to some data and how this affects the resulting output.
So from this you can see how we have built a column where there is only relevant size information.
- Finding the Default Title’ text that applies to singular products & replacing this with blank/ null
- Finding an option title Filled Cushion – this variant title does not correspond to a size definition so again we replace this with blank/ null.
- We are then left with data where the pattern is [product size] / [product colour] .. we only want everything before the forward slash character. The third transformation is a regular expression to extract everything from start of the string up to the forward slash.
We could employ a very similar rule-set to build a colour column. Only the final regular expression would instead extract the values after the slash character.
This is a slightly more advanced concept, but there is a very clear use case in the potential setup of a Google Merchant Centre feed.
The first thing to consider is, where Shopify holds data in an array format this is the way that we treat it when it has been copied to Feed Donkey.
For example, the field Store > Products > Product_Images*.*Src is where all the product image urls can be retrieved from, and this is data in array format.
http://abc.com/main_img_a, http://abc.com/altimg_a_1, http://abc.com/altimg_a_2
So this is a single product record row & you can see that this is a a 3 column array, there is a main image url, and 2 additional images that have been added to this particular product.
Now what we mean when we say this data is treated as an array in Feed Donkey is that any transformation rules that are applied, are applied to each of the values in the array. So if you for example applied a rule that truncates to leave just the first 4 characters e.g.
This would return:
http, http, http
After this first process, the results are then joined as a string and either this is a output against the product record in the column, or if you add another process this is what will be variable that is used in subsequent transformations.
Just for illustration purposes, say you added a 2nd LEFT function
The output against this record would now be:
So turning to the real world application of this behaviour, in the case of additional/ alternate image urls it is advantageous for us to be able to manipulate this data in string format. I will illustrate.
For additional images, the Google Merchant Centre feed specification requires that you supply one or more ‘additional_image_link’ columns, like this
We can achieve this, but we need the data as a string that we can determine the order in which the urls appear. This is important because we need to be able to ‘carve up’ the string to extract what we need.
We need to pick the 2nd url in the string to populate column X, and the 3rd to populate column Y.
The method summary is
- REGEX EXTRACT where the expression matches all the urls in the array
- This will then return all the urls as a comma separated string
- A further REGEX EXTRACT function with the expression including Group 1 parentheses identifying the url that is required for output.
So this would look something like this in the builder