# Write an OnExtraction Middleware
This article explains the steps to implement a Middleware with the OnExtraction hook, using the Poll Daddy Widget Provider as an example.
Poll Daddy, now Crowd Signal (opens new window) is a popular platform for creating Polls and Quizzes that you can embed into your website.
When a website has articles that contain polls, like this one, we need to integrate them into the Marfeel version.
# Poll Daddy Widget Provider
At Marfeel, there is already a Widget Provider for Poll Daddy (opens new window) available. If we take a look at its schema (opens new window), we'll see that it has two properties, pollId
which is required, and mainColor
.
While mainColor
is a static value and can be configured through widgets.json
, pollId
needs to be dynamic as it will change depending on the page it's placed on.
Implementing a Middleware will allow us to dynamically retrieve the pollId
value.
# Find the pollId
The first step is to find where the pollId is in the tenant's page.
Whereas this may differ from case to case, in this example you can find it within the URL of the src
attribute of a <script>
tag:
<script
type="text/javascript"
charset="utf-8"
src="https://secure.polldaddy.com/p/10582522.js"
>
</script>
Now you know where to find the pollId
value, we'll create a Middleware that extracts it.
# Pick a Middleware type
First, we'll need to decide which type of Middleware is required: OnExtraction or OnBrowser.
Since we need to retrieve data from the HTML of the page, OnExtraction is the right type.
# Implementation guide
The easiest way to work with Middleware is by using a test-driven approach.
# Create the test
Start by downloading a copy of the page's HTML which we can use as an input for our test.
TIP
Use the npm run create:fixtures
command to automatically generate the fixtures:
npm run create:fixtures https://example.com/article-with-cute-kittens/
This command will download the HTML of the desired page into the fixtures
folder, within the src
folder of the tenant.
You'll also need to install the Middleware test package:
npm i --save-dev @marfeel/middlewares-test-utils
As any test-driven development, start by creating the test file in src/middlewares/widgets/poll-daddy/
and name it on-extraction.test.ts
.
TIP
If we used the OnBrowser Hook, the filename would be named on-browser.test.ts
.
In it, load the fixture we just extracted into a document that our Middleware can work on.
The loadFixture
method does this for you.
Import it from the middlewares-test-utils
package and call it passing the filename of your fixture file as a parameter:
import { loadFixture } from '@marfeel/middlewares-test-utils';
const document = await loadFixture('how-to-improve-ad-viewability.html');
The loadFixture
will load your fixture into a document.
Next, you need to set up the Middleware execution. The runMiddleware
method allows you to run a Middleware in a given document.
runMiddleware
requires two arguments:
- Document: The document the Middleware will be executed on, in this case, the output of the
loadFixture
method. - onExtractionMiddleware: The Middleware to execute.
Middleware
At this point you haven't created the Middleware yet, create an empty file next to on-extraction.test.ts
and name it on-extraction.ts
.
Then, import it in your test file.
import { onExtraction } from './on-extraction';
const result = await runMiddleware(document, onExtraction);
To finish the test, configure the describe block (opens new window) validating the result of the middleware execution is correct by comparing it to the expected result.
The whole test will look something like this:
import { loadFixture, runMiddleware } from '@marfeel/middlewares-test-utils';
import { onExtraction } from './on-extraction';
describe('Poll Daddy', () => {
describe('OnExtraction', () => {
test('returns the pollId', async() => {
const document = await loadFixture('how-to-improve-ad-viewability.html');
const result = await runMiddleware(document, onExtraction);
expect(result).toEqual({
pollId: '10582522'
});
});
});
});
If you try to run the test now ( using npm test
) it will fail, because we haven't implemented the Middleware hook yet! Let's do that next.
# Set up OnExtraction Middleware
Access the on-extraction.ts
file you created earlier and add the following skeleton:
import { onExtractionFunction } from '@marfeel/middlewares-types';
import { PollDaddyProps } from '@marfeel/widget-providers-poll-daddy';
export const onExtraction: onExtractionFunction<PollDaddyProps> = async ({ document }): Promise<PollDaddyProps | undefined> => {
return {
pollId: ''
}
};
The onExtractionFunction
import enables the OnExtraction type, which we will use for the onExtraction function expected return.
The PollDaddyProps
import is required to connect the Middleware to your Widget Provider.
TIP
For these imports to work you need to install them as dependencies:
- Poll-daddy Widget Provider:
npm i @marfeel/widget-providers-poll-daddy
- Middleware Types package, as a development dependency.
npm i --save-dev @marfeel/middlewares-types
At this point, you can run the test. It will still fail though, as the Middleware is returning an empty value.
# Implement OnExtraction Middleware
Now you need to configure the OnExtraction Middleware to retrieve the target data.
First, you need to find the correct script tag. As the Document object of the page is an argument of the Middleware, you can use it to query its elements and filter out the one containing the ID.
const url = [...document.querySelectorAll('script')]
.map(element => element.getAttribute('src'))
.filter(url => url)
.find(url => url.toLowerCase().includes('https://secure.polldaddy.com/p/'));
This example iterates over all the scripts of the document looking for one that contains polldaddy
in the src
attribute.
So now you have the https://secure.polldaddy.com/p/<pollID>.js
URL located you have to extract the id from it.
You can achieve this using string manipulation functions.
const filename = url.split('/').pop();
const pollId = filename.replace('.js', '');
return {
pollId
};
Regex
You could use a Regex to parse the string but it should be avoided when possible as Regex has higher complexity and performance cost compared to string manipulation methods.
All pieces are in place, your Middleware should look like this:
import { onExtractionFunction } from '@marfeel/middlewares-types';
import { PollDaddyProps } from '@marfeel/widget-providers-poll-daddy';
export const onExtraction: onExtractionFunction<PollDaddyProps> = async ({ document }): Promise<PollDaddyProps | undefined> => {
const url = Array.from(document.querySelectorAll('script'))
.map(element => element.getAttribute('src'))
.filter(url => url)
.find(url => url.toLowerCase().includes('https://secure.polldaddy.com/p/'));
if (!url) {
return;
}
const filename = url.split('/').pop();
const pollId = filename.replace('.js', '');
return {
pollId
};
};
Now you can run npm test
and the test will pass.
We should also test that our new Middleware works correctly in production.
To do so, compile the code using npm run build
(don't forget the react
option if needed!) and use the Middleware command to test it in the production microservice.
npm run middleware poll-daddy https://example.com/article-containing-poll-daddy/
This gives you the following output which means the Middleware works correctly when extracting the data directly from the website.
{
"result": {
"pollId": "10582522"
},
"error": null
}
# Connect the pieces
Now that you have a working Middleware, let's hook it all up.
- Activate the Middleware: Add the
invokeMiddleware
flag with the valuetrue
tofeatures.json
.
{
"features": {
"invokeMiddleware": true
}
}
- Configure the Widget Provider: Add the poll-daddy provider to
widgets.json
.
{
"widgets": [
{
"type": "provider",
"id": "poll-daddy",
"name": "poll-daddy",
"selector": ".mrf-polldaddy",
"middleware": "poll-daddy",
"parameters": {
"pollId": "",
"mainColor": "#F56565"
}
}
]
}
TIP
If there's no widgets.json
file in your tenant, create one in the root folder of the tenant.
The value of the middleware
key is the name of the folder the Middleware is in.
Done! You have your Middleware working and fully integrated with the poll-daddy Widget Provider, it's time to send the PR.
Once the code is merged to production, you'll be able to visit the Marfeel version of the site and see the Provider in action.
TIP
As a safety check, it's recommended to test your middleware in several articles. The tenant might use different configurations and your Middleware should cover them all.