The Branding Store | Logo Design, Web Design and E-commerce specialists.| Pembroke Pines, Florida.

18 May

How We Improved Our Core Web Vitals (Case Study)

by TBSCategories: News

Last year, Google started emphasizing the importance of Core Web Vitals and how they reflect a person’s real experience when visiting sites around the web. Performance is a core feature of our company, Instant Domain Search—it’s in the name. Imagine our surprise when we found that our vitals scores were not great for a lot of people. Our fast computers and fiber internet masked the experience real people have on our site. It wasn’t long before a sea of red “poor” and yellow “needs improvement” notices in our Google Search Console needed our attention. Entropy had won, and we had to figure out how to clean up the jank—and make our site faster.

I founded Instant Domain Search in 2005 and kept it as a side-hustle while I worked on a Y Combinator company (Snipshot, W06), before working as a software engineer at Facebook. We’ve recently grown to a small group based mostly in Victoria, Canada and we are working through a long backlog of new features and performance improvements. Our poor web vitals scores, and the looming Google Update, brought our focus to finding and fixing these issues.

When the first version of the site was launched, I’d built it with PHP, MySQL, and XMLHttpRequest. Internet Explorer 6 was fully supported, Firefox was gaining share, and Chrome was still years from launch. Over time, we’ve evolved through a variety of static site generators, JavaScript frameworks, and server technologies. Our current front-end stack is React served with Next.js and a backend service built-in Rust to answer our domain name searches. We try to follow best practice by serving as much as we can over a CDN, avoiding as many third-party scripts as possible, and using simple SVG graphics instead of bitmap PNGs. It wasn’t enough.

Next.js lets us build our pages and components in React and TypeScript. When paired with VS Code the development experience is amazing. Next.js generally works by transforming React components into static HTML and CSS. This way, the initial content can be served from a CDN, and then Next can “hydrate” the page to make elements dynamic. Once the page is hydrated, our site turns into a single-page app where people can search for and generate domain names. We do not rely on Next.js to do much server-side work, the majority of our content is statically exported as HTML, CSS, and JavaScript to be served from a CDN.

When someone starts searching for a domain name, we replace the page content with search results. To make the searches as fast as possible, the front-end directly queries our Rust backend which is heavily optimized for domain lookups and suggestions. Many queries we can answer instantly, but for some TLDs we need to do slower DNS queries which can take a second or two to resolve. When some of these slower queries resolve, we will update the UI with whatever new information comes in. The results pages are different for everyone, and it can be hard for us to predict exactly how each person experiences the site.

The Chrome DevTools are excellent, and a good place to start when chasing performance issues. The Performance view shows exactly when HTTP requests go out, where the browser spends time evaluating JavaScript, and more:

There are three Core Web Vitals metrics that Google will use to help rank sites in their upcoming search algorithm update. Google bins experiences into “Good”, “Needs Improvement”, and “Poor” based on the LCP, FID, and CLS scores real people have on the site:

LCP, or Largest Contentful Paint, defines the time it takes for the largest content element to become visible.
FID, or First Input Delay, relates to a site’s responsiveness to interaction—the time between a tap, click, or keypress in the interface and the response from the page.
CLS, or Cumulative Layout Shift, tracks how elements move or shift on the page absent of actions like a keyboard or click event.

Chrome is set up to track these metrics across all logged-in Chrome users, and sends anonymous statistics summarizing a customer’s experience on a site back to Google for evaluation. These scores are accessible via the Chrome User Experience Report, and are shown when you inspect a URL with the PageSpeed Insights tool. The scores represent the 75th percentile experience for people visiting that URL over the previous 28 days. This is the number they will use to help rank sites in the update.

A 75th percentile (p75) metric strikes a reasonable balance for performance goals. Taking an average, for example, would hide a lot of bad experiences people have. The median, or 50th percentile (p50), would mean that half of the people using our product were having a worse experience. The 95th percentile (p95), on the other hand, is hard to build for as it captures too many extreme outliers on old devices with spotty connections. We feel that scoring based on the 75th percentile is a fair standard to meet.

To get our scores under control, we first turned to Lighthouse for some excellent tooling built into Chrome and hosted at web.dev/measure/, and at PageSpeed Insights. These tools helped us find some broad technical issues with our site. We saw that the way Next.js was bundling our CSS and slowed our initial rendering time which affected our FID. The first easy win came from an experimental Next.js feature, optimizeCss, which helped improve our general performance score significantly.

Lighthouse also caught a cache misconfiguration that prevented some of our static assets from being served from our CDN. We are hosted on Google Cloud Platform, and the Google Cloud CDN requires that the Cache-Control header contains “public”. Next.js does not allow you to configure all of the headers it emits, so we had to override them by placing the Next.js server behind Caddy, a lightweight HTTP proxy server implemented in Go. We also took the opportunity to make sure we were serving what we could with the relatively new stale-while-revalidate support in modern browsers which allows the CDN to fetch content from the origin (our Next.js server) asynchronously in the background.

It’s easy—maybe too easy—to add almost anything you need to your product from npm. It doesn’t take long for bundle sizes to grow. Big bundles take longer to download on slow networks, and the 75th percentile mobile phone will spend a lot of time blocking the main UI thread while it tries to make sense of all the code it just downloaded. We liked BundlePhobia which is a free tool that shows how many dependencies and bytes an npm package will add to your bundle. This led us to eliminate or replace a number of react-spring powered animations with simpler CSS transitions:

Through the use of BundlePhobia and Lighthouse, we found that third-party error logging and analytics software contributed significantly to our bundle size and load time. We removed and replaced these tools with our own client-side logging that take advantage of modern browser APIs like sendBeacon and ping. We send logging and analytics to our own Google BigQuery infrastructure where we can answer the questions we care about in more detail than any of the off-the-shelf tools could provide. This also eliminates a number of third-party cookies and gives us far more control over how and when we send logging data from clients.

Our CLS score still had the most room for improvement. The way Google calculates CLS is complicated—you’re given a maximum “session window” with a 1-second gap, capped at 5 seconds from the initial page load, or from a keyboard or click interaction, to finish moving things around the site. If you’re interested in reading more deeply into this topic, here’s a great guide on the topic. This penalizes many types of overlays and popups that appear just after you land on a site. For instance, ads that shift content around or upsells that might appear when you start scrolling past ads to reach content. This article provides an excellent explanation of how the CLS score is calculated and the reasoning behind it.

We are fundamentally opposed to this kind of digital clutter so we were surprised to see how much room for improvement Google insisted we make. Chrome has a built-in Web Vitals overlay that you can access by using the Command Menu to “Show Core Web Vitals overlay”. To see exactly which elements Chrome considers in its CLS calculation, we found the Chrome Web Vitals extension’s “Console Logging” option in settings more helpful. Once enabled, this plugin shows your LCP, FID, and CLS scores for the current page. From the console, you can see exactly which elements on the page are connected to these scores. Our CLS scores had the most room for improvement.

Of the three metrics, CLS is the only one that accumulates as you interact with a page. The Web Vitals extension has a logging option that will show exactly which elements cause CLS while you are interacting with a product. Watch how the CLS metrics add when we scroll on Smashing Magazine’s home page:

The best way to track progress from one deploy to the next is to measure page experiences the same way Google does. If you have Google Analytics set up, an easy way to do this is to install Google’s web-vitals module and hook it up to Google Analytics. This provides a rough measure of your progress and makes it visible in a Google Analytics dashboard.

This is where we hit a wall. We could see our CLS score, and while we’d improved it significantly, we still had work to do. Our CLS score was roughly 0.23 and we needed to get this below 0.1—and preferably down to 0. At this point, though, we couldn’t find something that told us exactly which components on which pages were still affecting the score. We could see that Chrome exposed a lot of detail in their Core Web Vitals tools, but that the logging aggregators threw away the most important part: exactly which page element caused the problem.

To capture all of the detail we need, we built a serverless function to capture web vitals data from browsers. Since we don’t need to run real-time queries on the data, we stream it into Google BigQuery’s streaming API for storage. This architecture means we can inexpensively capture about as many data points as we can generate.

After learning some lessons while working with Web Vitals and BigQuery, we decided to bundle up this functionality and release these tools as open-source at vitals.dev.

Using Instant Vitals is a quick way to get started tracking your Web Vitals scores in BigQuery. Here’s an example of a BigQuery table schema that we create:

Integrating with Instant Vitals is easy. You can get started by integrating with the client library to send data to your backend or serverless function:

import { init } from "@instantdomain/vitals-client";  init({ endpoint: "/api/web-vitals" });

Then, on your server, you can integrate with the server library to complete the circuit:

import fs from "fs";  import { init, streamVitals } from "@instantdomain/vitals-server";  // Google libraries require service key as path to file const GOOGLE_SERVICE_KEY = process.env.GOOGLE_SERVICE_KEY; process.env.GOOGLE_APPLICATION_CREDENTIALS = "/tmp/goog_creds"; fs.writeFileSync(   process.env.GOOGLE_APPLICATION_CREDENTIALS,   GOOGLE_SERVICE_KEY );  const DATASET_ID = "web_vitals"; init({ datasetId: DATASET_ID }).then().catch(console.error);  // Request handler export default async (req, res) => {   const body = JSON.parse(req.body);   await streamVitals(body, body.name);   res.status(200).end(); };

Simply call streamVitalswith the body of the request and the name of the metric to send the metric to BigQuery. The library will handle creating the dataset and tables for you.

After collecting a day’s worth of data, we ran this query like this one:

SELECT   `<project_name>.web_vitals.CLS`.Value,   Node FROM   `<project_name>.web_vitals.CLS` JOIN   UNNEST(Entries) AS Entry JOIN   UNNEST(Entry.Sources) WHERE   Node != "" ORDER BY   value LIMIT   10

This query produces results like this:

Value	Node
`4.6045324800736724E-4`	`/html/body/div[1]/main/div/div/div[2]/div/div/blockquote`
`7.183070668914928E-4`	`/html/body/div[1]/header/div/div/header/div`
`0.031002668277977697`	`/html/body/div[1]/footer`
`0.035830703317463526`	`/html/body/div[1]/main/div/div/div[2]`
`0.035830703317463526`	`/html/body/div[1]/footer`
`0.035830703317463526`	`/html/body/div[1]/main/div/div/div[2]`
`0.035830703317463526`	`/html/body/div[1]/main/div/div/div[2]`
`0.035830703317463526`	`/html/body/div[1]/footer`
`0.035830703317463526`	`/html/body/div[1]/footer`
`0.03988482067913317`	`/html/body/div[1]/footer`

This shows us which elements on which pages have the most impact on CLS. It created a punch list for our team to investigate and fix. On Instant Domain Search, it turns out that slow or bad mobile connections will take more than 500ms to load some of our search results. One of the worst contributors to CLS for these users was actually our footer.

The layout shift score is calculated as a function of the size of the element moving, and how far it goes. In our search results view, if a device takes more than a certain amount of time to receive and render search results, the results view would collapse to a zero-height, bringing the footer into view. When the results come in, they push the footer back to the bottom of the page. A big DOM element moving this far added a lot to our CLS score. To work through this properly, we need to restructure the way the search results are collected and rendered. We decided to just remove the footer in the search results view as a quick hack that’d stop it from bouncing around on slow connections.

We now review this report regularly to track how we are improving — and use it to fight declining results as we move forward. We have witnessed the value of extra attention to newly launched features and products on our site and have operationalized consistent checks to be sure core vitals are acting in favor of our ranking. We hope that by sharing Instant Vitals we can help other developers tackle their Core Web Vitals scores too.

Google provides excellent performance tools built into Chrome, and we used them to find and fix a number of performance issues. We learned that the field data provided by Google offered a good summary of our p75 progress, but did not have actionable detail. We needed to find out exactly which DOM elements were causing layout shifts and input delays. Once we started collecting our own field data—with XPath queries—we were able to identify specific opportunities to improve everyone’s experience on our site. With some effort, we brought our real-world Core Web Vitals field scores down into an acceptable range in preparation for June’s Page Experience Update. We’re happy to see these numbers go down and to the right!

Articles on Smashing Magazine — For Web Designers And Developers

17 May

How We Improved Our Core Web Vitals (Case Study)

by TBSCategories: News

LCP, or Largest Contentful Paint, defines the time it takes for the largest content element to become visible.
FID, or First Input Delay, relates to a site’s responsiveness to interaction—the time between a tap, click, or keypress in the interface and the response from the page.
CLS, or Cumulative Layout Shift, tracks how elements move or shift on the page absent of actions like a keyboard or click event.

After learning some lessons while working with Web Vitals and BigQuery, we decided to bundle up this functionality and release these tools as open-source at vitals.dev.

Using Instant Vitals is a quick way to get started tracking your Web Vitals scores in BigQuery. Here’s an example of a BigQuery table schema that we create:

Integrating with Instant Vitals is easy. You can get started by integrating with the client library to send data to your backend or serverless function:

import { init } from "@instantdomain/vitals-client";  init({ endpoint: "/api/web-vitals" });

Then, on your server, you can integrate with the server library to complete the circuit:

import fs from "fs";  import { init, streamVitals } from "@instantdomain/vitals-server";  // Google libraries require service key as path to file const GOOGLE_SERVICE_KEY = process.env.GOOGLE_SERVICE_KEY; process.env.GOOGLE_APPLICATION_CREDENTIALS = "/tmp/goog_creds"; fs.writeFileSync(   process.env.GOOGLE_APPLICATION_CREDENTIALS,   GOOGLE_SERVICE_KEY );  const DATASET_ID = "web_vitals"; init({ datasetId: DATASET_ID }).then().catch(console.error);  // Request handler export default async (req, res) => {   const body = JSON.parse(req.body);   await streamVitals(body, body.name);   res.status(200).end(); };

Simply call streamVitalswith the body of the request and the name of the metric to send the metric to BigQuery. The library will handle creating the dataset and tables for you.

After collecting a day’s worth of data, we ran this query like this one:

SELECT   `<project_name>.web_vitals.CLS`.Value,   Node FROM   `<project_name>.web_vitals.CLS` JOIN   UNNEST(Entries) AS Entry JOIN   UNNEST(Entry.Sources) WHERE   Node != "" ORDER BY   value LIMIT   10

This query produces results like this:

Value	Node
`4.6045324800736724E-4`	`/html/body/div[1]/main/div/div/div[2]/div/div/blockquote`
`7.183070668914928E-4`	`/html/body/div[1]/header/div/div/header/div`
`0.031002668277977697`	`/html/body/div[1]/footer`
`0.035830703317463526`	`/html/body/div[1]/main/div/div/div[2]`
`0.035830703317463526`	`/html/body/div[1]/footer`
`0.035830703317463526`	`/html/body/div[1]/main/div/div/div[2]`
`0.035830703317463526`	`/html/body/div[1]/main/div/div/div[2]`
`0.035830703317463526`	`/html/body/div[1]/footer`
`0.035830703317463526`	`/html/body/div[1]/footer`
`0.03988482067913317`	`/html/body/div[1]/footer`

Articles on Smashing Magazine — For Web Designers And Developers

17 May

How We Improved Our Core Web Vitals (Case Study)

by TBSCategories: News

LCP, or Largest Contentful Paint, defines the time it takes for the largest content element to become visible.
FID, or First Input Delay, relates to a site’s responsiveness to interaction—the time between a tap, click, or keypress in the interface and the response from the page.
CLS, or Cumulative Layout Shift, tracks how elements move or shift on the page absent of actions like a keyboard or click event.

After learning some lessons while working with Web Vitals and BigQuery, we decided to bundle up this functionality and release these tools as open-source at vitals.dev.

Using Instant Vitals is a quick way to get started tracking your Web Vitals scores in BigQuery. Here’s an example of a BigQuery table schema that we create:

Integrating with Instant Vitals is easy. You can get started by integrating with the client library to send data to your backend or serverless function:

import { init } from "@instantdomain/vitals-client";  init({ endpoint: "/api/web-vitals" });

Then, on your server, you can integrate with the server library to complete the circuit:

import fs from "fs";  import { init, streamVitals } from "@instantdomain/vitals-server";  // Google libraries require service key as path to file const GOOGLE_SERVICE_KEY = process.env.GOOGLE_SERVICE_KEY; process.env.GOOGLE_APPLICATION_CREDENTIALS = "/tmp/goog_creds"; fs.writeFileSync(   process.env.GOOGLE_APPLICATION_CREDENTIALS,   GOOGLE_SERVICE_KEY );  const DATASET_ID = "web_vitals"; init({ datasetId: DATASET_ID }).then().catch(console.error);  // Request handler export default async (req, res) => {   const body = JSON.parse(req.body);   await streamVitals(body, body.name);   res.status(200).end(); };

Simply call streamVitalswith the body of the request and the name of the metric to send the metric to BigQuery. The library will handle creating the dataset and tables for you.

After collecting a day’s worth of data, we ran this query like this one:

SELECT   `<project_name>.web_vitals.CLS`.Value,   Node FROM   `<project_name>.web_vitals.CLS` JOIN   UNNEST(Entries) AS Entry JOIN   UNNEST(Entry.Sources) WHERE   Node != "" ORDER BY   value LIMIT   10

This query produces results like this:

Value	Node
`4.6045324800736724E-4`	`/html/body/div[1]/main/div/div/div[2]/div/div/blockquote`
`7.183070668914928E-4`	`/html/body/div[1]/header/div/div/header/div`
`0.031002668277977697`	`/html/body/div[1]/footer`
`0.035830703317463526`	`/html/body/div[1]/main/div/div/div[2]`
`0.035830703317463526`	`/html/body/div[1]/footer`
`0.035830703317463526`	`/html/body/div[1]/main/div/div/div[2]`
`0.035830703317463526`	`/html/body/div[1]/main/div/div/div[2]`
`0.035830703317463526`	`/html/body/div[1]/footer`
`0.035830703317463526`	`/html/body/div[1]/footer`
`0.03988482067913317`	`/html/body/div[1]/footer`

Articles on Smashing Magazine — For Web Designers And Developers

06 May

Reducing HTML Payload With Next.js (Case Study)

by TBSCategories: News

I know what you are thinking. Here’s another article about reducing JavaScript dependencies and the bundle size sent to the client. But this one is a bit different, I promise.

This article is about a couple of things that Bookaway faced and we (as a company in the traveling industry) managed to optimize our pages, so that the HTML we send is smaller. Smaller HTML means less time for Google to download and process those long strings of text.

Usually, the HTML code size is not a big issue, especially for small pages, not data-intensive, or pages that are not SEO-oriented. However, in our pages, the case was different as our database stores lots of data, and we need to serve thousands of landing pages at scale.

You may be wondering why we need such a scale. Well, Bookaway works with 1,500 operators and provide over 20k services in 63 countries with 200% growth year over year (pre Covid-19). In 2019, we sold 500k tickets a year, so our operations are complex and we need to showcase it with our landing pages in an appealing and fast manner. Both for Google bots (SEO) and to actual clients.

In this article, I’ll explain:

how we found the HTML size is too big;
how it got reduced;
the benefits of this process (i.e. creating improved architecture, improving ode organization, providing a straightforward job for Google to index tens of thousands of landing pages, and serving much fewer bytes to the client — especially suitable for people with slow connections).

But first, let’s talk about the importance of speed improvement.

Why Is Speed Improvement Necessary To Our SEO Efforts?

Meet “Web Vitals”, but in particular, meet LCP (Largest Contentful Paint):

“Largest Contentful Paint (LCP) is an important, user-centric metric for measuring perceived load speed because it marks the point in the page load timeline when the page’s main content has likely loaded — a fast LCP helps reassure the user that the page is useful.”

The main goal is to have a small LCP as possible. Part of having a small LCP is to let the user download as small HTML as possible. That way, the user can start the process of painting the largest content paint ASAP.

While LCP is a user-centric metric, reducing it should make a big help to Google bots as Googe states:

“The web is a nearly infinite space, exceeding Google’s ability to explore and index every available URL. As a result, there are limits to how much time Googlebot can spend crawling any single site. Google’s amount of time and resources to crawling a site is commonly called the site’s crawl budget.”

— “Advanced SEO,” Google Search Central Documentation

One of the best technical ways to improve the crawl budget is to help Google do more in less time:

Q: “Does site speed affect my crawl budget? How about errors?”

A: “Making a site faster improves the users’ experience while also increasing the crawl rate. For Googlebot, a speedy site is a sign of healthy servers so that it can get more content over the same number of connections.”

To sum it up, Google bots and Bookaway clients have the same goal — they both want to get content delivered fast. Since our database contains a large amount of data for every page, we need to aggregate it efficiently and send something small and thin to the clients.

Investigations for ways we can improve led to finding that there is a big JSON embedded in our HTML, making the HTML chunky. For that case, we’ll need to understand React Hydration.

React Hydration: Why There Is A JSON In HTML

That happens because of how Server-side rendering works in react and Next.js:

When the request arrives at the server — it needs to make an HTML based on a data collection. That collection of data is the object returned by getServerSideProps.
React got the data. Now it kicks into play in the server. It builds in HTML and sends it.
When the client receives the HTML, it is immediately pained in front of him. In the meanwhile, React javascript is being downloaded and executed.
When javascript execution is done, React kicks into play again, now on the client. It builds the HTML again and attaches event listeners. This action is called hydration.
As React building the HTML again for the hydration process, it requires the same data collection used on the server (look back at 1.).
This data collection is being made available by inserting the JSON inside a script tag with id __NEXT_DATA__.

What Pages Are We Talking About Exactly?

As we need to promote our offerings in search engines, the need for landing pages has arisen. People usually don’t search for a specific bus line’s name, but more like, “How to get from Bangkok to Pattaya?” So far, we have created four types of landing pages that should answer such queries:

City A to City B
All the lines stretched from a station in City A to a station in City B. (e.g. Bangkok to Pattaya)
City
All lines that go through a specific city. (e.g. Cancun)
Country
All lines that go through a specific country. (e.g. Italy)
Station
All lines that go through a specific station. (e.g. Hanoi-airport)

Now, A Look At Architecture

Let’s take a high-level and very simplified look at the infrastructure powering the landing pages we are talking about. Interesting parts lie on 4 and 5. That’s where the wasting parts:

Key Takeaways From The Process

The request is hitting the getInitialProps function. This function runs on the server. This function’s responsibility is to fetch data required for the construction of a page.
The raw data returned from REST Servers passed as is to React.
First, it runs on the server. Since the non-aggregated data was transferred to React, React is also responsible for aggregating the data into something that can be used by UI components (more about that in the following sections)
The HTML is being sent to the client, together with the raw data. Then React is kicking again into play also in the client and doing the same job. Because hydration is needed (more about that in the following sections). So React is doing the data aggregation job twice.

The Problem

Analyzing our page creation process led us to the finding of Big JSON embedded inside the HTML. Exactly how big is difficult to say. Each page is slightly different because each station or city has to aggregate a different data set. However, it is safe to say that the JSON size could be as big as 250kb on popular pages. It was Later reduced to sizes around 5kb-15kb. Considerable reduction. On some pages, it was hanging around 200-300 kb. That is big.

The big JSON is embedded inside a script tag with id of ___NEXT_DATA___:

<script id="__NEXT_DATA__" type="application/json"> // Huge JSON here. </script>

If you want to easily copy this JSON into your clipboard, try this snippet in your Next.js page:

copy($  ('#__NEXT_DATA__').innerHTML)

A question arises.

Why Is It So Big? What’s In There?

A great tool, JSON Size analyzer, knows how to process a JSON and shows where most of the bulk of size resides.

That was our initial findings while examining a station page:

There are two issues with the analysis:

Data is not aggregated.
Our HTML contains the complete list of granular products. We don’t need them for painting on-screen purposes. We do need them for aggregation methods. For example, We are fetching a list of all the lines passing through this station. Each line has a supplier. But we need to reduce the list of lines into an array of 2 suppliers. That’s it. We’ll see an example later.
Unnecessary fields.
When drilling down each object, we saw some fields we don’t need at all. Not for aggregation purposes and not for painting methods. That’s because We fetch the data from REST API. We can’t control what data we fetch.

Those two issues showed that the pages need architecture change. But wait. Why do we need a data JSON embedded in our HTML in the first place? 🤔

Architecture Change

The issue of the very big JSON had to be solved in a neat and layered solution. How? Well, by adding the layers marked in green in the following diagram:

A few things to note:

Double data aggregation was removed and consolidated to just being made just once on the Next.js server only;
Graphql Server layer added. That makes sure we get only the fields we want. The database can grow with many more fields for each entity, but that won’t affect us anymore;
PageLogic function added in getServerSideProps. This function gets non-aggregated data from back-end services. This function aggregates and prepares the data for the UI components. (It runs only on the server.)

Data Flow Example

We want to render this section from a station page:

We need to know who are the suppliers are operating in a given station. We need to fetch all lines for the lines REST endpoint. That’s the response we got (example purpose, in reality, it was much larger):

[   {     id: "58a8bd82b4869b00063b22d2",     class: "Standard",     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     id: "58f5e40da02e97f000888e07a",     class: "Luxury",     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     id: "58f5e4a0a02e97f000325e3a",     class: 'Luxury',     supplier: "Jones Ltd",     type: "minivan",   }, ]; [   { supplier: "Hyatt-Mosciski", amountOfLines: 2, types: ["bus"] },   { supplier: "Jones Ltd", amountOfLines: 1, types: ["minivan"] }, ];

As you can see, we got some irrelevant fields. pictures and id are not going to play any role in the section. So we’ll call the Graphql Server and request only the fields we need. So now it looks like this:

[   {     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     supplier: "Jones Ltd",     type: "minivan",   }, ];

Now that’s an easier object to work with. It is smaller, easier to debug, and takes less memory on the server. But, it is not aggregated yet. This is not the data structure required for the actual rendering.

Let’s send it to the PageLogic function to crunch it and see what we get:

[   { supplier: "Hyatt-Mosciski", amountOfLines: 2, types: ["bus"] },   { supplier: "Jones Ltd", amountOfLines: 1, types: ["minivan"] }, ];

This small data collection is sent to the Next.js page.

Now that’s ready-made for UI rendering. No more crunching and preparations are needed. Also, it is now very compact compared to the initial data collection we have extracted. That’s important because we’ll be sending very little data to the client that way.

How To Measure The Impact Of The Change

Reducing HTML size means there are fewer bits to download. When a user requests a page, it gets fully formed HTML in less time. This can be measured in content download of the HTML resource in the network panel.

Conclusions

Delivering thin resources is essential, especially when it comes to HTML. If HTML is turning out big, we have no room left for CSS resources or javascript in our performance budget.

It is best practice to assume many real-world users won’t be using an iPhone 12, but rather a mid-level device on a mid-level network. It turns out that the performance levels are pretty tight as the highly-regarded article suggests:

“Thanks to progress in networks and browsers (but not devices), a more generous global budget cap has emerged for sites constructed the “modern” way. We can now afford ~100KiB of HTML/CSS/fonts and ~300-350KiB of JS (gzipped). This rule-of-thumb limit should hold for at least a year or two. As always, the devil’s in the footnotes, but the top-line is unchanged: when we construct the digital world to the limits of the best devices, we build a less usable one for 80+% of the world’s users.”

Performance Impact

We measure the performance impact by the time it takes to download the HTML on slow 3g throttling. that metric is called “content download” in Chrome Dev Tools.

Here’s a metric example for a station page:

	HTML size (before gzip)	HTML Download time (slow 3G)
Before	370kb	820ms
After	166	540ms
Total change	204kb decrease	34% Decrease

Layered Solution

The architecture changes included additional layers:

GraphQl server: helpers with fetching exactly what we want.
Dedicated function for aggregation: runs only on the server.

Those changed, apart from pure performance improvements, also offered much better code organization and debugging experience:

All the logic regarding reducing and aggregating data now centralized in a single function;
The UI functions are now much more straightforward. No aggregation, no data crunching. They are just getting data and painting it;
Debugging server code is more pleasant since we extract only the data we need—no more unnecessary fields coming from a REST endpoint.

Articles on Smashing Magazine — For Web Designers And Developers

06 May

Reducing HTML Payload With Next.js (Case Study)

by TBSCategories: News

I know what you are thinking. Here’s another article about reducing JavaScript dependencies and the bundle size sent to the client. But this one is a bit different, I promise.

In this article, I’ll explain:

how we found the HTML size is too big;
how it got reduced;
the benefits of this process (i.e. creating improved architecture, improving ode organization, providing a straightforward job for Google to index tens of thousands of landing pages, and serving much fewer bytes to the client — especially suitable for people with slow connections).

But first, let’s talk about the importance of speed improvement.

Why Is Speed Improvement Necessary To Our SEO Efforts?

Meet “Web Vitals”, but in particular, meet LCP (Largest Contentful Paint):

“Largest Contentful Paint (LCP) is an important, user-centric metric for measuring perceived load speed because it marks the point in the page load timeline when the page’s main content has likely loaded — a fast LCP helps reassure the user that the page is useful.”

While LCP is a user-centric metric, reducing it should make a big help to Google bots as Googe states:

“The web is a nearly infinite space, exceeding Google’s ability to explore and index every available URL. As a result, there are limits to how much time Googlebot can spend crawling any single site. Google’s amount of time and resources to crawling a site is commonly called the site’s crawl budget.”

— “Advanced SEO,” Google Search Central Documentation

One of the best technical ways to improve the crawl budget is to help Google do more in less time:

Q: “Does site speed affect my crawl budget? How about errors?”

A: “Making a site faster improves the users’ experience while also increasing the crawl rate. For Googlebot, a speedy site is a sign of healthy servers so that it can get more content over the same number of connections.”

Investigations for ways we can improve led to finding that there is a big JSON embedded in our HTML, making the HTML chunky. For that case, we’ll need to understand React Hydration.

React Hydration: Why There Is A JSON In HTML

That happens because of how Server-side rendering works in react and Next.js:

When the request arrives at the server — it needs to make an HTML based on a data collection. That collection of data is the object returned by getServerSideProps.
React got the data. Now it kicks into play in the server. It builds in HTML and sends it.
When the client receives the HTML, it is immediately pained in front of him. In the meanwhile, React javascript is being downloaded and executed.
When javascript execution is done, React kicks into play again, now on the client. It builds the HTML again and attaches event listeners. This action is called hydration.
As React building the HTML again for the hydration process, it requires the same data collection used on the server (look back at 1.).
This data collection is being made available by inserting the JSON inside a script tag with id __NEXT_DATA__.

What Pages Are We Talking About Exactly?

City A to City B
All the lines stretched from a station in City A to a station in City B. (e.g. Bangkok to Pattaya)
City
All lines that go through a specific city. (e.g. Cancun)
Country
All lines that go through a specific country. (e.g. Italy)
Station
All lines that go through a specific station. (e.g. Hanoi-airport)

Now, A Look At Architecture

Let’s take a high-level and very simplified look at the infrastructure powering the landing pages we are talking about. Interesting parts lie on 4 and 5. That’s where the wasting parts:

Key Takeaways From The Process

The request is hitting the getInitialProps function. This function runs on the server. This function’s responsibility is to fetch data required for the construction of a page.
The raw data returned from REST Servers passed as is to React.
First, it runs on the server. Since the non-aggregated data was transferred to React, React is also responsible for aggregating the data into something that can be used by UI components (more about that in the following sections)
The HTML is being sent to the client, together with the raw data. Then React is kicking again into play also in the client and doing the same job. Because hydration is needed (more about that in the following sections). So React is doing the data aggregation job twice.

The Problem

The big JSON is embedded inside a script tag with id of ___NEXT_DATA___:

<script id="__NEXT_DATA__" type="application/json"> // Huge JSON here. </script>

If you want to easily copy this JSON into your clipboard, try this snippet in your Next.js page:

copy($  ('#__NEXT_DATA__').innerHTML)

A question arises.

Why Is It So Big? What’s In There?

A great tool, JSON Size analyzer, knows how to process a JSON and shows where most of the bulk of size resides.

That was our initial findings while examining a station page:

There are two issues with the analysis:

Data is not aggregated.
Our HTML contains the complete list of granular products. We don’t need them for painting on-screen purposes. We do need them for aggregation methods. For example, We are fetching a list of all the lines passing through this station. Each line has a supplier. But we need to reduce the list of lines into an array of 2 suppliers. That’s it. We’ll see an example later.
Unnecessary fields.
When drilling down each object, we saw some fields we don’t need at all. Not for aggregation purposes and not for painting methods. That’s because We fetch the data from REST API. We can’t control what data we fetch.

Those two issues showed that the pages need architecture change. But wait. Why do we need a data JSON embedded in our HTML in the first place? 🤔

Architecture Change

The issue of the very big JSON had to be solved in a neat and layered solution. How? Well, by adding the layers marked in green in the following diagram:

A few things to note:

Double data aggregation was removed and consolidated to just being made just once on the Next.js server only;
Graphql Server layer added. That makes sure we get only the fields we want. The database can grow with many more fields for each entity, but that won’t affect us anymore;
PageLogic function added in getServerSideProps. This function gets non-aggregated data from back-end services. This function aggregates and prepares the data for the UI components. (It runs only on the server.)

Data Flow Example

We want to render this section from a station page:

[   {     id: "58a8bd82b4869b00063b22d2",     class: "Standard",     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     id: "58f5e40da02e97f000888e07a",     class: "Luxury",     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     id: "58f5e4a0a02e97f000325e3a",     class: 'Luxury',     supplier: "Jones Ltd",     type: "minivan",   }, ]; [   { supplier: "Hyatt-Mosciski", amountOfLines: 2, types: ["bus"] },   { supplier: "Jones Ltd", amountOfLines: 1, types: ["minivan"] }, ];

[   {     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     supplier: "Jones Ltd",     type: "minivan",   }, ];

Let’s send it to the PageLogic function to crunch it and see what we get:

[   { supplier: "Hyatt-Mosciski", amountOfLines: 2, types: ["bus"] },   { supplier: "Jones Ltd", amountOfLines: 1, types: ["minivan"] }, ];

This small data collection is sent to the Next.js page.

How To Measure The Impact Of The Change

Conclusions

Delivering thin resources is essential, especially when it comes to HTML. If HTML is turning out big, we have no room left for CSS resources or javascript in our performance budget.

“Thanks to progress in networks and browsers (but not devices), a more generous global budget cap has emerged for sites constructed the “modern” way. We can now afford ~100KiB of HTML/CSS/fonts and ~300-350KiB of JS (gzipped). This rule-of-thumb limit should hold for at least a year or two. As always, the devil’s in the footnotes, but the top-line is unchanged: when we construct the digital world to the limits of the best devices, we build a less usable one for 80+% of the world’s users.”

Performance Impact

We measure the performance impact by the time it takes to download the HTML on slow 3g throttling. that metric is called “content download” in Chrome Dev Tools.

Here’s a metric example for a station page:

	HTML size (before gzip)	HTML Download time (slow 3G)
Before	370kb	820ms
After	166	540ms
Total change	204kb decrease	34% Decrease

Layered Solution

The architecture changes included additional layers:

GraphQl server: helpers with fetching exactly what we want.
Dedicated function for aggregation: runs only on the server.

Those changed, apart from pure performance improvements, also offered much better code organization and debugging experience:

All the logic regarding reducing and aggregating data now centralized in a single function;
The UI functions are now much more straightforward. No aggregation, no data crunching. They are just getting data and painting it;
Debugging server code is more pleasant since we extract only the data we need—no more unnecessary fields coming from a REST endpoint.

Articles on Smashing Magazine — For Web Designers And Developers

05 May

Reducing HTML Payload With Next.js (Case Study)

by TBSCategories: News

I know what you are thinking. Here’s another article about reducing JavaScript dependencies and the bundle size sent to the client. But this one is a bit different, I promise.

In this article, I’ll explain:

how we found the HTML size is too big;
how it got reduced;
the benefits of this process (i.e. creating improved architecture, improving ode organization, providing a straightforward job for Google to index tens of thousands of landing pages, and serving much fewer bytes to the client — especially suitable for people with slow connections).

But first, let’s talk about the importance of speed improvement.

Why Is Speed Improvement Necessary To Our SEO Efforts?

Meet “Web Vitals”, but in particular, meet LCP (Largest Contentful Paint):

“Largest Contentful Paint (LCP) is an important, user-centric metric for measuring perceived load speed because it marks the point in the page load timeline when the page’s main content has likely loaded — a fast LCP helps reassure the user that the page is useful.”

While LCP is a user-centric metric, reducing it should make a big help to Google bots as Googe states:

“The web is a nearly infinite space, exceeding Google’s ability to explore and index every available URL. As a result, there are limits to how much time Googlebot can spend crawling any single site. Google’s amount of time and resources to crawling a site is commonly called the site’s crawl budget.”

— “Advanced SEO,” Google Search Central Documentation

One of the best technical ways to improve the crawl budget is to help Google do more in less time:

Q: “Does site speed affect my crawl budget? How about errors?”

A: “Making a site faster improves the users’ experience while also increasing the crawl rate. For Googlebot, a speedy site is a sign of healthy servers so that it can get more content over the same number of connections.”

Investigations for ways we can improve led to finding that there is a big JSON embedded in our HTML, making the HTML chunky. For that case, we’ll need to understand React Hydration.

React Hydration: Why There Is A JSON In HTML

That happens because of how Server-side rendering works in react and Next.js:

When the request arrives at the server — it needs to make an HTML based on a data collection. That collection of data is the object returned by getServerSideProps.
React got the data. Now it kicks into play in the server. It builds in HTML and sends it.
When the client receives the HTML, it is immediately pained in front of him. In the meanwhile, React javascript is being downloaded and executed.
When javascript execution is done, React kicks into play again, now on the client. It builds the HTML again and attaches event listeners. This action is called hydration.
As React building the HTML again for the hydration process, it requires the same data collection used on the server (look back at 1.).
This data collection is being made available by inserting the JSON inside a script tag with id __NEXT_DATA__.

What Pages Are We Talking About Exactly?

City A to City B
All the lines stretched from a station in City A to a station in City B. (e.g. Bangkok to Pattaya)
City
All lines that go through a specific city. (e.g. Cancun)
Country
All lines that go through a specific country. (e.g. Italy)
Station
All lines that go through a specific station. (e.g. Hanoi-airport)

Now, A Look At Architecture

Let’s take a high-level and very simplified look at the infrastructure powering the landing pages we are talking about. Interesting parts lie on 4 and 5. That’s where the wasting parts:

Key Takeaways From The Process

The request is hitting the getInitialProps function. This function runs on the server. This function’s responsibility is to fetch data required for the construction of a page.
The raw data returned from REST Servers passed as is to React.
First, it runs on the server. Since the non-aggregated data was transferred to React, React is also responsible for aggregating the data into something that can be used by UI components (more about that in the following sections)
The HTML is being sent to the client, together with the raw data. Then React is kicking again into play also in the client and doing the same job. Because hydration is needed (more about that in the following sections). So React is doing the data aggregation job twice.

The Problem

The big JSON is embedded inside a script tag with id of ___NEXT_DATA___:

<script id="__NEXT_DATA__" type="application/json"> // Huge JSON here. </script>

If you want to easily copy this JSON into your clipboard, try this snippet in your Next.js page:

copy($  ('#__NEXT_DATA__').innerHTML)

A question arises.

Why Is It So Big? What’s In There?

A great tool, JSON Size analyzer, knows how to process a JSON and shows where most of the bulk of size resides.

That was our initial findings while examining a station page:

There are two issues with the analysis:

Data is not aggregated.
Our HTML contains the complete list of granular products. We don’t need them for painting on-screen purposes. We do need them for aggregation methods. For example, We are fetching a list of all the lines passing through this station. Each line has a supplier. But we need to reduce the list of lines into an array of 2 suppliers. That’s it. We’ll see an example later.
Unnecessary fields.
When drilling down each object, we saw some fields we don’t need at all. Not for aggregation purposes and not for painting methods. That’s because We fetch the data from REST API. We can’t control what data we fetch.

Those two issues showed that the pages need architecture change. But wait. Why do we need a data JSON embedded in our HTML in the first place? 🤔

Architecture Change

The issue of the very big JSON had to be solved in a neat and layered solution. How? Well, by adding the layers marked in green in the following diagram:

A few things to note:

Double data aggregation was removed and consolidated to just being made just once on the Next.js server only;
Graphql Server layer added. That makes sure we get only the fields we want. The database can grow with many more fields for each entity, but that won’t affect us anymore;
PageLogic function added in getServerSideProps. This function gets non-aggregated data from back-end services. This function aggregates and prepares the data for the UI components. (It runs only on the server.)

Data Flow Example

We want to render this section from a station page:

[   {     id: "58a8bd82b4869b00063b22d2",     class: "Standard",     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     id: "58f5e40da02e97f000888e07a",     class: "Luxury",     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     id: "58f5e4a0a02e97f000325e3a",     class: 'Luxury',     supplier: "Jones Ltd",     type: "minivan",   }, ]; [   { supplier: "Hyatt-Mosciski", amountOfLines: 2, types: ["bus"] },   { supplier: "Jones Ltd", amountOfLines: 1, types: ["minivan"] }, ];

[   {     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     supplier: "Jones Ltd",     type: "minivan",   }, ];

Let’s send it to the PageLogic function to crunch it and see what we get:

[   { supplier: "Hyatt-Mosciski", amountOfLines: 2, types: ["bus"] },   { supplier: "Jones Ltd", amountOfLines: 1, types: ["minivan"] }, ];

This small data collection is sent to the Next.js page.

How To Measure The Impact Of The Change

Conclusions

Delivering thin resources is essential, especially when it comes to HTML. If HTML is turning out big, we have no room left for CSS resources or javascript in our performance budget.

“Thanks to progress in networks and browsers (but not devices), a more generous global budget cap has emerged for sites constructed the “modern” way. We can now afford ~100KiB of HTML/CSS/fonts and ~300-350KiB of JS (gzipped). This rule-of-thumb limit should hold for at least a year or two. As always, the devil’s in the footnotes, but the top-line is unchanged: when we construct the digital world to the limits of the best devices, we build a less usable one for 80+% of the world’s users.”

Performance Impact

We measure the performance impact by the time it takes to download the HTML on slow 3g throttling. that metric is called “content download” in Chrome Dev Tools.

Here’s a metric example for a station page:

	HTML size (before gzip)	HTML Download time (slow 3G)
Before	370kb	820ms
After	166	540ms
Total change	204kb decrease	34% Decrease

Layered Solution

The architecture changes included additional layers:

GraphQl server: helpers with fetching exactly what we want.
Dedicated function for aggregation: runs only on the server.

Those changed, apart from pure performance improvements, also offered much better code organization and debugging experience:

All the logic regarding reducing and aggregating data now centralized in a single function;
The UI functions are now much more straightforward. No aggregation, no data crunching. They are just getting data and painting it;
Debugging server code is more pleasant since we extract only the data we need—no more unnecessary fields coming from a REST endpoint.

Articles on Smashing Magazine — For Web Designers And Developers

05 May

Reducing HTML Payload With Next.js (Case Study)

by TBSCategories: News

I know what you are thinking. Here’s another article about reducing JavaScript dependencies and the bundle size sent to the client. But this one is a bit different, I promise.

In this article, I’ll explain:

how we found the HTML size is too big;
how it got reduced;
the benefits of this process (i.e. creating improved architecture, improving ode organization, providing a straightforward job for Google to index tens of thousands of landing pages, and serving much fewer bytes to the client — especially suitable for people with slow connections).

But first, let’s talk about the importance of speed improvement.

Why Is Speed Improvement Necessary To Our SEO Efforts?

Meet “Web Vitals”, but in particular, meet LCP (Largest Contentful Paint):

“Largest Contentful Paint (LCP) is an important, user-centric metric for measuring perceived load speed because it marks the point in the page load timeline when the page’s main content has likely loaded — a fast LCP helps reassure the user that the page is useful.”

While LCP is a user-centric metric, reducing it should make a big help to Google bots as Googe states:

“The web is a nearly infinite space, exceeding Google’s ability to explore and index every available URL. As a result, there are limits to how much time Googlebot can spend crawling any single site. Google’s amount of time and resources to crawling a site is commonly called the site’s crawl budget.”

— “Advanced SEO,” Google Search Central Documentation

One of the best technical ways to improve the crawl budget is to help Google do more in less time:

Q: “Does site speed affect my crawl budget? How about errors?”

A: “Making a site faster improves the users’ experience while also increasing the crawl rate. For Googlebot, a speedy site is a sign of healthy servers so that it can get more content over the same number of connections.”

Investigations for ways we can improve led to finding that there is a big JSON embedded in our HTML, making the HTML chunky. For that case, we’ll need to understand React Hydration.

React Hydration: Why There Is A JSON In HTML

That happens because of how Server-side rendering works in react and Next.js:

When the request arrives at the server — it needs to make an HTML based on a data collection. That collection of data is the object returned by getServerSideProps.
React got the data. Now it kicks into play in the server. It builds in HTML and sends it.
When the client receives the HTML, it is immediately pained in front of him. In the meanwhile, React javascript is being downloaded and executed.
When javascript execution is done, React kicks into play again, now on the client. It builds the HTML again and attaches event listeners. This action is called hydration.
As React building the HTML again for the hydration process, it requires the same data collection used on the server (look back at 1.).
This data collection is being made available by inserting the JSON inside a script tag with id __NEXT_DATA__.

What Pages Are We Talking About Exactly?

City A to City B
All the lines stretched from a station in City A to a station in City B. (e.g. Bangkok to Pattaya)
City
All lines that go through a specific city. (e.g. Cancun)
Country
All lines that go through a specific country. (e.g. Italy)
Station
All lines that go through a specific station. (e.g. Hanoi-airport)

Now, A Look At Architecture

Let’s take a high-level and very simplified look at the infrastructure powering the landing pages we are talking about. Interesting parts lie on 4 and 5. That’s where the wasting parts:

Key Takeaways From The Process

The request is hitting the getInitialProps function. This function runs on the server. This function’s responsibility is to fetch data required for the construction of a page.
The raw data returned from REST Servers passed as is to React.
First, it runs on the server. Since the non-aggregated data was transferred to React, React is also responsible for aggregating the data into something that can be used by UI components (more about that in the following sections)
The HTML is being sent to the client, together with the raw data. Then React is kicking again into play also in the client and doing the same job. Because hydration is needed (more about that in the following sections). So React is doing the data aggregation job twice.

The Problem

The big JSON is embedded inside a script tag with id of ___NEXT_DATA___:

<script id="__NEXT_DATA__" type="application/json"> // Huge JSON here. </script>

If you want to easily copy this JSON into your clipboard, try this snippet in your Next.js page:

copy($  ('#__NEXT_DATA__').innerHTML)

A question arises.

Why Is It So Big? What’s In There?

A great tool, JSON Size analyzer, knows how to process a JSON and shows where most of the bulk of size resides.

That was our initial findings while examining a station page:

There are two issues with the analysis:

Data is not aggregated.
Our HTML contains the complete list of granular products. We don’t need them for painting on-screen purposes. We do need them for aggregation methods. For example, We are fetching a list of all the lines passing through this station. Each line has a supplier. But we need to reduce the list of lines into an array of 2 suppliers. That’s it. We’ll see an example later.
Unnecessary fields.
When drilling down each object, we saw some fields we don’t need at all. Not for aggregation purposes and not for painting methods. That’s because We fetch the data from REST API. We can’t control what data we fetch.

Those two issues showed that the pages need architecture change. But wait. Why do we need a data JSON embedded in our HTML in the first place? 🤔

Architecture Change

The issue of the very big JSON had to be solved in a neat and layered solution. How? Well, by adding the layers marked in green in the following diagram:

A few things to note:

Double data aggregation was removed and consolidated to just being made just once on the Next.js server only;
Graphql Server layer added. That makes sure we get only the fields we want. The database can grow with many more fields for each entity, but that won’t affect us anymore;
PageLogic function added in getServerSideProps. This function gets non-aggregated data from back-end services. This function aggregates and prepares the data for the UI components. (It runs only on the server.)

Data Flow Example

We want to render this section from a station page:

[   {     id: "58a8bd82b4869b00063b22d2",     class: "Standard",     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     id: "58f5e40da02e97f000888e07a",     class: "Luxury",     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     id: "58f5e4a0a02e97f000325e3a",     class: 'Luxury',     supplier: "Jones Ltd",     type: "minivan",   }, ]; [   { supplier: "Hyatt-Mosciski", amountOfLines: 2, types: ["bus"] },   { supplier: "Jones Ltd", amountOfLines: 1, types: ["minivan"] }, ];

[   {     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     supplier: "Hyatt-Mosciski",     type: "bus",   },   {     supplier: "Jones Ltd",     type: "minivan",   }, ];

Let’s send it to the PageLogic function to crunch it and see what we get:

[   { supplier: "Hyatt-Mosciski", amountOfLines: 2, types: ["bus"] },   { supplier: "Jones Ltd", amountOfLines: 1, types: ["minivan"] }, ];

This small data collection is sent to the Next.js page.

How To Measure The Impact Of The Change

Conclusions

Delivering thin resources is essential, especially when it comes to HTML. If HTML is turning out big, we have no room left for CSS resources or javascript in our performance budget.

“Thanks to progress in networks and browsers (but not devices), a more generous global budget cap has emerged for sites constructed the “modern” way. We can now afford ~100KiB of HTML/CSS/fonts and ~300-350KiB of JS (gzipped). This rule-of-thumb limit should hold for at least a year or two. As always, the devil’s in the footnotes, but the top-line is unchanged: when we construct the digital world to the limits of the best devices, we build a less usable one for 80+% of the world’s users.”

Performance Impact

We measure the performance impact by the time it takes to download the HTML on slow 3g throttling. that metric is called “content download” in Chrome Dev Tools.

Here’s a metric example for a station page:

	HTML size (before gzip)	HTML Download time (slow 3G)
Before	370kb	820ms
After	166	540ms
Total change	204kb decrease	34% Decrease

Layered Solution

The architecture changes included additional layers:

GraphQl server: helpers with fetching exactly what we want.
Dedicated function for aggregation: runs only on the server.

Those changed, apart from pure performance improvements, also offered much better code organization and debugging experience:

All the logic regarding reducing and aggregating data now centralized in a single function;
The UI functions are now much more straightforward. No aggregation, no data crunching. They are just getting data and painting it;
Debugging server code is more pleasant since we extract only the data we need—no more unnecessary fields coming from a REST endpoint.

Articles on Smashing Magazine — For Web Designers And Developers

31 Mar

Choosing A New Serverless Database Technology At An Agency (Case Study)

by TBSCategories: News

Adopting a new technology is one of the hardest decisions for a technologist in a leadership role. This is often a large and uncomfortable area of risk, whether you are building software for another organization or within your own.

Over the last twelve years as a software engineer, I’ve found myself in the position of having to evaluate a new technology at increasing frequency. This may be the next frontend framework, a new language, or even entirely new architectures like serverless.

The experimentation phase is often fun and exciting. It is where software engineers are most at home, embracing the novelty and euphoria of “aha” moments while grokking new concepts. As engineers, we like to think and tinker, but with enough experience, every engineer learns that even the most incredible technology has its blemishes. You just haven’t found them yet.

Now, as the co-founder of a creative agency, my team and I are often in a unique position to use new technologies. We see many greenfield projects, which become the perfect opportunity to introduce something new. These projects also see a level of technical isolation from the larger organization and are often less burdened by prior decisions.

That being said, a good agency lead is entrusted to care for someone else’s big idea and deliver it to the world. We have to treat it with even more care than we would our own projects. Whenever I’m about to make the final call on a new technology I often ponder this piece of wisdom from the co-founder Stack Overflow Joel Spolski:

“You have to sweat and bleed with the thing for a year or two before you really know it’s good enough or realize that no matter how hard you try you can’t…”

This is the fear, this is the place that no tech lead wants to find themselves in. Choosing a new technology for a real-world project is hard enough, but as an agency, you have to make these decisions with someone else’s project, someone else’s dream, someone else’s money. At an agency, the last thing you want is to find one of those blemishes near the deadline for a project. Tight timelines and budgets make it nearly impossible to reverse course after a certain threshold is crossed, so finding out a technology can’t do something critical or is unreliable too late into a project can be catastrophic.

Throughout my career as a software engineer, I’ve worked at SaaS companies and creative agencies. When it comes to adopting a new technology for a project these two environments have very different criteria. There is overlap in criteria, but by and large, the agency environment has to work with rigid budgets and rigorous time constraints. While we want the products we build to age well over time, it’s often more difficult to make investments in something less proven or to adopt technology with steeper learning curves and rough edges.

That being said, agencies also have some unique constraints that a single organization may not have. We have to bias for efficiency and stability. The billable hour is often the final unit of measurement when a project is complete. I’ve been at SaaS companies where spending a day or two on setup or a build pipeline is no big deal.

At an agency, this type of time cost puts strain on relationships as finance teams see narrowing profit margins for little visible results. We also have to consider the long-term maintenance of a project, and conversely what happens if a project needs to be handed back off to the client. We therefore must bias for efficiency, learning curve, and stability in the technology we choose.

When evaluating a new piece of technology I look at three overarching areas:

The Technology
The Developer Experience
The Business

Each of these areas has a set of criteria I like met before I start really diving into the code and experimenting. In this article, we’ll take a look at these criteria and use the example of considering a new database for a project and review it at a high level under each lens. Taking a tangible decision like this will help demonstrate how we can apply this framework in the real world.

The Technology

The very first thing to take a look at when evaluating a new technology is if that solution can solve the problems it claims to solve. Before diving into how a technology can help our process and business operations, it’s important to first establish that it is meeting our functional requirements. This is also where I like to take a look at what existing solutions we are using and how this new one stacks up against them.

I’ll ask myself questions like:

Does it at a minimum solve the problem my existing solution does?
In what ways is this solution better?
In what ways is it worse?
For areas that it is worse, what will it take to overcome those shortcomings?
Will it take the place of multiple tools?
How stable is the technology?

Our Why?

At this point, I also want to review why we are seeking another solution. A simple answer is we are encountering a problem that existing solutions don’t solve. However, this is often rarely the case. We have solved many software problems over the years with all of the technology we have today. What typically happens is that we get turned onto a new technology that makes something we are currently doing easier, more stable, faster, or cheaper.

Let’s take React as an example. Why did we decide to adopt React when jQuery or Vanilla JavaScript was doing the job? In this case, using the framework highlighted how this was a much better way to handle stateful frontends. It became faster for us to build things like filtering and sorting features by working with data structures instead of direct DOM manipulation. This was a saving in time and increased stability of our solutions.

Typescript is another example where we decided to adopt it because we found increases in the stability of our code and maintainability. With adopting new technologies, there often isn’t a clear problem we are looking to solve, but rather just looking to stay current and then discovering more efficient and stable solutions than we are currently using.

In the case of a database, we were specifically considering moving to a serverless option. We had seen a lot of success with serverless applications and deployments reducing our overhead as an organization. One area where we felt this was lacking was our data layer. We saw services like Amazon Aurora, Fauna, Cosmos and Firebase that were applying serverless principles to databases and wanted to see if it was time to take the leap ourselves. In this case, we were looking to lower our operational overhead and increase our development speed and efficiency.

It’s important at this level to understand your why before you start diving into new offerings. This may be because you are solving a novel problem, but far more often you are looking to improve your ability to solve a type of problem you are already solving. In that case, you need to take inventory of where you have been to figure out what would provide a meaningful improvement to your workflow. Building upon our example of looking at serverless databases, we’ll need to take a look at how we are currently solving problems and where those solutions fall short.

Where we have been…

As an agency, we have previously used a wide range of databases including but not limited to MySQL, PostgreSQL, MongoDB, DynamoDB, BigQuery, and Firebase Cloud Storage. The vast majority of our work centered around three core databases though: PostgreSQL, MongoDB, and Firebase Realtime Database. Each one of these does, in fact, have semi-serverless offerings, but some key features of newer offerings had us re-evaluating our previous assumptions. Let’s take a look at our historical experience with each of these first and why we are left considering alternatives in the first place.

We typically chose PostgreSQL for larger, long-term projects, as this is the battle-tested gold standard for almost everything. It supports classic transactions, normalized data, and is ACID compliant. There are a wealth of tools and ORMs available in almost every language and it can even be used as an ad-hoc NoSQL database with its JSON column support. It integrates well with many existing frameworks, libraries and programming languages making it a true go-anywhere workhorse. It is also open-source and therefore doesn’t get us locked into any one vendor. As they say, nobody ever got fired for choosing Postgres.

That being said, we have gradually found ourselves using PostgreSQL less and less as we became more of a Node-oriented shop. We have found the ORM’s for Node to be lackluster and requiring more custom queries (although this has become less problematic now) and NoSQL felt to be a more natural fit when working in a JavaScript or TypeScript runtime. That being said, we often had projects that could be done quite quickly with classic relational modeling like e-commerce workflows. However, dealing with the local setup of the database, unifying the testing flow across teams, and dealing with local migrations were things we didn’t love and were happy to leave behind as NoSQL, cloud-based databases became more popular.

MongoDB was increasingly our go-to database as we adopted Node.js as our preferred back end. Working with MongoDB Atlas made it easy to have quick development and testing databases that our team could use. For a while, MongoDB was not ACID compliant, didn’t support transactions, and discouraged too many inner join-like operations, thus for e-commerce applications we still were using Postgres most often. That being said, there are a wealth of libraries that go with it and Mongo’s query language and first-class JSON support gave us speed and efficiency we had not experienced with relational databases. MongoDB has added support for ACID transactions recently, but for a long time, this was the chief reason we would opt for Postgres instead.

MongoDB also introduced us to a new level of flexibility. In the middle of an agency project, requirements are bound to change. No matter how hard you defend against it, there is always a last-minute data requirement. With NoSQL databases, in general, the flexibility of the data structure made those types of changes less harsh. We didn’t end up with a folder full of migration files to manage that added and removed and added columns again before a project even saw daylight.

As a service, Mongo Atlas was also pretty close to what we desired in a database cloud service. I like to think of Atlas as a semi-serverless offering since you still have some operational overhead in managing it. You have to provision a certain size database and select an amount of memory upfront. These things will not scale for you automatically so you will need to monitor it for when it is time to provide more space or memory. In a truly serverless database, this would all happen automatically and on-demand.

We also utilized Firebase Realtime Database for a few projects. This was indeed a serverless offering where the database scales up and down on-demand, and with pay-as-you-go pricing, it made sense for applications where the scale was not known upfront and the budget was limited. We used this instead of MongoDB for short-lived projects that had simple data requirements.

One thing we did not enjoy about Firebase was it felt to be further from the typical relational model built around normalized data that we were used to. Keeping the data structures flat meant we often had more duplication, which could turn a bit ugly as a project grows. You end up having to update the same data in multiple places or trying to join together different references resulting in multiple queries that can become hard to reason about in the code. While we liked Firebase, we never really fell in love with the query language and sometimes found the documentation to be lackluster.

In general, both MongoDB and Firebase had a similar focus on denormalized data, and without access to efficient transactions, we often found many of the workflows that were easy to model in relational databases, which led to more complex code at the application layer with their NoSQL counterparts. If we could get the flexibility and ease of these NoSQL offerings with the robustness and relational modeling of a traditional SQL database we would really have found a great match. We felt MongoDB had the better API and capabilities but Firebase had the truly serverless model operationally.

Our Ideal

At this point, we can start looking at what new options we will consider. We’ve clearly defined our previous solutions and we’ve identified the things that are important for us to have at a minimum in our new solution. We not only have a baseline or minimum set of requirements, but we also have a set of problems that we’d like the new solution to alleviate for us. Here are the technical requirements we have:

Serverless operationally with on-demand scale
Flexible modeling (schemaless)
No reliance on migrations or ORMs
ACID compliant transactions
Supports relationships and normalized data
Works with both serverless and traditional backends

So now that we have a list of must-haves we can actually evaluate some options. It may not be important that the new solution nails every target here. It may just be that it hits the right combination of features where existing solutions are not overlapping. For instance, if you wanted schemaless flexibility, you had to give up ACID transactions. (This was the case for a long time with databases.)

An example from another domain is if you want to have typescript validation in your template rendering you need to be using TSX and React. If you go with options like Svelte or Vue, you can have this — partially but not completely — through the template rendering. So a solution that gave you the tiny footprint and speed of Svelte with the template level type checking of React and TypeScript could be enough for adoption even if it were missing another feature. The balance of and wants and needs is going to change from project to project. It is up to you to figure out where the value is going to be and decide how to tick the most important points in your analysis.

We can now take a look at a solution and see how it evaluates against our desired solution. Fauna is a serverless database solution that boasts an on-demand scale with global distribution. It is a schemaless database, that provides ACID-compliant transactions, and supports relational queries and normalized data as a feature. Fauna can be used in both serverless applications as well as more traditional backends and provides libraries to work with the most popular languages. Fauna additionally provides workflows for authentication as well as easy and efficient multi-tenancy. These are both solid additional features to note because they could be the swaying factors when two technologies are nose to nose in our evaluation.

Now after looking at all of these strengths we have to evaluate the weaknesses. One of which is Fauna is not open source. This does mean that there are risks of vendor lock-in, or business and pricing changes that are out of your control. Open source can be nice because you can often up and take the technology to another vendor if you please or potentially contribute back to the project.

In the agency world, vendor lock-in is something we have to watch closely, not so much because of the price, but the viability of the underlying business is important. Having to change databases on a project that is in the middle of development or a few years old are both disastrous for an agency. Often a client will have to foot the bill for this, which is not a pleasant conversation to have.

One other weakness we were concerned with is the focus on JAMstack. While we love JAMstack, we find ourselves building a wide variety of traditional web applications more often. We want to be sure that Fauna continues to support those use cases. We had a bad experience in the past with a hosting provider that went all-in on JAMstack and we ended up having to migrate a rather large swath of sites from the service, so we want to feel confident that all use cases will continue to see solid support. Right now, this seems to be the case, and the serverless workflows provided by Fauna actually can complement a more traditional application quite nicely.

At this point, we’ve done our functional research and the only way to know if this solution is viable is to get down and write some code. In an agency environment, we can’t just take weeks out of the schedule for people to evaluate multiple solutions. This is the nature of working in an agency vs. a SaaS environment. In the latter, you might build a few prototypes to try to get to the right solution. In an agency, you will get a few days to experiment, or maybe the opportunity to do a side project but by and large we really have to narrow this down to one or two technologies at this stage and then put the fingers to the keyboard.

The Developer Experience

Judging the experience side of a new technology is perhaps the most difficult of the three areas since it is by nature subjective. It will also have variability from team to team. For example, if you asked a Ruby programmer, a Python programmer, and a Rust programmer about their opinions on different language features, you will get quite an array of responses. So, before you begin to judge an experience, you must first decide what characteristics are most important to your team overall.

For agencies I think there are two major bottlenecks that come up with regard to developer experience:

Setup time and configuration
Learnability

Both of these affect the long-term viability of a new technology in different ways. Keeping transient teams of developers in sync at an agency can be a headache. Tools that have lots of upfront setup costs and configurations are notoriously difficult for agencies to work with. The other is learnability and how easy it is for developers to grow the new technology. We’ll go into these in more detail and why they are my base when starting to evaluate developer experience.

Setup Time And Configuration

Agencies tend to have little patience and time for configuration. For me, I love sharp tools, with ergonomic designs, that allow me to get to work on the business problem at hand quickly. A number of years ago I worked for a SaaS company that had a complex local setup that involved many configurations and often failed at random points in the setup process. Once you were set up, the conventional wisdom was not to touch anything, and hope that you weren’t at the company long enough to have to set it up again on another machine. I’ve met developers that greatly enjoyed configuring each little piece of their emacs setup and thought nothing of losing a few hours to a broken local environment.

In general, I have found agency engineers have a disdain for these types of things in their day-to-day work. While at home they may tinker with these types of tools, but when on a deadline there’s nothing like tools that just work. At agencies, we typically would prefer to learn a few new things that work well, consistently, rather than to be able to configure each piece of tech to each individual’s personal taste.

One thing that is good about working with a cloud platform that is not open source is they own the setup and configuration entirely. While a downside of this is vendor lock-in, the upside is that these types of tools often do the thing they are set up to do well. There is no tinkering with environments, no local setups, and no deployment pipelines. We also have fewer decisions to make.

This is inherently the appeal of serverless. Serverless in general has a greater reliance on proprietary services and tools. We trade the flexibility of hosting and source code so that we can gain greater stability and focus on the problems of the business domain we are trying to solve. I’ll also note that when I’m evaluating a technology and I get the feeling that migrating off of a platform might be needed, this is often a bad sign at the outset.

In the case of databases, the set-it-and-forget-it setup is ideal when working with clients where the database needs can be ambiguous. We’ve had clients who were unsure how popular a program or application would be. We’ve had clients that we technically were not contracted to support in this way but nonetheless called us in a panic when they needed us to scale their database or application.

In the past, we’d always have to factor in things like redundancy, data replication, and sharding to scale when we crafted our SOW’s. Trying to cover each scenario while also being prepared to move a full book of business around in the event a database wasn’t scaling is an impossible situation to prepare for. In the end, a serverless database makes these things easier.

You never lose data, you don’t have to worry about replicating data across a network, nor provisioning a larger database and machine to run it on — it all just works. We only focus on the business problem at hand, the technical architecture and scale will always be managed. For our development team, this is a huge win; we have less fire drills, monitoring, and context switching.

Learnability

There is a classic user experience measure, which I think is applicable to developer experience, which is learnability. When designing for a certain user experience we don’t just look at if something is apparent or easy on first try. Technology just has more complexity than that most of the time. What is important is how easily a new user can learn and master the system.

When it comes to technical tools, especially powerful ones, it would be a lot to ask for there to be zero learning curve. Usually what we look for is for there to be great documentation for the most common use cases and for that knowledge to be easily and quickly built upon when in a project. Losing a little time to learning on the first project with a technology is okay. After that, we should see efficiency improve with each successive project.

What I look for specifically here is how we can leverage knowledge and patterns we already know to help shorten the learning curve. For instance, with serverless databases, there is going to be virtually zero learning curve for getting them set up in the cloud and deployed. When it comes to using the database one of the things I like is when we can still leverage all the years of mastering relational databases and apply those learnings to our new setup. In this case, we are learning how to use a new tool but it’s not forcing us to rethink our data modeling from the ground up.

As an example of this, when using Firebase, MongoDB, and DynamoDB we found that it encouraged denormalized data rather than trying to join different documents. This created a lot of cognitive friction when modeling our data as we needed to think in terms of access patterns rather than business entities. On the other side of this Fauna allowed us to leverage our years of relational knowledge as well as our preference for normalized data when it came to modeling data.

The part we had to get used to was using indexes and a new query language to bring those pieces together. In general, I’ve found that preserving concepts that are a part of larger software design paradigms makes it easier on the development team in terms of learnability and adoption.

How do we know that a team is adopting and loving a new technology? I think the best sign is when we find ourselves asking whether that tool integrates with the said new technology? When a new technology gets to a level of desirability and enjoyment that the team is searching for ways to incorporate it into more projects, that is a good sign you have a winner.

The Business

In this section, we have to look at how a new technology meets our business needs. These include questions like:

How easily can it be priced and integrated into our support plans?
Can we transition it to clients easily?
Can clients be onboarded to this tool if need be?
How much time does this tool actually save if any?

The rise of serverless as a paradigm fits agencies well. When we talk about databases and DevOps, the need for specialists in these areas at agencies is limited. Often we are handing off a project when we are done with it or supporting it in a limited capacity long term. We tend to bias toward full-stack engineers as these needs outnumber DevOps needs by a large margin. If we hired a DevOps engineer they would likely be spending a few hours deploying a project and many more hours hanging out waiting for a fire.

In this regard, we always have some DevOps contractors on the ready, but do not staff for these positions full time. This means we cannot rely on a DevOps engineer to be ready to jump for an unexpected issue. For us we know we can get better rates on hosting by going to AWS directly, but we also know that by using Heroku we can rely on our existing staff to debug most issues. Unless we have a client we need to support long term with specific backend needs, we like to default to managed platforms as a service.

Databases are no exception. We love leaning on services like Mongo Atlas or Heroku Postgres to make this process as easy as possible. As we started to see more and more of our stack head into serverless tools like Vercel, Netlify, or AWS Lambda — our database needs had to evolve with that. Serverless databases like Firebase, DynamoDB, and Fauna are great because they integrate well with serverless apps but also free our business completely from provisioning and scaling.

These solutions also work well for more traditional applications, where we don’t have a serverless application but we can still leverage serverless efficiencies at the database level. As a business, it is more productive for us to learn a single database that can apply to both worlds than to context switch. This is similar to our decision to adopt Node and isomorphic JavaScript (and TypeScript).

One of the downsides we have found with serverless has been coming up with pricing for clients we manage these services for. In a more traditional architecture, flat rate tiers make it very easy to translate those into a rate for clients with predictable circumstances for incurring increases and overages. When it comes to serverless this can be ambiguous. Finance people don’t typically like hearing things like we charge 1/10th of a penny for every read beyond 1 million, and so on and so forth.

This is hard to translate into a fixed number even for engineers as we are often building applications that we are not certain what the usage will be. We often have to create tiers ourselves but the many variables that go into the cost calculation of a lambda can be hard to wrap your head around. Ultimately, for a SaaS product these pay-as-you-go pricing models are great but for agencies the accountants like more concrete and predictable numbers.

When it came to Fauna, this was definitely more ambiguous to figure out than say a standard MySQL database that had flat-rate hosting for a set amount of space. The upside was that Fauna provides a nice calculator that we were able to use to put together our own pricing schemes.

Another difficult aspect of serverless can be that many of these providers do not allow for easy breakdown of each application being hosted. For instance, the Heroku platform makes this quite easy by creating new pipelines and teams. We can even enter a client’s credit card for them in case they don’t want to use our hosting plans. This can all be done within the same dashboard as well so we didn’t need to create multiple logins.

When it came to other serverless tools this was much more difficult. In evaluating serverless databases Firebase supports splitting payments by project. In the case of Fauna or DynamoDB, this is not possible so we do have to do some work to monitor usage in their dashboard, and if the client wants to leave our service, we would have to transfer the database over to their own account.

Ultimately, serverless tools provide great business opportunities in terms of cost savings, management, and process efficiency. However, often they do prove challenging for agencies when it comes to pricing and account management. This is one area where we have had to leverage cost calculators to create our own predictable pricing tiers or set clients up with their own accounts so they can make the payments directly.

Conclusion

It can be a difficult task to adopt a new technology as an agency. While we are in a unique position to work with new, greenfield projects that have opportunities for new technologies, we also have to consider the long-term investment of these. How will they perform? Will our people be productive and enjoy using them? Can we incorporate them into our business offering?

You need to have a firm grasp of where you have been before you figure out where you want to go technologically. When evaluating a new tool or platform it’s important to think of what you have tried in the past and figure out what is most important to you and your team. We took a look at the concept of a serverless database and passed it through our three lenses — the technology, the experience, and the business. We were left with some pros and cons and had to strike the right balance.

After we evaluated serverless databases, we decided to adopt Fauna over the alternatives. We felt the technology was robust and ticked all of our boxes for our technology filter. When it came to the experience, virtually zero configuration and being able to leverage our existing knowledge of relational data modeling made this a winner with the development team. On the business side serverless provides clear wins to efficiency and productivity, however on the pricing side and account management there are still some difficulties. We decided the benefits in the other areas outweighed the pricing difficulties.

Overall, we highly recommend giving Fauna a shot on one of your next projects. It has become one of our favorite tools and our go-to database of choice for smaller serverless projects and even more traditional large backend applications. The community is very helpful, the learning curve is gentle, and we believe you’ll find levels of productivity you hadn’t realized before with existing databases.

When we first use a new technology on a project, we start with something either internal or on the smaller side. We try to mitigate the risk by wading into the water rather than leaping into the deep end by trying it on a large and complex project. As the team builds understanding of the technology, we start using it for larger projects but only after we feel comfortable that it has handled similar use cases well for us in the past.

In general, it can take up to a year for a technology to become a ubiquitous part of most projects so it is important to be patient. Agencies have a lot of flexibility but also are required to ensure stability in the products they produce, we don’t get a second chance. Always be experimenting and pushing your agency to adopt new technologies, but do so carefully and you will reap the benefits.

How To Run The Right Kind Of Research Study With The Double-Diamond Model

by TBSCategories: News

How To Run The Right Kind Of Research Study With The Double-Diamond Model

Steve Bromley

2020-05-29T10:00:00+00:00 2020-05-29T21:08:33+00:00

Product and design teams make a lot of decisions. Early on in the development of a product, they will be thinking about features — such as what the product should do, and how each feature should work. Later on, those decisions become more nuanced — such as ‘what should this button say? Each decision introduces an element of risk — if a bad decision is made, it will reduce the chance for the product to be successful.

The people making these decisions rely on a variety of information sources to improve the quality of their decision This includes intuition, an understanding of the market, as well as an understanding of user behavior. Of these, the most valuable source of information to put evidence behind decisions is understanding our users.

Being armed with an understanding of the appropriate user research methods can be very valuable when developing new products. This article will cover some appropriate methods and advice on when to deploy them.

A Process For Developing Successful Products

The double diamond is a model created by the UK’s Design Council which describes a process for making successful products. It describes taking time to understand a domain, picking the right problem to solve, and then exploring potential ideas in that space. This should prove that the product is solving real problems for users and that the implementation of the product works for users.

Diagram showing the design council’s Double Diamond model — The design council’s Double Diamond model (Large preview)

To succeed at each stage of the process requires understanding some information about your users. Some of the information we might want to understand from users when going through the process is as follows:

The double diamond image with user research questions linked to each phase — Some research questions appropriate for each stage (Large preview)

Each stage has some user research methods that are best suited to uncovering that information. In this article, we’ll refer to the double diamond to highlight the appropriate research method throughout product development.

Diamond 1: Exploring The Problem And Deciding What To Fix

The first diamond describes how to come up with a suitable problem that a new product or feature should fix. It requires understanding what problems users have, and prioritizing them to focus on a high-value area. This avoids the risk of building something that no-one is going to use.

The most effective way of understanding the problem is to get true first-hand experience of users performing real tasks in context. This is best done by applying ethnographic and observational methods to identify the range of problems that exist, then prioritizing them using methods such as surveys.

Double Diamond Phase	Appropriate Method	Why?
Explore the problem	Ethnographic and Observational studies	Gives deep insight into what problems people have that can inspire product decisions
Decide what to fix	Surveys	Discovers how representative problems are, and helps prioritise them

We’ll review each method, in turn, to describe why it’s appropriate.

Explore The Problem With Ethnographic And Observational Methods

The first phase of the double diamond is to ‘explore the problem’. User research can build up an understanding of how people act in the real world and the problems they face. This allows the problem space to be fully explored.

The double diamond image with ‘Explore the problem’ highlighted — Explore the problem (Large preview)

This valuable behavioral information is only uncovered only by watching people do real tasks and asking them questions to uncover their motivations and issues. Doing early qualitative research will help identify the problems that people have. These problems can inspire product ideas, features, and help teams understand how to successfully solve user’s problems. This information will also help disregard poor product ideas by revealing that there is no real need for it. This leads to a more useful product being developed and increasing the chance of success.

The most appropriate methods for doing this are ethnographic. This can include diary studies, where a user’s interaction with the subject matter is captured over a number of weeks. This reveals issues that wouldn’t turn up in a single session or that people wouldn’t remember to talk about in a lab-based interview.

This isn’t the only way of uncovering this kind of in-depth information though. Other suitable observational methods include watching people use existing software or products, both in the lab or in the wild. These are quicker and easier to run than diary studies, but are limited to only capturing a single interaction or what the participant will remember when prompted. For some subject matters, this might be enough (e.g. navigating an online shop can be done and explored in a single session). More complex interactions over time, such as behavior with fitness trackers, would be more sensible to track as a diary study.

Decide What To Fix With Surveys

The second half of the first diamond comes next. Having understood real user’s contexts and what problems they have, these can then be documented and prioritized to ‘decide what to fix’.

The double diamond image with ‘Decide what to fix’ highlighted — Decide what to fix (Large preview)

This prioritization will be done by product managers who take into account many factors, such as “what do we have the technical ability to do” and “what meets our business goals”. However, user research can also add valuable information by uncovering the size of the issues users have. Surveys are a sensible approach for this, informed by the true understanding of user behavior uncovered in the earlier studies. This allows teams to size the uncovered issues and reveal how representative the behaviors discovered are.

Combining quantitative methods with generative user research studies help inspire early decisions about what a product should do. For example, Spotify’s discovery work on how people consume music analyzed primary research fieldwork to create personas and inform their development work. This allows a team to complete the first diamond with a clear understanding of what problem their product is trying to solve.

Diamond 2: Test And Refine Potential Solutions

The second diamond describes how to end up on a successful implementation of a product to fix the problem. Having decided which problem to fix, research can then explore different ways of fixing that problem, and help refine the best method.

Double Diamond Phase	Appropriate Method	Why?
Test potential solutions	Moderated usability testing	Creates a deep understanding of why the solution works, to inform iteration
Refine final solution	Unmoderated usability testing	Can get quick results on small questions, such as with the UI

Test Potential Solutions With Moderated Usability Testing

The second diamond in the double diamond design process starts with evaluating a variety of solutions in order to decide the best possible implementation of a product. To achieve this with rigor requires usability testing — creating representative prototypes and then observing if users can successfully complete tasks using them.

The double diamond image with ‘Test Potential Solutions’ highlighted — Test potential solutions (Large preview)

This kind of study takes time to do properly, and attention on each individual’s user experience to understand what causes the behavior that is observed during usability testing. A moderated session, with the researcher present, can ask probing questions to uncover things that participants won’t articulate unprompted such as “what are you thinking currently” or “ why did you decide to do that?”. These kinds of studies reveal more data when a moderator is able to ask participants these questions, and avoids missing the opportunity to gather more data from each study, which can be used to evaluate and iterate the product. A single moderated research session potentially reveals more useful information than a series of unmoderated tests.

This kind of in-depth exploration of the problem has been a key part of AirBNB’s early success. In 2009 the company was close to bankruptcy and desperate to understand why people were not booking rooms. By spending time with users reviewing the ads on their website, they were able to uncover that the pictures were the problem. This then allowed them to focus their iteration on the process for gathering photos of rooms, which put them on the path for changing hotel booking forever. As the global pandemic changes people’s behavior with holidays in the future, in-depth qualitative research will be essential as they continue to adapt to new challenges.

This doesn’t mean that the moderator has to be in the same room as the participant. Often it can be very valuable to find participants who are geographically remote, and avoid over-sampling people who live in major cities, which is often where research teams are based. Screen sharing software, such as google hangouts or zoom can make remote sessions possible, while still having the session run live with a moderator.

Refine Final Solution With Unmoderated Usability Testing

The final stage of the double diamond describes refining the final solution, which can require a lot of small iterative tests. A shortcut to the deep insight from moderated testing is remote unmoderated research. This includes tools like usertesting.com which allow teams to put their software in front of users with little effort. By sending a website URL to their panel of users, they send back videos of their participants using the website and commenting on their experience.

The double diamond image with ‘Refine final solution’ highlighted — Refine final solution (Large preview)

This method can be popular because it is quick (multiple sessions can run simultaneously without a moderator present) and cheap (participants aren’t paid a huge amount to take part). Because of this, it is often considered an appropriate tool by companies looking to start doing user research, however, care needs to be taken.

This method has constraints which means that it’s only sensible for later on in the design process. Because the participants on these kinds of websites are all people who test multiple websites regularly, they become increasingly different to a normal user. This changes their behavior while using websites and makes it dangerous to draw conclusions from their behavior about what other users would understand or do. This is called a sampling bias — creating a difference between ‘real’ users, and the users being tested.

Because of these risks, these studies may be most appropriate late in development, testing content or UI changes, when the risks of getting decisions wrong are much lower. Iterative studies ensure that users understand what has been created, and are able to use it in the way the designer intended. An example of this is the iterative usability testing the UK’s Government Digital Service ran to ensure citizens could successfully identify themselves and access government services.

After The Double Diamond

As we’ve covered, it is important to select the right method to de-risk product decisions before launch. When a product is launched, it will be immediately obvious whether there is an audience for it, and whether people understand and can use the product — both through how well the product sells, and through reviews and customer feedback.

The double diamond image with ‘Solution delivered’ highlighted — After the double diamond (Large preview)

Nevertheless, launching the right product doesn’t mean that the opportunity for research is over. New opportunities to explore real user behavior will continue to inspire adding or removing features, or changes to how the product works.

Double Diamond Phase	Appropriate Method	Why?
Solution delivered	Analytics + moderated usability testing combined	Inform future updates post-launch with qualitative and quantitative insight.

Combining some of the methods we’ve described previously with new data from analytics will continue to drive high-quality decision making.

Research After The Solution Is Delivered With Analytics

Post-launch analytics are an important part of building a complete understanding of the behavior of users.

Analytics will reveal what people are doing on a website. However, this information is most valuable when combined with understanding why that behavior is occurring. It is also important to be aware that analytics are only seeing a short section of a user’s experience, the part that happens on your website and their whole end-to-end journey also includes a lot of things that happened off the site, or in the real world. Building a research strategy that combines insight from analytics with an understanding of motivations from qualitative studies is a powerful combination to inform decision making.

This requires close collaboration between the analytics team and the user research team — regular community events, skills sharing and project updates will create awareness of the priorities of each team, the type of research questions they can support one another with and identify opportunities to work together, leading to a stronger combined team.

Optimize Your Research Process

In this article, we’ve covered some appropriate methods to use to inform product development. However, there can still be resistance to running the right kind of study.

New research teams may be asked to cut corners. This can include suggesting participants who are convenient, such as friends, without taking the time to screen them to ensure they represent real users. This can be suggested by colleagues who are unaware of the risks caused by taking decisions based on unrepresentative research.

In addition to running research studies, a researcher has to be an educator and advocate for running the right kind of studies and help their colleagues understand the differences in quality between the type of information gathered from different research methods. Presentations, roadshows, and creating posters are some techniques that can help achieve this.

Incorporating user research into decision making can be quite radical at some organizations, particularly those with a history of deferring to client wishes or listening to the highest-paid person in the room. A lot of hard work and creativity are needed to bring about change in how people work. This requires understanding the decision maker’s current incentives, and describing the benefits of research in a way that shows how it makes their life easier.

If an organization understands and accepts why running studies using appropriate methods it shows a real desire for improving the quality of decision making within the organization. This is an encouraging sign that a new research team has the potential to be successful.

The next step for new researchers will be to establish the logistics of running research, including creating a research process, building out the tools and software needed, and identifying the highest priority research questions for your organization. There is a lot of great guidance from the research community on techniques to do this, for example, the work being done by the research ops community.

(ah, ra, il)

Articles on Smashing Magazine — For Web Designers And Developers

03-migrating-design-system-from-sketch-to-figma

02 Sep

Moving From Sketch To Figma: A Case Study Of Migrating Design Systems

by TBSCategories: News

Moving From Sketch To Figma: A Case Study Of Migrating Design Systems

Buzz Usborne

2019-09-02T12:30:59+02:00 2019-09-02T12:07:39+00:00

For the past year, every time I got frustrated with Sketch, a colleague of mine suggested I try Figma. Then, when I wrote an article about building our design system in Sketch, I got a bunch of feedback from people telling me to try Figma. And recently, Linda, our Head of Design at Help Scout, asked me, “Hey Buzz, shouldn’t we be using Figma?”

I couldn’t fight it anymore… I just had to try Figma!

This isn’t a love letter to Figma or a harsh review of Sketch. Instead, it’s a cautionary tale for anyone who is thinking of moving tools. This is the story of how everything panned out, and the specifics of migrating a design system from one platform to another.

Understanding The Cost

The first thing to consider is that there’s a cost involved in switching tools — a consideration not usually factored into the conversation whenever there’s a #designtwitter pile-on. Only one-person teams can afford to change design tools at will; for busy teams, it’s not so easy.

The difficulty for us at Help Scout was the fact that our design system is built as multiple, interdependent Sketch Libraries managed with GitHub. We also have multiple in-flight projects, processes and vast documentation that all depend on Sketch files. And don’t forget the monumental effort involved in training and moving an entire team onto a new tool whilst simultaneously doing actual work!

Screenshot of the original Help Scout design system in GitHub — Contributing to Help Scout’s design system happened through GitHub. (Large preview)

There’s also a financial cost involved in someone (in this case, that’d be me) taking the time away from business-as-usual work to research and document all this good stuff. Point is, if you work in an established design team, you’ll know that changing tools is about as easy as moving offices.

But that’s how this works. Tools are “sticky” just by virtue of being hard to leave. Suffice to say, this wasn’t going to be a decision we made lightly.

Kicking The Tires

With the understanding that my decision would have an impact on the whole team and organization, I started by spending two full days exploring Figma. I watched videos, I spoke to other designers who use it often and I played with the tool… a lot! Essentially, I explored how easy it would be to move our Sketch components over. A question that came to mind was whether it would be as easy as opening a .sketch file in Figma?

Unsurprisingly, no.

It turns out that Figma and Sketch — while similar in layout and functionality — have some key differences in how they allow components to be overridden. This was the kicker. Figma allows for color, type and effects (shadows, etc.) to be customized by the user, whereas Sketch will only allow pre-determined overrides. Because of the limitations Sketch imposes on overriding components, we’d built our original design system around that — allowing full color, border and style control using a complex system of masks and building-block components.

Over-complicated? Yes. But it worked great for us.

Here’s a simple card symbol in Sketch which was made from five nested symbols that were necessary in order for us to achieve the level of flexibility we required. This is the exact kind of thing that doesn’t import well into Figma.

A side-by-side image of a component and it’s available overrides in Sketch — A preview of how we brought Figma-level overrides to Sketch (Large preview)

While this complexity in Sketch allowed us the level of flexibility Figma offers out-the-box, it meant that almost any component imported from Sketch brought an unnecessary level of complexity along with it. In order for us to use Figma, we’d need to rebuild everything from scratch to strip each component down to the essentials.

Decision Time!

Given the above, my initial decision was that although I thought Figma was the better tool, the stronger company, and the safer long-term bet, it was going to be too difficult and costly to switch. Re-building entire libraries is a big job! It was like breaking up before we’d even given it a chance.

“It’s not you, it’s us.”

But as it happens, Figma are Help Scout’s customers. On hearing our decision to stick with Sketch, our Head of Sales set up a call with the Figma product team — not necessarily to change anyone’s minds, but to share our experiences, more like as friends do. They were understandably cool about the whole thing, but asked whether they could talk to me about my decisions. And that was an opportunity not to be missed!

In the days leading up to my conversation with the Figma team, I decided to jump back into the tool — at the very least to give myself enough understanding to be able to talk with confidence and not look like a total amateur in front of people who knew a lot more in this area than me. By the time I spoke with the team, I was a convert — in just those couple of extra days, I realized how much more productive and collaborative we’d be as a team with these kinds of features at our disposal. The cost of switching hadn’t changed, but my opinion of whether the cost was worth it had. Help Scout’s Head of Design made a compelling point to that effect too: If we feel like we’ll make the switch someday, then why not today?

So my conversation with Figma ended up being more along the lines of, “Give me some advice on how to make this work,” which they graciously did.

How To Switch

So it’s possible that you might be in the same spot I was; you want to move tools but are faced with the monumental task of rebuilding hundreds of components, styles, and a load of documentation. Well, friend, you’re going to need to take a deliberate and systematic approach to this. Your mileage may vary, but this is how I moved Help Scout’s entire design system to Figma in just a week.

Split Your Libraries
Lean Heavily On Styles (+ Documentation)
Show How Components Extend
Organize Properly
Importing vs. Re-Building
Get Your Team On Board
Go All In

1. Split Your Libraries

This applies to creating Sketch libraries too, but I strongly suggest splitting design systems into separate sub-libraries that cover different parts of your ecosystem. In our case, we have Core which contains components applicable to any designer (brand assets, illustrations, icons, etc.), then domain-specific documents. This approach makes migration a bit easier to handle when you’re moving things over in organized chunks.

Thumbnails of the four Help Scout design-system libraries — Our design libraries, separated by team. (Large preview)

In our case, migrating to Figma involved beginning with the Core elements — which were then used to build out subsequent libraries.

2. Lean Heavily On Styles (+ Documentation)

Figma has “Styles” that work in the same way you’re used to seeing Type Styles working in Sketch, but with the added benefit that these also apply to color and effects. With this in mind, it’s really useful to define all your colors and shared elements in one single library, then document them.

Documentation showing a selection of shadows available with the Library — An example of how styles are documented within each library (Large preview)

3. Show How Components Extend

Since Figma allows much greater control over how components can be extended, you’ll probably end up with fewer components than you had in Sketch — instead of “button solid color” and “button outlined,” in Figma you’ll just need “button”. Because of this, I found that it was important to document the different ways a component can be extended directly within the library itself.

For example, only one component is required to re-create an entire two-sided chat conversation in Figma. But a new designer would never know what overrides to apply, so it’s important to visually demonstrate whenever it’s possible. Here’s the same component being used in six different ways:

An example conversation built with components to demonstrate correct use — An example of how a single Figma component can construct an entire conversation (Large preview)

4. Organize Properly

I quickly abandoned trying to replicate the naming structure I had in the original Sketch files because of subtle differences in how Figma’s file system works. Ultimately, the aim is to make sure components are in a logical place and easy to find, and the best way I found to achieve that was to carefully organize my Pages by category (e.g., Forms), Frames by group classification (e.g., Inputs) and Components by individual element (e.g., Error). Being specific about naming makes components super easy to find — especially by people who didn’t originally create them.

Side-by-side of Frame names and related components — Naming is important! (Large preview)

5. Importing vs. Re-Building

Phew, I wish I had good news here about the physical act of importing Sketch components (for a lot of things, namely individual elements like icons which you can import from Sketch and it’ll all work out great). However, for more complex components (especially ones that involve masks and nested symbols), you’re better off re-creating the components from scratch. Yes, it’s extremely painful, but on the upside, you’ll get really good at using Figma in a very short time!

My workflow in Figma for re-creating the more complex Sketch components was literally to screenshot then “trace” them in Figma. As ridiculous as this sounds, it turned out to be much faster than importing from Sketch and removing the unnecessary elements. And I’m a little bit ashamed to say that I love this kind of work, but also, turns out that this workflow was more effective.

(But of course, if you’re migrating simpler components like icons, then Figma’s importing capabilities will serve you just fine.)

A timelapse of making a Figma component from a Sketch symbol — An insight into my day (Large preview)

6. Get Your Team On Board

As a 100% remote team, most things we do at Help Scout are well communicated — this was no different. So while the team was aware of the impending tool switch, it wasn’t until I had finished the design system that they got the nudge.

At this point, I gave a 20-minute demo video explaining Figma, some basics on how to use it, and some of the cool improvements they’ll find to their workflow when using components. This turned out to be a hit and definitely softened the blow for people who were perhaps a little hesitant about the move at first.

The original video that I shared with my team

7. Go All In

Part of my initial research involved seeing whether we could maintain our design system in Sketch and Figma simultaneously. I’m certain it can be done, but it’s a bit of a stretch for us given our fairly small team size and the fact we have no single person or team dedicated to the upkeep of our libraries. But instead of keeping what we had in place, I decided to go all-in on Figma.

This meant creating and updating all documentation and employee onboarding to reference the new stuff which forced me to address the migration of anything that referenced the old stuff — including existing development processes and designer hand-off. Ultimately, drawing a line in the sand meant that we were all committed to making this a success.

Of course, the Sketch libraries still exist; they’re just no longer documented nor updated. And in terms of migration, in-flight projects continue to use Sketch files (although some designers have chosen to migrate their work to Figma), whereas new projects use Figma. It’s a clean break.

Conclusion: Make A Plan!

It’s hard to conclude an article like this without sounding like I have all the answers — which I most certainly do not. But my advice to anyone switching tools is to take it slow. Put in the research, make a plan of attack, figure out the cost then weigh up whether you’re prepared to pay it — this applies whether you’re moving to Figma, Sketch, InVision Studio, Adobe XD, Framer X or some other trendy new tool I haven’t heard of yet.

For us, time will tell, but I’m still pretty confident we made the right call!

Tag: Study

Why Is Speed Improvement Necessary To Our SEO Efforts?

React Hydration: Why There Is A JSON In HTML

What Pages Are We Talking About Exactly?

Now, A Look At Architecture

Key Takeaways From The Process

The Problem

Why Is It So Big? What’s In There?

Architecture Change

Data Flow Example

How To Measure The Impact Of The Change

Conclusions

Performance Impact

Layered Solution

Why Is Speed Improvement Necessary To Our SEO Efforts?

React Hydration: Why There Is A JSON In HTML

What Pages Are We Talking About Exactly?

Now, A Look At Architecture

Key Takeaways From The Process

The Problem

Why Is It So Big? What’s In There?

Architecture Change

Data Flow Example

How To Measure The Impact Of The Change

Conclusions

Performance Impact

Layered Solution

Why Is Speed Improvement Necessary To Our SEO Efforts?

React Hydration: Why There Is A JSON In HTML

What Pages Are We Talking About Exactly?

Now, A Look At Architecture

Key Takeaways From The Process

The Problem

Why Is It So Big? What’s In There?

Architecture Change

Data Flow Example

How To Measure The Impact Of The Change

Conclusions

Performance Impact

Layered Solution

Why Is Speed Improvement Necessary To Our SEO Efforts?

React Hydration: Why There Is A JSON In HTML

What Pages Are We Talking About Exactly?

Now, A Look At Architecture

Key Takeaways From The Process

The Problem

Why Is It So Big? What’s In There?

Architecture Change

Data Flow Example

How To Measure The Impact Of The Change

Conclusions

Performance Impact

Layered Solution

The Technology

Our Why?

Where we have been…

Our Ideal

The Developer Experience

Setup Time And Configuration

Learnability

The Business

Conclusion

Further Reading

A Process For Developing Successful Products

Diamond 1: Exploring The Problem And Deciding What To Fix

Explore The Problem With Ethnographic And Observational Methods

Decide What To Fix With Surveys

Diamond 2: Test And Refine Potential Solutions

Test Potential Solutions With Moderated Usability Testing

Refine Final Solution With Unmoderated Usability Testing

After The Double Diamond

Research After The Solution Is Delivered With Analytics

Optimize Your Research Process

Understanding The Cost

Kicking The Tires

Decision Time!

How To Switch

1. Split Your Libraries

2. Lean Heavily On Styles (+ Documentation)

3. Show How Components Extend