The other side of the moon

[philiptellis] /bb|[^b]{2}/
Never stop Grokking


Friday, August 06, 2021

Safely passing secrets to a RUN command in a Dockerfile

There may be cases where you need to pass in secrets to a RUN command in a Dockerfile, and it's very important that these secrets not be leaked into the environment or the image. In particular, these secrets should not be stored in the image (either on disk or in the environment, not even in intermediate layers), they should not show up when using docker history.

While working this topic, I found many blog posts that point to pieces that may be used, but nothing that pulls it all together, so I decided to write this post with everything I've found. I'll provide a list of references at the end.

In my case, I needed to temporarily pass a valid odbc.ini file to my Julia code so that I could build a SysImage with the appropriate database query and result parsing functions compiled. I did not want the odbc.ini file available in the image.

Step 0: Make sure you have docker > 18.09

docker version

You most likely have a new enough version of docker, but in the odd chance that you're running a version older than 18.09, please upgrade. My tests were run on 19.03 and 20.10.

Developing the Dockerfile:

Step 1: Specify the Dockerfile syntax

At the top of your Dockerfile (this has to be the absolute first line), add the following:

# syntax=docker/dockerfile:1

This tells docker build to use the latest 1.x version of the Dockerfile syntax.

There are various docs that specify using 1.2 or 1.0-experimental. These values were valid when the docs were written, but are dated at this point. Specifying version 1 tells docker build to use whatever is latest on the 1.x tree, so you can still use 1.3, 1.4, etc. Specifying 1.2 restricts it to the 1.2.x tree.

Step 2: Mount a secret file where you need it

At the RUN command where you need a secret, --mount it as follows:

RUN --mount=type=secret,id=mysecret,dst=/path/to/secret.key,uid=1000 your-command-here

There are a few things in here, which I'll explain one by one.

type=secret
This tells docker that we're mounting a secret file from the host (as opposed to a directory or something else)
id=mysecret
This is any string you'd like. It has to match the id passed in on the docker build command line
dst=/path/to/secret.key
This is where you'd like the secret file to be accessible. Any file already at this location will be temporarily hidden while the secret file is mounted, so it's safe to use a location that your code will expect at run time.
uid=1000
This is the userid that should own the file. This defaults to 0 (root), so is useful if your command runs as a different user. You can also specify a gid

The full list of supported parameters for secret mounts is available at the buildkit github page

You can add the same --mount at different locations in your Dockerfile, and with different dst and uid values. The file is mounted only for the duration of that RUN command and not persisted to any layers.

Running docker build

Step 3: Set the environment variable

This step is optional on newer versions of Docker.

Once you're ready to run docker build, tell docker to use BuildKit

DOCKER_BUILDKIT=1

You can either put this right before running the command, or export it into your shell.

Step 4: Run docker build with your secret file
docker build --secret id=mysecret,src=/full/path/to/secret.key .

It's important to note that tilde (~) expansion does not work here. You can use an absolute or relative path, but you cannot use expansion.

That's IT!!!

Jenkins

If you run your docker builds through jenkins, you'll need a few more steps. The bulk of it is documented in this Cloudbees-CI article about injecting secrets.

Once you've gotten your secret file into Jenkins, and bound it to an environment variable in your Build Environment, you have to update the docker build command to use this variable instead.

For example, if we bound the secret file to a variable called MYSECRETFILE, then we'd change our build command to:

docker build --secret id=mysecret,src=${MYSECRETFILE} .

References

These links were very useful in figuring out this solution.

Wednesday, February 17, 2021

Recovering from Big Sur upgrade snafu

Apple recently pushed out a new release of MacOS called Big Sur. Unfortunately, the upgrade process is problematic. Specifically, the upgrader does not check for the required disk space before starting the upgrade, and if the target system doesn't have enough disk space (35GB or so), then the upgrade fails partway through, leaving your system in a mostly unusable state.

This is what happened to me.

My environment

  • The system was a 13" Macbook with a 128GB SSD drive. 128 is pretty small and doesn't leave much space for too many large items.
  • The system had just a single user.
  • At the start of the upgrade, the system had about 13GB of free disk space (>10%).
  • Desktop, Documents and Photos were backed up to iCloud, but Downloads weren't, and some very large photos & videos had been removed from iCloud to save space there, so they only existed locally.

Prior discussion

Mr. Macintosh has published a very detailed explanation of the issue and various ways to get around it without any data loss. This is a very good article that got me very far in my investigation. I was lucky that the latest updates had been posted just a few hours before I hit the problem myself.

Unfortunately, none of the suggested fixes worked for me.

  1. I couldn't mount the drive in Target Disk Mode as my password wouldn't work (the password still worked when logging in locally, but that took me back to the upgrade loop).
  2. I couldn't start up the system in Recovery Mode as it wanted a password, but again, wouldn't accept the password (the same password that worked when fully booting up).
  3. I couldn't access the disk when booting from an external startup disk because of the same issue.

Many posts I found online seemed to suggest that a firmware password was required, but I'd never set this up.

Single User Mode

Eventually, what showed the most promise was booting into Single User Mode and then fiddling around with all available disk devices.

Password worked for Single User Mode
  1. To start up in Single User Mode, press Cmd+S when starting up until the Apple logo shows up.
  2. The system prompts you for a password, and my password did in fact work in this mode.
  3. After signing in, you're dropped into a unix shell.
  4. There's only a basic file system mounted, which contains a limited number of unix commands and none of your data
Mount the Data partition

Once in single user mode, I had to mount my data partition. I first used the mount command to see what was already mounted. It showed that the only mounted device was /dev/disk1s1. I assumed that my Data partition would be /dev/disk1s2 and that it would have the same filesystem, and I chose a convenient mount point:

# mount -t apfs /dev/disk1s2 /System/Volumes/Data

Miraculously, this did not ask me for a password, and mounted my Data partition. I was able to look through the files and identify potential targets to remove. I also noticed that the disk was no completely full (0 bytes free). This was due to the Big Sur installer, which took up 11GB, and then added a few files, using up the entire 13GB that I had available.

Things were getting a little cumbersome here as most of the unix commands I needed to use were not on the primary partition, but on the mounted partition, so I added the appropriate folders to the unix PATH environment variable:

PATH="$PATH":/System/Volumes/Data/usr/bin

I was starting to see that choosing a 3 level deep path as my mount point perhaps wasn't a great idea. I also learned that while the screen is quite wide, the terminal environment is set to show 80 columns of text, and goes into very weird line wrapping issues if you type past that. It's even worse if you try tab completion at this point.

Transferring large files

Some of the large files & folders I identified were downloaded packages that could be removed. Unfortunately this only got me 2GB back. To get enough space back, I'd have to remove some photos and videos that weren't stored on iCloud. I figured I'd copy them over to an SD card and then could delete them.

I popped in the SD Card, and the kernel displayed some debug messages on the terminal. It told me that the card was in /dev/disk4, so I tried mounting that at a random empty directory:

# mount -t exfat /dev/disk4 /System/VM

This did not work!

No SD Cards in Single User Mode

By default, SD Cards are formatted with an EXFAT file system (the kind used by Windows and all digital cameras). Unfortunately, you cannot mount an EXFAT filesystem in Single User Mode as the exfatfs driver isn't compiled into the kernel. It's loaded up as a dynamic module when required. This only works when booting in standard mode with a kernel that allows dynamic loading. Single User Mode does not.

Reformat the SD Card

This was a brand new SD Card, so I decided to reformat it as an Apple file system. I used a different Macbook to do this, however my first attempt didn't work. It isn't sufficient to just format the SD Card, you also need to partition it, and that's where the filesystem is created.

I created a single APFS partition across the entire SD Card and then tried mounting it.

Unfortunately, now it was no longer at /dev/disk4 even though that's what the kernel debug messages said. Looking at /dev/disk* showed me that /dev/disk5s1 was a potential candidate.

# mount -t apfs /dev/disk5s1 /System/VM

Finally, this worked. I was able to copy my files over, and remove them from the Data partition. This freed up about 45GB, which allowed me to continue with the upgrade.

After the upgrade completed, I appear to have 75GB free. I haven't had a chance to check where the space has changed. I also plan to permanently use the SD Card (256GB) as an external hard drive.

Wednesday, November 18, 2020

Understanding Emotion for Happy Users

How does your site make your users feel?

Introduction

So you’ve come here for a post about performance, but here I am talking about emotion… what gives? I hope that if you haven’t already, then as this post progresses, you’ll see that performance and emotion are closely intertwined.

While we may be web builders, our goal is to run a business that provides services or products to real people. The website we build is a means of connecting people to that service or product.

The way things are…

The art and science of measuring the effects of signal latency on real users is now about 250 years old. We now call this Real User Measurement, or RUM for short, and it’s come a long way since Steve Souders’ early work at Yahoo.

Browsers now provide us with many APIs to fetch performance metrics that help site owners make sites faster. Concurrently, the Core Web Vitals initiative from Google helps identify metrics that most affect the user experience.

These metrics, while useful operationally, don’t give us a clear picture of the user experience, or why we need to optimise them for our site in particular. They don’t answer the business or human questions of, “Why should we invest in web performance?” (v/s for example, a feature that customers really want), or even more specifically, “What should we work on first?”.

Andy Davies recently published a post about the link between site speed and business outcomes…

Context influences experience,
Experience influences behaviour,
Behaviour influences business outcomes.

All of the metrics we collect and optimise for deal with context, and we spend very little time measuring and optimising the rest of the flow.

Switching Hats

Over the last decade working on boomerang and mPulse, we slowly came to the realisation that we’ve been approaching performance metrics from a developer centric view. We’d been drawing on our experience as developers – users who have browser dev tools shortcuts committed to muscle memory. We were measuring and optimising the metrics that were useful and easy to collect from a developer’s point of view.

Once we switched hats to draw on our experiences as consumers of the web, the metrics that really matter became clearer. We started asking better questions...

  • What does it mean that performance improved by 100ms?
  • Are all 100ms the same?
  • Do all users perceive time the same way?
  • Is performance all that matters?

In this post, we’ll talk about measuring user experience and its effects on behaviour, what we can infer from that behaviour, and how it affects business outcomes.

Delight & Frustration

In Group Psychology and the Analysis of Ego, Freud notes that “Frustration occurs when there is an inhibiting condition that interferes with or stops the realization of a goal.”

Users visit our sites to accomplish a goal. Perhaps they’re doing research to act on later, perhaps they want to buy something, perhaps they’re looking to share an article they read a few days ago.

Anything that slows down or prevents the user from accomplishing this goal can cause frustration. On the other hand, making their goal easy to find and achieve can be delightful.

How a user feels when using our site affects whether they’ll come back and “convert” into customers (however you may define convert).

The Link Between Latency & Frustration

In 2013, Tammy Everts and her team at Radware ran a usability lab experiment. The study hooked participants up to EEG devices, and asked them to shop on certain websites. Half the users had an artificial delay added to their browsing experience and neither group were made aware of the performance changes. They all believed they were testing the usability of the sites. The study showed that...

A 500ms connection speed delay resulted in up to a 26% increase in peak frustration and up to an 8% decrease in engagement.

Similarly in 2015, Ericsson ConsumerLab neuro research studied the effects of delayed web pages on mobile users and found that “Delayed web pages caused a 38% rise in mobile users' heart rates — equivalent to the anxiety of watching a horror movie alone.”

This may not be everyone’s cup of tea, and the real implication is that users make a conscious or unconscious decision on whether to stick around, return, or leave the site.

Cognitive Bias

Various cognitive biases affect how individual experiences affect perception and behaviour. Understanding these biases, and intervening when an experience tends negative can improve the overall experience.

Perceptual Dissonance

Also known as Sensory Dissonance, Perceptual Dissonance results from unexpected outcomes of common actions.

The brain’s predictive coding is what helps you do things like “figure out if a car coming down the road is going slow enough for you to cross safely”. A perceptive violation of this coding is useful in that it helps us learn new things, but if that violation breaks long standing “truths”, or if violations are inconsistent, it makes learning impossible, and leads to psychological stress, and frustration.

On the web, users expect websites to behave in a certain way. Links should be clickable, sites should in general scroll vertically, etc. Things like jank while scrolling, nothing happening when a user clicks a link (dead clicks), or a click target moving as the user attempts to click on it (layout shift) causes perceptual dissonance and frustration.

If these bad experiences are consistent, then users come to expect them. Our data shows that users from geographies where the internet is slower than average tend to be more patient with web page loads.

Survivorship Bias

We only measure users who can reach our site. For some users, a very slow experience is better than an unreachable site.

In 2012, after Youtube made their site lighter, Chris Zakariahs found that aggregate performance had gotten worse. On delving into the data, they found that new users who were previously unable to access the site were now coming in at the long tail. The site appeared slower in aggregate, but the number of users who could use it had gone up.

Negativity Bias

Users are more likely to remember and talk to their friends about their bad experiences with a site than they are about the good ones. We need only run a twitter search for “$BRAND_NAME slow” to see complaints about bad experiences.

Bad experiences are also perceived to be far more intense than equivalent good experiences. To end up with a neutral overall experience, bad experiences need to be balanced with more intense good experiences. A single bad experience over the course of the session makes it harder to result in overall delight.

Active Listening

Research shows that practicing Active Listening can have a big impact on countering Negativity Bias. Simply acknowledging when you’ve screwed up and didn’t meet the user’s expectations can alleviate negative perception. If we detect, via JavaScript, that the page is taking too long to transition between loading states, we could perhaps display a message that acknowledges and apologizes for things going slower than expected.

Hey, we realise that it’s taking a little longer than expected to get to what you want. You deserve better. We’re sorry and hope you’ll stick around a bit.

Users will be more forgiving if their pain is acknowledged.

Measuring Emotion

There are many ways we could measure the emotional state of users using our site. These range from active engagement to completely creepy. Naturally not all of these will be applicable for websites...

  • Use affective computing (facial analysis, EEGs, pulse tracking, etc.)
  • Ask the user via a survey popover
  • Business outcomes of behaviour
  • Behavioural analysis
Affective Computing

For website owners, affective computing isn’t really in play. Things like eye tracking, wireless brain interfaces, and other affective computing methodologies are too intrusive. They work well in a lab environment where users consent to this kind of tracking and can be hooked up to measurement devices. This is both inconvenient, and creepy to run on the web.

Ask the user

Asking the user can be effective as shown by a recent study from Wikipedia. The study used a very simple Yes/No/No Comment style dialog with randomized order. They found that users’ perceived quality of experience is inversely proportional to median load time. A 4% temporary improvement to page load time resulted in an equally temporary 1% extra satisfied users.

Area chart of two timeseries: Median loadEventEnd, and Satisfaction Ratio (positive/total). Time axis covers 1 year from Oct 2019 to Oct 2020. More details in the text preceding this image.

This method requires active engagement by the user and suffers from selection bias and the hawthorne effect.

It’s hard to quantify what kinds of experiences would reduce the effects of selection bias and result in users choosing to answer the survey, or how you’d want to design the popover to increase self-selection.

The Hawthorne effect, on the other hand, suggests that individuals change the way they react to stimuli if they know they’re being measured or observed.

Business Outcomes

Measuring business outcomes is necessary but it can be hard to identify what context resulted in an outcome. One needs to first understand the intermediate steps of experience and behaviour. Did a user bounce because the experience was bad, or did they just drop in to do some research and will return later to complete a purchase?

Behavioural analysis

Applying the results of lab based research to users actively using a website can help tie experience to behaviour. We first need to introduce some new terms that we’ll define in the paragraphs that follow.

Rage Clicks, Wild Mouse, Scrandom, and Backtracking are behavioural signals we can use. In conjunction with when in a page’s life cycle users typically expect different events to take place, they can paint a picture of user expectations and behaviour.

Correlating these metrics with contextual metrics like Core Web Vitals on one hand, and business outcomes on the other can help us tell a more complete story of which performance metrics we should care about and why.

Rage, Frustration & Confusion

To measure Rage, Frustration & Confusion, we look at Rage Clicks, Wild Mouse and Backtracking.

Rage Clicks

Rage Clicks occur when users rapid-fire click on your site. It is the digital equivalent of cursing to release frustration. We’ve probably all caught ourselves rage clicking at some point. Click once, nothing happens, click again, still nothing, and then on and on. This could be a result of interaction delays, or of users expecting something to be clickable when it isn't.

Rage clicks can be measured easily and non-intrusively, and are easy to analyse.

Fullstory has some great resources around Rage Clicks.

Wild Mouse

Research shows that people who are angry are more likely to use the mouse in a jerky and sudden, but surprisingly slow fashion.

People who feel frustrated, confused or sad are less precise in their mouse movements and move it at different speeds.

There are several expected mouse movements while a user traverses a website. Horizontal and vertical reading patterns are expected and suggest that the user is engaged in your content.

On the other hand, random patterns, or jumping between options in a form can suggest confusion, doubt, and frustration. See Churruca, 2011 for the full study.

The JavaScript library Dawdle.js can help classify these mouse patterns.

Scrandom

Scrandom is the act of randomly scrolling the page up and down with no particular scroll target. This can indicate that a user is unsure of the content, the page is too long, or is waiting for something to happen and making sure that the page is still responsive without accidentally clicking anything.

Backtracking

Backtracking is the process of hitting the back button on the web. Users who are confused or lost on your site may hit the back button often to get back to a safe space. This behaviour may manifest itself in different ways, but can often be identified with very long sessions that appear to loop.

Tie this into the Page Load Timeline

In his post on Web Page Usability, Addy Osmani states that loading a page is a progressive journey with four key moments to it: Is it happening? Is it useful? Is it usable? and Is it delightful? And he includes this handy graphic to explain it:

When did the user feel they could interact? When could they interact? Speed metrics illustrate First Paint, First Contentful Paint, Time to Interactive for a page

The first three are fairly objective. With only minor differences between browsers, it’s straightforward to pull this information out of standard APIs, and possibly supplement it with custom APIs like User Timing.

We’ve found that over 65% of users expect a site to be usable after elements have started becoming visible but before it is actually Interactive. Contrast that with 30% who will wait until after the onload event has fired.

Correlating Rage with Loading Events

Comparing the points in time when users rage click with the loading timeline above, we see some patterns.

Relative time series showing the intensity of rage clicks tied to when users first interact with a page relative to page load. We also include the First Input Delay as a separate series, and show 25th-75th percentile bands for the First Paint, Largest Contentful Paint, Visually Ready, and Interactive times relative to Page Load.
The horizontal axis on this chart is time as a relative percent of the full page load time. -50 indicates half of the page load time while +50 is 1.5x the page load time. The vertical axis indicates intensity of rage while point radius indicates probability of rage clicks at that time point. The coloured bars indicate 25th to 75th percentile ranges for the particular timer relative to full page load with the line going through indicating the median.

We see a large amount of rage between content becoming visible and the page becoming interactive. Users expect to be able to interact with the page soon after content becomes visible, and if that expectation isn’t met, it results in rage clicking.

We also see a small stream of rage clicks after the page has completed loading, caused by interaction delays.

There’s a small gap just before the onload event fires. The onload event is when many JavaScript event handlers run, which in turn result in Long Tasks, and increased Interaction Delays. What we’re seeing here is not the absence of any interaction, but survivorship bias where the interactions that happen at that time aren’t captured until later.

The horizontal axis on this chart is relative time along the page load timeline. We looked at various combinations of absolute and relative time across multiple timers, and it was clear that relativity is a stronger model, which brings us to a new metric based on relative timers...

Frustration Index

The frustration index, developed by Tim Vereecke, is a measure based on the relation between loading phases. We’ve seen that once one event occurs, users expect the next to happen within a certain amount of time. If we miss that expectation, the user's perception is that something is stopping or inhibiting their ability to complete their task, resulting in frustration.

The Frustration Index encapsulates that relationship. The formula we use is constantly under development as research brings new things to light, but it’s helpful to visit the website to understand exactly how it works and see some examples.

So how do we know that this is a good metric to study?

Correlating Rage & Frustration

It turns out that there is a strong correlation (ρ=0.91) between the intensity of rage (vertical axis) that a user expresses and the calculated frustration index (horizontal axis) of the page.

Scatter Plot showing Frustration Index on the horizontal axis and intensity of rage clicks on the vertical axis. The two variables have a pearson's correlation coefficient of 0.91.

Rather than looking at individual timers for optimization, it is better to consider all timers in cohesion. Improving one of them changes the user’s expectation of when other events should happen and missing that expectation results in frustration.

However, further to this, the formula is something we can apply client-side to determine if we’re meeting expectations, and practice active listening if we’re not.

Correlating Frustration & Business Outcomes

Looking at the correlation between Frustration Index and certain business metrics also shows a pattern.

Double Scatter Plot showing Frustration Index on the horizontal axis and bounce rate on the first vertical axis and average session duration in minutes on the second.
  • Bounce Rate is proportional to the frustration index with a sharp incline around what we call the LD50 point (for this particular site). ρb=0.65
  • Average Time spent on the site goes down as frustration increases, again sharply at first and then tapering off. ρt=-0.49
LD50

The LD50, or Median Lethal Dose is a term borrowed from the biological sciences. Buddy Brewer first applied the term to web performance in 2012, and we’ve been using it ever since.

In biology, it’s the dosage of a toxin that kills off 50% of the sample, be it tumour cells, or mice.

On the web, we think of it more in terms of when 50% of users decide not to move on in their journey. We could apply it to bounce rate, or retention rate, or any other rate that’s important to your site, and the “dose”, may be a timer value, or frustration index, or anything else. Depending on the range of the metric in question, we may also use a percentile other than the median, for example, LD25 or LD75.

This isn’t a single magic number that works for all websites. It isn’t even a single number that works for all pages on a site or for all users. Different pages and sites have different levels of importance to a user, and a user’s emotional state, or even the state of their device (eg: low battery), when they visit your site can affect how patient they are.

Column chart showing the LD25 frustration index value for users from different Geos: US:26, Germany:10, Japan:18, Australia:42, Canada:44.
Patience is also a Cultural Thing

People from different parts of the world have a different threshold for frustration.

Many of our customers have international audiences and they have separate sites customized for each locale. We find that users from different global regions have different expectations of how fast a site should be.

In this chart, looking at 5 high GDP countries (that we have data for), we see a wide distribution in LD25 value across them, ranging from a value of 10 for Germany to the 40s for Australia and Canada. It’s not shown in this chart, but the difference is even wider when we look at LD50, with Germany at 14 and Canada at 100.

So how fast should our site be?

We’ve heard a lot about how our site’s performance affects the user experience, and consequently how people feel when using our site. We’ve seen how the “feel” of a site can affect the business, but what does all of that tell us about how to build our sites?

  • How fast should we be to reduce frustration?
  • What should we be considering in our performance budgets?
  • How do we leave our users feeling happy?

I think these may be secondary questions…

A better question to start with, is:

Will adding a new feature delight or frustrate the user?

References

Acknowledgements

Thanks to Andy Davies, Nic Jansma, Paul Calvano, Tim Vereecke, and Cliff Crocker for feedback on an earlier draft of this post.

Thanks also to the innumerable practitioners whose research I've built upon to get here including Addy Osmani, Andy Davies, Gilles Dubuc, Lara Hogan, Nicole Sullivan, Silvana Churruca, Simon Hearne, Tammy Everts, Tim Kadlec, Tim Vereecke, the folks from Fullstory, and many others that I'm sure I've missed.

...===...