Hugo Regular Expressions (featuring emoji codes & ignoreFiles⁠🙈)
Updated  by  nm  2024-March-21

Page contents

Prerequisites

This article assumes you know the basics about the Hugo static site generator, emoji codes like :see_no_evil:,1 and regular expressions, including the following regular expressions.

ExpressionMatches…
^beginning of string
$end of string
.any single character
\.a literal period (.)
*zero or more of previous item
+one or more of previous item
?zero or one of previous item
[a-z]a single character in abcdefghijklmnopqrstuvwxyz
[-+0-9_a-z]a single character in -⁠+0123456789_abcdefghijklmnopqrstuvwxyz
[.]a literal period (.)

tip

Introduction

In Hugo, you can use regular expressions…

Below I describe some Hugo regular expressions that I use to help maintain the Infinite Ink website.


Regular expression examples

Example 1: Ignoring files with the ignoreFiles config parameter

Infinite Ink’s primary config file, hugo.yaml, includes this:

ignoreFiles:
  - \.bak$
  - IGNORE

Each item in this ignoreFiles list2 is a regular expression. The first item, \.bak$, matches any path that ends with .bak.3 The second item, IGNORE, matches any path that contains all-⁠uppercase IGNORE.


note

To learn about ignoreFiles, see gohugo.io/getting-started/configuration/#ignore-content-and-data-files-when-rendering.


Example 2: Highlighting emoji codes with the replaceRE function

As of Hugo v0.120.0, emoji codes1 are supported only in Goldmark Markdown. This means that I need to find and replace the emoji codes in Infinite Ink’s AsciiDoc and Pandoc Markdown source files. I’m doing this by having Hugo automatically find and highlight each emoji code and then I manually replace it. Here are the details:

  1. Use Hugo’s replaceRE function to find and highlight each emoji code (for example, :dragon:).
  2. Manually change each emoji code to its literal Unicode character. For example, I manually change :dragon: to 🐉 in the source file.4

Here is the code I use in a Hugo layout file to do #1 (find and highlight):

{{- if hugo.IsDevelopment -}}
 {{ .Content
    | replaceRE
      `(:[-+0-9_a-z]+:)`
      `<span style="font-size: 10em; color: Green;"><nobr>⚠$1 EmojiCode</nobr></span>`
    | safeHTML
 }}
{{- else -}}
 {{ .Content }}
{{- end -}}

And here is what is rendered when I’m in development mode and the source5 contains the string :dragon::

⚠:dragon: EmojiCode

I make this huge and use <nobr></nobr>6 so that any rendered page with an emoji shortcode will have a horizontal scroll bar and I’ll know there’s an issue from anywhere on the page (because the horizontal scroll bar appears on the whole page).

To learn about replaceRE, see gohugo.io/functions/strings/replacere/.

note

This replaceRE example uses a capture group and back reference. The part in parentheses, :[-+0-9_a-z]+: is captured and $1 is the back reference.


More examples

For more examples of Hugo regular expressions, see Infinite Ink’s Configuring Security in Hugo⁠🚥.


Case sensitive and insensitive regular expressions

By default, Hugo regular expressions are case sensitive. For example, if Infinite Ink’s hugo.yaml includes…

ignoreFiles:
  - IGNORE

…then this article (whose source file is hugo-regular-expressions-ignorefiles.md) will be built.


If I want Hugo to ignore paths that contain the case insensitive string IGNORE (IGNORE, ignore, IgNoRe, iGnOrE, etc.), then I specify this in Infinite Ink’s hugo.yaml:

ignoreFiles:
  - (?i)IGNORE
    ^^^^
     👆
    means case insensitive match

With this, this article will not be built (because its source file is hugo-regular-expressions-ignorefiles.md).


References


See also

For more about gohugo, see Infinite Ink’s…


Endnotes



  1. In Hugo, “emoji shortcodes” are known as “emoji codes” (because “shortcode” has another meaning). An example is :see_no_evil:, which is the emoji code for 🙈. To learn about emoji codes, see emojipedia.org/shortcodes/. The emoji codes supported by Goldmark Markdown are listed on gohugo.io/quick-reference/emojis/↩︎ ↩︎

  2. In Hugo, a “list” is also known as a “slice.” ↩︎

  3. To learn about .bak files, see wikipedia.org/wiki/Bak_file↩︎

  4. On Windows, I insert a literal emoji character (as opposed to an emoji code) with Windows built-⁠in emoji keyboard, which can be launched with Win+.. ↩︎

  5. Actually when the .Content (not the source) contains the string :dragon:. To learn about Hugo’s .Content page variable, see Infinite Ink’s Hugo RawContent and Content Fingerprints⁠🆔↩︎

  6. The <nobr> HTML tag means “no break” or “no wrap.” For details, see developer.mozilla.org/en-US/docs/Web/HTML/Element/nobr↩︎


Please share & discuss 📝 👎 👍 📯