GLOB(7)                   (2020-08-13)                    GLOB(7)

     NAME
          glob - globbing pathnames

     DESCRIPTION
          Long ago, in UNIX V6, there was a program /etc/glob that
          would expand wildcard patterns.  Soon afterward this became
          a shell built-in.

          These days there is also a library routine glob(3) that will
          perform this function for a user program.

          The rules are as follows (POSIX.2, 3.13).

        Wildcard matching
          A string is a wildcard pattern if it contains one of the
          characters aq?aq, aq*aq or aq[aq.  Globbing is the operation that
          expands a wildcard pattern into the list of pathnames match-
          ing the pattern.  Matching is defined by:

          A aq?aq (not between brackets) matches any single character.

          A aq*aq (not between brackets) matches any string, including
          the empty string.

          Character classes

          An expression "[...]" where the first character after the
          leading aq[aq is not an aq!aq matches a single character, namely
          any of the characters enclosed by the brackets.  The string
          enclosed by the brackets cannot be empty; therefore aq]aq can
          be allowed between the brackets, provided that it is the
          first character.  (Thus, "[][!]" matches the three charac-
          ters aq[aq, aq]aq and aq!aq.)

          Ranges

          There is one special convention: two characters separated by
          aq-aq denote a range.  (Thus, "[A-Fa-f0-9]" is equivalent to
          "[ABCDEFabcdef0123456789]".)  One may include aq-aq in its
          literal meaning by making it the first or last character
          between the brackets.  (Thus, "[]-]" matches just the two
          characters aq]aq and aq-aq, and "[--0]" matches the three char-
          acters aq-aq, aq.aq, aq0aq, since aq/aq cannot be matched.)

          Complementation

          An expression "[!...]" matches a single character, namely
          any character that is not matched by the expression obtained
          by removing the first aq!aq from it.  (Thus, "[!]a-]" matches
          any single character except aq]aq, aqaaq and aq-aq.)

     Page 1                        Linux             (printed 5/22/22)

     GLOB(7)                   (2020-08-13)                    GLOB(7)

          One can remove the special meaning of aq?aq, aq*aq and aq[aq by
          preceding them by a backslash, or, in case this is part of a
          shell command line, enclosing them in quotes.  Between
          brackets these characters stand for themselves.  Thus,
          "[[?*\]" matches the four characters aq[aq, aq?aq, aq*aq and aq\aq.

        Pathnames
          Globbing is applied on each of the components of a pathname
          separately.  A aq/aq in a pathname cannot be matched by a aq?aq
          or aq*aq wildcard, or by a range like "[.-0]".  A range con-
          taining an explicit aq/aq character is syntactically incor-
          rect.  (POSIX requires that syntactically incorrect patterns
          are left unchanged.)

          If a filename starts with a aq.aq, this character must be
          matched explicitly.  (Thus, rm * will not remove .profile,
          and tar c * will not archive all your files; tar c . is bet-
          ter.)

        Empty lists
          The nice and simple rule given above: "expand a wildcard
          pattern into the list of matching pathnames" was the origi-
          nal UNIX definition.  It allowed one to have patterns that
          expand into an empty list, as in

              xv -wait 0 *.gif *.jpg

          where perhaps no *.gif files are present (and this is not an
          error).  However, POSIX requires that a wildcard pattern is
          left unchanged when it is syntactically incorrect, or the
          list of matching pathnames is empty.  With bash one can
          force the classical behavior using this command:

              shopt -s nullglob

          (Similar problems occur elsewhere.  For example, where old
          scripts have

              rm `find . -name "*ti"`

          new scripts require

              rm -f nosuchfile `find . -name "*ti"`

          to avoid error messages from rm called with an empty argu-
          ment list.)

     NOTES
        Regular expressions
          Note that wildcard patterns are not regular expressions,
          although they are a bit similar.  First of all, they match
          filenames, rather than text, and secondly, the conventions

     Page 2                        Linux             (printed 5/22/22)

     GLOB(7)                   (2020-08-13)                    GLOB(7)

          are not the same: for example, in a regular expression aq*aq
          means zero or more copies of the preceding thing.

          Now that regular expressions have bracket expressions where
          the negation is indicated by a aqhaaq, POSIX has declared the
          effect of a wildcard pattern "[ha...]" to be undefined.

        Character classes and internationalization
          Of course ranges were originally meant to be ASCII ranges,
          so that "[ -%]" stands for "[ !"#$%]" and "[a-z]" stands for
          "any lowercase letter".  Some UNIX implementations general-
          ized this so that a range X-Y stands for the set of charac-
          ters with code between the codes for X and for Y.  However,
          this requires the user to know the character coding in use
          on the local system, and moreover, is not convenient if the
          collating sequence for the local alphabet differs from the
          ordering of the character codes.  Therefore, POSIX extended
          the bracket notation greatly, both for wildcard patterns and
          for regular expressions.  In the above we saw three types of
          items that can occur in a bracket expression: namely (i) the
          negation, (ii) explicit single characters, and (iii) ranges.
          POSIX specifies ranges in an internationally more useful way
          and adds three more types:

          (iii) Ranges X-Y comprise all characters that fall between X
          and Y (inclusive) in the current collating sequence as
          defined by the LC_COLLATE category in the current locale.

          (iv) Named character classes, like

          [:alnum:]  [:alpha:]  [:blank:]  [:cntrl:]
          [:digit:]  [:graph:]  [:lower:]  [:print:]
          [:punct:]  [:space:]  [:upper:]  [:xdigit:]

          so that one can say "[[:lower:]]" instead of "[a-z]", and
          have things work in Denmark, too, where there are three let-
          ters past aqzaq in the alphabet.  These character classes are
          defined by the LC_CTYPE category in the current locale.

          (v) Collating symbols, like "[.ch.]" or "[.a-acute.]", where
          the string between "[." and ".]" is a collating element
          defined for the current locale.  Note that this may be a
          multicharacter element.

          (vi) Equivalence class expressions, like "[=a=]", where the
          string between "[=" and "=]" is any collating element from
          its equivalence class, as defined for the current locale.
          For example, "[[=a=]]" might be equivalent to "[a'a�a:a^a]",
          that is, to "[a[.a-acute.][.a-grave.][.a-umlaut.][.a-
          circumflex.]]".

     SEE ALSO

     Page 3                        Linux             (printed 5/22/22)

     GLOB(7)                   (2020-08-13)                    GLOB(7)

          sh(1), fnmatch(3), glob(3), locale(7), regex(7)

     COLOPHON
          This page is part of release 5.10 of the Linux man-pages
          project.  A description of the project, information about
          reporting bugs, and the latest version of this page, can be
          found at https://www.kernel.org/doc/man-pages/.

     Page 4                        Linux             (printed 5/22/22)