常规选项
--help
|
打印帮助消息,简要概述命令行选项,然后退出。
|
-V,-- version
|
打印grep的版本号,然后退出。
|
比赛选择选项
-E,--extended-regexp
|
将PATTERN解释为扩展的正则表达式(请参见基本与扩展的正则表达式)。
|
-F, --fixed-strings
|
将PATTERN解释为要匹配的固定字符串列表,以换行符分隔。
|
-G,-- basic-regexp
|
将PATTERN解释为基本正则表达式(请参见基本与扩展正则表达式)。这是运行grep时的默认选项。
|
-P,-- perl-regexp
|
将PATTERN解释为Perl正则表达式。此功能仍处于试验阶段,可能会产生警告消息。
|
匹配控制选项
-e PATTERN,-regexp=PATTERN
|
使用PATTERN作为匹配的模式。这可用于指定多个搜索模式,或保护以破折号(-)开头的模式。
|
-f FILE,--file = FILE
|
从FILE获取模式,每行一个。
|
-i,--ignore-case
|
忽略PATTERN和输入文件中的大小写区别。
|
-v, --invert-match
|
反转匹配感,以选择不匹配的行。
|
-w,--word-regexp
|
仅选择包含构成整个单词的匹配项的行。测试是匹配的子字符串必须在该行的开头,或者必须在非单词组成字符之前。或者,它必须在行的末尾,或后跟非单词的组成字符。单词组成的字符是字母,数字和下划线。
|
-x,-- line-regexp
|
仅选择与整行完全匹配的匹配项。
|
-y
|
与-i相同。
|
通用输出控制
-c,-- count
|
代替正常输出,为每个输入文件打印匹配行数。使用-v,-- invert-match选项(请参见下文),计算不匹配的行。
|
--color [= WHEN],-- colour [=WHEN ]
|
匹配的(非空)字符串,匹配行,上下文行,文件名,行号,字节偏移量和分隔符(用于字段和上下文行组)用转义序列括起来,以在终端上以彩色显示它们。颜色由环境变量GREP_COLORS定义。仍支持较早的环境变量GREP_COLOR,但其设置没有优先级。WHEN是永远,永远,或汽车。
|
-L, --files-without-match
|
代替普通输出,打印每个输入文件的名称,通常不会从该文件中打印输出。扫描将在第一个匹配项时停止。
|
-l,--files-with-matches
|
代替正常的输出,打印通常会从中打印输出的每个输入文件的名称。扫描将在第一个匹配项时停止。
|
-m NUM,--max-count = NUM
|
在NUM条匹配的行之后停止读取文件。如果输入是来自常规文件的标准输入,并且输出NUM条匹配行,则grep确保将标准输入定位在退出之前的最后一条匹配行之后,而不管尾随上下文行是否存在。这使呼叫过程可以恢复搜索。当grep在NUM条匹配行之后停止时,它将输出任何尾随上下文行。当还使用-c或--count选项时,grep不会输出大于NUM的计数。当-v或--invert-match还使用了option,grep在输出NUM条不匹配的行后停止。
|
-o,-- only-matching
|
仅打印匹配行的匹配(非空)部分,每个这样的部分都在单独的输出行上。
|
-q,-- quiet,--silent
|
安静; 不要在标准输出中写任何东西。如果发现任何匹配项,即使检测到错误,也以零状态立即退出。另请参见-s或--no-messages选项。
|
-s, --no-messages
|
禁止显示有关不存在或不可读文件的错误消息。
|
输出线前缀控制
-b,-- byte-offset
|
在输出的每一行之前,在输入文件中打印基于0的字节 偏移量。如果指定-o(-- only-matching),则打印匹配部分本身的偏移量。
|
-H,--with-filename
|
打印每个匹配项的文件名。当要搜索多个文件时,这是默认设置。
|
-h,-- no-filename
|
在输出中禁止文件名的前缀。当只有一个文件(或只有标准输入)要搜索时,这是默认设置。
|
--label=LABEL
|
显示实际上来自标准输入的输入作为来自文件LABEL的输入。在实现zgrep之类的工具时,这尤其有用,例如gzip -cd foo.gz | grep --label = foo -H某物。另请参见-H选项。
|
-n,--line-number
|
在输出的每一行之前,在其输入文件中添加从1开始的行号。
|
-T,-- initial-tab
|
确保实际行内容的第一个字符位于制表位上,以使制表符的对齐看起来正常。这对于将其输出前缀为实际内容的选项很有用:-H,-n和-b。为了提高单个文件中的行全部从同一列开始的可能性,这还将使行号和字节偏移(如果存在)以最小尺寸的字段宽度打印。
|
-u,--unix-byte-offsets
|
报告Unix样式的字节偏移量。此开关使grep报告字节偏移,就好像该文件是Unix样式的文本文件一样,即,去除了CR字符。这将产生与在Unix机器上运行grep相同的结果。除非也使用-b选项,否则该选项无效。它对MS-DOS和MS-Windows以外的平台没有影响。
|
-Z,-- null
|
输出零字节(ASCII NUL字符),而不是通常在文件名后的字符。例如,grep -lZ在每个文件名之后输出一个零字节,而不是通常的newline。即使存在包含不寻常字符(例如换行符)的文件名,此选项也可以使输出明确。此选项可与find -print0,perl -0,sort -z和xargs -0等命令一起使用,以处理任意文件名,即使是包含换行符的文件名。
|
上下文线控制
-A NUM,--after-context = NUM
|
在匹配的行之后打印NUM行尾随上下文。在连续的匹配组之间放置包含组分隔符(-)的行。使用-o或--only-matching选项,此选项无效,并给出警告。
|
-B NUM,--before-context = NUM
|
在匹配行之前打印前导上下文的NUM行。在连续的匹配组之间放置包含组分隔符(-)的行。使用-o或--only-matching选项,此选项无效,并给出警告。
|
-C NUM,- NUM,--context = NUM
|
打印NUM行输出上下文。在连续的匹配组之间放置包含组分隔符(-)的行。使用-o或--only-matching选项,此选项无效,并给出警告。
|
文件和目录选项
-a,-- text
|
像对待文本一样处理二进制文件;这等效于--binary-files = text选项。
|
--binary-files=TYPE
|
如果文件的前几个字节指示该文件包含二进制数据,则假定该文件的类型为TYPE。默认情况下,TYPE为二进制,并且grep通常输出单行消息,表明二进制文件匹配,如果不匹配,则不输出消息。如果TYPE为不匹配,则grep假定二进制文件不匹配;否则,grep假定二进制文件不匹配。这等效于-I选项。如果TYPE为text,则grep将二进制文件视为文本;这等效于-a选项。警告:grep --binary-files = text 可能会输出二进制垃圾,如果输出是终端并且终端驱动程序将其中一些解释为命令,则二进制垃圾可能会带来讨厌的副作用。
|
-D ACTION,--devices = ACTION
|
如果输入文件是设备,FIFO或套接字,请使用ACTION进行处理。默认情况下,ACTION为read,这意味着设备就像普通文件一样被读取。如果ACTION为skip,则设备将以静默方式跳过。
|
-d ACTION,--directories = ACTION
|
如果输入文件是目录,请使用ACTION进行处理。默认情况下,ACTION为read,即读取目录,就像它们是普通文件一样。如果ACTION为skip,则静默跳过目录。如果ACTION是recurse,则仅当它们在命令行上时,才按照符号链接递归地读取每个目录下的所有文件。这等效于-r选项。
|
--exclude = GLOB
|
跳过基本名称与GLOB匹配的文件(使用通配符匹配)。一个文件名水珠可以使用*,?和[...]作为通配符和\引用通配符或反斜杠字符字面。
|
--exclude-from=FILE
|
跳过基本名称与从FILE读取的任何文件名名称匹配的文件(使用通配符匹配,如--exclude所述)。
|
--exclude-dir =DIR
|
从递归搜索中排除与模式DIR匹配的目录。
|
-I
|
处理二进制文件,就好像它不包含匹配数据一样;这等效于--binary-files = without-match选项。
|
--include = GLOB
|
仅搜索基本名称与GLOB匹配的文件(使用通配符匹配,如--exclude所述)。
|
-r, --recursive
|
仅在命令行上,才遵循符号链接递归地读取每个目录下的所有文件。这等效于-d recurse选项。
|
-R,--dereference-recursive
|
递归读取每个目录下的所有文件。跟随所有符号链接,这与-r不同。
|
其他选项
--line-buffered
|
在输出上使用行缓冲。这可能会导致性能下降。
|
--mmap
|
如果可能,请使用mmap系统调用读取输入,而不是默认的读取系统调用。在某些情况下,-- mmap会产生更好的性能。但是,如果在grep运行时输入文件缩小,或者发生I / O错误,则--mmap可能导致未定义的行为(包括核心转储)。
|
-U,--binary
|
将文件视为二进制文件。默认情况下,在MS-DOS和MS-Windows下,grep通过查看从文件读取的前32 KB的内容来猜测文件类型。如果grep认为文件是文本文件,它将从原始文件内容中删除CR字符(以使带有^和$的正则表达式正确运行)。指定-U会否决此猜测,导致所有文件都被逐字读取并传递给匹配机制;如果文件是每行末尾带有CR / LF对的文本文件,则将导致某些正则表达式失败。此选项对MS-DOS和MS-Windows以外的平台无效。
|
-z, --null-data
|
将输入视为一组行,每行以零字节(ASCII NUL字符)而不是换行符结尾。与-Z或--null选项一样,此选项可与sort -z之类的命令一起使用以处理任意文件名。
|
常用表达
正则表达式是描述一组字符串的模式。通过使用各种运算符组合较小的表达式,可以类似于算术表达式来构造正则表达式。
grep理解正则表达式语法的三种不同版本:“基本”(BRE),“扩展”(ERE)和“ perl”(PRCE)。在GNU grep中,基本语法和扩展语法之间的可用功能没有区别。在其他实现中,基本正则表达式的功能较弱。以下描述适用于扩展的正则表达式;基本正则表达式的差异将在后面总结。Perl正则表达式提供了其他功能。
基本的构建块是与单个字符匹配的正则表达式。大多数字符(包括所有字母和数字)都是匹配自己的正则表达式。任何具有特殊含义的元字符都可以在其前面加上反斜杠来引用。
句点(。)匹配任何单个字符。
字符类和括号表达式
方括号表达式是由[和]括起来的字符的列表。它匹配该列表中的任何单个字符;如果列表的第一个字符是插入符号^,则它匹配列表中未包含的任何字符。例如,正则表达式[0123456789]与任何一位数字匹配。
在方括号表达式中,范围表达式由两个字符组成,并用连字符分隔。它使用语言环境的整理顺序和字符集来匹配在两个字符(包括两个字符)之间排序的任何单个字符。例如,在默认的C语言环境中,[ad]等效于[abcd]。许多语言环境都按字典顺序对字符进行排序,在这些语言环境中,[ad]通常不等同于[abcd];例如,它可能等效于[aBbCcDd]。要获得括号表达式的传统解释,可以通过将LC_ALL 环境变量设置为值C来使用C语言环境。
最后,在括号表达式中预定义了某些命名的字符类,如下所示。它们的名称不言自明,分别是[:alnum:],[:alpha:],[:cntrl:],[:digit:],[:graph:],[:lower:],[:print:],[:punct:],[:space:],[:upper:]和[:xdigit:]。例如,[[:alnum:]]表示当前语言环境中的数字和字母的字符类。在C语言环境和ASCII字符集编码中,这与[0-9A-Za-z]相同。(请注意,这些类名称中的方括号是符号名的一部分,并且除了界定方括号的方括号外,还必须包括这些方括号。)大多数元字符在方括号表达式中失去其特殊的含义。要包含文字],请将其放在列表的第一位。同样,要包含文字^,请先将其放置在其他任何地方。最后,要包含文字-,请将其放在最后。
锚定
插入符号^和美元符号$是元字符,分别与行的开头和结尾处的空字符串匹配。
反斜杠字符和特殊表达
符号\< and \>分别与单词开头和结尾的空字符串匹配。符号\ b匹配单词边缘的空字符串,\ B匹配单词不在单词边缘的空字符串。符号\ w是[_[:alnum:]]的同义词,\ W是[^_[:alnum:]]的同义词。
重复
正则表达式后可以跟几个重复运算符之一:
?
|
上一项是可选的,最多匹配一次。
|
*
|
前一项将被匹配零次或多次。
|
+
|
前一项将被匹配一次或多次。
|
{ n }
|
上一项完全匹配n次。
|
{ n ,}
|
前一项匹配n次或多次。
|
{ n ,m }
|
前一项至少匹配n次,但不超过m次。
|
串联
两个正则表达式可以串联 ; 生成的正则表达式与通过串联两个分别与串联表达式匹配的子字符串形成的字符串匹配。
交替
infix运算符可以将两个正则表达式连接起来。; 结果正则表达式与匹配任一备用表达式的任何字符串匹配。
优先顺序
重复优先于串联,反过来优先于交替。整个表达式可以用括号括起来,以覆盖这些优先级规则并形成子表达式。
反向引用和子表达式
向后引用\ n,其中n是一个数字,与先前由正则表达式的第n个括号括起来的子表达式匹配的子字符串匹配。
基本与扩展正则表达式
在基本正则表达式中,元字符?, +, {, |, (, and )失去特殊含义;而是使用反斜杠版本\?, \+, \{, \|, \(, and \)。
传统版本的egrep不支持{元字符,而有些egrep实现则支持\ {,因此可移植脚本应避免使用grep -E模式中的{,并应使用[{]来匹配文字{。
GNU grep -E尝试通过假设{如果是无效间隔指定的开始不特殊,则支持传统用法。例如,命令grep -E'{1'搜索两个字符的字符串{1,而不是在正则表达式中报告语法错误。POSIX允许将此行为作为扩展,但可移植脚本应避免这种情况。
环境变量
grep的行为受以下环境变量影响。
通过依次检查三个环境变量LC_ALL,LC_foo和LANG,可以指定类别LC_foo的语言环境。设置的这些变量中的第一个指定语言环境。例如,如果未设置LC_ALL,但LC_MESSAGES设置为pt_BR,则将巴西葡萄牙语语言环境用于LC_MESSAGES类别。该Ç如果没有这些环境变量的设置使用的语言环境,如果没有安装的区域设置目录,或者如果grep的不符合国家语言支持(NLS)编译。
注意的其他变量:
GREP_OPTIONS
|
此变量指定将默认选项放置在任何显式选项的前面。例如,如果GREP_OPTIONS为' --binary- files = without-match --directories = skip ',则grep的行为就像在任何一个选项之前都指定了--binary-files = without-match和--directories = skip这两个选项一样显式选项。选项规格由空格分隔。反斜杠转义下一个字符,因此可用于指定包含空格或反斜杠的选项。
|
GREP_COLOR
|
此变量指定用于突出显示匹配的(非空)文本的颜色。不推荐使用GREP_COLORS,但仍然支持。GREP_COLORS的mt,ms和mc功能具有优先权。它只能指定用于在任何匹配行中突出显示匹配的非空文本的颜色(省略-v命令行选项时为选定的行,或指定-v时为上下文行)。默认值为01; 31,这表示终端的默认背景上的红色粗体前景文本。
|
GREP_COLORS
|
指定用于突出显示输出各部分的颜色和其他属性。它的值是用冒号分隔的功能列表,默认为ms = 01; 31:mc = 01; 31:sl =:cx =:fn = 35:ln = 32:bn = 32:se = 36,其中rv和ne布尔功能被忽略(即false)。支持的功能如下:
sl =
|
整个选定行的SGR子字符串(即,省略-v命令行选项时匹配的行,或指定-v时不匹配的行)。但是,如果同时指定了布尔rv功能和-v命令行选项,则它将应用于上下文匹配行。默认值为空(即终端的默认颜色对)。
|
cx =
|
整个上下文行的SGR子字符串(即,当省略-v命令行选项时,不匹配的行;或者当指定-v时,匹配的行)。但是,如果同时指定了布尔rv功能和-v命令行选项,则它将应用于选定的不匹配行。默认值为空(即终端的默认颜色对)。
|
rv
|
指定-v命令行选项时,布尔值反转(交换)sl =和cx =功能的含义。默认值为false(即,功能被省略)。
|
mt = 01; 31
|
SGR子字符串,用于匹配任何匹配行中的非空文本(即,当省略-v命令行选项时为选定的行,或者在指定-v时为上下文行)。设置此值等同于将ms =和mc =一次设置为相同值。默认为当前行背景上方的粗体红色文本前景。
|
ms = 01; 31
|
SGR子字符串,用于匹配所选行中的非空文本。(仅在省略-v命令行选项时使用。)启动时,sl =(或cx =如果rv)功能的效果保持活动。默认值为当前行上方的红色粗体文本前景背景。
|
mc = 01; 31
|
SGR子字符串,用于匹配上下文行中的非空文本。(仅在指定-v命令行选项时使用。)启动时,cx =(或sl =如果rv)功能的效果保持活动。默认值为当前行上方的红色粗体文本前景背景。
|
fn = 35
|
文件名的SGR子字符串以任何内容行为前缀。默认值是终端默认背景上的洋红色文本前景。
|
ln = 32
|
行号的SGR子字符串以任何内容行为前缀。默认值为终端默认背景上的绿色文本前景。
|
bn = 32
|
SGR子字符串,用于在任何内容行前添加字节偏移量。默认值为终端默认背景上的绿色文本前景。
|
se = 36
|
SGR子串为被插入选择的行场间分离器(:),上下文线场之间,( - ),和相邻行的组之间时被指定的非零上下文(- )。默认值为终端默认背景上的青色文本前景。
|
天生
|
布尔值,用于防止每次彩色项目结束时使用向右的行擦除(EL)向右(\ 33 [K)清除到行尾。在不支持EL的终端上需要这样做。否则,在不使用back_color_erase(bce)布尔terminfo功能,选择的突出显示颜色不影响背景,或者EL太慢或引起过多闪烁的终端上,此选项很有用。默认值为false(即,功能被省略)。
|
请注意,布尔功能没有= ...部分。默认情况下将它们省略(即false),并在指定时变为true。
请参阅文本终端文档中的“选择图形呈现(SGR)”部分,该部分用于允许的值及其作为字符属性的含义。这些子字符串值是整数的十进制表示,可以用分号连接起来。grep负责将结果组装成完整的SGR序列(\ 33 [ ... m)。串联的常见值包括1表示粗体,4表示下划线,5表示闪烁,7表示反色,39表示默认前景色,30到37表示前景色,90到97表示16色模式前景色,38; 5; 0到38; 5; 255表示88色和256色模式前景色,49为默认背景色,40至47为背景色,100至107为16色模式背景色,48; 5; 0至48; 5; 255为88色和256色模式背景颜色。
|
LC_ALL,LC_COLLATE,LANG
|
这些变量指定LC_COLLATE类别的语言环境,该语言环境确定用于解释范围表达式(如[az])的整理顺序。
|
LC_ALL,LC_CTYPE,LANG
|
这些变量为LC_CTYPE类别指定区域设置,该区域设置确定字符的类型,例如哪些字符为空格。
|
LC_ALL,LC_MESSAGES,LANG
|
这些变量指定LC_MESSAGES类别的语言环境,该语言环境确定grep用于消息的语言。默认的C语言环境使用美国英语消息。
|
POSIXLY_CORRECT
|
如果设置,则grep的行为与POSIX要求相同;否则,grep的行为将更类似于其他GNU程序。POSIX要求文件名后的选项必须被视为文件名。默认情况下,此类选项被排列在操作数列表的最前面,并被视为选项。同样,POSIX要求将无法识别的选项诊断为“非法”,但是由于它们并非真正违法,因此默认情况下将其诊断为“无效”。POSIXLY_CORRECT还禁用_N_GNU_nonoption_argv_flags_,如下所述。
|
_N_GNU_nonoption_argv_flags_
|
(这里N是grep的数字进程ID。)如果此环境变量的值的第i个字符为1,则即使该grep的第i个操作数似乎为1,也不要将其视为选项。Shell可以为它运行的每个命令将此变量放入环境中,并指定哪些操作数是文件名通配符扩展的结果,因此不应将其视为选项。仅对于GNU C库,并且仅在未设置POSIXLY_CORRECT时,此行为才可用。
|
退出状态
如果找到选定的行,则退出状态为0,如果未找到,则退出状态为1。如果发生错误,则退出状态为2。
General Options
--help
|
Print a help message briefly summarizing command-line options, and exit.
|
-V, --version
|
Print the version number of grep, and exit.
|
Match Selection Options
-E, --extended-regexp
|
Interpret PATTERN as an extended regular expression (see Basic vs. Extended Regular Expressions).
|
-F, --fixed-strings
|
Interpret PATTERN as a list of fixed strings, separated by newlines, that is to be matched.
|
-G, --basic-regexp
|
Interpret PATTERN as a basic regular expression (see Basic vs. Extended Regular Expressions). This is the default option when running grep.
|
-P, --perl-regexp
|
Interpret PATTERN as a Perl regular expression. This functionality is still experimental, and may produce warning messages.
|
Matching Control Options
-e PATTERN, --regexp=PATTERN
|
Use PATTERN as the pattern to match. This can be used to specify multiple search patterns, or to protect a pattern beginning with a dash (-).
|
-f FILE, --file=FILE
|
Obtain patterns from FILE, one per line.
|
-i, --ignore-case
|
Ignore case distinctions in both the PATTERN and the input files.
|
-v, --invert-match
|
Invert the sense of matching, to select non-matching lines.
|
-w, --word-regexp
|
Select only those lines containing matches that form whole words. The test is that the matching substring must either be at the beginning of the line, or preceded by a non-word constituent character. Or, it must be either at the end of the line or followed by a non-word constituent character. Word-constituent characters are letters, digits, and underscores.
|
-x, --line-regexp
|
Select only matches that exactly match the whole line.
|
-y
|
The same as -i.
|
General Output Control
-c, --count
|
Instead of the normal output, print a count of matching lines for each input file. With the -v, --invert-match option (see below), count non-matching lines.
|
--color[=WHEN], --colour[=WHEN]
|
Surround the matched (non-empty) strings, matching lines, context lines, file names, line numbers, byte offsets, and separators (for fields and groups of context lines) with escape sequences to display them in color on the terminal. The colors are defined by the environment variable GREP_COLORS. The older environment variable GREP_COLOR is still supported, but its setting does not have priority. WHEN is never, always, or auto.
|
-L, --files-without-match
|
Instead of the normal output, print the name of each input file from which no output would normally have been printed. The scanning will stop on the first match.
|
-l, --files-with-matches
|
Instead of the normal output, print the name of each input file from which output would normally have been printed. The scanning will stop on the first match.
|
-m NUM, --max-count=NUM
|
Stop reading a file after NUM matching lines. If the input is standard input from a regular file, and NUM matching lines are output, grep ensures that the standard input is positioned to just after the last matching line before exiting, regardless of the presence of trailing context lines. This enables a calling process to resume a search. When grep stops after NUM matching lines, it outputs any trailing context lines. When the -c or --count option is also used, grep does not output a count greater than NUM. When the -v or --invert-match option is also used, grep stops after outputting NUM non-matching lines.
|
-o, --only-matching
|
Print only the matched (non-empty) parts of a matching line, with each such part on a separate output line.
|
-q, --quiet, --silent
|
Quiet; do not write anything to standard output. Exit immediately with zero status if any match is found, even if an error was detected. Also see the -s or --no-messages option.
|
-s, --no-messages
|
Suppress error messages about nonexistent or unreadable files.
|
Output Line Prefix Control
-b, --byte-offset
|
Print the 0-based byte offset within the input file before each line of output. If -o (--only-matching) is specified, print the offset of the matching part itself.
|
-H, --with-filename
|
Print the file name for each match. This is the default when there is more than one file to search.
|
-h, --no-filename
|
Suppress the prefixing of file names on output. This is the default when there is only one file (or only standard input) to search.
|
--label=LABEL
|
Display input actually coming from standard input as input coming from file LABEL. This is especially useful when implementing tools like zgrep, e.g., gzip -cd foo.gz | grep --label=foo -H something. See also the -Hoption.
|
-n, --line-number
|
Prefix each line of output with the 1-based line number within its input file.
|
-T, --initial-tab
|
Make sure that the first character of actual line content lies on a tab stop, so that the alignment of tabs looks normal. This is useful with options that prefix their output to the actual content: -H, -n, and -b. To improve the probability that lines from a single file will all start at the same column, this also causes the line number and byte offset (if present) to be printed in a minimum size field width.
|
-u, --unix-byte-offsets
|
Report Unix-style byte offsets. This switch causes grep to report byte offsets as if the file were a Unix-style text file, i.e., with CR characters stripped off. This will produce results identical to running grep on a Unix machine. This option has no effect unless -b option is also used; it has no effect on platforms other than MS-DOS and MS-Windows.
|
-Z, --null
|
Output a zero byte (the ASCII NUL character) instead of the character that normally follows a file name. For example, grep -lZ outputs a zero byte after each file name instead of the usual newline. This option makes the output unambiguous, even in the presence of file names containing unusual characters like newlines. This option can be used with commands like find -print0, perl -0, sort -z, and xargs -0 to process arbitrary file names, even those that contain newline characters.
|
Context Line Control
-A NUM, --after-context=NUM
|
Print NUM lines of trailing context after matching lines. Places a line containing a group separator (--) between contiguous groups of matches. With the -o or --only-matching option, this has no effect and a warning is given.
|
-B NUM, --before-context=NUM
|
Print NUM lines of leading context before matching lines. Places a line containing a group separator (--) between contiguous groups of matches. With the -o or --only-matching option, this has no effect and a warning is given.
|
-C NUM, -NUM, --context=NUM
|
Print NUM lines of output context. Places a line containing a group separator (--) between contiguous groups of matches. With the -o or --only-matching option, this has no effect and a warning is given.
|
File and Directory Selection
-a, --text
|
Process a binary file as if it were text; this is equivalent to the --binary-files=text option.
|
--binary-files=TYPE
|
If the first few bytes of a file indicate that the file contains binary data, assume that the file is of type TYPE. By default, TYPE is binary, and grep normally outputs either a one-line message saying that a binary file matches, or no message if there is no match. If TYPE is without-match, grep assumes that a binary file does not match; this is equivalent to the -I option. If TYPE is text, grep processes a binary file as if it were text; this is equivalent to the -a option. Warning: grep --binary-files=text might output binary garbage, which can have nasty side effects if the output is a terminal and if the terminal driver interprets some of it as commands.
|
-D ACTION, --devices=ACTION
|
If an input file is a device, FIFO or socket, use ACTION to process it. By default, ACTION is read, which means that devices are read just as if they were ordinary files. If ACTION is skip, devices are silently skipped.
|
-d ACTION, --directories=ACTION
|
If an input file is a directory, use ACTION to process it. By default, ACTION is read, i.e., read directories just as if they were ordinary files. If ACTION is skip, silently skip directories. If ACTION is recurse, read all files under each directory, recursively, following symbolic linksonly if they are on the command line. This is equivalent to the -roption.
|
--exclude=GLOB
|
Skip files whose base name matches GLOB (using wildcard matching). A file-name glob can use *, ?, and [...] as wildcards, and \ to quote a wildcard or backslash character literally.
|
--exclude-from=FILE
|
Skip files whose base name matches any of the file-name globs read from FILE (using wildcard matching as described under --exclude).
|
--exclude-dir=DIR
|
Exclude directories matching the pattern DIR from recursive searches.
|
-I
|
Process a binary file as if it did not contain matching data; this is equivalent to the --binary-files=without-match option.
|
--include=GLOB
|
Search only files whose base name matches GLOB (using wildcard matching as described under --exclude).
|
-r, --recursive
|
Read all files under each directory, recursively, following symbolic links only if they are on the command line. This is equivalent to the -d recurse option.
|
-R, --dereference-recursive
|
Read all files under each directory, recursively. Follow all symbolic links, unlike -r.
|
Other Options
--line-buffered
|
Use line buffering on output. This can cause a performance penalty.
|
--mmap
|
If possible, use the mmap system call to read input, instead of the default readsystem call. In some situations, --mmap yields better performance. However, --mmap can cause undefined behavior (including core dumps) if an input file shrinks while grep is operating, or if an I/O error occurs.
|
-U, --binary
|
Treat the file(s) as binary. By default, under MS-DOS and MS-Windows, grepguesses the file type by looking at the contents of the first 32 KB read from the file. If grep decides the file is a text file, it strips the CR characters from the original file contents (to make regular expressions with ^ and $ work correctly). Specifying -U overrules this guesswork, causing all files to be read and passed to the matching mechanism verbatim; if the file is a text file with CR/LF pairs at the end of each line, this will cause some regular expressions to fail. This option has no effect on platforms other than MS-DOS and MS-Windows.
|
-z, --null-data
|
Treat the input as a set of lines, each terminated by a zero byte (the ASCII NUL character) instead of a newline. Like the -Z or --null option, this option can be used with commands like sort -z to process arbitrary file names.
|
Regular Expressions
A regular expression is a pattern that describes a set of strings. Regular expressions are constructed analogously to arithmetic expressions, by using various operators to combine smaller expressions.
grep understands three different versions of regular expression syntax: "basic" (BRE), "extended" (ERE) and "perl" (PRCE). In GNU grep, there is no difference in available functionality between basic and extended syntaxes. In other implementations, basic regular expressions are less powerful. The following description applies to extended regular expressions; differences for basic regular expressions are summarized afterwards. Perl regular expressions give additional functionality.
The fundamental building blocks are the regular expressions that match a single character. Most characters, including all letters and digits, are regular expressions that match themselves. Any meta-character with special meaning may be quoted by preceding it with a backslash.
The period (.) matches any single character.
Character Classes and Bracket Expressions
A bracket expression is a list of characters enclosed by [ and ]. It matches any single character in that list; if the first character of the list is the caret ^ then it matches any character not in the list. For example, the regular expression [0123456789] matches any single digit.
Within a bracket expression, a range expression consists of two characters separated by a hyphen. It matches any single character that sorts between the two characters, inclusive, using the locale's collating sequence and character set. For example, in the default C locale, [a-d]is equivalent to [abcd]. Many locales sort characters in dictionary order, and in these locales [a-d] is typically not equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example. To obtain the traditional interpretation of bracket expressions, you can use the C locale by setting the LC_ALL environment variable to the value C.
Finally, certain named classes of characters are predefined within bracket expressions, as follows. Their names are self explanatory, and they are [:alnum:], [:alpha:], [:cntrl:], [:digit:], [:graph:], [:lower:], [:print:], [:punct:], [:space:], [:upper:], and [:xdigit:]. For example, [[:alnum:]] means the character class of numbers and letters in the current locale. In the C locale and ASCII character set encoding, this is the same as [0-9A-Za-z]. (Note that the brackets in these class names are part of the symbolic names, and must be included in addition to the brackets delimiting the bracket expression.) Most meta-characters lose their special meaning inside bracket expressions. To include a literal ] place it first in the list. Similarly, to include a literal ^ place it anywhere but first. Finally, to include a literal -, place it last.
Anchoring
The caret ^ and the dollar sign $ are meta-characters that respectively match the empty string at the beginning and end of a line.
The Backslash Character and Special Expressions
The symbols \< and \> respectively match the empty string at the beginning and end of a word. The symbol \b matches the empty string at the edge of a word, and \B matches the empty string provided it's not at the edge of a word. The symbol \w is a synonym for [_[:alnum:]] and \W is a synonym for [^_[:alnum:]].
Repetition
A regular expression may be followed by one of several repetition operators:
?
|
The preceding item is optional and matched at most once.
|
*
|
The preceding item will be matched zero or more times.
|
+
|
The preceding item will be matched one or more times.
|
{n}
|
The preceding item is matched exactly n times.
|
{n,}
|
The preceding item is matched n or more times.
|
{n,m}
|
The preceding item is matched at least n times, but not more than m times.
|
Concatenation
Two regular expressions may be concatenated; the resulting regular expression matches any string formed by concatenating two substringsthat respectively match the concatenated expressions.
Alternation
Two regular expressions may be joined by the infix operator |; the resulting regular expression matches any string matching either alternate expression.
Precedence
Repetition takes precedence over concatenation, which in turn takes precedence over alternation. A whole expression may be enclosed in parentheses to override these precedence rules and form a subexpression.
Back References and Subexpressions
The back-reference \n, where n is a single digit, matches the substring previously matched by the nth parenthesized subexpression of the regular expression.
Basic vs Extended Regular Expressions
In basic regular expressions the meta-characters ?, +, {, |, (, and )lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).
Traditional versions of egrep did not support the { meta-character, and some egrep implementations support \{ instead, so portable scripts should avoid { in grep -E patterns and should use [{] to match a literal {.
GNU grep -E attempts to support traditional usage by assuming that {is not special if it would be the start of an invalid interval specification. For example, the command grep -E '{1' searches for the two-character string {1 instead of reporting a syntax error in the regular expression. POSIX allows this behavior as an extension, but portable scripts should avoid it.
Environment Variables
The behavior of grep is affected by the following environment variables.
The locale for category LC_foo is specified by examining the three environment variables LC_ALL, LC_foo, and LANG, in that order. The first of these variables that is set specifies the locale. For example, if LC_ALL is not set, but LC_MESSAGES is set to pt_BR, then the Brazilian Portuguese locale is used for the LC_MESSAGES category. The C locale is used if none of these environment variables are set, if the locale catalog is not installed, or if grep was not compiled with national language support (NLS).
Other variables of note:
GREP_OPTIONS
|
This variable specifies default options to be placed in front of any explicit options. For example, if GREP_OPTIONS is '--binary- files=without-match --directories=skip', grepbehaves as if the two options --binary-files=without-matchand --directories=skip had been specified before any explicit options. Option specifications are separated by whitespace. A backslash escapes the next character, so it can be used to specify an option containing whitespace or a backslash.
|
GREP_COLOR
|
This variable specifies the color used to highlight matched (non-empty) text. It is deprecated in favor of GREP_COLORS, but still supported. The mt, ms, and mc capabilities of GREP_COLORS have priority over it. It can only specify the color used to highlight the matching non-empty text in any matching line (a selected line when the -v command-line option is omitted, or a context line when -v is specified). The default is 01;31, which means a bold red foreground text on the terminal's default background.
|
GREP_COLORS
|
Specifies the colors and other attributes used to highlight various parts of the output. Its value is a colon-separated list of capabilities that defaults to ms=01;31:mc=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36with the rv and ne boolean capabilities omitted (i.e., false). Supported capabilities are as follows:
sl=
|
SGR substring for whole selected lines (i.e., matching lines when the -v command-line option is omitted, or non-matching lines when -v is specified). However, if the boolean rvcapability and the -v command-line option are both specified, it applies to context matching lines instead. The default is empty (i.e., the terminal's default color pair).
|
cx=
|
SGR substring for whole context lines (i.e., non-matching lines when the -v command-line option is omitted, or matching lines when -v is specified). However, if the boolean rvcapability and the -v command-line option are both specified, it applies to selected non-matching lines instead. The default is empty (i.e., the terminal's default color pair).
|
rv
|
Boolean value that reverses (swaps) the meanings of the sl= and cx= capabilities when the -v command-line option is specified. The default is false (i.e., the capability is omitted).
|
mt=01;31
|
SGR substring for matching non-empty text in any matching line (i.e., a selected line when the -v command-line option is omitted, or a context line when -v is specified). Setting this is equivalent to setting both ms= and mc= at once to the same value. The default is a bold red text foreground over the current line background.
|
ms=01;31
|
SGR substring for matching non-empty text in a selected line. (This is only used when the -vcommand-line option is omitted.) The effect of the sl= (or cx= if rv) capability remains active when this kicks in. The default is a bold red text foreground over the current line background.
|
mc=01;31
|
SGR substring for matching non-empty text in a context line. (This is only used when the -vcommand-line option is specified.) The effect of the cx= (or sl= if rv) capability remains active when this kicks in. The default is a bold red text foreground over the current line background.
|
fn=35
|
SGR substring for file names prefixing any content line. The default is a magenta text foreground over the terminal's default background.
|
ln=32
|
SGR substring for line numbers prefixing any content line. The default is a green text foreground over the terminal's default background.
|
bn=32
|
SGR substring for byte offsets prefixing any content line. The default is a green text foreground over the terminal's default background.
|
se=36
|
SGR substring for separators that are inserted between selected line fields (:), between context line fields, (-), and between groups of adjacent lines when nonzero context is specified (--). The default is a cyan text foreground over the terminal's default background.
|
ne
|
Boolean value that prevents clearing to the end of line using Erase in Line (EL) to Right (\33[K) each time a colorized item ends. This is needed on terminals on which EL is not supported. It is otherwise useful on terminals for which the back_color_erase (bce) boolean terminfo capability does not apply, when the chosen highlight colors do not affect the background, or when EL is too slow or causes too much flicker. The default is false (i.e., the capability is omitted).
|
Note that boolean capabilities have no =... part. They are omitted (i.e., false) by default and become true when specified.
See the Select Graphic Rendition (SGR) section in the documentation of the text terminal that is used for permitted values and their meaning as character attributes. These substring values are integers in decimal representation and can be concatenated with semicolons. grep takes care of assembling the result into a complete SGR sequence (\33[...m). Common values to concatenate include 1 for bold, 4 for underline, 5 for blink, 7 for inverse, 39 for default foreground color, 30 to 37 for foreground colors, 90 to 97 for 16-color mode foreground colors, 38;5;0 to 38;5;255 for 88-color and 256-color modes foreground colors, 49 for default background color, 40 to 47 for background colors, 100 to 107for 16-color mode background colors, and 48;5;0 to 48;5;255for 88-color and 256-color modes background colors.
|
LC_ALL, LC_COLLATE, LANG
|
These variables specify the locale for the LC_COLLATEcategory, which determines the collating sequence used to interpret range expressions like [a-z].
|
LC_ALL, LC_CTYPE, LANG
|
These variables specify the locale for the LC_CTYPE category, which determines the type of characters, e.g., which characters are whitespace.
|
LC_ALL, LC_MESSAGES, LANG
|
These variables specify the locale for the LC_MESSAGEScategory, which determines the language that grep uses for messages. The default C locale uses American English messages.
|
POSIXLY_CORRECT
|
If set, grep behaves as POSIX requires; otherwise, grepbehaves more like other GNU programs. POSIX requires that options that follow file names must be treated as file names; by default, such options are permuted to the front of the operand list and are treated as options. Also, POSIX requires that unrecognized options be diagnosed as "illegal", but since they are not really against the law the default is to diagnose them as "invalid". POSIXLY_CORRECT also disables _N_GNU_nonoption_argv_flags_, described below.
|
_N_GNU_nonoption_argv_flags_
|
(Here N is grep's numeric process ID.) If the ith character of this environment variable's value is 1, do not consider the ith operand of grep to be an option, even if it appears to be one. A shell can put this variable in the environment for each command it runs, specifying which operands are the results of file name wildcard expansion and therefore should not be treated as options. This behavior is available only with the GNU C library, and only when POSIXLY_CORRECT is not set.
|
Exit Status
The exit status is 0 if selected lines are found, and 1 if not found. If an error occurred the exit status is 2.