cut OPTION... [FILE]...
选件
-b, --bytes=LIST
|
按照LIST中的指定,仅从每行中选择字节。LIST指定一个字节,一组字节或一个字节范围;请参阅下面的指定列表。
|
-c, --characters=LIST
|
按照LIST中的指定,仅从每一行中选择字符。LIST指定一个字符,一组字符或一个字符范围;请参阅下面的指定列表。
|
-d, --delimiter=DELIM
|
使用字符DELIM,而不是一个标签的领域 分隔符。
|
-f, --fields=LIST
|
每行仅选择这些字段;除非指定-s选项,否则还将打印任何不包含定界符的行。LIST指定一个字段,一组字段或一系列字段;请参阅下面的指定列表。
|
-n
|
该选项将被忽略,但出于兼容性原因而被包括在内。
|
--complement
|
补充所选字节,字符或字段的集合。
|
-s, --only-delimited
|
不要打印不包含定界符的行。
|
--output-delimiter=STRING
|
使用STRING作为输出定界符字符串。默认为使用输入定界符。
|
--help
|
显示帮助消息并退出。
|
--version
|
输出版本信息并退出。
|
使用说明
调用cut时,请使用-b,-c或-f选项,但只能使用其中之一。
如果未指定FILE,则cut从标准输入读取。
指定清单
每个LIST由一个整数,一个整数范围或多个以逗号分隔的整数范围组成。所选输入的写入顺序与读取的顺序相同,并且仅写入一次即可输出。范围包括:
N
|
N从第1个字节开始计数的第N个字节,字符或字段。
|
N-
|
从第N个字节,字符或字段到行尾的N-。
|
N-M
|
第N至第M个字节,字符或字段(含)的N-M。
|
-M
|
从第一个到第M个字节,字符或字段。
|
例如,假设您有一个名为data.txt的文件,其中包含以下文本:
one two three four five
alpha beta gamma delta epsilon
在此示例中,这些单词中的每个单词都由制表符而不是空格分隔。制表符是cut的默认分隔符,因此默认情况下它将认为字段是由制表符分隔的任何内容。
要仅“剪切”每行的第三个字段,请使用以下命令:
cut -f 3 data.txt
...将输出以下内容:
three
gamma
相反,如果您只想“剪切”每行的第二至第四字段,请使用以下命令:
cut -f 2-4 data.txt
...将输出以下内容:
two three four
beta gamma delta
如果要仅“剪切”每行的第一至第二和第四至第五字段(省略第三字段),请使用以下命令:
cut -f 1-2,4-5 data.txt
...将输出以下内容:
one two four five
alpha beta delta epsilon
或者,假设您要第三个字段及其后的每个字段,而忽略前两个字段。在这种情况下,您可以使用以下命令:
cut -f 3- data.txt
...将输出以下内容:
three four five
gamma delta epsilon
使用LIST指定范围还适用于从一行中剪切字符(-c)或字节(-b)。例如,要仅输出data.txt每行的第三到第十二个字符,请使用以下命令:
cut -c 3-12 data.txt
...将输出以下内容:
e two thre
pha beta g
请记住,每个单词之间的“空格”实际上是一个制表符,因此输出的两行都显示十个字符:八个字母数字字符和两个制表符。换句话说,cut省略了每行的前两个字符,将制表符视为一个字符。输出3到12个字符,每个制表符作为一个字符计数;并删除第十二个字符。
计数字节而不是字符将导致在这种情况下相同的输出,因为在一个ASCII - 编码的文本文件中,每个字符由数据的单个字节(8位)表示。所以命令:
cut -b 3-12 data.txt
...将为我们的文件data.txt产生完全相同的输出:
e two thre
pha beta g
指定制表符以外的定界符
制表符是cut用来确定构成字段的默认分隔符。因此,如果文件的字段已由制表符分隔,则无需指定其他分隔符。
但是,您可以指定任何字符作为分隔符。例如,文件/ etc / passwd包含有关系统上每个用户的信息,每行一个用户,并且每个信息字段均以冒号(“ : ”)分隔。例如,线/ etc / passwd中为根用户可能看起来像这样:
root:x:0:0:root:/root:/bin/bash
这些字段按以下顺序包含以下信息,并用冒号分隔:
-
用户名
-
密码(如果加密,则显示为x)
-
用户ID号(UID)
-
组ID号(GID)
-
注释字段(由finger命令使用)
-
主目录
-
Shell
用户名是该行的第一个字段,因此要显示系统上的每个用户名,请使用以下命令:
cut -f 1 -d ':' /etc/passwd
...将输出,例如:
root
daemon
bin
sys
chope
(在一个典型的系统上,有更多的用户帐户,包括许多特定于系统服务的帐户,但是在此示例中,我们假设只有五个用户。)
/ etc / passwd文件中每行的第三个字段是UID(用户ID号),因此要显示每个用户名和用户ID号,请使用以下命令:
cut -f 1,3 -d ':' /etc/passwd
...这将输出以下内容,例如:
root:0
daemon:1
bin:2
sys:3
chope:1000
如您所见,默认情况下,将使用为输入指定的相同分隔符来分隔输出。在这种情况下,这就是冒号(“ : ”)。但是,您可以为输入和输出指定其他定界符。因此,如果您想运行前面的命令,但输出用空格分隔,则可以使用以下命令:
cut -f 1,3 -d ':' --output-delimiter=' ' /etc/passwd
root 0
daemon 1
bin 2
sys 3
chope 1000
但是,如果您希望输出由制表符分隔怎么办?在命令行上指定制表符比较复杂,因为它是不可打印的字符。要在命令行上指定它,必须从外壳“保护”它。根据您使用的外壳,此操作的执行方法有所不同,但是在Linux默认外壳(bash)中,可以使用$'\ t'指定制表符。所以命令:
cut -f 1,3 -d ':' --output-delimiter=$'\t' /etc/passwd
...将输出以下内容,例如:
root 0
daemon 1
bin 2
sys 3
chope 1000
cut OPTION... [FILE]...
Options
-b, --bytes=LIST
|
Select only the bytes from each line as specified in LIST. LIST specifies a byte, a set of bytes, or a range of bytes; see Specifying LIST below.
|
-c, --characters=LIST
|
Select only the characters from each line as specified in LIST. LIST specifies a character, a set of characters, or a range of characters; see Specifying LIST below.
|
-d, --delimiter=DELIM
|
use character DELIM instead of a tab for the field delimiter.
|
-f, --fields=LIST
|
select only these fields on each line; also print any line that contains no delimiter character, unless the -s option is specified. LIST specifies a field, a set of fields, or a range of fields; see Specifying LIST below.
|
-n
|
This option is ignored, but is included for compatibility reasons.
|
--complement
|
complement the set of selected bytes, characters or fields.
|
-s, --only-delimited
|
do not print lines not containing delimiters.
|
--output-delimiter=STRING
|
use STRING as the output delimiter string. The default is to use the input delimiter.
|
--help
|
Display a help message and exit.
|
--version
|
output version information and exit.
|
Usage Notes
When invoking cut, use the -b, -c, or -f option, but only one of them.
If no FILE is specified, cut reads from the standard input.
Specifying LIST
Each LIST is made up of an integer, a range of integers, or multiple integer ranges separated by commas. Selected input is written in the same order that it is read, and is written to output exactly once. A range consists of:
N
|
the Nth byte, character, or field, counted from 1.
|
N-
|
from the Nth byte, character, or field, to the end of the line.
|
N-M
|
from the Nth to the Mth byte, character, or field (inclusive).
|
-M
|
from the first to the Mth byte, character, or field.
|
For example, let's say you have a file named data.txt which contains the following text:
one two three four five
alpha beta gamma delta epsilon
In this example, each of these words is separated by a tab character, not spaces. The tab character is the default delimiter of cut, so it will by default consider a field to be anything delimited by a tab.
To "cut" only the third field of each line, use the command:
cut -f 3 data.txt
...which will output the following:
three
gamma
If instead you want to "cut" only the second-through-fourth field of each line, use the command:
cut -f 2-4 data.txt
...which will output the following:
two three four
beta gamma delta
If you want to "cut" only the first-through-second and fourth-through-fifth field of each line (omitting the third field), use the command:
cut -f 1-2,4-5 data.txt
...which will output the following:
one two four five
alpha beta delta epsilon
Or, let's say you want the third field and every field after it, omitting the first two fields. In this case, you could use the command:
cut -f 3- data.txt
...which will output the following:
three four five
gamma delta epsilon
Specifying a range with LIST also applies to cutting characters (-c) or bytes (-b) from a line. For example, to output only the third-through-twelfth character of every line of data.txt, use the command:
cut -c 3-12 data.txt
...which will output the following:
e two thre
pha beta g
Remember that the "space" in between each word is actually a single tab character, so both lines of output are displaying ten characters: eight alphanumeric characters and two tab characters. In other words, cut is omitting the first two characters of each line, counting tabs as one character each; outputting characters three through twelve, counting tabs as one character each; and omitting any characters after the twelfth.
Counting bytes instead of characters will result in the same output in this case, because in an ASCII-encoded text file, each character is represented by a single byte (eight bits) of data. So the command:
cut -b 3-12 data.txt
...will, for our file data.txt, produce exactly the same output:
e two thre
pha beta g
Specifying A Delimiter Other Than Tab
The tab character is the default delimiter that cut uses to determine what constitutes a field. So, if your file's fields are already delimited by tabs, you don't need to specify a different delimiter character.
You can specify any character as the delimiter, however. For instance, the file /etc/passwd contains information about each user on the system, one user per line, and each information field is delimited by a colon (":"). For example, the line of /etc/passwd for the root user may look like this:
root:x:0:0:root:/root:/bin/bash
These fields contain the following information, in the following order, separated by a colon character:
-
Username
-
Password (shown as x if encrypted)
-
User ID number (UID)
-
Group ID number (GID)
-
Comment field (used by the finger command)
-
Home Directory
-
Shell
The username is the first field on the line, so to display each username on the system, use the command:
cut -f 1 -d ':' /etc/passwd
...which will output, for example:
root
daemon
bin
sys
chope
(There are many more user accounts on a typical system, including many accounts specific to system services, but for this example we will pretend there are only five users.)
The third field of each line in the /etc/passwd file is the UID (user ID number), so to display each username and user ID number, use the command:
cut -f 1,3 -d ':' /etc/passwd
...which will output the following, for example:
root:0
daemon:1
bin:2
sys:3
chope:1000
As you can see, the output will be delimited, by default, using the same delimiter character specified for the input. In this case, that's the colon character (":"). You can specify a different delimiter for the input and output, however. So, if you wanted to run the previous command, but have the output delimited by a space, you could use the command:
cut -f 1,3 -d ':' --output-delimiter=' ' /etc/passwd
root 0
daemon 1
bin 2
sys 3
chope 1000
But what if you want the output to be delimited by a tab? Specifying a tab character on the command line is a bit more complicated, because it is an unprintable character. To specify it on the command line, you must "protect" it from the shell. This is done differently depending on which shell you're using, but in the Linux default shell (bash), you can specify the tab character with $'\t'. So the command:
cut -f 1,3 -d ':' --output-delimiter=$'\t' /etc/passwd
...will output the following, for example:
root 0
daemon 1
bin 2
sys 3
chope 1000