假设您有两个文本文件,recipe.txt和shopping-list.txt。
recipe.txt包含以下几行:
All-Purpose Flour
Baking Soda
Bread
Brown Sugar
Chocolate Chips
Eggs
Milk
Salt
Vanilla Extract
White Sugar
而shopping-list.txt,LIST.TXT包含这些行:
All-Purpose Flour
Bread
Brown Sugar
Chicken Salad
Chocolate Chips
Eggs
Milk
Onions
Pickles
Potato Chips
Soda Pop
Tomatoes
White Sugar
如您所见,这两个文件是不同的,但是许多行是相同的。并非所有配方成分都在购物清单中,并且并非购物清单中的所有内容都是配方的一部分。
如果我们在两个文件上运行comm命令,它将读取两个文件并为我们提供三列输出:
comm recipe.txt shopping-list.txt
All-Purpose Flour
Baking Soda
Bread
Brown Sugar
Chicken Salad
Chocolate Chips
Eggs
Milk
Onions
Pickles
Potato Chips
Salt
Soda Pop
Tomatoes
Vanilla Extract
White Sugar
在这里,输出的每一行在开头都有零,一或两个制表符,将输出分为三列:
-
第一列(零制表符)是仅出现在第一个文件中的行。
-
第二列(一个选项卡)是仅出现在第二个文件中的行。
-
第三列(两个选项卡)是出现在两个文件中的行。
(列在视觉上重叠,因为在这种情况下,我们的终端将标签页打印为八个空格。在屏幕上看起来可能有所不同。)
接下来,让我们看看如何将分离的数据导入电子表格。
为电子表格创建CSV文件
一种使用comm的有用方法是输出到CSV文件,然后可以通过电子表格程序读取该文件。CSV文件只是使用特定字符(通常是逗号,制表符或分号)的文本文件,以可以作为电子表格读取的方式来分隔数据。按照惯例,CSV文件名的扩展名为 .csv。
例如,我们运行相同的命令,但是这次我们使用>运算符将输出重定向到名为output.csv的文件:
comm recipe.txt shopping-list.txt > output.csv
这次屏幕上没有输出。而是将输出发送到名为output.csv的文件。要检查它是否工作正常,我们可以猫的内容output.csv:
cat output.csv
All-Purpose Flour
Baking Soda
Bread
Brown Sugar
Chicken Salad
Chocolate Chips
Eggs
Milk
Onions
Pickles
Potato Chips
Salt
Soda Pop
Tomatoes
Vanilla Extract
White Sugar
要将这些数据导入电子表格,我们可以在LibreOffice Calc中将其打开:
在打开文件之前,LibreOffice会问我们如何解释文件数据。
我们希望列定界符为制表符,默认情况下已选中。(我们的数据中没有逗号或分号,因此我们不必担心其他复选框。)根据给定的选项,它还为我们提供了数据外观的预览。
一切看起来不错,因此我们可以单击“确定”,LibreOffice会将数据导入电子表格。
现在,如果需要,我们可以将电子表格保存为其他格式,例如Microsoft Excel文件,XML文件或HTML。
抑制列
如果只想输出特定的列,则可以在命令中指定要取消显示的列号,并在前面加一个破折号。例如,此命令将取消显示第1列和第2列,仅显示第3列(两个文件共享的行)。这将隔离购物清单中也是食谱一部分的项目:
comm -12 recipe.txt shopping-list.txt
All-Purpose Flour
Bread
Brown Sugar
Chocolate Chips
Eggs
Milk
White Sugar
下一条命令将取消显示第2列和第3列,仅显示第1列-配方中不在购物清单中的行。这向我们展示了橱柜中已经拥有的成分:
comm -23 recipe.txt shopping-list.txt
Baking Soda
Salt
Vanilla Extract
接下来的命令将取消显示第3列,仅显示第1列和第2列-食谱中不在购物清单中的项目和购物清单中不在食谱中的项目,它们分别在各自的列中。
comm -3 recipe.txt shopping-list.txt
Baking Soda
Chicken Salad
Onions
Pickles
Potato Chips
Salt
Soda Pop
Tomatoes
Vanilla Extract
Let's say you have two text files, recipe.txt and shopping-list.txt.
recipe.txt contains these lines:
All-Purpose Flour
Baking Soda
Bread
Brown Sugar
Chocolate Chips
Eggs
Milk
Salt
Vanilla Extract
White Sugar
And shopping-list.txt contains these lines:
All-Purpose Flour
Bread
Brown Sugar
Chicken Salad
Chocolate Chips
Eggs
Milk
Onions
Pickles
Potato Chips
Soda Pop
Tomatoes
White Sugar
As you can see, the two files are different, but many of the lines are the same. Not all of the recipe ingredients are on the shopping list, and not everything on the shopping list is part of the recipe.
If we run the comm command on the two files, it will read both files and give us three columns of output:
comm recipe.txt shopping-list.txt
All-Purpose Flour
Baking Soda
Bread
Brown Sugar
Chicken Salad
Chocolate Chips
Eggs
Milk
Onions
Pickles
Potato Chips
Salt
Soda Pop
Tomatoes
Vanilla Extract
White Sugar
Here, each line of output has either zero, one, or two tabs at the beginning, separating the output into three columns:
-
The first column (zero tabs) is lines that only appear in the first file.
-
The second column (one tab) is lines that only appear in the second file.
-
The third column (two tabs) is lines that appear in both files.
(The columns overlap visually because in this case, our terminal prints a tab as eight spaces. It might look different on your screen.)
Next, let's look at how we can bring our separated data into a spreadsheet.
Creating a CSV file for spreadsheets
One useful way to use comm is to output to a CSV file, which can then be read by a spreadsheet program. CSV files are just text files that use a certain character, usually a comma, tab, or semicolon, to delimit data in a way that can be read as a spreadsheet. By convention, CSV file names have the extension .csv.
For instance, let's run the same command, but this time let's redirect the output to a file called output.csv by using the > operator:
comm recipe.txt shopping-list.txt > output.csv
This time there is no output on the screen. Instead, output is sent to a file called output.csv. To check that it worked correctly, we can cat the contents of output.csv:
cat output.csv
All-Purpose Flour
Baking Soda
Bread
Brown Sugar
Chicken Salad
Chocolate Chips
Eggs
Milk
Onions
Pickles
Potato Chips
Salt
Soda Pop
Tomatoes
Vanilla Extract
White Sugar
To bring this data into a spreadsheet, we can open it in LibreOffice Calc.
Before it opens the file, LibreOffice asks us how to interpret the file data.
We want the column delimiter to be tab characters, which is already checked by default. (There are no commas or semicolons in our data, so we don't have to worry about the other checkboxes.) It also gives us a preview of how the data will look, given the options we selected.
Everything looks good, so we can click OK, and LibreOffice will import our data into a spreadsheet.
Now if we wanted to, we could save the spreadsheet in another format such as a Microsoft Excel file, or an XML file, or even HTML.
Suppressing columns
If you only want to output specific columns, you can specify the column numbers to suppress in the command, preceded by a dash. For instance, this command will suppress columns 1 and 2, displaying only column 3 — lines shared by both files. This isolates the items on the shopping list that are also part of the recipe:
comm -12 recipe.txt shopping-list.txt
All-Purpose Flour
Bread
Brown Sugar
Chocolate Chips
Eggs
Milk
White Sugar
The next command will suppress columns 2 and 3, displaying only column 1 — lines in the recipe that are not in the shopping list. This shows us what ingredients we already have in our cupboard:
comm -23 recipe.txt shopping-list.txt
Baking Soda
Salt
Vanilla Extract
And the next command will suppress column 3, displaying only columns 1 and 2 — the items in the recipe that are not on the shopping list, and the items on the shopping list that are not in the recipe, each in their own column.
comm -3 recipe.txt shopping-list.txt
Baking Soda
Chicken Salad
Onions
Pickles
Potato Chips
Salt
Soda Pop
Tomatoes
Vanilla Extract
未知的网友