如何查找包含特定单词的文本文件(不在其名称中)

问题描述

我想在我的硬盘上找到一个包含特定单词的文本文件。

在Ubuntu 12.4之前，我曾经在短跑应用程序中开始，我认为它被称为“搜索文件…”，其图标是放大镜。我无法再找到这个简单的应用程序。

最佳解决方案

您可以使用终端上的grep命令：

 grep -r word *

该命令将在当前目录(或子目录)下的所有文件中查找所有”word”。

次佳解决方案

安装gnome-search-tool。

sudo apt-get install gnome-search-tool

打开Search for files选择Select More Options和

files,text-processing,ubuntu

第三种解决方案

问题是相当古老的…无论如何…目前(2016年)有一个名为tracker的gnome应用程序(可以在ubuntu存储库中找到它)，可以安装它来搜索文件内的文本(试用odt-ods-odp-pdf)。该软件包附带4个其他要安装的软件包(tracker-extract，tracker-gui，tracker-miner-fs，tracker-utils)Namastè:)

第四种方案

是的，我知道你正在寻找gui应用程序，这是旧帖子，但也许这有助于某人。我找到了ack-grep util。首先通过sudo apt-get install ack-grep安装它，然后在要搜索的目录中运行命令ack-grep what_you_looking_for。这会显示您的文本中的所有文件，并且还会显示此文件的预览。这对我来说非常重要。

第五种方案

以下是可用于搜索特定文本字符串的文件的几种不同方法的概述，其中一些选项专门添加为仅用于文本文件，而忽略二进制文件/应用程序文件。

但是，应该注意的是，搜索单词可能会有点复杂，因为大多数line-matching工具都会尝试在该行的任何位置找到单词。如果我们将一个单词作为可能出现在行首或行尾的字符串，或单独在行中，或由空格和/或标点符号包围 – 这就是我们需要正则表达式的时候，尤其是那些来来自Perl。在这里，例如，我们可以使用grep中的-P来使用Perl正则表达式来包围它。

$ printf "A-well-a don't you know about the bird?\nWell, everybody knows that the bird is a word" | grep -noP '\bbird\b'                                               
1:bird
2:bird

简单的grep

$ grep -rIH  'word'

-r用于从当前目录递归搜索
-I忽略二进制文件
-H输出找到匹配的文件名

仅适用于搜索。

找到+ grep

$ find -type f -exec grep -IH 'word' {} \;

find执行递归搜索部分
-I选项是忽略二进制文件
-H输出找到行的文件名

在子shell中与其他命令结合的好方法，如：

$ find -type f -exec sh -c 'grep -IHq "word" "$1" && echo "Found in $1"' sh {} \;

Perl

#!/usr/bin/env perl
use File::Find;
use strict;
use warnings;

sub find_word{
    return unless -f;
    if (open(my $fh, $File::Find::name)){
        while(my $line = <$fh>){
            if ($line =~ /\bword\b/){
                printf "%s\n", $File::Find::name;
                close($fh);
                return;
            }
        }
    }
}

# this assumes we're going down from current working directory
find({ wanted => \&find_word, no_chdir => 1 },".")

递归bash脚本中的poor-mans递归grep

这是”bash way”。如果您安装了grep或perl，则可能没有理由使用此功能。

#!/usr/bin/env bash
shopt -s globstar
#set -x
grep_line(){
    # note that this is simple pattern matching 
    # If we wanted to search for whole words, we could use
    # word|word\ |\ word|\ word\ )
    # although when we consider punctuation characters as well - it gets more
    # complex
    case "$1" in
        *word*) printf "%s\n" "$2";;
    esac
}
readlines(){
    #  line count variable can be used to output on which line match occured

    #line_count=1
    while IFS= read -r line;
    do
        grep_line "$line" "$filename"
        #line_count=$(($line_count+1))
    done < "$1"
}

is_text_file(){
    # alternatively, mimetype command could be used
    # with *\ text\/* as pattern in case statement
    case "$(file -b --mime-type "$1")" in
        text\/*) return 0;;
        *) return 1;;
    esac
}

main(){
    for filename in ./**/*
    do
        if [ -f "$filename" ] && is_text_file "$filename"
        then
            readlines "$filename"
        fi
    done
}
main "$@"

参考资料

How to find a text file which contains a specific word inside (not in its name)