<?xml version="1.0" encoding="UTF-8"?><rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>TextSearch &amp;mdash; Paul Sutton</title>
    <link>https://personaljournal.ca/paulsutton/tag:TextSearch</link>
    <description>Personal Blog</description>
    <pubDate>Tue, 05 May 2026 16:29:10 +0000</pubDate>
    <item>
      <title>Bash scripting 12 – Files and Grep</title>
      <link>https://personaljournal.ca/paulsutton/bash-scripting-12-files-and-grep</link>
      <description>&lt;![CDATA[Bash scripting 12 – Files and Grep&#xA;&#xA;Rather than make a video for this,  I decided to just make a blog post so that I could include downloadable or at least copy / pasteable components.&#xA;&#xA;Grep stands for GNU Regular Expression Parser,  In essence and among other things, it can read (or parse) a file and report on contents, or in the case of this,  find a specific string of text.&#xA;&#xA;lorem Ipsum, is standard in the printing industry as it is dummy text used to fill on a page.  I have pasted below an example and it just happens to explain further.&#xA;&#xA;Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry&#39;s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.~&#xA;&#xA;If you copy and paste the above and save in a text file called lorem.txt we can do some neat stuff with grep and a few other commands.  I am not an expert at this, so this is some of what I picked up while researching this post.&#xA;&#xA;Firstly we can find a specific word (or string) in the text with&#xA;&#xA;cat lorem.txt | grep the&#xA;we can also do&#xA;grep the lorem.txt&#xA;&#xA;Both will search the file lorem.txt, display the file contents and highlight the word &#39;the&#39; from the text.&#xA;&#xA;This is great, so what else can we do&#xA;&#xA;In a short file, the number of times a word may appear may be less than 5 or 10.  So we could just count manually.  As discussed in a previous video, the command wc or word count, does what it says and counts the number of words.&#xA;&#xA;So I found the following&#xA;&#xA;cat lorem.txt | grep -o the | wc -l&#xA;Which gives the output as 6 which is how many times the word &#39;the&#39; appears in the text.&#xA;&#xA;As with other commands, there is a man page so &#xA;&#xA;man grep&#xA;and&#xA;man wc&#xA;Should provide useful information, you can also search for information with duckduckgo and there are numberous tutorials on line. &#xA;&#xA;Hope this is useful&#xA;&#xA;Chat&#xA;&#xA;I am on the Devon and Cornwall Linux user group mailing list and also their matrix channel as zleap, it is better to ask there,  that way others can answer too.&#xA;&#xA;Tags&#xA;&#xA;#Bash,#Bashscripting,#Files,#TextSearch,#StringSearch,#Grep,#wc,#WordCount&#xA;&#xA;hr&#xD;&#xA;&#xD;&#xA;table&#xD;&#xA;thead&#xD;&#xA;trtda rel=&#34;me&#34; href=&#34;https://qoto.org/@zleap&#34;Mastodon/a/td&#xD;&#xA;tda href=&#34;https://wiki.ircnow.org/?n=Shelllabs.Intro&#34;ShellLabs/td&#xD;&#xA;tda href=&#34;https://joinmastodon.org/&#34;Join Mastodon/a/td/tr/thead/table&#xD;&#xA;center&#xD;&#xA;AI statement : b Consent is NOT granted to use the content of this blog for the purposes of AI training or similar activity.  Consent CANNOT be assumed, it has to be granted. /b&#xD;&#xA;/center&#xD;&#xA;&#xD;&#xA;a href=&#34;https://liberapay.com/PaulSutton/donate&#34;img alt=&#34;Donate using Liberapay&#34; src=&#34;https://liberapay.com/assets/widgets/donate.svg&#34;/a&#xD;&#xA;]]&gt;</description>
      <content:encoded><![CDATA[<p>Bash scripting 12 – Files and Grep</p>

<p>Rather than make a video for this,  I decided to just make a blog post so that I could include downloadable or at least copy / pasteable components.</p>

<p>Grep stands for GNU Regular Expression Parser,  In essence and among other things, it can read (or parse) a file and report on contents, or in the case of this,  find a specific string of text.</p>

<p>lorem Ipsum, is standard in the printing industry as it is dummy text used to fill on a page.  I have pasted below an example and it just happens to explain further.</p>

<pre><code>Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry&#39;s standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.~

</code></pre>

<p>If you copy and paste the above and save in a text file called lorem.txt we can do some neat stuff with grep and a few other commands.  I am not an expert at this, so this is some of what I picked up while researching this post.</p>

<p>Firstly we can find a specific word (or string) in the text with</p>

<pre><code>cat lorem.txt | grep the
</code></pre>

<p>we can also do</p>

<pre><code>grep the lorem.txt
</code></pre>

<p>Both will search the file lorem.txt, display the file contents and highlight the word &#39;the&#39; from the text.</p>

<p>This is great, so what else can we do</p>

<p>In a short file, the number of times a word may appear may be less than 5 or 10.  So we could just count manually.  As discussed in a previous video, the command <strong>wc</strong> or word count, does what it says and counts the number of words.</p>

<p>So I found the following</p>

<pre><code>cat lorem.txt | grep -o the | wc -l
</code></pre>

<p>Which gives the output as <em>6</em> which is how many times the word &#39;the&#39; appears in the text.</p>

<p>As with other commands, there is a man page so</p>

<pre><code>man grep
</code></pre>

<p>and</p>

<pre><code>man wc
</code></pre>

<p>Should provide useful information, you can also search for information with <a href="https://www.duckduckgo.com" rel="nofollow">duckduckgo</a> and there are numberous tutorials on line.</p>

<p>Hope this is useful</p>

<p><strong>Chat</strong></p>

<p>I am on the Devon and Cornwall Linux user group mailing list and also their <a href="https://matrix.to/#/%23dcglug:matrix.org" rel="nofollow">matrix channel</a> as zleap, it is better to ask there,  that way others can answer too.</p>

<p><strong>Tags</strong></p>

<p><a href="/paulsutton/tag:Bash" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">Bash</span></a>,<a href="/paulsutton/tag:Bashscripting" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">Bashscripting</span></a>,<a href="/paulsutton/tag:Files" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">Files</span></a>,<a href="/paulsutton/tag:TextSearch" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">TextSearch</span></a>,<a href="/paulsutton/tag:StringSearch" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">StringSearch</span></a>,<a href="/paulsutton/tag:Grep" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">Grep</span></a>,<a href="/paulsutton/tag:wc" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">wc</span></a>,<a href="/paulsutton/tag:WordCount" class="hashtag" rel="nofollow"><span>#</span><span class="p-category">WordCount</span></a></p>

<hr>

<p><table>
<thead>
<tr><td><a href="https://qoto.org/@zleap" rel="nofollow">Mastodon</a></td>
<td><a href="https://wiki.ircnow.org/?n=Shelllabs.Intro" rel="nofollow">ShellLabs</td>
<td><a href="https://joinmastodon.org/" rel="nofollow">Join Mastodon</a></td></tr></thead></table>

AI statement : <b> Consent is NOT granted to use the content of this blog for the purposes of AI training or similar activity.  Consent CANNOT be assumed, it has to be granted. </b>
</p>

<p><a href="https://liberapay.com/PaulSutton/donate" rel="nofollow"><img alt="Donate using Liberapay" src="https://liberapay.com/assets/widgets/donate.svg"></a></p>
]]></content:encoded>
      <guid>https://personaljournal.ca/paulsutton/bash-scripting-12-files-and-grep</guid>
      <pubDate>Fri, 11 Apr 2025 14:51:20 +0000</pubDate>
    </item>
  </channel>
</rss>