Skip to main content

Binary Files & Encoding

One of the most frustrating errors in grep occurs when searching a directory and your terminal suddenly floods with:

Binary file /var/log/wtmp matches
Binary file /usr/bin/python matches

Why This Happens

When grep opens a file, it reads the first few bytes to determine if the file is text or binary. If it encounters a null byte (\0), it assumes the file is binary.

If grep finds your search pattern inside a binary file, it refuses to print the matched line. Printing raw binary data to a terminal can corrupt the terminal's character set, requiring a reset command to fix. Instead, grep simply prints the "Binary file matches" warning.

Solution 1: Ignore Binary Files (-I)

If you are searching a codebase or a log directory, you probably don't care about matches inside .jpg files or compiled .so libraries.

The -I (capital i) flag tells grep to skip binary files completely.

grep -r -I "error" /project/

Solution 2: Force Text Extraction (-a)

Sometimes, a text file gets corrupted with a single null byte, or a log file uses a strange encoding (like UTF-16), causing grep to mistakenly classify it as binary.

If you know the file contains readable text and you want to force grep to process and print it anyway, use the -a (--text) flag.

# Force grep to read the binary systemd journal as text
grep -a "failed" /var/log/journal/*

(Note: A better way to search the systemd journal is using journalctl | grep).

Solution 3: The Encoding Problem

If you are given a file created on a Windows machine, it might be encoded in UTF-16. grep natively expects UTF-8 or ASCII.

If you run grep "text" windows_file.txt, it will likely fail or treat the file as binary because UTF-16 inserts null bytes between English characters.

The Fix: You must convert the file encoding before searching.

# Use iconv to convert UTF-16 to UTF-8, then pipe to grep
iconv -f UTF-16 -t UTF-8 windows_file.txt | grep "text"