How can Python compare the differences between two files?
To compare the differences between two files, you can use the following steps:
- Open two files and read their contents line by line.
- Store the content of each file in two distinct lists.
- Compare these two lists using the SequenceMatcher class in the difflib module.
- Use the get_opcodes() method to obtain a list of opcodes that describe how to convert one list into another.
- Traverse through the list of operation codes and examine the type of each operation code.
- If the type of operation code is ‘replace’, it means that the two files are different on that line.
- If the opcode type is ‘delete’, it means that the first file has additional content on that line.
- If the opcode type is ‘insert’, it means that the second file has additional content on that line.
- Print out different lines and save them to a new file.
Here is a sample code:
import difflib
def compare_files(file1, file2, output_file):
with open(file1, 'r') as f1, open(file2, 'r') as f2:
lines1 = f1.readlines()
lines2 = f2.readlines()
differ = difflib.SequenceMatcher(None, lines1, lines2)
opcodes = differ.get_opcodes()
with open(output_file, 'w') as output:
for opcode, start1, end1, start2, end2 in opcodes:
if opcode == 'replace':
output.write(f'Different line in file1: {lines1[start1:end1]}')
output.write(f'Different line in file2: {lines2[start2:end2]}')
elif opcode == 'delete':
output.write(f'Extra line in file1: {lines1[start1:end1]}')
elif opcode == 'insert':
output.write(f'Extra line in file2: {lines2[start2:end2]}')
file1 = 'file1.txt'
file2 = 'file2.txt'
output_file = 'diff.txt'
compare_files(file1, file2, output_file)
The code above compares the content of two files, file1.txt and file2.txt, and saves the different lines in a file called diff.txt. You can modify the file names and paths as needed.