Creating MD5 digests with Ruby
06 Mar 04
Andrew L. Johnson
MD5 is a one-way hashing algorithm for creating digest
"signatures" or checksums of strings (usually entire files).
Ruby’s standard library includes MD5 as part of the Digest::
set of extension classes.
Creating MD5 checksums is a simple matter of requiring the
digest/md5 library and using either the
Digest::MD5.digest or Digest::MD5.hexdigest class methods
to return the digest of a given string — we will use hexdigests in
this article as they are printable:
#!/usr/bin/ruby -w
require 'digest/md5'
digest = Digest::MD5.hexdigest("Hello World\n")
puts digest
__END__
e59ff97941044f85df5297e1c302d260
MD5 digests are 128 bit (16 byte) signatures and are the most common method
of providing checksums for files available on the net. To create a checksum
of an entire file you need only pass in the file as a string. The following
will print out the filename and md5 digest of all the files passed to it on
the command line:
#!/usr/bin/ruby -w
require 'digest/md5'
ARGV.each do |f|
digest = Digest::MD5.hexdigest(File.read(f))
puts "#{f}: #{digest}"
end
__END__
Sometimes you want to do more than just calculate the checksum of a single
string — maybe you have a large file and want to calculate the digest
in small, memory friendly chunks; or maybe you are calculating a digest
from a stream of input. In such cases you can create a Digest::MD5 object
and use the #update (alias: #<<), digest,
and #hexdigest methods.
For example purposes, we will create a digest by reading and adding one
line at a time from a test file as well as calculating the digest all at
once. I will use the source for this article as the test-file:
#!/usr/bin/ruby -w
require 'digest/md5'
filename = 'MD5.rdoc'
all_digest = Digest::MD5.hexdigest(File.read(filename))
incr_digest = Digest::MD5.new()
file = File.open(filename, 'r')
file.each_line do |line|
incr_digest << line
end
puts incr_digest.hexdigest
puts all_digest
__END__
Saving this as md.rb and running it produced the following output:
~$ ruby md.rb
a075a1debea63e5d7073d9eed19ce031
a075a1debea63e5d7073d9eed19ce031
In addition to providing checksums for files you make available, or
checking files and packages you’ve downloaded, another use is
fingerprinting sensitive directories on your system. Creating a database of
MD5 digests for sensitive files or directories means you can periodically
cross-check your sensitive data against the database to see if anything has
been changed without your knowledge. This can provide you with a very
simple to implement addition to your intrusion detection tools.