Markov Chains in Ruby
Writing text and playing with text is all good fun. Using Markov Chains one can generate text that is almost readable, almost understandable, but not quite. Markov Chains analyses the frequency of words in relation to another.
Analysing text basically means counting which words follow each other. Producing text is done by picking a word and then looking up which word has come up after that in the text samples, picking one and then using this new word as the basis for determining the next one.
For a game project we are working on, I need this functionality. Because Amuda is written in Ruby using Ruby On Rails as it’s MVC framework, I also wanted to do the Markov Chains in Ruby. Because there don’t seem to be any Ruby implementation, I ported Gary Burds Markov Chain Generator. For those interested, read on to see the code – and if any Ruby-Gurus are out there, please help me improve my “Ruby-Fu”, I’m still a beginner in this language.
Here we go
# Markov Chain Generator
# based on the Python version by Gary Burd: https://gary.burd.info/2003/11/markov-chain-generator.html
# Released into the public domain, please keep this notice intact
# (c) InVisible GmbH
# https://www.invisible.ch
#
require "YAML"
class Array
# return a random element of the array, similar to random.choice in python
def choice
self[ rand(self.size) ]
end
end
class MarkovTool
attr_accessor :markov_data
def initialize( markov_data = nil )
# use an unlikely combination for end of paragraph marker
@nlnl = "#-#-"
@markov_data = markov_data if markov_data.class == Hash
@markov_data ||= Hash.new
end
def new_key( key, word)
return @nlnl if word == "\n"
return key if !word
return word
end
def markov_data_from_words( words )
key = @nlnl
words.each do | word |
@markov_data[ key ] ||= Array.new
@markov_data[ key ] << word
key = new_key( key, word )
end
end
def words_from_markov_data
key = @nlnl
result = Array.new
word = ""
# repeat until we hit a newline or a full-stop, remove the last clause to get paragraphs,
while word && word != "\n" && word[-1] != "."[0]
word = @markov_data[ key ].choice rescue nil
key = new_key( key, word )
result << word
end
result
end
# analyze and add a string
def words_from_string( line )
result = Array.new
words = line.split
if words.size > 0
words.each { | word | result << word }
else
result << "\n"
end
result
end
# analyze and add a file
def words_from_file( f )
result = Array.new
File.foreach( f ) do | line |
result << self.words_from_string( line )
end
result.flatten
end
# build a paragraphs out of the result array
def paragraph_from_words( words )
result = Array.new
words.each do | word |
result << word
end
result.join( " " )
end
# return a complete paragraph
def get_paragraph
wo = self.words_from_markov_data
self.paragraph_from_words( wo )
end
def store_in_yaml( f )
YAML.dump( @markov_data, f )
end
def load_from_yaml( f )
@markov_data = YAML.load( f )
end
end
if __FILE__ == $0 then
m = MarkovTool.new
# read exisiting markov data
# File.open( "markov.yaml" ) { | yf | m.load_from_yaml( yf ) }
if ARGV[0]
# if we got a filename, read it, process it and store the markov data
w = m.words_from_file( ARGV[0] )
m.markov_data_from_words( w )
File.open( "markov.yaml", "w" ) { | yf | m.store_in_yaml( yf ) }
end
# create a paragraph and display it
p = m.get_paragraph
puts p
end