Graal Forums  

Go Back   Graal Forums > Development Forums > NPC Scripting > Code Gallery
FAQ Members List Calendar Today's Posts

 
 
Thread Tools Search this Thread Display Modes
Prev Previous Post   Next Post Next
  #1  
Old 06-30-2008, 11:19 PM
DrakilorP2P DrakilorP2P is offline
Registered User
DrakilorP2P's Avatar
Join Date: Apr 2006
Posts: 755
DrakilorP2P is just really niceDrakilorP2P is just really nice
Random name generator using Markov Chains

This script, given a database of example names, will produce new random names using what it learned by parsing the example names. Some assembly required.

<tech>
The next character in the string is a random function of the previous character(s). For instance, if the script finds that the letter "f" often follows the letter "e" and that "q" almost never follows "e", it will replicate this by more often choosing "f" than "q" when faced with an "e". This is a first order chain. A second order chain will do the same, except choose the next character based on the two previous characters, such that "y" could often follow "to".
</tech>

Fun fact: Markov chains have been used for spam filtering. They have also been used to elude spam filters.

In order to produce original names, you will need to input a fairly large database of example names. By customizing the database, you will be able to choose what kind of names you want to generate. If you want English-sounding names, you input English names, if you want alien-sounding names, you input alien names.
There are some things you need to consider however. If the database is too large, the infamous loop limit will eat you if the performance demon doesn't. If it's too small, you'll often find yourself looking at unoriginal names or blatant copies of names already in the database.

You'll be able to choose what order the chains should be. This also gives you some options. A first order chain will often produce gibberish, while a fourth order chain will often produce unoriginal results. I've found that a third order chain will often produce non-gibberish with only occasional copies.

Fun fact: Markov chains are being used in Google's PageRank formula.

These names were generated in order-3 with a database of a thousand medium length names:
Quote:
rebe lucy kristine jessie danicher orah alba cara rita wilagros tessie wina son claudra alexis
These on the other hand, are order-3, but generated with only a small database of famous scientists. As you can see, most are duplicates.
Quote:
piccardson ein heisenberg curie shroedinger ein langmeir fowler shroedingevin tesla piccard vers planck bohr henriot planck heisenberg kramerschaffelt fowler shroedinger
Bottom line: the quality of the names depends entirely on your input data.
I suggest a decently large database with a fair share of longer names and second or third order chains.

PHP Code:
//#CLIENTSIDE
function onCreated()
{
  
doNotCallThisFunction();
  
  
//Use this.nameArray as the database. Generate third order chains.
  
parseAndLearnArray(this.nameArray3);
  
  echo(
"---------------");
  echo(
"Markov: " generate());
  echo(
"Markov: " generate());
  echo(
"Markov: " generate());
  echo(
"Markov: " generate());
  echo(
"Markov: " generate());
}

function 
generate(ccurrentWord)
{
  if (
temp.== null) {
    
temp."";
    for (
temp.i=0temp.i<this.ordertemp.i++) temp.@= "#";
  }
  
  if (
currentWord == nullcurrentWord "";
  
  
temp.total 0;
  for (
temp.charactermakevar("this.chain_key_" temp.c.substring(0this.order))) {
    
temp.frequency makevar("this.chain_key_" temp.c.substring(0this.order) @ "_" temp.character);
    
temp.total += temp.frequency;
  }
  
  if (
temp.total 0) {
    
temp.int(random(1temp.total+1));
    for (
temp.charactermakevar("this.chain_key_" temp.c.substring(0this.order))) {
      
temp.frequency makevar("this.chain_key_" temp.c.substring(0this.order) @ "_" temp.character);
      
temp.-= temp.frequency;
      
      if (
temp.<= 0) {
        if (
temp.character == "#") return temp.currentWord;
        
        
temp.currentWord @= temp.character;
        
temp.temp.c.substring(1temp.c.length() - 1) @ temp.character;
        break;
      }
    }
  }
  else {
    return 
temp.currentWord;
  }
  
  return 
generate(temp.ctemp.currentWord);
}

function 
parseAndLearnText(text)
{
  
temp.bakText temp.text;
  
temp.text "";
  for (
temp.i=0temp.i<this.ordertemp.i++) temp.text @= "#";
  
temp.text @= temp.bakText "#";
  
  for (
temp.i=0temp.i<temp.text.length() - this.ordertemp.i++) {
    
temp.tmpText temp.text.substring(temp.ithis.order);
    
temp.nextCharacter temp.text.substring(temp.this.order1);
    
    if (!
makevar("this.chain_key_" temp.tmpText) == null) {
      
makevar("this.chain_key_" temp.tmpText) = {};
    }
    if (!
makevar("this.chain_key_" temp.tmpText "_" temp.nextCharacter)) {
      
makevar("this.chain_key_" temp.tmpText "_" temp.nextCharacter) = 0;
      
makevar("this.chain_key_" temp.tmpText).add(temp.nextCharacter);
    }
    
makevar("this.chain_key_" temp.tmpText "_" temp.nextCharacter) += 1;
  }
}

function 
parseAndLearnArray(textArrayorder)
{
  if (
temp.order == nulltemp.order 3;
  
this.order temp.order;
  
  for (
temp.linetemp.textArray) {
    
parseAndLearnText(line);
  }
}

//It's most likely better to read these from a file
function doNotCallThisFunction()
{
  
this.nameArray = {
"tesla",
"heisenberg",
"curie",
"richardson",
"bohr",
"brillouin",
"fowler",
"shroedinger",
"langmeir",
"planck",
"lorentz",
"henriot",
"kramers",
"piccard",
"verschaffelt",
"einstein",
"langevin"
};

As previously mentioned, some assembly required. You probably want to load all the names from a text file rather than putting them directly in an array.

Fun fact: Markov chains have been used to create chatbots.

Some more examples using larger databases than Graal can handle follows.

More than five thousand names, second order chains. More original but also some gibberish.
Quote:
la marrick gan ca dana weladian marri edee ma lorenan deth mina lina ro wey ti rodee shanitherlin ronna vice da monda rolannmarlisomashon
Just above 80000 dictionary words, third order chains. Extremely large databases produce more gibberish.
Quote:
chats oute debillintakeoff efteer matzu roomplose prediallowsers buness mutumustudity gondor papery quinteles cust unbei antrangedly hoopf ves divis rollier sad synecked earchy ryn***ed cowps butting ding vinyls pees uncent mail amp
Fun fact: J.R.R Tolkien used Markov chains.
Reply With Quote
 


Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off

Forum Jump


All times are GMT +2. The time now is 01:44 PM.


Powered by vBulletin® Version 3.8.11
Copyright ©2000 - 2026, vBulletin Solutions Inc.
Copyright (C) 1998-2019 Toonslab All Rights Reserved.