I have started writing a simple php script to read the data out of an xml file and into a database.
Simple right? Well here are a few more details.
The current xml file is about 9.4 million lines long.
I dont have a full description for each field, meaning i dont what fields share common values or what those values might be.
It needs to be efficient enough so that it wont take hundreds of hours to import.
So where im at...
Currently i have a basic script that will loop through all the xml data there is and format it nicely into a very large associative array:
Code: Select all
<?php
class xmlImport{
protected $xmlArr;
protected $xml;
public function xmlImport($file){
$this->xml = simplexml_load_file($file);
$this->xmlArr = $this->processXML($this->xml);
print "<pre>";print_r($this->xmlArr);print "</pre>";
return $this->xmlArr;
}
private function processXML(&$xml){
foreach($xml->children() as $child){
if(count($child) > 1){
$arr[][$child->getName()] = $child;
}else{
$arr[$child->getName()] = $this->processXML($child);
}
}
return $arr;
}
private function importData(&$arr){
}
}
$xmlImport = new xmlImport("example.xml");
?>Code: Select all
<?xml version="1.0" encoding="UTF-8"?>
<Items>
<Item>
<ItemKey>D9B6982B01010101</ItemKey>
<Name>
<English>A Scrap of Paper</English>
<French>Morceau de papier</French>
<German>Ein Stück Papier</German>
</Name>
<Description>
<English>It's beginning to crumble</English>
<French>Le papier commence à s'effriter.</French>
<German>Beginnt zu verfallen.</German>
</Description>
<Value>1</Value>
<Rarity>Trash</Rarity>
<SoulboundTrigger>BindOnPickup</SoulboundTrigger>
<Icon>item_icons\blank_parchment</Icon>
<FirstLootedBy>
<EU_Akala>
<Name>Eleonas</Name>
<Guild>Old Heroes</Guild>
<Date>2011-02-24T22:28:14</Date>
</EU_Akala>
<EU_Argent>
<Name>Welf</Name>
<Date>2011-02-24T21:38:13</Date>
</EU_Argent>
<EU_Blightweald>
<Name>Doradan</Name>
<Guild>Daedalus</Guild>
<Date>2011-02-25T01:03:56</Date>
</EU_Blightweald>
</FirstLootedBy>
</Item>
<Item>
<ItemKey>B6BBBD1F01D2D6F7D805663C</ItemKey>
<Name>
<English>Templar's Bow of the Sinister</English>
<French>Arc de Templier du Sinistre</French>
<German>Templerbogen</German>
</Name>
<Value>2350</Value>
<Rarity>Uncommon</Rarity>
<Icon>item_icons\ranged_bow_043</Icon>
<Slot>Weapon_Ranged</Slot>
<WeaponType>2h_ranged_bow</WeaponType>
<MinimumDamage>42</MinimumDamage>
<MaximumDamage>78</MaximumDamage>
<Speed>2.600000</Speed>
<Range>30.000000</Range>
<OnEquip>
<Strength>11</Strength>
<Dexterity>15</Dexterity>
<Endurance>9</Endurance>
</OnEquip>
<RequiredLevel>49</RequiredLevel>
<RunebreakSkillLevel>226</RunebreakSkillLevel>
<FirstLootedBy>
<EU_Blightweald>
<Name>Mels</Name>
<Guild>Desire</Guild>
<Date>2011-03-06T00:56:43</Date>
</EU_Blightweald>
<EU_Bloodiron>
<Name>Miaki</Name>
<Guild>Serenity</Guild>
<Date>2011-04-01T14:35:38</Date>
</EU_Bloodiron>
</FirstLootedBy>
</Item>
</Items>Basically i dont know how many fields there will possibly be inside the xml file with no full structure available.
Is it possible to create tables/fields as the script goes along and finds new children in the xml?
So basically if it finds a new child, it creates a new database field and inserts the data, else it goes on to the next one/inserts data.
However, here in lies my third problem, efficiency.
I would like to optimize the database into several tables, to help reduce the size of a single table.
So in the example.xml you see above, OnEquip will have several sub-fields that to my knowledge, can possibly have up to 21 individual children. This is the point where i would simply make OnEquip (in the database) a pointer to another table with these fields in it.
Open to all thoughts and suggestions, if you need more info just ask and hopefully i can provide