What's the point of XML?

Ye' old general discussion board. Basically, for everything that isn't covered elsewhere. Come here to shoot the breeze, shoot your mouth off, or whatever suits your fancy.
This forum is not for asking programming related questions.

Moderator: General Moderators

User avatar
cj5
Forum Commoner
Posts: 60
Joined: Tue Jan 17, 2006 3:38 pm
Location: Long Island, NY, USA

for those in the ignorant

Post by cj5 »

Well, if you consider the background of XML, and its roots, it was intended to bridge the gap of the varying types of electronic document formats, and give developers a way to use the information inside these documents for application development. It's the crux of information sharing. What if you walked into a library and saw the shelves filled with sheets of paper just shoved on there in no particular order, and no page numbers? Take SGML for instance (the grandparent of XML). Now consider its document type definition. Would you want to search and retrieve documents of all varying types without placing them into a predefined infrastructure? From the Library of Congress' MARC XML Design Considerations (http://www.loc.gov/standards/marcxml/ma ... esign.html), the last note given is
Extensiblity

By using XML as the structure for MARC records, users of the MARC in the XML framework can more easily write their own tools to consume, manipulate, and convert MARC data.
You need a central point at which various users can access and manipulate data sources
User avatar
Benjamin
Site Administrator
Posts: 6935
Joined: Sun May 19, 2002 10:24 pm

Post by Benjamin »

jshpro2 wrote:agtlewis,

care to mock up a 6 dimensional CSV file for me to show me how that would work?
Whoa no way I don't think that would work very well.
It's basically a data abstraction thing, if you are downloading data for the weather for multiple zipcodes from a weather service for example

Code: Select all

<weather>
 <zip code = "33458">
   <temps>
     <high>82</high>
     <low>75</low>
    </temps>
 </zip>
 <zip code = "90210">
   <temps>
     <high>80</high>
     <low>73</low>
    </temps>
 </zip>
</weather>
It's just a really convenient way to move data from place to place, to store hierarchical data, etc.
Ok, I understand that. But looking at that example the first thing I see is that there is probably 4 or 5 times more markup than data. I'm sure if someone put their mind to it, they could develop something much more efficient, possibly even with mime types so it can support binary data as well. I'm really not impressed with it.
User avatar
feyd
Neighborhood Spidermoddy
Posts: 31559
Joined: Mon Mar 29, 2004 3:24 pm
Location: Bothell, Washington, USA

Post by feyd »

psst, it does support binary data.
User avatar
cj5
Forum Commoner
Posts: 60
Joined: Tue Jan 17, 2006 3:38 pm
Location: Long Island, NY, USA

Post by cj5 »

agtlewis wrote:Ok, I understand that. But looking at that example the first thing I see is that there is probably 4 or 5 times more markup than data. I'm sure if someone put their mind to it, they could develop something much more efficient, possibly even with mime types so it can support binary data as well. I'm really not impressed with it.
I think you're not impressed, because you are not aware of the various tools out there that can produce this. Sure I can show you how to markup a CSV file with PHP. It's very simple in fact. The important thing I must emphasize, is that most major websites that offer up XML in one format or another, usually don't have static XML laying around in a file, instead they dynamically generate it from various formats, whether they'd be electronic documents (text, csv, excel, pdf) or databases. If you think developers hand type out XML documents, then you need to do more research on this topic. I'd suggest you look into things like the PEAR XML packages, NuSOAP, and many other PHP scripts that can easily produce XML. I use some of them to allow people to access my database information. I can create PDF's on the fly by importing XML into a PDF format, Spreadsheets too. XML does support binary data as well, but by interpreting XML as a programming language is misinterpretation. To draw a picture for you, look at it as a central data format. If I am building a site with Java using an Oracle database, and I want to access information from another website built with PHP and MySQL. Now XML offers a bridge connection to that data, because both languages have the ability to parse/generate/query XML data, without having to labor through reinventing the wheel via parallel data access coding.

Hope that helps.
User avatar
CoderGoblin
DevNet Resident
Posts: 1425
Joined: Tue Mar 16, 2004 10:03 am
Location: Aachen, Germany

Post by CoderGoblin »

cj5 : I remember SGML....

Wasn't that and AECMA going to allow people to have a paperless office by the year 2000...

Oops too late...
User avatar
cj5
Forum Commoner
Posts: 60
Joined: Tue Jan 17, 2006 3:38 pm
Location: Long Island, NY, USA

Post by cj5 »

CoderGoblin wrote:cj5 : I remember SGML....

Wasn't that and AECMA going to allow people to have a paperless office by the year 2000...

Oops too late...
What's your your source for this information?
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

agtlewis wrote:Ok, I understand that. But looking at that example the first thing I see is that there is probably 4 or 5 times more markup than data. I'm sure if someone put their mind to it, they could develop something much more efficient, possibly even with mime types so it can support binary data as well. I'm really not impressed with it.
Why is it a problem that there is more markup than data? HTML usually has more markup than content and I don't hear an outcry about how unimpressive or unsuccessful it has been. I hope you are not thinking of performance issues in the abstract. The markup is meaningful and allows very general support for very rich data is thousands of tools and programs. Unlike "more efficient" formats, when you get XML it is pretty self explanatory what the data is.

I'm not sure what to say about "I'm really not impressed with it" as the likes of IBM, Sun and Microsoft (and pretty much everyone else) have all standardized on it. XML does for data interchange what HTML did for page layout -- make the powerful easy.
(#10850)
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Post by Roja »

agtlewis wrote:But looking at that example the first thing I see is that there is probably 4 or 5 times more markup than data. I'm sure if someone put their mind to it, they could develop something much more efficient, possibly even with mime types so it can support binary data as well. I'm really not impressed with it.
Efficiency isnt the goal. Portability and consistency is.

Just like you could probably write a custom parser for Yahoo finance, that scrapes *only* the numbers you need, making it very "efficient". But when you have to change that custom parser once a week, redoing almost all your work, because they change layouts.. Suddenly, consistency becomes a much higher priority.

Now imagine trying to incorporate data from a dozen sources per page, like the personalized pages do. It would be flat out impossible without xml.

If that doesn't get it across to you, picture doing sales through online retailers like barnes and noble, amazon, and other retailers. Trying to create the information each needs, and parse the information they return would be a nightmare. With XML, you can simply import the xml feed, and target the element in the tree you are looking for, like its a row from a db.

XML is fairly efficient - for being a completely consistent data exchange format. It supports binary data (CDATA), and millions of websites have embraced it.

Feel free to not be impressed or use it. The rest of the world has, does, and is FAR better because of it.

Eventually, you'll probably find something compelling about it.
User avatar
m3mn0n
PHP Evangelist
Posts: 3548
Joined: Tue Aug 13, 2002 3:35 pm
Location: Calgary, Canada

Post by m3mn0n »

Bottom line is XML is very useful if you're a web developer, and the more you understand about it and the more you work with it, the more you appreciate it.

Especially when you learn about making your own markup language for a data source you manage, XHTML, and syndication (Atom/RSS for example) and effectively use these things.
User avatar
Christopher
Site Administrator
Posts: 13596
Joined: Wed Aug 25, 2004 7:54 pm
Location: New York, NY, US

Post by Christopher »

There are reasonable alternatives to XML for some cases, JSON being an example. For language specific tasks you can do shortcuts. I hear that Yahoo is providing some data in PHP serialize() format for example.
(#10850)
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Post by josh »

arborint wrote: I hear that Yahoo is providing some data in PHP serialize() format for example.
And if I'm using perl I have to re-implement unserialize() ?
Roja
Tutorials Group
Posts: 2692
Joined: Sun Jan 04, 2004 10:30 pm

Post by Roja »

jshpro2 wrote:
arborint wrote: I hear that Yahoo is providing some data in PHP serialize() format for example.
And if I'm using perl I have to re-implement unserialize() ?
Tada. The value of standardized data formats. :)
Gambler
Forum Contributor
Posts: 246
Joined: Thu Dec 08, 2005 7:10 pm

Post by Gambler »

Speaking about flawed logic... Excel tables are the best document format in the world. They are used by many large businesses and there are many tools that generate/manipulate them. Also, they are created by Microsof, which by itself makes them business standard. Who cares about the rest? We should all use excel tables. We should not reinvent the wheel. We should not consider using better formats, because excel already exists and it works.

Those are the same arguments everyone uses do defend pretty much any existsing technology that is fairly popular.
User avatar
m3mn0n
PHP Evangelist
Posts: 3548
Joined: Tue Aug 13, 2002 3:35 pm
Location: Calgary, Canada

Post by m3mn0n »

Gambler wrote:Speaking about flawed logic... Excel tables are the best document format in the world. They are used by many large businesses and there are many tools that generate/manipulate them. Also, they are created by Microsof, which by itself makes them business standard. Who cares about the rest? We should all use excel tables. We should not reinvent the wheel. We should not consider using better formats, because excel already exists and it works.

Those are the same arguments everyone uses do defend pretty much any existsing technology that is fairly popular.
LOL

Were you joking? Or are you serious?
josh
DevNet Master
Posts: 4872
Joined: Wed Feb 11, 2004 3:23 pm
Location: Palm beach, Florida

Post by josh »

Gambler wrote:Those are the same arguments everyone uses do defend pretty much any existsing technology that is fairly popular.
Are you saying that I am defending XML? I did no such thing, I am defending standards. Excel is not a standard, it is a proprietary format and thus cannot be compared to the current topic ( serialize vs xml). Although serialize is a data format that can be implemented in other languages, it is not a standard. Nothing tells us that the format that serialize() won't change in the next PHP version (although I doubt they will because I bet a lot of people serialize() data for long term storage)
Post Reply