You Are Not Allowed to Request Data Again Xml
During my second lecture to an XML class at a local
community higher, I explained how XML lets y'all define your ain markup language with custom tags and attributes. I had finished defining a uncomplicated markup language for use
with a list of amateur sports clubs, and had displayed a sample document
written with that markup. At that point, one student asked:
Article Continues Below
"Isn't it inefficient to accept to type all those tags for
every order? What expert is this? It looks nice, but what can I
exercise with this document? How can I put this in a spider web folio or apply it with
other programs? Wouldn't it be easier to just use HTML or a
database/word processor/backup-the-blank?"
The reason that we utilise XML instead of a specific application is that
XML is not but a pretty face, living in isolation from the rest
of the computing world. XML is more than a rulebook for generating
custom markup languages. Information technology is part of a family of technologies, which,
working together, make your XML-based documents very useful indeed. To
demonstrate what I mean, I decided to create a new XML-based markup
language from scratch, and prove what you lot can exercise with a document written
in that language, using off-the-shelf tools.
Creating a New Markup Language#section2
The language that I created stores the nutritional
information that you find on food labels in the United States. The
certificate starts with a <nutrition> tag, followed by
a <daily-values> element that gives the maximum
amounts of fatty, sodium, etc. for a 2000-calorie-a-day nutrition, and the
units in which the amount is measured.
The daily values are followed by a series of
<food> elements, each of which gives data
about a specific food and its nutritional categories. Because the
<daily-values> element has already defined the units
in which each category is measured, we don't need to repeat them
for every food; we just enter the numbers for that particular
food's total fat, sodium, etc. After the last nutrient, we close the
document with a endmost </diet> tag.
<nutrition><!-- Establish the daily values --> <daily-values> <total-fat units="g"> 65 </full-fat> <saturated-fat units="g"> twenty </saturated-fat> <cholesterol units="mg"> 300 </cholesterol> <sodium units="mg"> 2400 </sodium> <carb units="g"> 300 </carb> <cobweb units="chiliad"> 25 </fiber> <poly peptide units="g"> 50 </protein> </daily-values><p><!-- Now list the individual foods --></p><food> <name>Avocado Dip</proper name> <mfr>Sunnydale</mfr><serving units="g"> 29 </serving> <calories total="110" fatty="100"/><full-fat> 11 </total-fat> <saturated-fat> 3 </saturated-fat> <cholesterol> v </cholesterol> <sodium> 210 </sodium> <carb> 2 </carb> <cobweb> 0 </cobweb> <protein> 1 </protein><vitamins> <p> <a> 0 </a><br /> </p><c> 0 </c> </vitamins><minerals> <p> </p><ca> 0 </ca> <p> </p><fe> 0 </iron> </minerals> </food><p><!-- etc. --></p> </nutrition> You may come across the entire document
that is used for the examples in this article. All the numbers
are real; only the manufacturers' names have been inverse
to protect the innocent and avoid lawsuits.
A quick note: vitamins and minerals are measured in percentages, not
grams or milligrams. That's why nosotros don't need to establish
whatever units or maximums for them in the <daily-values>
element.
I entered the data past hand using the nedit program on
Linux. I could have used any editor that lets me save files
as patently ASCII text; notepad on Windows or vi on Linux would have done
equally well. To make data entry easier, I created an empty
"template" for a food, which you see at the lesser of the
file. I copied and pasted information technology for each new food, and then that I didn't
have to type the tags over and over once more.
Firsthand Benefits#section3
What have we bought by creating this XML file in a text
editor rather than creating an HTML document or a spreadsheet or information
base? Kickoff, the data is structured; information technology's not just a mass of
numbers in an HTML table or a text file of tab–separated values.
Considering of the custom tags, information technology's something that humans can read
and understand. Information technology'south also open; we don't demand some
expensive, proprietary software to extract the information from a
binary file. So, equally a ship medium, XML already serves us
nicely.
Validating the Document#section4
Even if you're the only person who ever enters
data into the certificate, you'd similar to be able to bank check that you
haven't left out whatever information or added extra tags.
Additionally, you'd like to be sure that your percentages are all
between 0 and 100.
This becomes fifty-fifty more important if many people enter data. Fifty-fifty if
you give other folks instructions on the proper format, they may ignore
information technology or make errors. In brusque, you would similar to have the computer assist
you lot determine that the data in your documents is valid.
You practice this by creating a machine-readable grammar which
specifies which tags and attributes are valid, and in what
combinations, and what values your tags and attributes may incorporate.
You lot so hand your document and the grammar to a programme called a
validator, and it checks that the document matches your
specifications.
One car-readable form of specifying such a grammer is a notation
called Relax NG. Relax NG is, itself, an XML-based markup
language. Its purpose is to specify what is valid in other
markup languages. This isn't as crazy or impossible as it
sounds. Afterwards all, books that tell you how to use English grammar
correctly are also written in English.
For example, one of the specifications of our nutritional markup
language is that the <calories> element is an empty
chemical element, and it has 2 attributes, the total aspect
and the fat attribute. These must both take decimal
numbers in them. We say this in Relax NG every bit follows:
<chemical element name="calories"> <empty/> <attribute name="total"><data type="decimal"/> </attribute> <aspect proper noun="fat"><data type="decimal"/> </attribute> </element> When we laissez passer nutrition documents through the validator with this
document, the validator will tell us that the start tag beneath
is right, only the second one isn't.
<calories total="100" fat="10"/> <calories total="217" fat="don't ask!"/> You may encounter the entire grammer
specification for the diet markup hither. Y'all may
also find
out more than about Relax NG. By the manner,
Relax NG is non the just game in town if you want to specify
grammar. You may use something chosen a DTD (Certificate
Type Definition), which is non as powerful
as Relax NG; or yous may use XML Schema, which is
about every bit powerful as Relax NG, only far more circuitous to learn.
Effort it!#section5
If you are feeling adventurous, y'all may want to try these
files yourself. Yous volition demand some XML tools in gild to
practice this. Hither is how to set up the tools
for Windows, and hither's the setup for Linux.
To validate a file, go to the command prompt if yous are using
Windows, or go to a console window and become a shell prompt if you lot
are using Linux. Then use the batch/shell file described
in the setup instructions to invoke
the Multi-Schema Validator:
msvalidate nutrition.rng diet.xml
Now What?#section6
Although nosotros can enter readable data and check to encounter if it'south
OK, nosotros still can't do anything with information technology. If we display it
in a browser, we simply see the text all squeezed together. That's
because the browser doesn't know how to display a
<nutrient> or <vitamins> tag.
Displaying the XML#section7
If yous are using the very latest browsers, yous can
attach a stylesheet to the XML file. Nosotros have washed that in
this example by putting this line at the peak of file
nutrition.xml
<?xml version="one.0"?> <?xml-stylesheet type="text/css" href="nutrition.css"?> <nutrition></nutrition> The mode canvass that we write for file diet.css
looks very much similar the style sheets that you use with your HTML
files. The departure is that nosotros assign styles to our new nutrition
tags, not to the standard HTML tags. For example, to say that a
nutrient's manufacturer should appear in 16 point italic type without
starting a new line, y'all would write:
mfr { brandish: inline; font-size: 16pt; font-style: italic; } Once you have created
the unabridged stylesheet in the same
directory as the XML file, y'all tin open up the XML file in a
modern browser such as Mozilla, and it volition display the information.
Transformation—A Meliorate Mode#section8
The issues with the stylesheet are that:
- It only works with the very latest browsers that handle
Cascading Style Sheets Level 2. - It tin't extract all the information (for instance, the units
don't evidence upward in the output document because they are
"subconscious" in the aspect values. - Information technology can't summate percentages.
Additionally, the markup we've invented here is data-oriented;
it is designed to describe data to be stored or to be transmitted to
other programs. In these documents, the order of elements and the type
of data in each element is adequately rigid. Stylesheets piece of work better with
narrative-oriented markup documents. These are documents which are
more often than not meant for human reading, and are more "free-form"
than data-oriented documents. Examples of narrative-oriented markup are
XHTML, DocBook (a markup for writing books and manufactures), and NewsML
(for writing news reports).
In order to get around these problems, nosotros can use XSLT,
Extensible Stylesheet Linguistic communication Transformations, to catechumen the
nutrition file into other forms. XSLT is, again, another XML-based
markup language. Its purpose is to describe how to take input from 1
XML file (the "source document") and output it to a result
document. XSLT has the flexibility to extract information from attributes as
well equally chemical element content, and it tin practise calculation and sorting upon the
data in the source document.
This power makes XSLT a fundamental applied science in the XML family of
technologies. For a expert introduction, read
Norman Walsh's excellent presentation on the subject field or
this
easily-on tutorial.
Transformation to HTML#section9
The commencement
XSLT file, which you may encounter here, converts the nutrition certificate
into a very plain HTML file suitable for brandish on any browser on a
desktop or PDA. To do the transformation, you lot'd type this
command:
transform nutrition.xml nutrition_plain.xslt nutrition_plain.html
The outcome of the transformation is an HTML file named nutrition_plain.html,
which you may open in whatever browser y'all like. Even this simple
transformation has done 2 things that we could not do with CSS:
it uses the data in attributes to brandish the units for each
nutritional category, and it calculates percentages of the daily
values.
Fancy Transformation#section10
OK, so maybe you want something a bit fancier. Hither'due south a more complex
transformation which sorts the information by the ratio of fat calories to
full calories per serving; sort of a "healthiness
index."
If you have saved the XSLT in a file called
nutrition_fancy.xslt you can type this control:
transform diet.xml nutrition_fancy.xslt nutrition_fancy.html That produces a file named
nutrition_fancy.html,
which looks remarkably different from the manifestly version. It uses
Cascading Style Sheets to produce the trivial bar graphs; you'll
need a modern browser like Internet Explorer five+ or Mozilla/Netscape 6
to run into the effect. Discover that XSLT lets you pick and choose the data
you want to brandish; the information well-nigh carbohydrates, cobweb,
vitamins, and minerals are omitted in the fancy version. (They
could, of course, be added by irresolute the XSLT file.)
We take used XSLT to take the source XML file and transform
it to 2 different HTML files; a manifestly version that is suitable for
display on one-time browsers and PDAs, and a fancier version that is
suitable for use with desktop computers and modern browsers.
Non-HTML Transformation#section11
But wait, maybe you don't want HTML;
at that place'due south more than just browsers in the earth, you lot know. You lot might
want to take the information and convert it to a text file of tab–separated
values for import into a spreadsheet or database program.
Hither is a
transformation file that does this, using this command:
transform nutrition.xml nutrition_csv.xslt nutrition.csv
And hither's the resulting text
file.
Conversion to Print#section12
Let's say you lot desire to create a PDF file from your
XML. That's possible by using a transformation to change the XML to
another markup language: XSL-FO (Extensible Stylesheet
Language – Formatting Objects). This is a page layout linguistic communication. A tool
called FOP (Formatting Objects to PDF) takes that markup and
creates PDF files for yous.
Here is a transformation file which
takes the nutrition data and converts information technology to formatting objects. If
you save information technology in nutrition_fo.xslt, you tin can use FOP to do
the conversion to PDF:
fop -xml diet.xml -xsl nutrition_fo.xslt -pdf nutrition.pdf
The effect is a PDF file; it
produces pages that are approximately 8 centimeters wide and 9
centimeters high, which fits comfortably into a shirt pocket.
Generating Graphics#section13
Finally, you may wish to create an interactive, graphic
version of the data. Another XML-based markup,
SVG—Scalable Vector Graphics— gives you this
adequacy. SVG has elements like the following, which draw a black
diagonal line and a yellowish circle with a dark-green outline:
<line x1="0" y1="0" x2="50" y2="50" /> <circle cx="100" cy="100" r="30" /> Past using a transformation file that
produces SVG, nosotros tin construct a graphic that shows a bar graph for
the food whose proper noun you click. Here's what yous type:
transform nutrition.xml nutrition_svg.xslt nutrition.svg
You may brandish the upshot with the SVG browser that is part of the
Batik toolkit. If you have installed Batik equally per the instructions
given for Linux or for Windows, you type
batik�nutrition.svg. I have non tested the file with
the latest version of the Adobe SVG
Viewer, merely it should work nicely. Hither is a screenshot;
click information technology to see information technology total size.
Other Ways to Use the XML Tools#section14
In this article, we've used the Multi-Schema
Validator, Xalan Transformer, FOP converter, and Batik viewer from the
command prompt. That's the fastest and easiest style to get things
working so that you lot can have an experience of what XML tin can do.
The batch or shell file approach would work in a product
environment where y'all generate a whole website's worth of HTML
files from i or more than XML files at regular time intervals. You but
ready a batch chore to run at scheduled times (a cron task
in Unix terms) to generate the files you need.
What if you need to generate HTML pages or PDF files dynamically
in response to user requests? Obviously, you don't want the overhead
of starting a Java process every time a request comes in, and a
static batch file certainly won't exercise the play tricks. Both the
Multi-Schema Validator and Xalan have an API (Application Program
Interface) and can thus go role of a Coffee servlet running on
your server and handling dynamic user requests. In one case a servlet is
loaded, it stays in retentivity, so there is no extra overhead for
subsequent uses of a transformation.
If you lot are interested in running servlets, i option is to use the
Dki jakarta Tomcat servlet container. It can run equally a stand up-alone server for testing or as a
module for either Apache or Microsoft IIS.
Timing#section15
There are two aspects to timing: how long it takes to
write the grammars and transformations, and how fast they run.
Designing the markup
linguistic communication took me nearly 25 minutes, and entering the data took me
another 25 minutes, some of information technology running out to the kitchen to take hold of items
from the shelf or refrigerator. Writing and testing the Relax NG
grammer required 30 minutes.
The Cascading Style Sheet for displaying the XML directly in Mozilla
took all of 15 minutes to write. The "plain HTML"
transformation took well-nigh l minutes, including time for looking upwards
some XSLT constructs and doing some experimentation. The
"fancy" transformation took 45 minutes. I needed 20 minutes
to figure out how to do the bar graphs with stylesheets in the kickoff
place, and I used another v minutes for pocket-size aesthetic adjustments.
The file for conversion to tab–separated values was a xv-infinitesimal
task.
The transformation for PDF took an 60 minutes. The showtime time through, I
designed information technology for paper the size of a compact disc insert. I thought
better of it, and decided to reduce it to shirt-pocket size. That took
another 30 to 45 minutes of tweaking and getting the font sizes just
the way I wanted them. I too had to make some changes to avoid using
parts of XSL Formatting Objects that FOP does not implement notwithstanding.
Finally, the SVG transformation took an hour and a half to write.
About half that time was experimenting to get everything positioned
nicely and making the ECMA Script interaction work properly.
Yous don't have to be an expert at Relax NG, XSLT, XSL
Formatting Objects, or SVG to do this. I don't employ whatsoever of these
techonlogies on a daily basis. I just know enough nearly each of them
to get things to work. In this example, my philosophy was "the commencement
style you think of that works is the right way." That is why XSLT
experts will exist shocked when they run into an inefficient construct similar
this in the obviously HTML transform file.
select="/nutrition/daily-values/*[name(.)=proper name($node)]/@units" This is non to say that in that location is no learning involved here; you will
demand to spend some fourth dimension on that. Y'all don't need to spend a
lifetime on information technology, though. It is definitely possible to learn enough about
these technologies to put them to effective use in a short time.
Operation#section16
I tested all of these files on a 400MHz AMD K-6 with
128Mb of memory running SuSE Linux
seven.2. For the transformations, I modified the SimpleTransform.coffee
sample program that comes with Xalan. This program records the total
time to generate the output and the time involved in transformation
after the XSLT file has been parsed. If you lot are running transformations
on a server, you tin can cache the parsed XSLT file, so the overhead for
parsing occurs only in one case.
| Transformation | Time in seconds | |
|---|---|---|
| Total | Transform | |
| Manifestly HTML | 3.691 | 1.018 |
| Fancy HTML | 4.057 | one.409 |
| Tab–separated Values | 3.057 | 0.548 |
| SVG | iii.386 | 0.689 |
I measured the fourth dimension for the PDF transformation with the
Linux time control. Generating the file took
15.115 seconds real time, with x.920 seconds of user CPU time.
Of grade, these are not the only tools bachelor. There are other
XSLT processors and other programs for converting XSL Formatting
Objects to PDF. I chose MSV, Xalan, Fop, and Batik because they are
free, like shooting fish in a barrel to use, and I was already familiar with them.
Summary#section17
-
Using XML-based markup gives your document structure,
and makes it readable and open. -
XML is part of a family unit of technologies.
-
You can employ grammer markup languages like Relax NG
or XML Schema to validate
your documents. -
You can use XSLT transformations to repurpose a document.
A single certificate can serve as the source for XHTML, plain text,
PDF, or other XML markup languages like SVG. -
Programs which exercise validation and transformation are freely
available and like shooting fish in a barrel to apply.
These capabilities exist right at present, and they are like shooting fish in a barrel to learn and
use. That is why XML is good, and why people are then excited about
it once they start to apply information technology.
Yous may download the
XML files and the resulting HTML, text, and PDF files.
Source: https://alistapart.com/article/usingxml/
0 Response to "You Are Not Allowed to Request Data Again Xml"
Post a Comment